Task scheduling gets stuck

Task scheduling gets stuck - java

I am currently trying to get my grip on OptaPlanner as it seems to be the perfect solution for a problem I have.
Basically the Project job scheduling example is what I am going for, but as I only know my Java basics this is way to complex to start with. So I am trying to start with a very limited example and work my way up from there:
I have tasks with a duration and one defined predecessor.
The planning entity is the time each task starts.
I have a hard score that punishes tasks starting before starttime+duration of its predecessor. I also have a soft score that tries to reduce gaps, keeping the overall process as short as possible.
public HardSoftScore calculateScore(Schedule schedule) {
int hardScore = 0;
int softScore = 0;
for (Task task : schedule.getTaskList()) {
int endTime = task.getAllocation().getStartTime() + task.getDuration();
softScore = -endTime;
for (Task task2 : schedule.getTaskList()) {
if(task.getId()!=task2.getId()){
if( task2.getPredecessorId()==task.getId()) {
if (endTime > task2.getAllocation().getStartTime()) {
hardScore += task2.getAllocation().getStartTime() - endTime;
}
}
}
}
}
return HardSoftScore.valueOf(hardScore, softScore);
}
This is the solver config:
<?xml version="1.0" encoding="UTF-8"?>
<solver>
<!--<environmentMode>FAST_ASSERT</environmentMode>-->
<!-- Domain model configuration -->
<solutionClass>com.foo.scheduler.domain.Schedule</solutionClass>
<planningEntityClass>com.foo.scheduler.domain.Task</planningEntityClass>
<!-- Score configuration -->
<scoreDirectorFactory>
<scoreDefinitionType>HARD_SOFT</scoreDefinitionType>
<simpleScoreCalculatorClass>com.foo.scheduler.solver.score.SchedulingSimpleScoreCalculator</simpleScoreCalculatorClass>
</scoreDirectorFactory>
<!-- Optimization algorithms configuration -->
<termination>
<maximumSecondsSpend>100</maximumSecondsSpend>
</termination>
<constructionHeuristic>
<constructionHeuristicType>FIRST_FIT</constructionHeuristicType>
</constructionHeuristic>
<localSearch>
<acceptor>
<entityTabuSize>7</entityTabuSize>
</acceptor>
<forager>
<acceptedCountLimit>1000</acceptedCountLimit>
</forager>
</localSearch>
</solver>
The problem is that this works great as long as I only have the hard score. But of course it has gaps. As soon as I add the soft score everything gets stuck after about 10 steps. Why?
[...]
2014-05-03 20:01:31,966 [main] DEBUG Step index (10), time spend (495), score (-35hard/-66soft), best score (-34hard/-68soft), accepted/selected move count (1000/19884) for picked step (com.foo.scheduler.domain.Task#35480096 => com.foo.scheduler.domain.Allocation#f9a4520).
2014-05-03 20:03:11,471 [main] DEBUG Step index (11), time spend (100000), score (-35hard/-65soft), best score (-34hard/-68soft), accepted/selected move count (0/105934687) for picked step (com.foo.scheduler.domain.Task#7050c91f => com.foo.scheduler.domain.Allocation#47c44bd4).

A selected move count of 105934687 at step 11 clearly indicates that no move is being accepted. I don't see how a soft score could trigger that though. There is only 1 explanation:
The EntityTabuAcceptor doesn't accept any move, because they are all tabu, which means every move's planning entities are in the tabu list. This is possible if you have very small dataset (14 or less planning entities). Turn on TRACE logging and the log will confirm this.
Each of these workaround should fix that:
Use Late Acceptance
<acceptor>
<lateAcceptanceSize>400</lateAcceptanceSize>
</acceptor>
<forager>
<acceptedCountLimit>1</acceptedCountLimit>
</forager>
Use <entityTabuRatio> instead of <entityTabuSize>
Mess with the <entityTabuSize> based on the dataset size with SolverFactory.getSolverConfig(). Not recommended!
Why less than 14 planning entities?
Because by default you get a <changeMoveSelector> and a <swapMoveSelector>. The <swapMoveSelector> swaps 2 entities, making both tabu if it wins a step. The tabu list size is the number of steps, so if 7 swap moves win the steps in a row, there can be 14 entities in the tabu list.

Related

Optaplanner skipping Phases and returning empty solution

I'm working on scheduling project for classes (Teachers, Lessons, time). I'm using optaplanner as part of spring-boot application, the test code is compiling and running correctly however the result contain empty solution, in the log output I see this message:
rted: time spent (11), best score (0hard/0soft), environment mode (REPRODUCIBLE), move thread count (NONE), random (JDK with seed 0).
2021-09-28 22:39:26.619 INFO 2579 --- [pool-1-thread-1] o.o.core.impl.solver.DefaultSolver : Skipped all phases (2): out of 0 planning entities, none are movable (non-pinned).
2021-09-28 22:39:26.620 INFO 2579 --- [pool-1-thread-1] o.o.core.impl.solver.DefaultSolver : Solving ended: time spent (16), best score (0hard/0soft), score calculation speed (62/sec), phase total (2), environment mode (REPRODUCIBLE), move thread count (NONE).
The problem is in the test calculator I wrote I'm trying to loop on the possible solution and actually decrease the cost a bit sometimes or even increase it, but it doesn't taking effect because I'm looping and trying to log the objects but nothing is being logged, this is the code of the calculator:
public class ScheduleScoreCalculator implements EasyScoreCalculator<ScheduleTable, HardSoftScore>
{
#Override
public HardSoftScore calculateScore(ScheduleTable scheduleTable) {
int hardScore = 0;
int softScore = 0;
List<ScheduledClass> scheduledClassList = scheduleTable.getScheduledClasses();
System.out.println(scheduleTable);
System.out.println("Hmmmmm ---"); // this is logged but the score is not changing
for(ScheduledClass a: scheduledClassList) {
for (ScheduledClass b : scheduledClassList) {
if (a.getTeacher().getTeacherId() > 17000L) {
hardScore+=18;
}
if (a.getTimeslot() != null && a.getTimeslot().equals(b.getTimeslot())
&& a.getId() < b.getId()) {
if (a.getTeacher() != null && a.getTeacher().equals(b.getTeacher())) {
hardScore--;
}
if (a.getTeacher().equals(b.getTeacher())) {
hardScore--;
}
} else {
hardScore++;
softScore+=2;
}
}
}
return HardSoftScore.of(hardScore, softScore);
}
}
So Please any idea why optaplanner might skip creating possible solutions?

The issue was simpler than I thought, The Solution class annotated with "PlanningSolution" has property "scheduledClasses" annotated with "PlanningEntityCollectionProperty" my mistake that this property was initialized with empty List (ArrayList), the solution was to initialize a solution class! In retrospect I think the documentation is to be blamed on this, the provided example didn't mention that we need to have this, so it should not be null (otherwise and exception will be raised) and it shouldn't be empty List. You need to initialize it with class without setting any value for the movable properties (annotated with "PlanningVariable").
Thanks for #Lukáš Petrovický as his comment helped me do the correct investigation!

Drools comparing Sets

I'm trying to use Optaplanner to replace myself in scheduling our work planning.
The system is having a MySQL database containing the necessary information and relationships. For this issue I'll only use the three tables I need:
Employees --> Have Skills
Jobs --> Have Skills
Skills
In Drools I have the rule
rule 'required Skills'
when
Job(employee != null, missingSkillCount > 0, $missingSkillCount : missingSkillCount)
then
scoreHolder.addHardConstraintMatch(kcontext, -10 * $missingSkillCount);
end
In Class Job I have a function missingSkillCount():
public int getMissingSkillCount() {
if (this.employee == null) {
return 0;
}
int count = 0;
for (Skill skill : this.reqskills) {
if(!this.employee.getSkills().contains(skill)) {
count++;
}
}
return count;
}
When I run my program, Optaplanner returns that none of my workers have any skills...
However, when I manually use this function (adapted to accept an Employee as parameter): public int getMissingSkillCount(Employee employee), it does return the correct values.
I'm puzzled! I somehow understand that containsis checking for the same object, instead of the content of the object. But then I don't understand how to do this efficiently...

1) Are your Jobs in the Drools working memory? I presume they are your #PlanningEntity and the instances are in #PlanningEntityCollectionProperty on your #PlanningSolution, so they will be. You can verify this by just matching a rule on Job() and doing a System.out.println.
2) Try writing the constraint as a ConstraintStream (see docs) and putting a debug breakpoint in the getMissingSkillCount() > 0 lambda to see what's going on.
3) Temporarily turn on FULL_ASSERT to validate there is no score corruption.
4) Turn on DEBUG and then TRACE logging for optaplanner, to see what's going on inside.

Still wondering what makes the difference between letting Optaplanner run getMissingSkillCount() and using it "manually".
I fixed it by overriding equals(), that should have been my first clue!

Apache Flink Using Windows to induce a delay before writing to Sink

I am wondering is possible with Flink windowing to induce a 10 minute delay from when the data enters the pipeline until it is written to a table in Cassandra.
My initial intention was to write each transaction to a table in Cassandra and query the table using a range key at the web layer but due to the volume of data, I am looking at options to delay the write for N seconds. This means that my table will only ever have data that is at least 10 minutes old.
The small diagram below shows 10 minute windows that roll every minute. As time moves on I only want to write data to Cassandra that is older than 10 minutes (the parts in green). I guess is this even possible with Flink?
I could create 11 minute windows that roll every minute but I would end up throwing 90% of the data away, which seems a waste.
Final Solution
I created my own flavour of FlinkKafkaConsumer09 called DelayedKafkaConsumer The main reason for this is to override the creation of the KafkaFetcher
public class DelayedKafkaConsumer<T> extends FlinkKafkaConsumer09<T> {
private ConsumerRecordFunction applyDelayAction;
.............
#Override
protected AbstractFetcher<T, ?> createFetcher(SourceContext<T> sourceContext, Map<KafkaTopicPartition, Long> assignedPartitionsWithInitialOffsets,
SerializedValue<AssignerWithPeriodicWatermarks<T>> watermarksPeriodic,
SerializedValue<AssignerWithPunctuatedWatermarks<T>> watermarksPunctuated,
StreamingRuntimeContext runtimeContext, OffsetCommitMode offsetCommitMode) throws Exception {
return new DelayedKafkaFetcher<>(
sourceContext, assignedPartitionsWithInitialOffsets, watermarksPeriodic, watermarksPunctuated,
runtimeContext.getProcessingTimeService(), runtimeContext.getExecutionConfig().getAutoWatermarkInterval(),
runtimeContext.getUserCodeClassLoader(), runtimeContext.getTaskNameWithSubtasks(),
runtimeContext.getMetricGroup(), this.deserializer, this.properties, this.pollTimeout, useMetrics, applyDelayAction);
}
The DelayedKafkaFetcher has a small piece of code in it's runFetchLoop that sleeps for n milliseconds before emmitting the record.
private void delayMessage(Long msgTransactTime, Long nowMinusDelay) throws InterruptedException {
if (msgTransactTime > nowMinusDelay) {
Long sleepTimeout = msgTransactTime - nowMinusDelay;
if (LOGGER.isDebugEnabled()) {
LOGGER.debug(format("Message with transaction time {0}ms is not older than {1}ms. Sleeping for {2}", msgTransactTime, nowMinusDelay, sleepTimeout));
}
TimeUnit.MILLISECONDS.sleep(sleepTimeout);
}
}

How to automatically collapse repetitive log output in log4j

Every once in a while, a server or database error causes thousands of the same stack trace in the server log files. It might be a different error/stacktrace today than a month ago. But it causes the log files to rotate completely, and I no longer have visibility into what happened before. (Alternately, I don't want to run out of disk space, which for reasons outside my control right now is limited--I'm addressing that issue separately). At any rate, I don't need thousands of copies of the same stack trace--just a dozen or so should be enough.
I would like it if I could have log4j/log4j2/another system automatically collapse repetitive errors, so that they don't fill up the log files. For example, a threshold of maybe 10 or 100 exceptions from the same place might trigger log4j to just start counting, and wait until they stop coming, then output a count of how many more times they appeared.
What pre-made solutions exist (a quick survey with links is best)? If this is something I should implement myself, what is a good pattern to start with and what should I watch out for?
Thanks!

Will the BurstFilter do what you want? If not, please create a Jira issue with the algorithm that would work for you and the Log4j team would be happy to consider it. Better yet, if you can provide a patch it would be much more likely to be incorporated.

Log4j's BurstFilter will certainly help prevent you filling your disks. Remember to configure it so that it applies in as limited a section of code as you can, or you'll filter out messages you might want to keep (that is, don't use it on your appender, but on a particular logger that you isolate in your code).
I wrote a simple utility class at one point that wrapped a logger and filtered based on n messages within a given Duration. I used instances of it around most of my warning and error logs to protect the off chance that I'd run into problems like you did. It worked pretty well for my situation, especially because it was easier to quickly adapt for different situations.
Something like:
...
public DurationThrottledLogger(Logger logger, Duration throttleDuration, int maxMessagesInPeriod) {
...
}
public void info(String msg) {
getMsgAddendumIfNotThrottled().ifPresent(addendum->logger.info(msg + addendum));
}
private synchronized Optional<String> getMsgAddendumIfNotThrottled() {
LocalDateTime now = LocalDateTime.now();
String msgAddendum;
if (throttleDuration.compareTo(Duration.between(lastInvocationTime, now)) <= 0) {
// last one was sent longer than throttleDuration ago - send it and reset everything
if (throttledInDurationCount == 0) {
msgAddendum = " [will throttle future msgs within throttle period]";
} else {
msgAddendum = String.format(" [previously throttled %d msgs received before %s]",
throttledInDurationCount, lastInvocationTime.plus(throttleDuration).format(formatter));
}
totalMessageCount++;
throttledInDurationCount = 0;
numMessagesSentInCurrentPeriod = 1;
lastInvocationTime = now;
return Optional.of(msgAddendum);
} else if (numMessagesSentInCurrentPeriod < maxMessagesInPeriod) {
msgAddendum = String.format(" [message %d of %d within throttle period]", numMessagesSentInCurrentPeriod + 1, maxMessagesInPeriod);
// within throttle period, but haven't sent max messages yet - send it
totalMessageCount++;
numMessagesSentInCurrentPeriod++;
return Optional.of(msgAddendum);
} else {
// throttle it
totalMessageCount++;
throttledInDurationCount++;
return emptyOptional;
}
}
I'm pulling this from an old version of the code, unfortunately, but the gist is there. I wrote a bunch of static factory methods that I mainly used because they let me write a single line of code to create one of these for that one log message:
} catch (IOException e) {
DurationThrottledLogger.error(logger, Duration.ofSeconds(1), "Received IO Exception. Exiting current reader loop iteration.", e);
}
This probably won't be as important in your case; for us, we were using a somewhat underpowered graylog instance that we could hose down fairly easily.

What's the most effective way to execute a task some distant time in the future on App Engine?

I have an application on App Engine which is consuming some data. After parsing that data, it will know that it needs to execute something in a period of time - possibly not for a number of hours or weeks.
What is the best way to execute a piece of code after some arbitrary amount of time on App Engine?
I figured using Countdown Millis or EtaMillis from a TaskQueue would work, but haven't seen any evidence of anyone doing the same thing, especially for such long time frames.
Is that the best approach, or is there a better way?

If you are able to persist an object in the datastore with all of the relevant information for future processing (including when the processing for the object's data should begin), you could have a cron job periodically query the datastore with a date/time range filter and trigger processing any of the above objects at the appropriate time.

We successfully use TaskQueue's countdown parameter for sending emails to customers 7 days after they registered and for number of other needs.
Task queues is core/basic API/service and are pretty reliable - my opinion it's a best way to go with task queues ETA/countdown unless you:
need ability programmatically see what is in the queue
need ability programmatically delete certain task from the queue

I'm using the task queue as a scheduler. There is a 30 day max eta declared in QueueConstants and applied in QueueImpl.
//Returns the maximum time into the future that a task may be scheduled.
private static final long MAX_ETA_DELTA_MILLIS = 2592000000L;
1000ms * 60s * 60m * 24hr * 30days = 2592000000ms
private long determineEta(TaskOptions taskOptions) {
Long etaMillis = taskOptions.getEtaMillis();
Long countdownMillis = taskOptions.getCountdownMillis();
if (etaMillis == null) {
if (countdownMillis == null) {
return currentTimeMillis();
} else {
if (countdownMillis > QueueConstants.getMaxEtaDeltaMillis()) {
throw new IllegalArgumentException("ETA too far into the future");
}
if (countdownMillis < 0) {
throw new IllegalArgumentException("Negative countdown is not allowed");
}
return currentTimeMillis() + countdownMillis;
}
} else {
if (countdownMillis == null) {
if (etaMillis - currentTimeMillis() > QueueConstants.getMaxEtaDeltaMillis()) {
throw new IllegalArgumentException("ETA too far into the future");
}
if (etaMillis < 0) {
throw new IllegalArgumentException("Negative ETA is invalid");
}
return etaMillis;
} else {
throw new IllegalArgumentException(
"Only one or neither of EtaMillis and CountdownMillis may be specified");
}
}
}

I do the following:
Enqueue a task with a delay configured as you mention. Have the task processing change datastore entries in a known way (for example: set a flag).
Have a stragglers low frequency cron job, to perform any processing that has somehow been missed by an enqueued task (for example: an uncaught exception happened in the task).
For this to work, ensure that the processing called by the tasks and cron job are idempotent.
Enjoy?

I think taskQueue is a good strategy but has one big problem "If a push task is created successfully, it will eventually be deleted (at most seven days after the task successfully executes)." Source
I would instead use the datastore. here is one strategy you can take:
Insert a record into datastore once you completed "parsing that data".
Check the current date against the create/insert date to see how much time has passed by since your job was completed/started etc
(clearly, you don't want to do every minute etc maybe do it once a
day or every hour)
Execute the next task that you need to do as soon as condition in step 2 become passed your "arbitrary amount of time" you want.
Here is how you can add a record to data store...to get you started ..
Entity parsDataHolder = new Entity("parsing_data_done", guestbookKey);
parsDataHolder.setProperty("date", date);
DatastoreService datastore = DatastoreServiceFactory.getDatastoreService();
datastore.put(parsDataHolder)

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Task scheduling gets stuck - java

Related

Optaplanner skipping Phases and returning empty solution

Drools comparing Sets

Apache Flink Using Windows to induce a delay before writing to Sink

How to automatically collapse repetitive log output in log4j

What's the most effective way to execute a task some distant time in the future on App Engine?

Categories

Resources