I have taken the cloud balancing problem to a different direction:
A problem where you need to fill up all the computers with processes and computers can overfill (by changing the overfill constraint to a soft one)
The change was easy by adding the following constraints:
Constraint unfilledCpuPowerTotal(ConstraintFactory constraintFactory) {
return constraintFactory.forEach(CloudProcess.class)
.groupBy(CloudProcess::getComputer, sum(CloudProcess::getRequiredCpuPower))
.filter((computer, requiredCpuPower) -> requiredCpuPower < computer.getCpuPower())
.penalize("unfilledCpuPowerTotal",
HardSoftScore.ONE_HARD,
(computer, requiredCpuPower) -> computer.getCpuPower() - requiredCpuPower);
}
Constraint unfilledMemoryTotal(ConstraintFactory constraintFactory) {
return constraintFactory.forEach(CloudProcess.class)
.groupBy(CloudProcess::getComputer, sum(CloudProcess::getRequiredMemory))
.filter((computer, requiredMemory) -> requiredMemory < computer.getMemory())
.penalize("unfilledMemoryTotal",
HardSoftScore.ONE_HARD,
(computer, requiredMemory) -> computer.getMemory() - requiredMemory);
}
Constraint unfilledNetworkBandwidthTotal(ConstraintFactory constraintFactory) {
return constraintFactory.forEach(CloudProcess.class)
.groupBy(CloudProcess::getComputer, sum(CloudProcess::getRequiredNetworkBandwidth))
.filter((computer, requiredNetworkBandwidth) -> requiredNetworkBandwidth < computer.getNetworkBandwidth())
.penalize("unfilledNetworkBandwidthTotal",
HardSoftScore.ONE_HARD,
(computer, requiredNetworkBandwidth) -> computer.getNetworkBandwidth() - requiredNetworkBandwidth);
}
Constraint unusedComputer(ConstraintFactory constraintFactory) {
return constraintFactory.forEach(CloudComputer.class)
.ifNotExists(CloudProcess.class, equal(Function.identity(), CloudProcess::getComputer))
.penalize("unusedComputers",
HardSoftScore.ONE_HARD,
computer -> computer.getCpuPower() + computer.getMemory() + computer.getNetworkBandwidth());
}
I have also removed the cost constraint because it doesn't make sense in this context.
However, i dont want the planner to dump all the avalible processes into computers.
Meaning, if all the computers are already full and there are unused processes I would like them to stay that way and not be forced to add more overfill penalty to a computer.
I guess this can be done by somehow ignoring the init penalty but I can't seem to understand where or how to implement that idea.
I also thought about adding a "dummy" computer entity that just holds processes with no penalty (the planner will still fill regular computers because not filling them will resault in a soft penalty) but it seems like a lot of work and requires big changes to almost every part of the project so if there is a way to implement the first idea it would be preferred.
What you're describing is called over-constrained planning.
Most likely, you are looking for nullable variables.
Your idea with a dummy is called a "virtual value".
Related
I've been trying to get over-constrained planning to work for my situation, but keep running into issues where some failed hard constraints are still being assigned. Apologies if this has been answered before, but most examples/solutions I have seen are centered around Drools, and I'm using the streams API on this project. Using the quarkus 1.4.2 implementation of optaplanner, if that helps.
Below are some example constraints of what I'm trying to accomplish:
private Constraint unnassignedPerson(ConstraintFactory constraintFactory) {
return constraintFactory.from(Assignment.class)
.filter(assignment -> assignment.getPerson() == null)
.penalize("Unassigned", HardMediumSoftScore.ONE_MEDIUM);
private Constraint numberAssignmentConflict(ConstraintFactory constraintFactory) {
return constraintFactory.from(Assignment.class)
.join(Assignment.class,
Joiners.equal(Assignment::getPerson),
Joiners.equal(Assignment::getNumber),
Joiners.lessThan(Assignment::getId))
.penalize("Number Conflict", HardMediumSoftScore.of(2, 0, 0));
private Constraint tooLittleSpaceBetweenResourceAssignment(ConstraintFactory constraintFactory) {
return constraintFactory.from(Assignment.class)
.join(Assignment.class, Joiners.equal(Assignment::getPerson), Joiners.lessThan(Assignment::getId))
.filter((assignment, assignment2) -> !assignment.getResourceId().equals(assignment2.getResourceId()))
.filter(((assignment, assignment2) -> inRange(1, assignment.getNumber(), assignment2.getNumber())))
.penalize("Not enough space between assignments of different resource (requires 1)", HardMediumSoftScore.of(1, 0, 0));
}
(inRange is a simple local function to get the absolute difference between two numbers)
Note that these both work independently of each other in terms of honoring the nullable planning variable - it's only when both are enabled that I am getting unexpected results. When both are enabled, the one with the lower hard score is still assigned in the solution despite showing up as a hard constraint in the debug log (which in my local testing always finishes at -12hard/-2medium/0soft).
Any insight on what I might be doing wrong would be much appreciated, and thanks in advance :)
As a follow up, it appears the Joiners.lessThan(Assignment::getId) portion of my assignment conflict constraint is not compatible with nullable assignments. I removed that and added some more explicit checks instead, and now things are working like they should :D
psuedo-adaptation for anyone it might help:
private Constraint numberAssignmentConflict(ConstraintFactory constraintFactory) {
return constraintFactory.from(Assignment.class)
.join(Assignment.class,
Joiners.equal(Assignment::getPerson),
Joiners.equal(Assignment::getNumber))
.filter(((assignment, assignment2) -> assignment.getPerson() != null && assignment2.getPerson() != null))
.filter(((assignment, assignment2) -> !assignment.getId().equals(assignment2.getId())))
.penalize("Number Conflict", HardMediumSoftScore.of(2, 0, 0));
}
Doesn't the first constraint have to be a fromUnfiltered(Assignment.class) rather than from(Assignment.class). I believe that from()does not pass entities with unassigned planning variables hence the ONE_MEDIUM penalty would never be applied?
I'm trying to use Optaplanner to replace myself in scheduling our work planning.
The system is having a MySQL database containing the necessary information and relationships. For this issue I'll only use the three tables I need:
Employees --> Have Skills
Jobs --> Have Skills
Skills
In Drools I have the rule
rule 'required Skills'
when
Job(employee != null, missingSkillCount > 0, $missingSkillCount : missingSkillCount)
then
scoreHolder.addHardConstraintMatch(kcontext, -10 * $missingSkillCount);
end
In Class Job I have a function missingSkillCount():
public int getMissingSkillCount() {
if (this.employee == null) {
return 0;
}
int count = 0;
for (Skill skill : this.reqskills) {
if(!this.employee.getSkills().contains(skill)) {
count++;
}
}
return count;
}
When I run my program, Optaplanner returns that none of my workers have any skills...
However, when I manually use this function (adapted to accept an Employee as parameter): public int getMissingSkillCount(Employee employee), it does return the correct values.
I'm puzzled! I somehow understand that containsis checking for the same object, instead of the content of the object. But then I don't understand how to do this efficiently...
1) Are your Jobs in the Drools working memory? I presume they are your #PlanningEntity and the instances are in #PlanningEntityCollectionProperty on your #PlanningSolution, so they will be. You can verify this by just matching a rule on Job() and doing a System.out.println.
2) Try writing the constraint as a ConstraintStream (see docs) and putting a debug breakpoint in the getMissingSkillCount() > 0 lambda to see what's going on.
3) Temporarily turn on FULL_ASSERT to validate there is no score corruption.
4) Turn on DEBUG and then TRACE logging for optaplanner, to see what's going on inside.
Still wondering what makes the difference between letting Optaplanner run getMissingSkillCount() and using it "manually".
I fixed it by overriding equals(), that should have been my first clue!
Let's say I want to make some transformation 'A' configurable. This transformation manages some state using state-store and also requires repartitioning, which means repartitioning will be done only if configured. Now if I run the application 3 times (may be rolling upgrade as well) in following way (or any other combination) :-
Transformation 'A' is disabled
Transformation 'A' is enabled
Transformation 'A' is disabled
Given that all 3 runs uses the same cluster of Kafka brokers:-
If EOS is enabled, will EOS guarantee exist across all the 3 runs ?
If EOS is not enabled, Is there a case which may cause message loss( Failed to provide even at least once)?
The topology code to get better understanding of what I am trying to do:-
KStream<String, Cab> kStream = getStreamsBuilder()
.stream("topic_a", Consumed.with(keySerde, valueSerde))
.transformValues(() -> transformer1)
.transformValues(() -> transformer2, "stateStore_a")
.flatMapValues(events -> events);
mayBeEnrichAgain(kStream, keySerde, valueSerde)
.selectKey((ignored, event) -> event.getAnotherId())
.through(INTERMEDIATE_TOPIC_2, Produced.with(keySerde, valueSerde)) //this repartitioning will always be there
.transformValues(() -> transformer3, "stateStore_b")
.to(txStreamsConfig.getAlertTopic(), Produced.with(keySerde, valueSerde));
private <E extends Cab> KStream<String, E> mayBeEnrichAgain(final KStream<String, E> kStream,
final Serde<String> keySerde,
final Serde<E> valueSerde) {
if(enrichmentEnabled){ //repartitioning is configurable
return kStream.selectKey((ignored, event) -> event.id())
.through(INTERMEDIATE_TOPIC_1, Produced.with(keySerde, valueSerde))
.transformValues(enricher1)
.transformValues(enricher2);
}
else{
return kStream;
}
}
You cannot simply change the topology without potentially breaking it.
Hard to say in general if inserting the through-topic will break the application in the first place.
If it does not break, you might "loose" data when we remove the topic, as some unprocessed data might still be in this topic and after removing the topic, the topology would not read those data.
In general, you should reset an application cleanly or use a new application.id if you upgrade your app to a newer version that changes the structure of the topology.
I am currently developing an Eclipse-RCP application that stores per-project preferences using the EclipsePreference mechanism through ProjectScope. At first this seemed to work very well, but we have run into trouble when (read-) accessing these preferences in multithreaded scenarios while at the same time changes are being made to the workspace. What appears to be praticularly problematic is accessing such a preference node (ProjectScope.getNode()) while the project is being deleted by an asynchronous user action (right click on Project -> Delete Project). In such cases we get a pretty mix of
org.osgi.service.prefs.BackingStoreException
java.io.FileNotFoundException
org.eclipse.core.runtime.CoreException
Essentially they all complain that the underlying file is no longer there.
Initial attempts to fix this using checks like IProject.exists() or isAccessible() and even going so far as checking the presence of the actual .prefs file were as futile as expected: They only make the exceptions less likely but do not really prevent them.
So my question is: How are you supposed to safely access things like ProjectScope.getNode()? Do you need to go so far to put every read into a WorkspaceJob or is there some other, clever way to prevent the above problems like putting the read access in Display.asyncExec()?
Although I tried, I did not really find answers to the above question in the Eclipse documentation.
Usually scheduling rules are used to concurrently access resources in the workspace.
I've never worked with ProjectScopeed preferences but if they are stored within a project or its metadata, then a scheduling rule should help to coordinate access. If you are running the preferences access code in a Job, then setting an appropriate scheduling rule should do:
For example:
IProject project = getProjectForPreferences( projectPreferences );
ISchedulingRule rule = project.getWorkspace().getRuleFactory().modifyRule( project );
Job job = new Job( "Access Project Preferences" ) {
#Override
protected IStatus run( IProgressMonitor monitor ) {
if( project.exists() ) {
// read or write project preferences
}
return Status.OK_STATUS;
}
};
job.setRule( rule );
job.schedule();
The code acquires a rule to modify the project and the Job is guaranteed to run only when no other job with a conflichting rule runs.
If your code isn't running within a job, you can also manually acquire a lock with IJobManager.beginRule() and endRule().
For example:
ISchedulingRule rule = ...;
try {
jobManager.beginRule( rule, monitor );
if( project.exists() ) {
// read or write project preferences
}
} finally {
jobManager.endRule( rule );
}
As awkward as it looks, the call to beginRule must be within the try block, see the JavaDoc for more details.
I am trying to get a grasp on Google App Engine programming and wonder what the difference between these two methods is - if there even is a practical difference.
Method A)
public Collection<Conference> getConferencesToAttend(Profile profile)
{
List<String> keyStringsToAttend = profile.getConferenceKeysToAttend();
List<Conference> conferences = new ArrayList<Conference>();
for(String conferenceString : keyStringsToAttend)
{
conferences.add(ofy().load().key(Key.create(Conference.class,conferenceString)).now());
}
return conferences;
}
Method B)
public Collection<Conference> getConferencesToAttend(Profile profile)
List<String> keyStringsToAttend = profile.getConferenceKeysToAttend();
List<Key<Conference>> keysToAttend = new ArrayList<>();
for (String keyString : keyStringsToAttend) {
keysToAttend.add(Key.<Conference>create(keyString));
}
return ofy().load().keys(keysToAttend).values();
}
the "conferenceKeysToAttend" list is guaranteed to only have unique Conferences - does it even matter then which of the two alternatives I choose? And if so, why?
Method A loads entities one by one while method B does a bulk load, which is cheaper, since you're making just 1 network roundtrip to Google's datacenter. You can observe this by measuring time taken by both methods while loading a bunch of keys multiple times.
While doing a bulk load, you need to be cautious about loaded entities, if datastore operation throws exception. Operation might succeed even when some of the entities are not loaded.
The answer depends on the size of the list. If we are talking about hundreds or more, you should not make a single batch. I couldn't find documentation what is the limit, but there is a limit. If it not that much, definitely go with loading one by one. But, you should make the calls asynchronous by not using the now function:
List<<Key<Conference>> conferences = new ArrayList<Key<Conference>>();
conferences.add(ofy().load().key(Key.create(Conference.class,conferenceString));
And when you need the actual data:
for (Key<Conference> keyConference : conferences ) {
Conference c = keyConference.get();
......
}