Given a java.util.Collection what is the easiest way to create an endless java.util.Iterator which returns those elements such that they show up according to a given distribution (org.apache.commons.math.distribution)?
List<Object> l = new ArrayList<Object>(coll);
Iterator<Object> i = new Iterator<Object>() {
public boolean hasNext() { return true; }
public Object next() {
return coll.get(distribution.nextInt(0, l.size());
}
}
Your problem is then how to convert the Distribution classes in the apache library to implement the nextInt method. I have to say that it is far from obvious to me how you can actually do this from the Distribution interface.
One (slightly rubbish) way I can think of is to generate an EmpiricalDistribution (in the random package) dataset using the probability defined by your actual distribution and then using that emprirical dsitribution as the distribution (above)
Solution for Gaussian distribution
import java.io.OutputStreamWriter;
import java.io.PrintWriter;
import java.nio.charset.Charset;
import java.util.ArrayList;
import java.util.Collection;
import java.util.Collections;
import java.util.Iterator;
import java.util.List;
import java.util.Random;
import java.util.SortedMap;
import java.util.Map.Entry;
import com.google.common.collect.ArrayListMultimap;
import com.google.common.collect.ImmutableSortedMap;
import com.google.common.collect.Lists;
import com.google.common.collect.Multimap;
import com.google.common.collect.ImmutableSortedMap.Builder;
/**
* Endless sequence with gaussian distribution.
*
* #param <T> the type of the elements
* #author Michael Locher
*/
public class GaussianSequence<T> implements Iterable<T>, Iterator<T> {
private static final int HISTOGRAMM_SAMPLES = 50000;
private static final int HISTOGRAMM_ELEMENTS = 100;
private static final int HISTOGRAMM_LENGTH = 80;
private static final double DEFAULT_CUTOFF = 4.0;
private final List<T> elements;
private final int maxIndex;
private final Random rnd;
private final double scaling;
private final double halfCount;
/**
* Creates this.
* #param rnd the source of randomness to use
* #param elements the elements to deliver
*/
public GaussianSequence(final Random rnd, final Collection<T> elements) {
this(rnd, DEFAULT_CUTOFF, elements);
}
private GaussianSequence(final Random rnd, final double tailCutOff, final Collection<T> elements) {
super();
this.rnd = rnd;
this.elements = new ArrayList<T>(elements);
if (this.elements.isEmpty()) {
throw new IllegalArgumentException("no elements provided");
}
this.maxIndex = this.elements.size() - 1;
this.halfCount = this.elements.size() / 2.0;
this.scaling = this.halfCount / tailCutOff;
}
/**
* {#inheritDoc}
*/
#Override
public Iterator<T> iterator() {
return this;
}
/**
* {#inheritDoc}
*/
#Override
public boolean hasNext() {
return true;
}
/**
* {#inheritDoc}
*/
#Override
public void remove() {
throw new UnsupportedOperationException();
}
/**
* {#inheritDoc}
*/
#Override
public T next() {
return this.elements.get(sanitizeIndex(determineNextIndex()));
}
private int determineNextIndex() {
final double z = this.rnd.nextGaussian();
return (int) (this.halfCount + (this.scaling * z));
}
private int sanitizeIndex(final int index) {
if (index < 0) {
return 0;
}
if (index > this.maxIndex) {
return this.maxIndex;
}
return index;
}
/**
* Prints a histogramm to stdout.
* #param args not used
*/
public static void main(final String[] args) {
final PrintWriter out = new PrintWriter(new OutputStreamWriter(System.out, Charset.forName("UTF-8")), true);
plotHistogramm(new Random(), out);
}
private static void plotHistogramm(final Random rnd, final PrintWriter out) {
// build elements
final Multimap<Integer, Integer> results = ArrayListMultimap.create();
final List<Integer> elements = Lists.newArrayListWithCapacity(HISTOGRAMM_ELEMENTS);
for (int i = 1; i < HISTOGRAMM_ELEMENTS; i++) {
elements.add(i);
}
// sample sequence
final Iterator<Integer> randomSeq = new GaussianSequence<Integer>(rnd, elements);
for (int j = 0; j < HISTOGRAMM_SAMPLES; j++) {
final Integer sampled = randomSeq.next();
results.put(sampled, sampled);
}
// count and sort results
final Builder<Integer, Integer> r = ImmutableSortedMap.naturalOrder();
for (final Entry<Integer, Collection<Integer>> e : results.asMap().entrySet()) {
final int count = e.getValue().size();
r.put(e.getKey(), count);
}
// plot results
final SortedMap<Integer, Integer> sortedAndCounted = r.build();
final double histogramScale = (double) HISTOGRAMM_LENGTH / Collections.max(sortedAndCounted.values());
for (final Entry<Integer, Integer> e : sortedAndCounted.entrySet()) {
out.format("%3d [%4d]", e.getKey(), e.getValue());
final StringBuilder c = new StringBuilder();
final int lineLength = (int) (histogramScale * e.getValue());
for (int i = 0; i < lineLength; i++) {
c.append('*');
}
out.println(c);
}
}
}
Related
I have started experimenting with the Jenetics library, however I am having some issues with trying to make a very easy "custom" set of gene/chromosomes.
What I tried to do was to create a custom chromosome with a different (random) number of custom genes inside. The genes simply contain an integer value, just for the sake of simplicity. For the same simplicity, the contents of a gene can only be numbers ranging from 0 to 9 and a Gene is considered valid only if it does NOT contain the number 9 (again, retardedly simple, but I just wanted to make them custom)
Here is my code:
CustomGene:
public class CustomGene implements Gene<Integer, CustomGene> {
private Integer value;
private CustomGene(Integer value) {
this.value = value;
}
public static CustomGene of(Integer value) {
return new CustomGene(value);
}
public static ISeq<CustomGene> seq(Integer min, Integer max, int length) {
Random r = RandomRegistry.getRandom();
return MSeq.<CustomGene>ofLength(length).fill(() ->
new CustomGene(random.nextInt(r, min, max))
).toISeq();
}
#Override
public Integer getAllele() {
return value;
}
#Override
public CustomGene newInstance() {
final Random random = RandomRegistry.getRandom();
return new CustomGene(Math.abs(random.nextInt(9)));
}
#Override
public CustomGene newInstance(Integer integer) {
return new CustomGene(integer);
}
#Override
public boolean isValid() {
return value != 9;
}
}
CustomChromosome:
import org.jenetics.Chromosome;
import org.jenetics.util.ISeq;
import org.jenetics.util.RandomRegistry;
import java.util.Iterator;
import java.util.Random;
public class CustomChromosome implements Chromosome<CustomGene> {
private ISeq<CustomGene> iSeq;
private final int length;
public CustomChromosome(ISeq<CustomGene> genes) {
this.iSeq = genes;
this.length = iSeq.length();
}
public static CustomChromosome of(ISeq<CustomGene> genes) {
return new CustomChromosome(genes);
}
#Override
public Chromosome<CustomGene> newInstance(ISeq<CustomGene> iSeq) {
this.iSeq = iSeq;
return this;
}
#Override
public CustomGene getGene(int i) {
return iSeq.get(i);
}
#Override
public int length() {
return iSeq.length();
}
#Override
public ISeq<CustomGene> toSeq() {
return iSeq;
}
#Override
public Chromosome<CustomGene> newInstance() {
final Random random = RandomRegistry.getRandom();
ISeq<CustomGene> genes = ISeq.empty();
for (int i = 0; i < length; i++) {
genes = genes.append(CustomGene.of(Math.abs(random.nextInt(9))));
}
return new CustomChromosome(genes);
}
#Override
public Iterator<CustomGene> iterator() {
return iSeq.iterator();
}
#Override
public boolean isValid() {
return iSeq.stream().allMatch(CustomGene::isValid);
}
}
Main:
import org.jenetics.Genotype;
import org.jenetics.Optimize;
import org.jenetics.engine.Engine;
import org.jenetics.engine.EvolutionResult;
import org.jenetics.util.Factory;
import org.jenetics.util.RandomRegistry;
import java.util.Random;
public class Main {
private static int maxSum = - 100;
private static Integer eval(Genotype<CustomGene> gt) {
final int sum = gt.getChromosome().stream().mapToInt(CustomGene::getAllele).sum();
if(sum > maxSum)
maxSum = sum;
return sum;
}
public static void main(String[] args) {
final Random random = RandomRegistry.getRandom();
Factory<Genotype<CustomGene>> g =
Genotype.of(CustomChromosome.of(CustomGene.seq(0, 9, Math.abs(random.nextInt(9)) + 1)));
Engine<CustomGene, Integer> engine = Engine
.builder(Main::eval, g)
.optimize(Optimize.MAXIMUM)
.populationSize(100)
.build();
Genotype<CustomGene> result = engine.stream().limit(10000)
.collect(EvolutionResult.toBestGenotype());
System.out.println(eval(result));
result.getChromosome().stream().forEach(i -> {
System.out.print(i.getAllele() + " ");
});
System.out.println();
System.out.println(maxSum);
}
}
I do not understand why I get this output:
13 (evaluated result)
1 8 0 4 (all the alleles form the genes of the chosen chromosome)
32 (the actual maximum fitness found)
We can clearly see a difference between the genotype which had the biggest fitness function and the chosen genotype. Why?
I know I'm doing something wrong and it's probably a silly mistake, but I really can't seem to understand what I am doing wrong. Could you please help me out?
Lots of thanks!
As posted by the creator of the library here, the answer was:
you violated the contract of the Chromosome.newInstance(ISeq) method. This method must return a new chromosome instance. After fixing this
#Override
public Chromosome<CustomGene> newInstance(ISeq<CustomGene> iSeq) {
return new CustomChromosome(iSeq);
}
I have a static ArrayList (masterLog) that is in my main driver class. The ArrayList contains Event objects, the Event object has an ArrayList (heats) as a global variable. the heat object as an ArrayList (racers) as a global variable. Now when I have the following line of code:
System.out.println(ChronoTimer1009System.getMasterLog().get(0).getHeats().get(getCurHeat()).getRacers().toString());
this returns [] even though the getRacers() IS NOT empty!
When I call this:
System.out.println(getHeats().get(getCurHeat()).getRacers());
this returns the proper filled array.
I think I need to sync the masterLog ArrayList but I am unsure how. I have tried syncing it the way other threads on Stack Exchange have recommended but no luck.
it seems like the static ArrayList masterLog is updated two levels deep but not three levels deep if that makes sense.
What am I doing wrong?
UPDATE:
Maybe this will help explain:
In my main (driver) class, I have a static ArrayList called masterLog. The purpose of this ArrayLIst is to store instances of Event objects for later data retrieval. Now, without making it too complicated, the Event class contains an ArrayList called heats, and the Heat class contains an ArrayList called racers. When I access the masterLog ArrayList at some point in the program (when the other ArrayLists are populated with data), say for example by the call "masterLog.getHeats().get(0).getRacers()", the masterLog does not find any data in the racers ArrayList. It does, however, find data in the heats ArrayList. In other words, the object instance that is stored in the masterLog only updates information to a depth of 2 (not 3 if that makes sense).
UPDATE:
Here is some code:
ChronoTimer1009System class (driver)
package main;
import java.io.DataOutputStream;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.ArrayList;
import java.util.LinkedList;
import java.util.Stack;
public class ChronoTimer1009System {
private Event curEvent;
private static Channel[] channels = new Channel[8];
private boolean state;
private static Stack<Log> log;
private static ArrayList<Event> masterLog;
private static Printer p;
public static Time globalTime;
private int oldLogSize; //used only in this.export()
public ChronoTimer1009System() throws UserErrorException{
for(int i=0; i<channels.length; ++i){channels[i] = new Channel(SensorType.NONE);} // initialize channels
masterLog = new ArrayList<Event>(); //this holds references to each event
this.newEvent(EventType.IND);
this.state = false; //system is initally off
log = new Stack<Log>();
p = new Printer();
globalTime = null;
oldLogSize = 0;
}
public void newEvent(EventType e) throws UserErrorException {
switch(e){
case IND: this.curEvent = new IND();ChronoTimer1009System.masterLog.add(this.curEvent);break;
case PARIND: this.curEvent = new PARIND();ChronoTimer1009System.masterLog.add(this.curEvent);break;
case GRP: this.curEvent = new GRP();ChronoTimer1009System.masterLog.add(this.curEvent);break;
case PARGRP: this.curEvent = new PARGRP();ChronoTimer1009System.masterLog.add(this.curEvent);break;
}
for(Channel x : channels){if(x.getState()) x.toggleState();}
}
public void on() throws UserErrorException{
if(state) throw new IllegalStateException();
this.curEvent = new IND();
ChronoTimer1009System.globalTime = new Time(0);
state = true;
}
public void reset() throws UserErrorException{
if(state) state = false;
on();
}
public void exit(){
this.curEvent = null;
ChronoTimer1009System.globalTime = null;
if(!state) throw new IllegalStateException();
state = false;
}
public static Time searchElapsedByID(int idNum){
Time toReturn = null;
for(Log item : log){
if(item.getCompetitorNumber() == idNum){
toReturn = item.getElapsedTime(); break;
}
}
return toReturn;
}
/**
* #return the curEvent
*/
public Event getCurEvent() {
return curEvent;
}
/**
* #return the state
*/
public boolean isState() {
return state;
}
public static Channel getChan(int chan){
if(chan < 1 || chan > 8) throw new IllegalArgumentException("Argument is not in range");
return channels[chan-1];
}
public static void export(){
//*****FORMAT JSON*****
//before formating, a sort of the runners within each heat is needed to determine place.
String toJson = "{\"events\":[";
System.out.println(ChronoTimer1009System.getMasterLog().get(0).getHeats().get(0).getRacers().size());
//iterate through each event
for(int i = 0; i < ChronoTimer1009System.getMasterLog().size(); ++i){
//iterate through each heat of each event
toJson += "{\"name\":\"" + ChronoTimer1009System.getMasterLog().get(i).getType().toString() + "\",\"heats\":[";
for(int j = 0; j < ChronoTimer1009System.getMasterLog().get(i).getHeats().size(); ++j){
//iterate through each competitor in each heat
toJson += "{\"runners\":[";
System.out.println(ChronoTimer1009System.getMasterLog().get(i).getHeats().size());
ArrayList<Competitor> x = sortByPlace(ChronoTimer1009System.getMasterLog().get(i).getHeats().get(j).getRacers()); <----- on this line, the getRacers() part has a size of zero when it isn't empty.
for(int k = 0; k < x.size(); ++k){
//notice we are working with a sorted copy
//TODO make Competitor endTime the elapsed time
toJson += "{\"place\":\"" + String.valueOf(k+1) + "\",\"compNum\":\"" + x.get(k).getIdNum() + "\", \"elapsed\":\"" + x.get(k).getEndTime().toString() + "\"},";
}
toJson += "]},";
}
toJson += "]},";
}
toJson += "}";
System.out.println(toJson);
/*try{
URL site = new URL("http://7-dot-eastern-cosmos-92417.appspot.com/chronoserver");
HttpURLConnection conn = (HttpURLConnection) site.openConnection();
conn.setRequestMethod("POST");
conn.setDoInput(true);
conn.setDoOutput(true);
DataOutputStream out = new DataOutputStream(conn.getOutputStream());
String data = "data=" + toJson;
out.writeBytes(data);
out.flush();
out.close();
System.out.println("Done sent to server");
new InputStreamReader(conn.getInputStream());
}
catch (Exception e)
{
e.printStackTrace();
}*/
}
private static ArrayList<Competitor> sortByPlace(ArrayList<Competitor> unsorted)
{
ArrayList<Competitor> whole = (ArrayList<Competitor>) unsorted.clone();
ArrayList<Competitor> left = new ArrayList<Competitor>();
ArrayList<Competitor> right = new ArrayList<Competitor>();
int center;
if(whole.size()==1)
return whole;
else
{
center = whole.size()/2;
// copy the left half of whole into the left.
for(int i=0; i<center; i++)
{
left.add(whole.get(i));
}
//copy the right half of whole into the new arraylist.
for(int i=center; i<whole.size(); i++)
{
right.add(whole.get(i));
}
// Sort the left and right halves of the arraylist.
left = sortByPlace(left);
right = sortByPlace(right);
// Merge the results back together.
merge(left,right,whole);
}
return whole;
}
private static void merge(ArrayList<Competitor> left, ArrayList<Competitor> right, ArrayList<Competitor> whole) {
int leftIndex = 0;
int rightIndex = 0;
int wholeIndex = 0;
// As long as neither the left nor the right arraylist has
// been used up, keep taking the smaller of left.get(leftIndex)
// or right.get(rightIndex) and adding it at both.get(bothIndex).
while (leftIndex < left.size() && rightIndex < right.size())
{
if ((left.get(leftIndex).getEndTime().compareTo(right.get(rightIndex)))<0)
{
whole.set(wholeIndex,left.get(leftIndex));
leftIndex++;
}
else
{
whole.set(wholeIndex, right.get(rightIndex));
rightIndex++;
}
wholeIndex++;
}
ArrayList<Competitor>rest;
int restIndex;
if (leftIndex >= left.size()) {
// The left arraylist has been use up...
rest = right;
restIndex = rightIndex;
}
else {
// The right arraylist has been used up...
rest = left;
restIndex = leftIndex;
}
// Copy the rest of whichever arraylist (left or right) was
// not used up.
for (int i=restIndex; i<rest.size(); i++) {
whole.set(wholeIndex, rest.get(i));
wholeIndex++;
}
}
/**
* #return the log
*/
public static Stack<Log> getLog() {
return log;
}
/**
* #return the masterLog
*/
public static ArrayList<Event> getMasterLog() {
return masterLog;
}
/**
* #return the p
*/
public static Printer getPrinter() {
return p;
}
}
Event Class:
package main;
import java.util.ArrayList;
public abstract class Event extends Display{
private ArrayList<Heat> heats;
private int curHeat; //private means only this class can modify, not the subclasses
private Competitor curComp;
private String name;
public Event(String name) throws UserErrorException{
this.name = name;
heats = new ArrayList<Heat>();
curHeat = -1;
curComp = null;
createRun();
}
/**
* This method will be used by all EventTypes and will not change
* regardless of the EventType.
* #throws UserErrorException
*/
public void createRun() throws UserErrorException{
heats.add(new Heat()); ++curHeat;
}
/**
* #return the heats
*/
public ArrayList<Heat> getHeats() {
return heats;
}
/**
* #return the name
*/
public String getName() {
return name;
}
/**
* #return the currentHeat
*/
public int getCurHeat() {
return curHeat;
}
/**
* #return the curComp
*/
public Competitor getCurComp() {
return curComp;
}
/**
* #param curComp the curComp to set
*/
public void setCurComp(Competitor curComp) {
this.curComp = curComp;
}
/* (non-Javadoc)
* #see Display#displayHeatNumber()
*/
#Override
public String displayHeatNumber() {
// TODO Auto-generated method stub
return "Heat: " + (curHeat+1);
}
/* (non-Javadoc)
* #see Display#displayFinished()
*/
#Override
public String displayFinished() {
String toReturn = "";
boolean noRunners = true;
for(Competitor x : getHeats().get(getCurHeat()).getRacers()){
if(x.getEndTime() != null){
toReturn += "\n" + x.getIdNum() + " " + (ChronoTimer1009System.searchElapsedByID(x.getIdNum()).equals(new Time(Integer.MAX_VALUE, Integer.MAX_VALUE, Integer.MAX_VALUE, Integer.MAX_VALUE)) ? "DNF" : ChronoTimer1009System.searchElapsedByID(x.getIdNum()).toString() + " F");
noRunners = false;
}
}
if(noRunners){toReturn = "no runners have finished";}
return toReturn;
}
public abstract void endRun() throws UserErrorException;
public abstract void trigChan(int chan, boolean dnf) throws UserErrorException;
public abstract void cancel(int ln) throws UserErrorException;
public abstract EventType getType();
}
Heat class:
package main;
import java.util.ArrayList;
public class Heat {
private ArrayList<Competitor> racers;
//private ArrayList<Competitor> racers;
private int currentCompetitor;
/**
* Constructor
*/
public Heat(){
racers = new ArrayList<Competitor>();
//racers = new ArrayList<Competitor>();
currentCompetitor = 0;
}
/**
* Set selected racer as next on to start
* #param racer the racer to start next
*/
public void setNextCompetitor(Competitor x){
int pos = racers.indexOf(x);
if(pos == -1 || pos<currentCompetitor) throw new IllegalArgumentException("Competitor not in the race! Please add first");
for(int i = pos; i>currentCompetitor; --i){
racers.set(i, racers.get(i-1));
}
racers.set(currentCompetitor, x);
}
/**
* Take the selected runner (the next runner) out from the race
* #param racer the runner to be cleared
*/
public void clearNextCompetitor() throws UserErrorException {
if(racers.size()-(currentCompetitor)<1) throw new UserErrorException("No runners to clear!");
for(int i = currentCompetitor+1; i<racers.size(); ++i){
racers.set(i-1, racers.get(i));
}
racers.remove(racers.size()-1);
}
/**
* basically a remove method
* #param x
*/
public void remove(Competitor x){
int pos = racers.indexOf(x);
if(pos < 0) throw new IllegalArgumentException("runner does not exists");
racers.remove(pos);
}
/**
* Swaps two runners positions in line
*/
public void swap() throws UserErrorException{
int count = 0;
for(Competitor x : racers){
if(x.getStartTime() == null) ++count;
}
if(count > 1 && currentCompetitor + 1 <= racers.size()){
Competitor first = racers.get(currentCompetitor);
Competitor second = racers.get(currentCompetitor+1);
racers.set(currentCompetitor, second);
racers.set(currentCompetitor+1, first);
}
else{
throw new UserErrorException("Not enough competitors to swap");
}
}
/**
* Add a competitor to the end of the current line of competitors if any
* #param x the competitor to add
*/
public boolean addCompetitor(Competitor x) throws UserErrorException{
if(x.getIdNum() < 0 || x.getIdNum() > 99999) throw new UserErrorException("ID number out of range");
if(x.getRunNum() < 0) throw new IllegalArgumentException("Run Num Out of range");
boolean add = true;
for(Competitor i : racers){
if(i.getIdNum() == x.getIdNum()){
add = false;
break;
}
}
if(add){
racers.add(x);
}
return add;
}
/**
* Retrieve the next competitor if there is one
* #return the next competitor
*/
public Competitor getNextCompetitor() throws UserErrorException{
if(!hasNextCompetitor()) throw new UserErrorException("There are no more competitors!");
while(racers.get(currentCompetitor).isCompeting()){++currentCompetitor;}
return racers.get(currentCompetitor++);
}
/**
* used to fix the order of the queue after cancel is called
*/
public void fix(EventType x){
switch(x){
case IND:
--currentCompetitor;
break;
case GRP: case PARGRP: case PARIND:
for(int i = 0; i<racers.size(); ++i){
if(racers.get(i).getStartTime() == null){
currentCompetitor = i;
break;
}
}
break;
}
}
/**
* Is there another competitor to go?
* #return whether or not there is another competitor to go.
*/
public boolean hasNextCompetitor(){
return currentCompetitor < racers.size();
}
/**
* Return a 1D array view of the competitors
* #return
*/
public ArrayList<Competitor> getRacers(){
return racers;
}
}
in the export method of the ChronoTimer1009System class, I point out where the error is and what is happening
It seems that the centerpiece of Java Streams' parallelization is the ForEachTask. Understanding its logic appears to be essential to acquiring the mental model necessary to anticipate the concurrent behavior of client code written against the Streams API. Yet I find my anticipations contradicted by the actual behavior.
For reference, here is the key compute() method (java/util/streams/ForEachOps.java:253):
public void compute() {
Spliterator<S> rightSplit = spliterator, leftSplit;
long sizeEstimate = rightSplit.estimateSize(), sizeThreshold;
if ((sizeThreshold = targetSize) == 0L)
targetSize = sizeThreshold = AbstractTask.suggestTargetSize(sizeEstimate);
boolean isShortCircuit = StreamOpFlag.SHORT_CIRCUIT.isKnown(helper.getStreamAndOpFlags());
boolean forkRight = false;
Sink<S> taskSink = sink;
ForEachTask<S, T> task = this;
while (!isShortCircuit || !taskSink.cancellationRequested()) {
if (sizeEstimate <= sizeThreshold ||
(leftSplit = rightSplit.trySplit()) == null) {
task.helper.copyInto(taskSink, rightSplit);
break;
}
ForEachTask<S, T> leftTask = new ForEachTask<>(task, leftSplit);
task.addToPendingCount(1);
ForEachTask<S, T> taskToFork;
if (forkRight) {
forkRight = false;
rightSplit = leftSplit;
taskToFork = task;
task = leftTask;
}
else {
forkRight = true;
taskToFork = leftTask;
}
taskToFork.fork();
sizeEstimate = rightSplit.estimateSize();
}
task.spliterator = null;
task.propagateCompletion();
}
On a high level of description, the main loop keeps breaking down the spliterator, alternately forking off the processing of the chunk and processing it inline, until the spliterator refuses to split further or the remaining size is below the computed threshold.
Now consider the above algorithm in the case of unsized streams, where the whole is not being split into roughly equal halves; instead chunks of predetermined size are being repeatedly taken from the head of the stream. In this case the "suggested target size" of the chunk is abnormally large, which basically means that the chunks are never re-split into smaller ones.
The algorithm would therefore appear to alternately fork off one chunk, then process one inline. If each chunk takes the same time to process, this should result in no more than two cores being used. However, the actual behavior is that all four cores on my machine are occupied. Obviously, I am missing an important piece of the puzzle with that algorithm.
What is it that I'm missing?
Appendix: test code
Here is a piece of self-contained code which may be used to test the behavior which is the subject of this question:
package test;
import static java.util.concurrent.TimeUnit.NANOSECONDS;
import static java.util.concurrent.TimeUnit.SECONDS;
import static test.FixedBatchSpliteratorWrapper.withFixedSplits;
import java.io.IOException;
import java.io.PrintWriter;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.atomic.AtomicLong;
public class Parallelization {
static final AtomicLong totalTime = new AtomicLong();
static final ExecutorService pool = Executors.newFixedThreadPool(4);
public static void main(String[] args) throws IOException {
final long start = System.nanoTime();
final Path inputPath = createInput();
System.out.println("Start processing");
try (PrintWriter w = new PrintWriter(Files.newBufferedWriter(Paths.get("output.txt")))) {
withFixedSplits(Files.newBufferedReader(inputPath).lines(), 200).map(Parallelization::processLine)
.forEach(w::println);
}
final double cpuTime = totalTime.get(), realTime = System.nanoTime() - start;
final int cores = Runtime.getRuntime().availableProcessors();
System.out.println(" Cores: " + cores);
System.out.format(" CPU time: %.2f s\n", cpuTime / SECONDS.toNanos(1));
System.out.format(" Real time: %.2f s\n", realTime / SECONDS.toNanos(1));
System.out.format("CPU utilization: %.2f%%", 100.0 * cpuTime / realTime / cores);
}
private static String processLine(String line) {
final long localStart = System.nanoTime();
double ret = 0;
for (int i = 0; i < line.length(); i++)
for (int j = 0; j < line.length(); j++)
ret += Math.pow(line.charAt(i), line.charAt(j) / 32.0);
final long took = System.nanoTime() - localStart;
totalTime.getAndAdd(took);
return NANOSECONDS.toMillis(took) + " " + ret;
}
private static Path createInput() throws IOException {
final Path inputPath = Paths.get("input.txt");
try (PrintWriter w = new PrintWriter(Files.newBufferedWriter(inputPath))) {
for (int i = 0; i < 6_000; i++) {
final String text = String.valueOf(System.nanoTime());
for (int j = 0; j < 20; j++)
w.print(text);
w.println();
}
}
return inputPath;
}
}
package test;
import static java.util.Spliterators.spliterator;
import static java.util.stream.StreamSupport.stream;
import java.util.Comparator;
import java.util.Spliterator;
import java.util.function.Consumer;
import java.util.stream.Stream;
public class FixedBatchSpliteratorWrapper<T> implements Spliterator<T> {
private final Spliterator<T> spliterator;
private final int batchSize;
private final int characteristics;
private long est;
public FixedBatchSpliteratorWrapper(Spliterator<T> toWrap, long est, int batchSize) {
final int c = toWrap.characteristics();
this.characteristics = (c & SIZED) != 0 ? c | SUBSIZED : c;
this.spliterator = toWrap;
this.batchSize = batchSize;
this.est = est;
}
public FixedBatchSpliteratorWrapper(Spliterator<T> toWrap, int batchSize) {
this(toWrap, toWrap.estimateSize(), batchSize);
}
public static <T> Stream<T> withFixedSplits(Stream<T> in, int batchSize) {
return stream(new FixedBatchSpliteratorWrapper<>(in.spliterator(), batchSize), true);
}
#Override public Spliterator<T> trySplit() {
final HoldingConsumer<T> holder = new HoldingConsumer<>();
if (!spliterator.tryAdvance(holder)) return null;
final Object[] a = new Object[batchSize];
int j = 0;
do a[j] = holder.value; while (++j < batchSize && tryAdvance(holder));
if (est != Long.MAX_VALUE) est -= j;
return spliterator(a, 0, j, characteristics());
}
#Override public boolean tryAdvance(Consumer<? super T> action) {
return spliterator.tryAdvance(action);
}
#Override public void forEachRemaining(Consumer<? super T> action) {
spliterator.forEachRemaining(action);
}
#Override public Comparator<? super T> getComparator() {
if (hasCharacteristics(SORTED)) return null;
throw new IllegalStateException();
}
#Override public long estimateSize() { return est; }
#Override public int characteristics() { return characteristics; }
static final class HoldingConsumer<T> implements Consumer<T> {
Object value;
#Override public void accept(T value) { this.value = value; }
}
}
Ironically, the answer is almost stated in the question: as the "left" and "right" task take turns at being forked vs. processed inline, half of the time the right task, represented by this, e.g. the complete rest of the stream, is being forked off. That means that the forking off of chunks is just slowed down a bit (happening every other time), but clearly it happens.
We have a memory leak problem, we don't know what/where too many instances of a certain class are created/referred from. This occurs under heavy load in production and we cannot obtain heap dump (taking heap dump hangs the HA server for too long time). Runtime profiling is also not an option on production site because of performance degradation, the customers are happier with a random crash rather than agonizing slow during monitoring trying to fish for the crash instant. We don't know how to initiate the crash (leak), it just occurs at some times.
Is there a way to obtain object referrers/instantiation points at runtime from within the application itself?
I looked at http://docs.oracle.com/javase/6/docs/jdk/api/jpda/jdi/com/sun/jdi/ObjectReference.html and it gives an idea that something like this could be possible.
Any pointers how to achieve this preferrably with custom code without the heap-dump-way? Reproducing the problem in test environment has been tried and it seems exhaustive wild goose-chase. We want now a brute force way to find the cause.
It is recommended that you try to check you code which is causing such leaks. Here are some tutorials and help regarding the same
IBM Article on Handling memory leaks in Java
http://www.ibm.com/developerworks/library/j-leaks/
Some other useful articles
http://www.openlogic.com/wazi/bid/188158/How-to-Fix-Memory-Leaks-in-Java
There is also an Eclipse Memory Analyser Tool
But the most recommended solution will be
Try running jvisualvm from the JVM on the same machine as your program is running and enable profiling.
We solved the issue by collecting stacktraces on instantiation and on clone.. and by dumping them on a scheduler and when memory goes low.
We know the Object class that causes the problem, just needed to hunt down where it is born:
#EntityListeners(AbstractDTOJpaEventListener.class)
#MappedSuperclass
public abstract class AbstractDTO implements Storeable, Serializable, Cloneable {
/** */
private static final String SHADOWED_CLASS = "Custom";
/** */
protected final static boolean DEBUG_CUSTOM_INSTANCES = true;
/** */
public static long TARGET_HITRATE_PER_INTERVAL = 400000;
/** */
public static long LOGGING_INTERVAL = Times.MILLISECONDS_IN_TEN_SECONDS;
/** */
private static long previousLoggingTime;
/** */
protected static int hits;
/** */
protected static boolean hitting;
/** */
protected static int hitsWithinInterval;
/**
* #author Martin
*/
public static class Hi {
/**
*
*/
private long hitted;
private final long createdAt;
private final StackTraceElement[] stackTraceElements;
private final String threadName;
/**
* #param threadName
* #param stackTraceElements
*/
public Hi(String threadName, StackTraceElement[] stackTraceElements) {
this.threadName = threadName;
this.createdAt = System.currentTimeMillis();
this.stackTraceElements = stackTraceElements;
}
/**
*
*/
public void hit() {
hitted++;
}
/**
* #return the hitted
*/
public long getHitted() {
return hitted;
}
/**
* #param hitted the hitted to set
*/
public void setHitted(long hitted) {
this.hitted = hitted;
}
/**
* #return the createdAt
*/
public long getCreatedAt() {
return createdAt;
}
/**
* #return the stackTraceElements
*/
public StackTraceElement[] getStackTraceElements() {
return stackTraceElements;
}
/**
* #return the threadName
*/
public String getThreadName() {
return threadName;
}
}
/** */
protected final static Map<String, Hi> INSTANCE_SHADOW = new ConcurrentHashMap<String, Hi>();
private static final Comparator<? super Entry<String, Hi>> COMPARATOR = new Comparator<Entry<String, Hi>>() {
#Override
public int compare(Entry<String, Hi> o1, Entry<String, Hi> o2) {
if (o1 == o2) {
return 0;
}
return -Utils.compareNullSafe(o1.getValue().getHitted(), o2.getValue().getHitted(), Compare.ARG0_FIRST);
}
};
/**
* #param <T>
* #return T
* #see java.lang.Object#clone()
*/
#SuppressWarnings("unchecked")
public <T extends AbstractDTO> T clone() {
try {
return (T) super.clone();
} catch (CloneNotSupportedException e) {
throw new RuntimeException(e);
} finally {
if (DEBUG_CUSTOM_INSTANCES && getClass().getSimpleName().equals(SHADOWED_CLASS)) {
shadowInstance();
}
}
}
/**
*
*/
protected void shadowInstance() {
if (DEBUG_CUSTOM_INSTANCES) {
final long currentTimeMillis = System.currentTimeMillis();
if (TARGET_HITRATE_PER_INTERVAL <= ++hitsWithinInterval) {
hitting = true;
}
if ((TARGET_HITRATE_PER_INTERVAL / 2) <= ++hits) {
final Thread currentThread = Thread.currentThread();
final StackTraceElement[] stackTrace = currentThread.getStackTrace();
final String key = Utils.getPropertyPath(String.valueOf(System.identityHashCode(currentThread)), displayStackLocaktion(stackTrace))
.intern();
Hi hi = INSTANCE_SHADOW.get(key);
if (hi == null) {
synchronized (key) {
hi = INSTANCE_SHADOW.get(key);
if (hi == null) {
INSTANCE_SHADOW.put(key, hi = new Hi(currentThread.getName(), stackTrace));
}
}
}
hi.hit();
}
{
if (getLoggingInterval(currentTimeMillis) != getLoggingInterval(previousLoggingTime)) {
if (hitsWithinInterval < TARGET_HITRATE_PER_INTERVAL) {
if (hitting) {
hitting = false;
} else {
hits = 0; // Reset measuring on second round, give chance to burtsy hits
}
}
hitsWithinInterval = 0;
previousLoggingTime = currentTimeMillis;
}
}
}
}
/**
* #param time
* #return long
*/
private long getLoggingInterval(long time) {
return time / LOGGING_INTERVAL;
}
/**
* #return String
*/
public static String toStringShadows() {
final ArrayList<Entry<String, Hi>> entries;
synchronized (INSTANCE_SHADOW) {
entries = Convert.toMinimumArrayList(INSTANCE_SHADOW.entrySet());
INSTANCE_SHADOW.clear();
}
StringBuilder stringBuilder = new StringBuilder();
stringBuilder.append(new Timestamp(System.currentTimeMillis()) + " " + SHADOWED_CLASS + " Class instance instantiantion summary:\n");
stringBuilder.append("hits=" + hits + ", hitting=" + hitting + ", hitsWithinInterval=" + hitsWithinInterval + ", previousLoggingTime=" + new java.sql.Timestamp(previousLoggingTime));
if (entries.isEmpty()) {
return stringBuilder.toString();
}
Collections.sort(entries, COMPARATOR);
int index = 0;
stringBuilder.append("-----------------------------------------------------------------------");
for (Entry<String, Hi> entry : entries) {
Utils.append(stringBuilder, entry.getValue().getHitted() + "\t" + entry.getKey(), "\n");
}
for (Entry<String, Hi> entry : entries) {
final Hi hi = entry.getValue();
final StackTraceElement[] stackTrace = hi.getStackTraceElements();
final String groupName = entry.getKey();
final String threadName = hi.getThreadName();
stringBuilder.append("\n").append(++index).append('\t');
stringBuilder.append(hi.getHitted()).append("\tpcs\t").append(groupName);
stringBuilder.append("\t").append(new Timestamp(hi.getCreatedAt()).toString()).append('\t').append(threadName)
.append('\t').append(Convert.toString(stackTrace));
}
return stringBuilder.toString();
}
/**
* #param stackTrace
* #return String
*/
private static String displayStackLocaktion(final StackTraceElement[] stackTrace) {
StackTraceElement firstDistinguishingStackTraceElement = null;
for (int index = 0; index < stackTrace.length; index++) {
firstDistinguishingStackTraceElement = stackTrace[index];
if (!Arrays.asList(UNWANTED_LOCATIONS).contains(firstDistinguishingStackTraceElement.getClassName())) {
break;
}
}
StackTraceElement lastDistinguishingStackTraceElement = null;
for (int index = stackTrace.length-1; 0 <= index; index--) {
lastDistinguishingStackTraceElement = stackTrace[index];
if (lastDistinguishingStackTraceElement.getClassName().startsWith(OUR_PACKAGE_DOMAIN)) {
break;
}
}
return Utils.getPropertyPath(displayName(firstDistinguishingStackTraceElement) + "<-"
+ displayName(lastDistinguishingStackTraceElement));
}
/**
* #param firstDistinguishingStackTraceElement
* #return String
*/
private static String displayName(StackTraceElement firstDistinguishingStackTraceElement) {
return Utils.getPropertyPath(firstDistinguishingStackTraceElement.getClassName(), firstDistinguishingStackTraceElement.getMethodName(),
String.valueOf(firstDistinguishingStackTraceElement.getLineNumber()));
}
}
I have what amounts to an Iterator<Integer>... actually it's a class Thing that accepts a Visitor<SomeObject> and calls visit() for a subset of the SomeObjects it contains, and I have to implement Visitor<SomeObject> so it does something like this:
// somehow get all the Id's from each of the SomeObject that Thing lets me visit
public int[] myIdExtractor(Thing thing)
{
SomeCollection c = new SomeCollection();
thing.visitObjects(new Visitor<SomeObject>()
{
public void visit(SomeObject obj) { c.add(obj.getId()); }
}
);
return convertToPrimitiveArray(c);
}
I need to extract an int[] containing the results, and I'm not sure what to use for SomeCollection and convertToPrimitiveArray. The number of results is unknown ahead of time and will be large (10K-500K). Is there anything that would be a better choice than using ArrayList<Integer> for SomeCollection, and this:
public int[] convertToPrimitiveArray(List<Integer> ints)
{
int N = ints.size();
int[] array = new int[N];
int j = 0;
for (Integer i : ints)
{
array[j++] = i;
}
return array;
}
Efficiency and memory usage are of some concern.
It's not too difficult to come up with a class that collects ints in an array (even if you are not using some library which does it for you).
public class IntBuffer {
private int[] values = new int[10];
private int size = 0;
public void add(int value) {
if (!(size < values.length)) {
values = java.util.Arrays.copyOf(values, values.length*2);
}
values[size++] = value;
}
public int[] toArray() {
return java.util.Arrays.copyOf(values, size);
}
}
(Disclaimer: This is stackoverflow, I have not even attempted to compile this code.)
As an alternative you could use DataOutputStream to store the ints in a ByteArrayOutputStream.
final ByteArrayOutputStream byteOut = new ByteArrayOutputStream();
final DataOutputStream out = new DataOutputStream(byteOut);
...
out.writeInt(value);
...
out.flush();
final byte[] bytes = byteOut.toByteArray();
final int[] ints = new int[bytes.length/4];
final ByteArrayInputStream byteIn = new ByteArrayInputStream(bytes);
final DataInputStream in = new DataOutputStream(byteIn);
for (int ct=0; ct<ints.length; ++ct) {
ints[ct] = in.readInt();
}
(Disclaimer: This is stackoverflow, I have not even attempted to compile this code.)
You could look at something like pjc to handle this. That is a collections framework made for primitives.
for benchmarking's sake I put together a test program using an LFSR generator to prevent the compiler from optimizing out test arrays. Couldn't download pjc but I assume timing should be similar to Tom's IntBuffer class, which is by far the winner. The ByteArrayOutputStream approach is about the same speed as my original ArrayList<Integer> approach. I'm running J2SE 6u13 on a 3GHz Pentium 4, and with approx 220 values, after JIT has run its course, the IntBuffer approach takes roughly 40msec (only 40nsec per item!) above and beyond a reference implementation using a "forgetful" collection that just stores the last argument to visit() (so the compiler doesn't optimize it out). The other two approaches take on the order of 300msec, about 8x as slow.
edit: I suspect the problem with the Stream approach is that there is the potential for exceptions which I had to catch, not sure.
(for arguments run PrimitiveArrayTest 1 2)
package com.example.test.collections;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.DataInputStream;
import java.io.DataOutputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
public class PrimitiveArrayTest {
interface SomeObject {
public int getX();
}
interface Visitor {
public void visit(SomeObject obj);
}
public static class PlainObject implements SomeObject
{
private int x;
public int getX() { return this.x; }
public void setX(int x) { this.x = x; }
}
public static class Thing
{
/* here's a LFSR
* see http://en.wikipedia.org/wiki/Linear_feedback_shift_register
* and http://www.ece.cmu.edu/~koopman/lfsr/index.html
*/
private int state;
final static private int MASK = 0x80004;
private void _next()
{
this.state = (this.state >>> 1)
^ (-(this.state & 1) & MASK);
}
public Thing(int state) { this.state = state; }
public void setState(int state) { this.state = state; }
public void inviteVisitor(Visitor v, int terminationPoint)
{
PlainObject obj = new PlainObject();
while (this.state != terminationPoint)
{
obj.setX(this.state);
v.visit(obj);
_next();
}
}
}
static public abstract class Collector implements Visitor
{
abstract public void initCollection();
abstract public int[] getCollection();
public int[] extractX(Thing thing, int startState, int endState)
{
initCollection();
thing.setState(startState);
thing.inviteVisitor(this, endState);
return getCollection();
}
public void doit(Thing thing, int startState, int endState)
{
System.out.printf("%s.doit(thing,%d,%d):\n",
getClass().getName(),
startState,
endState);
long l1 = System.nanoTime();
int[] result = extractX(thing,startState,endState);
long l2 = System.nanoTime();
StringBuilder sb = new StringBuilder();
sb.append(String.format("%d values calculated in %.4f msec ",
result.length, (l2-l1)*1e-6));
int N = 3;
if (result.length <= 2*N)
{
sb.append("[");
for (int i = 0; i < result.length; ++i)
{
if (i > 0)
sb.append(", ");
sb.append(result[i]);
}
sb.append("]");
}
else
{
int sz = result.length;
sb.append(String.format("[%d, %d, %d... %d, %d, %d]",
result[0], result[1], result[2],
result[sz-3], result[sz-2], result[sz-1]));
}
System.out.println(sb.toString());
}
}
static public class Collector0 extends Collector
{
int lastint = 0;
#Override public int[] getCollection() { return new int[]{lastint}; }
#Override public void initCollection() {}
#Override public void visit(SomeObject obj) {lastint = obj.getX(); }
}
static public class Collector1 extends Collector
{
final private List<Integer> ints = new ArrayList<Integer>();
#Override public int[] getCollection() {
int N = this.ints.size();
int[] array = new int[N];
int j = 0;
for (Integer i : this.ints)
{
array[j++] = i;
}
return array;
}
#Override public void initCollection() { }
#Override public void visit(SomeObject obj) { ints.add(obj.getX()); }
}
static public class Collector2 extends Collector
{
/*
* adapted from http://stackoverflow.com/questions/1167060
* by Tom Hawtin
*/
private int[] values;
private int size = 0;
#Override public void visit(SomeObject obj) { add(obj.getX()); }
#Override public void initCollection() { values = new int[32]; }
private void add(int value) {
if (!(this.size < this.values.length)) {
this.values = java.util.Arrays.copyOf(
this.values, this.values.length*2);
}
this.values[this.size++] = value;
}
#Override public int[] getCollection() {
return java.util.Arrays.copyOf(this.values, this.size);
}
}
static public class Collector3 extends Collector
{
/*
* adapted from http://stackoverflow.com/questions/1167060
* by Tom Hawtin
*/
final ByteArrayOutputStream byteOut = new ByteArrayOutputStream();
final DataOutputStream out = new DataOutputStream(this.byteOut);
int size = 0;
#Override public int[] getCollection() {
try
{
this.out.flush();
final int[] ints = new int[this.size];
final ByteArrayInputStream byteIn
= new ByteArrayInputStream(this.byteOut.toByteArray());
final DataInputStream in = new DataInputStream(byteIn);
for (int ct=0; ct<ints.length; ++ct) {
ints[ct] = in.readInt();
}
return ints;
}
catch (IOException e) { /* gulp */ }
return new int[0]; // failure!?!??!
}
#Override public void initCollection() { }
#Override public void visit(SomeObject obj) {
try {
this.out.writeInt(obj.getX());
++this.size;
}
catch (IOException e) { /* gulp */ }
}
}
public static void main(String args[])
{
int startState = Integer.parseInt(args[0]);
int endState = Integer.parseInt(args[1]);
Thing thing = new Thing(0);
// let JIT do its thing
for (int i = 0; i < 20; ++i)
{
Collector[] collectors = {new Collector0(), new Collector1(), new Collector2(), new Collector3()};
for (Collector c : collectors)
{
c.doit(thing, startState, endState);
}
System.out.println();
}
}
}
Instead of convertToPrimitiveArray, you can use List.toArray(T[] a):
ArrayList<int> al = new ArrayList<int>();
// populate al
int[] values = new int[al.size()];
al.toArray(values);
For your other concerns, LinkedList might be slightly better than ArrayList, given that you don't know the size of your result set in advance.
If performance is really a problem, you may be better off hand-managing an int[] yourself, and using System.arraycopy() each time it grows; the boxing/unboxing from int to Integer that you need for any Collection could hurt.
As with any performance-related question, of course, test and make sure it really matters before spending too much time optimizing.