DouglasPeuckerSimplifier usage

DouglasPeuckerSimplifier usage - java

I am attempting to reduce shape files generated from OSM path data. I am using the DouglasPeuckerSimplifier implementation from VTS.
I want to build up a geojson of the routemap for a specific GTFS (general transit feed spec). I cant just use the set straight from the map as it's too heavy, I end up with multi-megabyte size json files.
My code looks like this, I have incuded the loop to populate the input just to give you some confidence that I have a valid input array. What I am querying is really just the last 3 lines, and the general concept of taking a path from OSM and reducing the number of points in it, which I thought was exactly what Douglas-Peucker was all about.
ArrayList<Geometry> points = new ArrayList <Geometry>();
GeometryFactory gf= new GeometryFactory();
for (Object sh : shape_points){
double thisShapeLat=((Shapes)sh).getshapePtLat();
double thisShapeLon=((Shapes)sh).getshapePtLon();
// void identical consecutive points
if (lastShapeLat == thisShapeLat && lastShapeLon == thisShapeLon) continue;
lastShapeLat = thisShapeLat;
lastShapeLon = thisShapeLon;
Coordinate coord= new Coordinate(thisShapeLon,thisShapeLat);
// System.err.println("added coord="+coord);
points.add(gf.createPoint(coord));
}
Geometry[] points_ar = (Geometry [])points.toArray(new Geometry[points.size()]);
GeometryCollection geometries = new GeometryCollection(points_ar, gf);
DouglasPeuckerSimplifier simplifier = new DouglasPeuckerSimplifier(geometries);
simplifier.setDistanceTolerance(0.00001);
Geometry result=simplifier.getResultGeometry();
No matter what value I set for the tolerance, I get the same points in (points) as out (result). It's not doing anything at all.
I have also called simplify() as a static, with the same result, i.e nothing.

You need to use a LineString not GeometryCollection for tyhe parameters to simplify.
Coordinate list2[] = new Coordinate[coords.size()];
list2 = coords.toArray(list2);
CoordinateArraySequence cas=new CoordinateArraySequence(list2);
LineString ls = new LineString(cas,gf);
Geometry result=DouglasPeuckerSimplifier.simplify(ls,0.001);

Related

dl4j lstm not successful

Im trying to copy the exrcise about halfway down the page on this link:
https://d2l.ai/chapter_recurrent-neural-networks/sequence.html
The exercise uses a sine function to create 1000 data points between -1 through 1 and use a recurrent network to approximate the function.
Below is the code I used. I'm going back to study more why this isn't working as it doesn't make much sense to me now when I was easily able to use a feed forward network to approximate this function.
//get data
ArrayList<DataSet> list = new ArrayList();
DataSet dss = DataSetFetch.getDataSet(Constants.DataTypes.math, "sine", 20, 500, 0, 0);
DataSet dsMain = dss.copy();
if (!dss.isEmpty()){
list.add(dss);
}
if (list.isEmpty()){
return;
}
//format dataset
list = DataSetFormatter.formatReccurnent(list, 0);
//get network
int history = 10;
ArrayList<LayerDescription> ldlist = new ArrayList<>();
LayerDescription l = new LayerDescription(1,history, Activation.RELU);
ldlist.add(l);
LayerDescription ll = new LayerDescription(history, 1, Activation.IDENTITY, LossFunctions.LossFunction.MSE);
ldlist.add(ll);
ListenerDescription ld = new ListenerDescription(20, true, false);
MultiLayerNetwork network = Reccurent.getLstm(ldlist, 123, WeightInit.XAVIER, new RmsProp(), ld);
//train network
final List<DataSet> lister = list.get(0).asList();
DataSetIterator iter = new ListDataSetIterator<>(lister, 50);
network.fit(iter, 50);
network.rnnClearPreviousState();
//test network
ArrayList<DataSet> resList = new ArrayList<>();
DataSet result = new DataSet();
INDArray arr = Nd4j.zeros(lister.size()+1);
INDArray holder;
if (list.size() > 1){
//test on training data
System.err.println("oops");
}else{
//test on original or scaled data
for (int i = 0; i < lister.size(); i++) {
holder = network.rnnTimeStep(lister.get(i).getFeatures());
arr.putScalar(i,holder.getFloat(0));
}
}
//add originaldata
resList.add(dsMain);
//result
result.setFeatures(dsMain.getFeatures());
result.setLabels(arr);
resList.add(result);
//display
DisplayData.plot2DScatterGraph(resList);
Can you explain the code I would need for a 1 in 10 hidden and 1 out lstm network to approximate a sine function?
Im not using any normalization as function is already -1:1 and Im using the Y input as the feature and the following Y Input as the label to train the network.
You notice i am building a class that allows for easier construction of nets and I have tried throwing many changes at the problem but I am sick of guessing.
Here are some examples of my results. Blue is data red is result

This is one of those times were you go from wondering why was this not working to how in the hell were my original results were as good as they were.
My failing was not understanding the documentation clearly and also not understanding BPTT.
With feed forward networks each iteration is stored as a row and each input as a column. An example is [dataset.size, network inputs.size]
However with recurrent input its reversed with each row being a an input and each column an iteration in time necessary to activate the state of the lstm chain of events. At minimum my input needed to be [0, networkinputs.size, dataset.size] But could also be [dataset.size, networkinputs.size, statelength.size]
In my previous example I was training the network with data in this format [dataset.size, networkinputs.size, 1]. So from my low resolution understanding the lstm network should never have worked at all but somehow produced at least something.
There may have also been some issue with converting the dataset to a list as I also changed how I feed the network but but I think the bulk of the issue was a data structure issue.
Below are my new results

Hard to tell what is going on without seeing the full code. For a start I don't see an RnnOutputLayer specified. You could take a look this which shows you how to build an RNN in DL4J.
If your RNN setup is correct this could be a tuning issue. You can find more on tuning here. Adam is probably a better choice for an updater than RMSProp. And tanh probably is a good choice for the activation for your output layer since it's range is (-1,1). Other things to check/tweak - learning rate, number of epochs, set up of your data (like are you trying to predict to far out?).

I want to make a graph with sensing data

I need to make a graph with data which come from arduino.
Data is sended as String and I want to draw graph with parts of the data
For example,
Arduino sents "1234567890", Graph will be drawn by "12345"
Here is my codes
ArrayList<Integer> colors = new ArrayList<>();
ArrayList<String> test1 = new ArrayList<>();
ArrayList<Entry> value1 = new ArrayList<>();
test1.add("123456909090");
test1.add("234567909090");
test1.add("334567909090");
test1.add("434567909090");
for (int i = 0; i < 4; i++){
String a = test1.get(i);
a.substring(0,6);
float b = Float.parseFloat(a);
value1.add(new Entry(i,b));
}
ScatterDataSet set1 = new ScatterDataSet(value1);
and there is an exception raised
FATAL EXCEPTION:java.lang.ArithmeticException: divide by zero
I have never used devide.
How can i solve it?
And If there is a better way, let me know.
Thanks for reading.

As far as I see, when you resive data, before use data you should check them with any control mechanism like if ... etc. it will help you. In my limited programming knowledge zero division makes problem ( any number / 0 == is problem) if you have to do this operations I prefer change the value like 0.0001 or 0.000001 or you can't use the data which was sended and wait , while you waiting you can use old data. It depens you and your project....

How to manually cross fold evaluate naive bayes in weka?

I'm using an own bag of word model instead of wekas StringToWordVector (turns out to be a mistake, but as it's only a school project, I'd like to finish it with my approach), so I cannot use it's CrossFoldEvaluation, as my BoW dictionary would contain the words of the training data too.
for (int n = 0; n < folds; n++) {
List<String> allData = getAllReviews(); // 2000 reviews
List<String> trainingData = getTrainingReviews(n, folds); // random 1800 reviews
List<String> testData = getTestReviews(n, folds); // random 200 reviews
bagOfWordsModel.train(trainingData); // builds a vocabulary of 1800 training reviews
Instances inst = bagOfWordsModel.vectorize(allData); // returns 1800 instances with the class attribute set to positive or negative, and 200 without
// todo: evaluate
Classifier cModel = (Classifier) new NaiveBayes();
cModel.buildClassifier(inst);
Evaluation eTest = new Evaluation(inst);
eTest.evaluateModel(cModel, inst);
// print results
String strSummary = eTest.toSummaryString();
System.out.println(strSummary);
}
How can I now evaluate this? I thought, weka will automatically try to determine the class attribute of the instances that have no value for the class attribute. But instead, it tells me weka.filters.supervised.attribute.Discretize: Cannot handle missing class values!

As you have both a training set and a testing set, you should train the classifier on the training data, which should be labelled, and then use the trained model to classify the unlabeled test data.
Classifier cModel = new NaiveBayes();
cModel.buildClassifier(trainingData);
And then, with the use of the following line you should be able to classify an unknown instance and get a prediction:
double clsLabel = cModel.classifyInstance(testData.instance(0));
Or you could use the Evaluation class to make predictions on the entire test set.
Evaluation evaluation = new Evaluation();
evaluation.evaluateModel(cModel, testData);
You have pointed out that you are attempting to implement your own cross-validation by taking a random subset of the data - There is a method that does k-fold cross-validation for you int he Evaluation class (crossValidateModel).
Evaluation evaluation = new Evaluation(trainingData);
evaluation.crossValidateModel(cModel, trainingData, 10, new Random(1));
Note: Cross-validation is used when you don't have a test set by taking a subset of the training data and holding it out of training and using that to evaluate performance cross-validation.
K-fold cross-validation splits the training data into K subsets. It puts one of the subsets aside and uses the remaining to train the classifier, returning to the subset set aside to evaluate the model. It then repeats this process until it has used each subset as the test set.

When Training, only Input the instances with set class.
In this line:
cModel.buildClassifier(inst);
you are Training a naive Bayes classifier. Input only the training examples(!). Evaluate against all data (with labels!). Evaluation checks the predicted Label against the actual Label, if I remember correctly.
The 200 data points without class Label seem useless, what are they for?

Need some help for deeplearning4j single RBM usage

I have a bunch of sensors and I really just want to reconstruct the input.
So what I want is this:
after I have trained my model I will pass in my feature matrix
get the reconstructed feature matrix back
I want to investigate which sensor values are completely different from the reconstructed value
Therefore I thought a RBM will be the right choice and since I am used to Java, I have tried to use deeplearning4j. But I got stuck very early. If you run the following code, I am facing 2 problems.
The result is far away from a correct prediction, most of them are simply [1.00,1.00,1.00].
I would expect to get back 4 values (which is the number of inputs expected to be reconstructed)
So what do I have to tune to get a) a better result and b) get the reconstructed inputs back?
public static void main(String[] args) {
// Customizing params
Nd4j.MAX_SLICES_TO_PRINT = -1;
Nd4j.MAX_ELEMENTS_PER_SLICE = -1;
Nd4j.ENFORCE_NUMERICAL_STABILITY = true;
final int numRows = 4;
final int numColumns = 1;
int outputNum = 3;
int numSamples = 150;
int batchSize = 150;
int iterations = 100;
int seed = 123;
int listenerFreq = iterations/5;
DataSetIterator iter = new IrisDataSetIterator(batchSize, numSamples);
// Loads data into generator and format consumable for NN
DataSet iris = iter.next();
iris.normalize();
//iris.scale();
System.out.println(iris.getFeatureMatrix());
NeuralNetConfiguration conf = new NeuralNetConfiguration.Builder()
// Gaussian for visible; Rectified for hidden
// Set contrastive divergence to 1
.layer(new RBM.Builder()
.nIn(numRows * numColumns) // Input nodes
.nOut(outputNum) // Output nodes
.activation("tanh") // Activation function type
.weightInit(WeightInit.XAVIER) // Weight initialization
.lossFunction(LossFunctions.LossFunction.XENT)
.updater(Updater.NESTEROVS)
.build())
.seed(seed) // Locks in weight initialization for tuning
.iterations(iterations)
.learningRate(1e-1f) // Backprop step size
.momentum(0.5) // Speed of modifying learning rate
.optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT) // ^^ Calculates gradients
.build();
Layer model = LayerFactories.getFactory(conf.getLayer()).create(conf);
model.setListeners(Arrays.asList((IterationListener) new ScoreIterationListener(listenerFreq)));
model.fit(iris.getFeatureMatrix());
System.out.println(model.activate(iris.getFeatureMatrix(), false));
}

For b), when you call activate(), you get a list of "nlayers" arrays. Every array in the list is the activation for one layer. The array itself is composed of rows: 1 row per input vector; each column contains the activation for every neuron in this layer and this observation (input).
Once all layers have been activated with some input, you can get the reconstruction with the RBM.propDown() method.
As for a), I'm afraid it's very tricky to train correctly an RBM.
So you really want to play with every parameter, and more importantly,
monitor during training various metrics that will give you some hint about whether it's training correctly or not. Personally, I like to plot:
The score() on the training corpus, which is the reconstruction error after every gradient update; check that it decreases.
The score() on another development corpus: useful to be warned when overfitting occurs;
The norm of the parameter vector: it has a large impact on the score
Both activation maps (= XY rectangular plot of the activated neurons of one layer over the corpus), just after initialization and after N steps: this helps detecting unreliable training (e.g.: when all is black/white, when a large part of all neurons are never activated, etc.)

How to try something until available object is found

The title is probably a bit confusing, but I don't really know how to explain this. I have a list of objects, in this case, locations, and those locations can be occupied by a player. If the selected location is already occupied, how can I try to find a new location, and continue this until non-occupied location is found?
I already know that there are 20 locations, I could manually check each and every one of those locations and see if it's occupied, but is there a better way to do this?
Here is a snippet of my code.
List<Location> spawnList = arena.getManager().getRandomSpawns(); // Returns a list of possible locations
Location random = spawnList.get(new Random().nextInt(spawnList.size())); // Selects a random location from the list
if (random.isOccupied()) {
/* Location is occupied, find another one from the list, and continue doing this until non-occupied location is found */
}
Sorry if you didn't understand, I don't know a good way of explaining this.

List<Location> spawnList = arena.getManager().getRandomSpawns();
Location random;
Random r = new Random();
do {
random = spawnList.get(r.nextInt(spawnList.size()))
} while(random.isOccupied());
This will fail if all locations are occupied, you should check this before.

You can choose one of two ways:
Push - when a location becomes available , notify that it is now available. (By calling a method, for example).
Polling: Something like you are doing now. It is possible to hold a collection of available locations, when a location becomes available it is added to the collection. you can wait for the list to have values. I would suggest A blockig queue:

The trivial approach would be to randomize a location in a loop until one is found:
List<Location> spawnList = arena.getManager().getRandomSpawns(); // Returns a list of possible locations
Location random = spawnList.get(new Random().nextInt(spawnList.size())); // Selects a random location from the list
while (random.isOccupied()) {
random = spawnList.get(new Random().nextInt(spawnList.size()));
}
The problem here is that this may take a very long time if most of the locations are already occupied.
A "safer" approach, which promises the same order of performance regardless of the percentage of pre-occupied locations could be to shuffle the list of locations, and then simply iterate through it:
List<Location> spawnList = new LinkedList<Location>(arena.getManager().getRandomSpawns());
Location random = null;
for (Location loc : spawnList) {
if (!loc.isOccupied()) {
random = loc;
}
}

You can declare a flag to check if candidate Location is found, and using while - loop to generate random Location, like,
Location random = null;
boolean foundLocation = false;
while(!foundLocation)
{
random = spawnList.get(new Random().nextInt(spawnList.size()));
if(!random.isOccupied())
{
foundLocation = true;
}
}
Note: Here has a assumption that there is at least one Location in the Location List, which is not occupied. If all of the Locations are occupied. Then the above code can not be used. It will be in infinite loop. We'd better check if at least one Location is not occupied in the List first.

Instead of stochastically probing until you hit an empty spot, you should
first collect all available locations and then
pick a random free location.
List<Integer> freeLocations = new ArrayList<>();
for (int i = 0; i < spawnList.size(); i++)
if (!spawnList.get(i).isOccupied) freeLocations.add(i);
Location random =
spawnList.get(freeLocations.get(rnd.nextInt(freeLocations.size()));

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

DouglasPeuckerSimplifier usage - java

Related

dl4j lstm not successful

I want to make a graph with sensing data

How to manually cross fold evaluate naive bayes in weka?

Need some help for deeplearning4j single RBM usage

How to try something until available object is found

Categories

Resources