How can I pause/serialize a genetic algorithm in Encog? - java

How can I pause a genetic algorithm in Encog 3.4 (the version currently under development in Github)?
I am using the Java version of Encog.
I am trying to modify the Lunar example that comes with Encog. I want to pause/serialize the genetic algorithm and then continue/deserialize at a later stage.
When I call train.pause(); it simply returns null - which is pretty obvious from the code since the method always returns null.
I would assume that it would be pretty straight forward since there can be a scenario in which I want to train a neural network, use it for some predictions and then continue training with the genetic algorithm as I get more data before resuming with more predictions - without having to restart the training from the beginning.
Please note that I am not trying to serialize or persist a neural network but rather the entire genetic algorithm.

Not all trainers in Encog support the simple pause/resume. If they do not support it, they return null, like this one. The genetic algorithm trainer is much more complex than a simple propagation trainer that supports pause/resume. To save the state of the genetic algorithm, you must save the entire population, as well as the scoring function (which may or may not be serializable). I modified the Lunar Lander example to show you how you might save/reload your population of neural networks to do this.
You can see that it trains 50 iterations, then round-trips (load/saves) the genetic algorithm, then trains 50 more.
package org.encog.examples.neural.lunar;
import java.io.File;
import java.io.IOException;
import org.encog.Encog;
import org.encog.engine.network.activation.ActivationTANH;
import org.encog.ml.MLMethod;
import org.encog.ml.MLResettable;
import org.encog.ml.MethodFactory;
import org.encog.ml.ea.population.Population;
import org.encog.ml.genetic.MLMethodGeneticAlgorithm;
import org.encog.ml.genetic.MLMethodGenomeFactory;
import org.encog.neural.networks.BasicNetwork;
import org.encog.neural.pattern.FeedForwardPattern;
import org.encog.util.obj.SerializeObject;
public class LunarLander {
public static BasicNetwork createNetwork()
{
FeedForwardPattern pattern = new FeedForwardPattern();
pattern.setInputNeurons(3);
pattern.addHiddenLayer(50);
pattern.setOutputNeurons(1);
pattern.setActivationFunction(new ActivationTANH());
BasicNetwork network = (BasicNetwork)pattern.generate();
network.reset();
return network;
}
public static void saveMLMethodGeneticAlgorithm(String file, MLMethodGeneticAlgorithm ga ) throws IOException
{
ga.getGenetic().getPopulation().setGenomeFactory(null);
SerializeObject.save(new File(file),ga.getGenetic().getPopulation());
}
public static MLMethodGeneticAlgorithm loadMLMethodGeneticAlgorithm(String filename) throws ClassNotFoundException, IOException {
Population pop = (Population) SerializeObject.load(new File(filename));
pop.setGenomeFactory(new MLMethodGenomeFactory(new MethodFactory(){
#Override
public MLMethod factor() {
final BasicNetwork result = createNetwork();
((MLResettable)result).reset();
return result;
}},pop));
MLMethodGeneticAlgorithm result = new MLMethodGeneticAlgorithm(new MethodFactory(){
#Override
public MLMethod factor() {
return createNetwork();
}},new PilotScore(),1);
result.getGenetic().setPopulation(pop);
return result;
}
public static void main(String args[])
{
BasicNetwork network = createNetwork();
MLMethodGeneticAlgorithm train;
train = new MLMethodGeneticAlgorithm(new MethodFactory(){
#Override
public MLMethod factor() {
final BasicNetwork result = createNetwork();
((MLResettable)result).reset();
return result;
}},new PilotScore(),500);
try {
int epoch = 1;
for(int i=0;i<50;i++) {
train.iteration();
System.out
.println("Epoch #" + epoch + " Score:" + train.getError());
epoch++;
}
train.finishTraining();
// Round trip the GA and then train again
LunarLander.saveMLMethodGeneticAlgorithm("/Users/jeff/projects/trainer.bin",train);
train = LunarLander.loadMLMethodGeneticAlgorithm("/Users/jeff/projects/trainer.bin");
// Train again
for(int i=0;i<50;i++) {
train.iteration();
System.out
.println("Epoch #" + epoch + " Score:" + train.getError());
epoch++;
}
train.finishTraining();
} catch(IOException ex) {
ex.printStackTrace();
} catch (ClassNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
int epoch = 1;
for(int i=0;i<50;i++) {
train.iteration();
System.out
.println("Epoch #" + epoch + " Score:" + train.getError());
epoch++;
}
train.finishTraining();
System.out.println("\nHow the winning network landed:");
network = (BasicNetwork)train.getMethod();
NeuralPilot pilot = new NeuralPilot(network,true);
System.out.println(pilot.scorePilot());
Encog.getInstance().shutdown();
}
}

Related

Sorting strings via stream

I am doing a coding exercise where I take the the raw data from a csv file and I print it in order of lowest to highest ranked literacy rates.
For example:
Adult literacy rate, population 15+ years, female (%),United Republic of Tanzania,2015,76.08978
Adult literacy rate, population 15+ years, female (%),Zimbabwe,2015,85.28513
Adult literacy rate, population 15+ years, male (%),Honduras,2014,87.39595
Adult literacy rate, population 15+ years, male (%),Honduras,2015,88.32135
Adult literacy rate, population 15+ years, male (%),Angola,2014,82.15105
Turns into:
Niger (2015), female, 11.01572
Mali (2015), female, 22.19578
Guinea (2015), female, 22.87104
Afghanistan (2015), female, 23.87385
Central African Republic (2015), female, 24.35549
My code:
import java.io.IOException;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;
public class LiteracyComparison {
public static void main(String[] args) throws IOException {
List<String> literacy = new ArrayList<>();
try (Scanner scanner = new Scanner(Paths.get("literacy.csv"))) {
while(scanner.hasNextLine()){
String row = scanner.nextLine();
String[] line = row.split(",");
line[2] = line[2].trim().substring(0, line[2].length() - 5);
line[3] = line[3].trim();
line[4] = line[4].trim();
line[5] = line[5].trim();
String l = line[3] + " (" + line[4] + "), " + line[2] + ", " + line[5];
literacy.add(l);
}
}
// right about where I get lost
literacy.stream().sorted();
}
}
Now I have converted the raw data into the correct format, it's just I am lost on how to sort it.
I am also wondering if there is a more efficient way to do this via the streams method. Please and thank you.
I took a few liberties while refactoring your code, but the idea is the same. This could be further improved but it is not intended to be a perfect solution, just something to answer your question and put you on the right track.
The main idea here is to create a nested class called LiteracyData, which stores the summary you had before as a String. However, we also want to store the literacy rate so we have something to sort by. Then you can use a Java Comparator to define your own method for comparing custom classes, in this case LiteracyData. Finally, tie it all together by calling the sort function on your List, while passing in the custom Comparator as an argument. That will sort your list. You can then print it to view the results.
import java.io.IOException;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;
import java.util.Comparator;
public class LiteracyComparison {
// Define a class that stores your data
public class LiteracyData {
private String summary;
private float rate;
public LiteracyData(String summary, float rate) {
super();
this.summary = summary;
this.rate = rate;
}
}
// This is a custom Comparator we defined for sorting LiteracyData
public class LiteracySorter implements Comparator<LiteracyData>
{
#Override
public int compare(LiteracyData d1, LiteracyData d2) {
return Float.compare(d1.rate, d2.rate);
}
}
public void run() {
List<LiteracyData> literacy = new ArrayList<>();
try (Scanner scanner = new Scanner(Paths.get("literacy.csv"))) {
while(scanner.hasNextLine()){
String row = scanner.nextLine();
String[] line = row.split(",");
line[2] = line[2].trim().substring(0, line[2].length() - 5);
line[3] = line[3].trim();
line[4] = line[4].trim();
line[5] = line[5].trim();
String l = line[3] + " (" + line[4] + "), " + line[2] + ", " + line[5];
LiteracyData data = new LiteracyData(l, Float.parseFloat(line[5]));
literacy.add(data);
}
} catch (Exception e) {
System.out.println(e.getMessage());
}
// Sort the list using your custom LiteracyData comparator
literacy.sort(new LiteracySorter());
// Iterate through the list and print each item to ensure it is sorted
for(LiteracyData data : literacy) {
System.out.println(data.summary);
}
}
public static void main(String[] args) throws IOException {
LiteracyComparison comparison = new LiteracyComparison();
comparison.run();
}
}

Language detections with apache Tika

I am currently trying to get along with Apache Tika and set up a language detection that checks all keyValues of my various properties files for the correct language of the respective file. Unfortunately the detection is not really good..All keys are not recognized with the correct language and I don't know how I can do it better. An api solution is out of the question, because I have the order to find a free way and most free connections only allow 1000 calls per day (in german alone I have more than 14000 keys).
If you know how I can make the current code better or maybe have another solution, please let me know!
Thanks a lot,
Pascal
Thats my Current code:
import java.util.Set;
import org.apache.tika.language.LanguageIdentifier;
public class detect {
#SuppressWarnings("deprecation")
public static void main(String[] args) throws Exception {
final MyPropAllKeys mPAK = new MyPropAllKeys("messages_forCheck.properties");
final Set<Object> keys = mPAK.getAllKeys();
for (final Object key : keys) {
final String keyString = key.toString();
final String keyValueString = mPAK.getPropertyValue(keyString);
detect(keyValueString, key);
}
}
public static void detect(String keyValueString, Object key) {
final LanguageIdentifier languageIdentifier = new LanguageIdentifier(keyValueString);
final String language = languageIdentifier.getLanguage();
if (!language.equals("de")) {
System.out.println(language + " " + key + ": " + keyValueString);
}
}
}
For Example thats some of the Results:
pt de.segal.baoss.platform.entity.BackgroundTaskType.MASS_INVOICE_DOCUMENT_CREATION: Rechnungsdokumente erzeugen
sk de.segal.baoss.purchase.supplier.creditorNumber: Kreditorennummer
no de.segal.baoss.module.crm.revenueLastYear: Umsatz vergangenes Jahr
no de.segal.baoss.module.op.customerReturn.action.createCreditEntry: Gutschrift erstellen
All are definitely German

Saving and Loading Trained Stanford classifier in java

I have a dataset of 1 million labelled sentences and using it for finding sentiment through Maximum Entropy. I am using Stanford Classifier for the same:-
public class MaximumEntropy {
static ColumnDataClassifier cdc;
public static float calMaxEntropySentiment(String text) {
initializeProperties();
float sentiment = (getMaxEntropySentiment(text));
return sentiment;
}
public static void initializeProperties() {
cdc = new ColumnDataClassifier(
"\\stanford-classifier-2016-10-31\\properties.prop");
}
public static int getMaxEntropySentiment(String tweet) {
String filteredTweet = TwitterUtils.filterTweet(tweet);
System.out.println("Reading training file");
Classifier<String, String> cl = cdc.makeClassifier(cdc.readTrainingExamples(
"\\stanford-classifier-2016-10-31\\labelled_sentences.txt"));
Datum<String, String> d = cdc.makeDatumFromLine(filteredTweet);
System.out.println(filteredTweet + " ==> " + cl.classOf(d) + " " + cl.scoresOf(d));
// System.out.println("Class score is: " +
// cl.scoresOf(d).getCount(cl.classOf(d)));
if (cl.classOf(d) == "0") {
return 0;
} else {
return 4;
}
}
}
My data is labelled 0 or 1. Now for each tweet the whole dataset is being read and it is taking a lot of time considering the size of dataset.
My query is that is there any way to first train the classifier and then load it when a tweet's sentiment is to be found. I think this approach will take less time. Correct me if I am wrong.
The following link provides this but there is nothing for JAVA API.
Saving and Loading Classifier
Any help would be appreciated.
Yes; the easiest way to do this is using Java's default serialization mechanism to serialize a classifier. A useful helper here is the IOUtils class:
IOUtils.writeObjectToFile(classifier, "/path/to/file");
To read the classifier:
Classifier<String, String> cl = IOUtils.readObjectFromFile(new File("/path/to/file");

Do Lucene(java framework) by default calculates the tf-idf and cosine similarity of a document against the term?

I am developing a search engine based application and was working on Lucene java framework, i am being confused by the score functionality by default provided by lucene i.e do the score functionality implements by default tf-idf and cosine similarity or do we have to do something else ?
public class LuceneTester {
String indexDir = "C:\\Users\\hamda\\Documents\\NetBeansProjects\\luceneDemo\\Index";
String dataDir = "C:\\Users\\hamda\\Documents\\NetBeansProjects\\luceneDemo\\Data";
Indexer indexer;
Searcher searcher;
public static void main(String[] args) {
LuceneTester tester;
try {
tester = new LuceneTester();
tester.createIndex();
tester.search("DataGuides");
} catch (IOException e) {
e.printStackTrace();
} catch (ParseException e) {
e.printStackTrace();
}
}
private void createIndex() throws IOException{
indexer = new Indexer(indexDir);
int numIndexed;
long startTime = System.currentTimeMillis();
numIndexed = indexer.createIndex(dataDir, new TextFileFilter());
long endTime = System.currentTimeMillis();
indexer.close();
System.out.println(numIndexed+" File indexed, time taken: "
+(endTime-startTime)+" ms");
}
i am getting the Document score in the end of the search function below
private void search(String searchQuery) throws IOException, ParseException{
searcher = new Searcher(indexDir);
long startTime = System.currentTimeMillis();
TopDocs hits = searcher.search(searchQuery);
long endTime = System.currentTimeMillis();
System.out.println(hits.totalHits +
" documents found. Time :" + (endTime - startTime));
for(ScoreDoc scoreDoc : hits.scoreDocs) {
Document doc = searcher.getDocument(scoreDoc);
System.out.println(scoreDoc.score+" File: "
+ doc.get(LuceneConstants.FILE_PATH));
}
searcher.close();
}
}
I have googled it and found this:
how can I implement the tf-idf and cosine similarity in Lucene?
Any help will be highly appreciated :)
As of Lucene 6.0, the default similarity implementation is BM25Similarity, which implements BM25.
If you want to use the old standard similarity implementation, use ClassicSimilarity.
For a comparison of the two, you might check out:
Doug Turnbull's BM25 The Next Generation of Lucene Relevance
ElasticSearch's BM25 vs Lucene Default Similarity
As i was going through some details in http://lucene.apache.org/, i found out that lucene scoring model by default use this class DefaultSimilarity http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html which extends the TFIDFSimilarity class, http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html
So in there documentation it is stated that scoring model by default implements tf-idf and cosine similarity. Any ways i may be wrong, so you can correct me :)

Need to pass test case in QC through Java

Could any one help me in below issue
I want to pass test cases in QC through Java, I used con4j and reached till test sets but I am unable to fetch the test cases under respective test set.
could any one please help me in how to pass test cases in QC through com4j
import com.qc.ClassFactory;
import com.qc.ITDConnection;
import com.qc.ITestLabFolder;
import com.qc.ITestSetFactory;
import com.qc.ITestSetTreeManager;
import com.qc.ITestSetFolder;
import com.qc.IList;
import com.qc.ITSTest;
import com.qc.ITestSet;
import com.qc.ITestFactory;
import com4j.*;
import com4j.stdole.*;
import com4j.tlbimp.*;
import com4j.tlbimp.def.*;
import com4j.tlbimp.driver.*;
import com4j.util.*;
import com4j.COM4J;
import java.util.*;
import com.qc.IRun;
import com.qc.IRunFactory;
public class Qc_Connect {
public static void main(String[] args) {
// TODO Auto-generated method stub
String url="http://abc/qcbin/";
String domain="abc";
String project="xyz";
String username="132222";
String password="Xyz";
String strTestLabPath = "Root\\Test\\";
String strTestSetName = "TestQC";
try{
ITDConnection itd=ClassFactory.createTDConnection();
itd.initConnectionEx(url);
System.out.println("COnnected To QC:"+ itd.connected());
itd.connectProjectEx(domain,project,username,password);
System.out.println("Logged into QC");
//System.out.println("Project_Connected:"+ itd.connected());
ITestSetFactory objTestSetFactory = (itd.testSetFactory()).queryInterface(ITestSetFactory.class);
ITestSetTreeManager objTestSetTreeManager = (itd.testSetTreeManager()).queryInterface(ITestSetTreeManager.class);
ITestSetFolder objTestSetFolder =(objTestSetTreeManager.nodeByPath(strTestLabPath)).queryInterface(ITestSetFolder.class);
IList its1 = objTestSetFolder.findTestSets(strTestSetName, true, null);
//IList ls= objTestSetFolder.findTestSets(strTestSetName, true, null);
System.out.println("No. of Test Set:" + its1.count());
ITestSet tst= (ITestSet) objTestSetFolder.findTestSets(strTestSetName, true, null).queryInterface(ITSTest.class);
System.out.println(tst.name());
//System.out.println( its1.queryInterface(ITestSet.class).name());
/* foreach (ITestSet testSet : its1.queryInterface(ITestSet.class)){
ITestSetFolder tsFolder = (ITestSetFolder)testSet.TestSetFolder;
ITSTestFactory tsTestFactory = (ITSTestFactory)testSet.TSTestFactory;
List tsTestList = tsTestFactory.NewList("");
}*/
/* Com4jObject comObj = (Com4jObject) its1.item(0);
ITestSet tst = comObj.queryInterface(ITestSet.class);
System.out.println("Test Set Name : " + tst.name());
System.out.println("Test Set ID : " + tst.id());
System.out.println("Test Set ID : " + tst.status());
System.out.println("Test Set ID : " );*/
System.out.println(its1.count());
System.out.println("TestSet Present");
Iterator itr = its1.iterator();
System.out.println(itr.hasNext());
while (itr.hasNext())
{
Com4jObject comObj = (Com4jObject) itr.next();
ITestSet sTestSet = comObj.queryInterface(ITestSet.class);
System.out.println(sTestSet.name());
Com4jObject comObj2 = sTestSet.tsTestFactory();
ITestSetFactory test = comObj2.queryInterface(ITestSetFactory.class);
}
// ITSTest tsTest=null;
// tsTest.
//its1.
/* comObj = (Com4jObject) its1.item(1);
ITSTest tst2=comObj.queryInterface(ITSTest.class);*/
// System.out.println( tst2.name());
/* foreach (ITSTest tsTest : tst2)
{
IRun lastRun = (IRun)tsTest.lastRun();
if (lastRun == null)
{
IRunFactory runFactory = (IRunFactory)tsTest.runFactory;
String date = "20160203";
IRun run = (IRun)runFactory.addItem( date);
run.status("Pass");
run.autoPost();
}
}*/
}
catch(Exception e){
e.printStackTrace();
}
}
}
I know the post is quite old. I have to struggle alot in OTA with Java and couldn't get a complete post for solving the issue.
Now i have running code after too much research.
so thought of sharing my code in case someone is looking for help.
Here is complete Solution.
`
ITestFactory sTestFactory = (connection.testFactory())
.queryInterface(ITestFactory.class);
ITest iTest1 = (sTestFactory.item(12081)).queryInterface(ITest.class);
System.out.println(iTest1.execDate());
System.out.println(iTest1.name());
ITestSetFactory sTestSetFactory = (connection.testSetFactory())
.queryInterface(ITestSetFactory.class);
ITestSet sTestSet = (sTestSetFactory.item(1402))
.queryInterface(ITestSet.class);
System.out.println(sTestSet.name() + "\n Test Set ID" + sTestSet.id());
IBaseFactory testFactory1 = sTestSet.tsTestFactory().queryInterface(
IBaseFactory.class);
testFactory1.addItem(iTest1);
System.out.println("Test case has been Added");
System.out.println(testFactory1.newList("").count());
IList tsTestlist = testFactory1.newList("");
ITSTest tsTest;
for (int tsTestIndex = 1; tsTestIndex <= tsTestlist.count(); tsTestIndex++) {
Com4jObject comObj = (Com4jObject) tsTestlist.item(tsTestIndex);
tsTest = comObj.queryInterface(ITSTest.class);
if (tsTest.name().equalsIgnoreCase("[3]TC_OTA_API_Test")) {
System.out.println("Hostname" + tsTest.hostName() + "\n"
+ tsTest.name() + "\n" + tsTest.status());
IRun lastRun = (IRun) tsTest.lastRun();
// IRun lastRun = comObjRun.queryInterface(IRun.class);
// don't update test if it may have been modified by someone
// else
if (lastRun == null) {
System.out.println("I am here last Run = Null");
runFactory = tsTest.runFactory().queryInterface(
IRunFactory.class);
System.out.println(runFactory.newList("").count());
String runName = "TestRun_Automated";
Com4jObject comObjRunForThisTS = runFactory
.addItem(runName);
IRun runObjectForThisTS = comObjRunForThisTS
.queryInterface(IRun.class);
runObjectForThisTS.status("Passed");
runObjectForThisTS.post();
runObjectForThisTS.refresh();
}
}
}
`
Why not build a client to access the REST API instead of passing through the OTA interface?
Once you build a basic client, you can post runs and update their status quite easily.
If you use c#/vb.net this has been easily completed. But you are working on java, I would suggest to provide interface above dlls to deal with operation. This will be much more easier than using com4j.
Similar query, probably following may help you. I would suggest to drop idea of using com4j and use solution provided in thread below which is proven,fail safe and auto-recoverable.
QC API JAR to connect using java
it was always been difficult to use com4j specially for HPQC/ALM. As dlls for QC are faulty and there are memory leaking/allocation problems which crashes dll executions frequently on certain platforms.

Categories

Resources