Printing result into two columns error for java - java

Its just as the title stated I want to print something into two columns but I seem to be running to an error.
int noOfHorseInList = 7;
String horseName[] = {"Blitz", "Sparky", "Sandy", "Rolo", "Beano",
"Flash", "", "", ""};
int horseAge[] = {2, 4, 1, 4, 4, 2, 0, 0, 0};
int timeTakenInSeconds[] = {600, -1, 500, 430, 412, -1, 0, 0, 0};
int option = 0;
System.out.println("Please choose from the following selection");
System.out.println("1. Show horses that finished\n2. Show horse details\n3. Adding a Late Entrant");
option = kb.nextInt();
kb.hasNextLine();
switch(option)
{
case 1:
//Finishers
String showHorseThatFinished = "";
showHorseThatFinished = finishers(noOfHorseInList, horseName, timeTakenInSeconds);
String timeOfHorse = "";
for(int i = 0; i < noOfHorseInList; i++)
{
if(timeTakenInSeconds[i] > -1)
{
timeOfHorse = timeOfHorse + "\n" + timeTakenInSeconds[i];
}
}
System.out.println("Horses Name\tFinishing Time");
System.out.println(showHorseThatFinished + "\t" + timeOfHorse);
break;
case 2:
//Horse details
break;
case 3:
//Adding Late Entrant
break;
}
}
public static String finishers(int noOfHorseInList,String horseName[],int timeTakenInSeconds[])
{
String showHorseThatFinished = "";
for(int i = 0; i < noOfHorseInList; i++)
{
if(timeTakenInSeconds[i] > -1)
{
showHorseThatFinished = showHorseThatFinished + "\n" + horseName[i];
}
}
return showHorseThatFinished;
}
This is my output, I want the values into two columns next to each other

You are actually calculating too much...
take a look at this method:
finishers(int noOfHorseInList, String horseName[], int timeTakenInSeconds[])
modify it like this:
public static String finishers(int noOfHorseInList, String horseName[], int timeTakenInSeconds[]) {
String showHorseThatFinished = "";
for (int i = 0; i < noOfHorseInList; i++) {
if (timeTakenInSeconds[i] > -1) {
showHorseThatFinished = showHorseThatFinished + "\n" + horseName[i] + "\t\t\t" + timeTakenInSeconds[i];
}
}
return showHorseThatFinished;
}
and your table will be done with no need of the
for(int i = 0; i < noOfHorseInList; i++)
{
in the case 1:

Append horse names and times in the same function.
Use StringBuffer for appending. It's more efficient.
public static String finishers(int noOfHorseInList,String horseName[],int timeTakenInSeconds[]) {
StringBuffer showHorseThatFinished = new StringBuffer();
for(int i = 0; i < noOfHorseInList; i++) {
// You need all horses to be written. Mark somehow that their time is erroneous
String time = "ERROR";
if (timeTakenInSeconds[i] > -1) {
time = "" + timeTakenInSeconds[i];
}
showHorseThatFinished.append(horseName[i]);
showHorseThatFinished.append("\t");
showHorseThatFinished.append(time);
showHorseThatFinished.append("\n");
}
}
return showHorseThatFinished.toString();
}
There is no need for this piece of code anymore
String timeOfHorse = "";
for(int i = 0; i < noOfHorseInList; i++)
{
if(timeTakenInSeconds[i] > -1)
{
timeOfHorse = timeOfHorse + "\n" + timeTakenInSeconds[i];
}
}
Just
System.out.println(showHorseThatFinished);

How about changing the following line in the for loop to -
timeOfHorse = timeOfHorse + "\t" + timeTakenInSeconds[i];

Related

Tiny GP output in text file

I've recently stumbled upon Tiny GP (A Genetic Programming program), and I found it pretty useful, so I decided to change all System.out.println() in the program to a write to text file method.
Problem: In the text file, for some reason, only says "PROBLEM SOLVED", instead of printing out generations and other things that it is supposed to (see code).
Tiny GP modified class file:
package main;
/*
* Program: tiny_gp.java
*
* Author: Riccardo Poli (email: rpoli#essex.ac.uk)
*
* Modified by Preston Tang
*/
import java.util.*;
import java.io.*;
import java.text.DecimalFormat;
public class tiny_gp {
String Name;
double[] fitness;
char[][] pop;
static Random rd = new Random();
static final int ADD = 110,
SUB = 111,
MUL = 112,
DIV = 113,
FSET_START = ADD,
FSET_END = DIV;
static double[] x = new double[FSET_START];
static double minrandom, maxrandom;
static char[] program;
static int PC;
static int varnumber, fitnesscases, randomnumber;
static double fbestpop = 0.0, favgpop = 0.0;
static long seed;
static double avg_len;
static final int MAX_LEN = 10000,
POPSIZE = 100000,
DEPTH = 5,
GENERATIONS = 100,
TSIZE = 2;
public static final double PMUT_PER_NODE = 0.05,
CROSSOVER_PROB = 0.9;
public static double[][] targets;
public double run() {
/* Interpreter */
char primitive = program[PC++];
if (primitive < FSET_START) {
return (x[primitive]);
}
switch (primitive) {
case ADD:
return (run() + run());
case SUB:
return (run() - run());
case MUL:
return (run() * run());
case DIV: {
double num = run(), den = run();
if (Math.abs(den) <= 0.001) {
return (num);
} else {
return (num / den);
}
}
}
return (0.0); // should never get here
}
public int traverse(char[] buffer, int buffercount) {
if (buffer[buffercount] < FSET_START) {
return (++buffercount);
}
switch (buffer[buffercount]) {
case ADD:
case SUB:
case MUL:
case DIV:
return (traverse(buffer, traverse(buffer, ++buffercount)));
}
return (0); // should never get here
}
public void setup_fitness(String fname) {
try {
int i, j;
String line;
BufferedReader in
= new BufferedReader(
new FileReader(fname));
line = in.readLine();
StringTokenizer tokens = new StringTokenizer(line);
varnumber = Integer.parseInt(tokens.nextToken().trim());
randomnumber = Integer.parseInt(tokens.nextToken().trim());
minrandom = Double.parseDouble(tokens.nextToken().trim());
maxrandom = Double.parseDouble(tokens.nextToken().trim());
fitnesscases = Integer.parseInt(tokens.nextToken().trim());
targets = new double[fitnesscases][varnumber + 1];
if (varnumber + randomnumber >= FSET_START) {
Write("too many variables and constants");
//System.out.println("too many variables and constants");
}
for (i = 0; i < fitnesscases; i++) {
line = in.readLine();
tokens = new StringTokenizer(line);
for (j = 0; j <= varnumber; j++) {
targets[i][j] = Double.parseDouble(tokens.nextToken().trim());
}
}
in.close();
} catch (FileNotFoundException e) {
Write("ERROR: Please provide a data file");
//System.out.println("ERROR: Please provide a data file");
System.exit(0);
} catch (Exception e) {
Write("ERROR: Incorrect data format");
//System.out.println("ERROR: Incorrect data format");
System.exit(0);
}
}
public double fitness_function(char[] Prog) {
int i = 0, len;
double result, fit = 0.0;
len = traverse(Prog, 0);
for (i = 0; i < fitnesscases; i++) {
for (int j = 0; j < varnumber; j++) {
x[j] = targets[i][j];
}
program = Prog;
PC = 0;
result = run();
fit += Math.abs(result - targets[i][varnumber]);
}
return (-fit);
}
public int grow(char[] buffer, int pos, int max, int depth) {
char prim = (char) rd.nextInt(2);
int one_child;
if (pos >= max) {
return (-1);
}
if (pos == 0) {
prim = 1;
}
if (prim == 0 || depth == 0) {
prim = (char) rd.nextInt(varnumber + randomnumber);
buffer[pos] = prim;
return (pos + 1);
} else {
prim = (char) (rd.nextInt(FSET_END - FSET_START + 1) + FSET_START);
switch (prim) {
case ADD:
case SUB:
case MUL:
case DIV:
buffer[pos] = prim;
one_child = grow(buffer, pos + 1, max, depth - 1);
if (one_child < 0) {
return (-1);
}
return (grow(buffer, one_child, max, depth - 1));
}
}
return (0); // should never get here
}
public int print_indiv(char[] buffer, int buffercounter) {
int a1 = 0, a2;
if (buffer[buffercounter] < FSET_START) {
if (buffer[buffercounter] < varnumber) {
Write("X" + (buffer[buffercounter] + 1) + " ");
//System.out.print("X" + (buffer[buffercounter] + 1) + " ");
} else {
WriteDouble(x[buffer[buffercounter]]);
//System.out.print(x[buffer[buffercounter]]);
}
return (++buffercounter);
}
switch (buffer[buffercounter]) {
case ADD:
Write("(");
//System.out.print("(");
a1 = print_indiv(buffer, ++buffercounter);
Write(" + ");
//System.out.print(" + ");
break;
case SUB:
Write("(");
//System.out.print("(");
a1 = print_indiv(buffer, ++buffercounter);
Write(" - ");
//System.out.print(" - ");
break;
case MUL:
Write("(");
//System.out.print("(");
a1 = print_indiv(buffer, ++buffercounter);
Write(" * ");
//System.out.print(" * ");
break;
case DIV:
Write("(");
//System.out.print("(");
a1 = print_indiv(buffer, ++buffercounter);
Write(" / ");
//System.out.print(" / ");
break;
}
a2 = print_indiv(buffer, a1);
Write(")");
//System.out.print(")");
return (a2);
}
public static char[] buffer = new char[MAX_LEN];
public char[] create_random_indiv(int depth) {
char[] ind;
int len;
len = grow(buffer, 0, MAX_LEN, depth);
while (len < 0) {
len = grow(buffer, 0, MAX_LEN, depth);
}
ind = new char[len];
System.arraycopy(buffer, 0, ind, 0, len);
return (ind);
}
public char[][] create_random_pop(int n, int depth, double[] fitness) {
char[][] pop = new char[n][];
int i;
for (i = 0; i < n; i++) {
pop[i] = create_random_indiv(depth);
fitness[i] = fitness_function(pop[i]);
}
return (pop);
}
public void stats(double[] fitness, char[][] pop, int gen) {
int i, best = rd.nextInt(POPSIZE);
int node_count = 0;
fbestpop = fitness[best];
favgpop = 0.0;
for (i = 0; i < POPSIZE; i++) {
node_count += traverse(pop[i], 0);
favgpop += fitness[i];
if (fitness[i] > fbestpop) {
best = i;
fbestpop = fitness[i];
}
}
avg_len = (double) node_count / POPSIZE;
favgpop /= POPSIZE;
Write("Generation=" + gen + " Avg Fitness=" + (-favgpop)
+ " Best Fitness=" + (-fbestpop) + " Avg Size=" + avg_len
+ "\nBest Individual: ");
//System.out.print("Generation=" + gen + " Avg Fitness=" + (-favgpop)
// + " Best Fitness=" + (-fbestpop) + " Avg Size=" + avg_len
// + "\nBest Individual: ");
print_indiv(pop[best], 0);
Write("\n");
//System.out.print("\n");
//System.out.flush();
}
public int tournament(double[] fitness, int tsize) {
int best = rd.nextInt(POPSIZE), i, competitor;
double fbest = -1.0e34;
for (i = 0; i < tsize; i++) {
competitor = rd.nextInt(POPSIZE);
if (fitness[competitor] > fbest) {
fbest = fitness[competitor];
best = competitor;
}
}
return (best);
}
public int negative_tournament(double[] fitness, int tsize) {
int worst = rd.nextInt(POPSIZE), i, competitor;
double fworst = 1e34;
for (i = 0; i < tsize; i++) {
competitor = rd.nextInt(POPSIZE);
if (fitness[competitor] < fworst) {
fworst = fitness[competitor];
worst = competitor;
}
}
return (worst);
}
public char[] crossover(char[] parent1, char[] parent2) {
int xo1start, xo1end, xo2start, xo2end;
char[] offspring;
int len1 = traverse(parent1, 0);
int len2 = traverse(parent2, 0);
int lenoff;
xo1start = rd.nextInt(len1);
xo1end = traverse(parent1, xo1start);
xo2start = rd.nextInt(len2);
xo2end = traverse(parent2, xo2start);
lenoff = xo1start + (xo2end - xo2start) + (len1 - xo1end);
offspring = new char[lenoff];
System.arraycopy(parent1, 0, offspring, 0, xo1start);
System.arraycopy(parent2, xo2start, offspring, xo1start,
(xo2end - xo2start));
System.arraycopy(parent1, xo1end, offspring,
xo1start + (xo2end - xo2start),
(len1 - xo1end));
return (offspring);
}
public char[] mutation(char[] parent, double pmut) {
int len = traverse(parent, 0), i;
int mutsite;
char[] parentcopy = new char[len];
System.arraycopy(parent, 0, parentcopy, 0, len);
for (i = 0; i < len; i++) {
if (rd.nextDouble() < pmut) {
mutsite = i;
if (parentcopy[mutsite] < FSET_START) {
parentcopy[mutsite] = (char) rd.nextInt(varnumber + randomnumber);
} else {
switch (parentcopy[mutsite]) {
case ADD:
case SUB:
case MUL:
case DIV:
parentcopy[mutsite]
= (char) (rd.nextInt(FSET_END - FSET_START + 1)
+ FSET_START);
}
}
}
}
return (parentcopy);
}
public void print_parms() {
Write("-- TINY GP (Java version) --\n");
//System.out.print("-- TINY GP (Java version) --\n");
Write("SEED=" + seed + "\nMAX_LEN=" + MAX_LEN
+ "\nPOPSIZE=" + POPSIZE + "\nDEPTH=" + DEPTH
+ "\nCROSSOVER_PROB=" + CROSSOVER_PROB
+ "\nPMUT_PER_NODE=" + PMUT_PER_NODE
+ "\nMIN_RANDOM=" + minrandom
+ "\nMAX_RANDOM=" + maxrandom
+ "\nGENERATIONS=" + GENERATIONS
+ "\nTSIZE=" + TSIZE
+ "\n----------------------------------\n");
// System.out.print("SEED=" + seed + "\nMAX_LEN=" + MAX_LEN
// + "\nPOPSIZE=" + POPSIZE + "\nDEPTH=" + DEPTH
// + "\nCROSSOVER_PROB=" + CROSSOVER_PROB
// + "\nPMUT_PER_NODE=" + PMUT_PER_NODE
// + "\nMIN_RANDOM=" + minrandom
// + "\nMAX_RANDOM=" + maxrandom
// + "\nGENERATIONS=" + GENERATIONS
// + "\nTSIZE=" + TSIZE
// + "\n----------------------------------\n");
}
public tiny_gp(String fname, long s) {
fitness = new double[POPSIZE];
seed = s;
if (seed >= 0) {
rd.setSeed(seed);
}
setup_fitness(fname);
for (int i = 0; i < FSET_START; i++) {
x[i] = (maxrandom - minrandom) * rd.nextDouble() + minrandom;
}
pop = create_random_pop(POPSIZE, DEPTH, fitness);
}
public void evolve() {
int gen = 0, indivs, offspring, parent1, parent2, parent;
double newfit;
char[] newind;
print_parms();
stats(fitness, pop, 0);
for (gen = 1; gen < GENERATIONS; gen++) {
if (fbestpop > -1e-5) {
Write("PROBLEM SOLVED\n");
//System.out.print("PROBLEM SOLVED\n");
System.exit(0);
}
for (indivs = 0; indivs < POPSIZE; indivs++) {
if (rd.nextDouble() < CROSSOVER_PROB) {
parent1 = tournament(fitness, TSIZE);
parent2 = tournament(fitness, TSIZE);
newind = crossover(pop[parent1], pop[parent2]);
} else {
parent = tournament(fitness, TSIZE);
newind = mutation(pop[parent], PMUT_PER_NODE);
}
newfit = fitness_function(newind);
offspring = negative_tournament(fitness, TSIZE);
pop[offspring] = newind;
fitness[offspring] = newfit;
}
stats(fitness, pop, gen);
}
Write("PROBLEM *NOT* SOLVED\n");
//System.out.print("PROBLEM *NOT* SOLVED\n");
System.exit(1);
}
public void Write(String context) {
FileWriter fileWriter;
try {
fileWriter = new FileWriter("GP.txt");
try (BufferedWriter bufferedWriter = new BufferedWriter(fileWriter)) {
bufferedWriter.write(context);
}
} catch (IOException ex) {
}
}
public void WriteDouble(double context) {
FileWriter fileWriter;
try {
fileWriter = new FileWriter("GP.txt");
try (BufferedWriter bufferedWriter = new BufferedWriter(fileWriter)) {
String ncontext = Double.toString(context);
bufferedWriter.write(ncontext);
}
} catch (IOException ex) {
}
}
};
The Functions Mapper file that uses the Tiny GP class file:
package function_mapper;
import javax.swing.JOptionPane;
import main.*;
public class Function_Mapper {
public static void main(String[] args) {
String fname = JOptionPane.showInputDialog(null, "File Name", "Input Dialog", JOptionPane.INFORMATION_MESSAGE);
long s = -1;
if (args.length == 2) {
s = Integer.valueOf(args[0]).intValue();
fname = args[1];
}
if (args.length == 1) {
fname = args[0];
}
tiny_gp gp = new tiny_gp(fname, s);
gp.evolve();
}
}
Much help appreciated, thanks!
The Write method overwrites the contents of the file on each invocation. There are two ways to fix this.
An easier one, is to append file, instead of overwriting it. It could be achieved by passing append argument to the FileWriter (I simplified code a little bit along the way).
// true on the next line means "append"
try (Writer writer = new FileWriter("GP.txt", true)) {
writer.write(Double.toString(context));
} catch (IOException ex) {
}
A harder, but much fore efficient one is to openwriter in the constructor, use it in Write method, and close in the specially introduced close method of the tiny_gp.

Error message for StorageResource type

I've been trying to work on this problem for a while now but to no avail. When I run the code I get this error message: incompatible types: edu.duke.StorageResource cannot be converted to java.lang.String on line String geneList = FMG.storeAll(dna);. Does this mean I'm trying to make edu.duke object work with a java.lang.String type object? What would we go about resolving this issue?
Here's my code so far:
package coursera_java_duke;
import java.io.*;
import edu.duke.FileResource;
import edu.duke.StorageResource;
import edu.duke.DirectoryResource;
public class FindMultiGenes5 {
public int findStopIndex(String dna, int index) {
int stop1 = dna.indexOf("TGA", index);
if (stop1 == -1 || (stop1 - index) % 3 != 0) {
stop1 = dna.length();
}
int stop2 = dna.indexOf("TAA", index);
if (stop2 == -1 || (stop2 - index) % 3 != 0) {
stop2 = dna.length();
}
int stop3 = dna.indexOf("TAG", index);
if (stop3 == -1 || (stop3 - index) % 3 != 0) {
stop3 = dna.length();
}
return Math.min(stop1, Math.min(stop2, stop3));
}
public StorageResource storeAll(String dna) {
//CATGTAATAGATGAATGACTGATAGATATGCTTGTATGCTATGAAAATGTGAAATGACCCAdna = "CATGTAATAGATGAATGACTGATAGATATGCTTGTATGCTATGAAAATGTGAAATGACCCA";
String geneAL = new String();
String sequence = dna.toUpperCase();
StorageResource store = new StorageResource();
int index = 0;
while (true) {
index = sequence.indexOf("ATG", index);
if (index == -1)
break;
int stop = findStopIndex(sequence, index + 3);
if (stop != sequence.length()) {
String gene = dna.substring(index, stop + 3);
store.add(gene);
//index = sequence.substring(index, stop + 3).length();
index = stop + 3; // start at the end of the stop codon
}else{ index = index + 3;
}
}
return store;//System.out.println(sequence);
}
public void testStorageFinder() {
DirectoryResource dr = new DirectoryResource();
StorageResource dnaStore = new StorageResource();
for (File f : dr.selectedFiles()) {
FileResource fr = new FileResource(f);
String s = fr.asString();
dnaStore = storeAll(s);
printGenes(dnaStore);
}
System.out.println("size = " + dnaStore.size());
}
public String readStrFromFile(){
FileResource readFile = new FileResource();
String DNA = readFile.asString();
//System.out.println("DNA: " + DNA);
return DNA;
}//end readStrFromFile() method;
public float calCGRatio(String gene){
gene = gene.toUpperCase();
int len = gene.length();
int CGCount = 0;
for(int i=0; i<len; i++){
if(gene.charAt(i) == 'C' || gene.charAt(i) == 'G')
CGCount++;
}//end for loop
System.out.println("CGCount " + CGCount + " Length: " + len + " Ratio: " + (float)CGCount/len);
return (float)CGCount/len;
}//end of calCGRatio() method;
public void printGenes(StorageResource sr){
//create a FindMultiGenesFile object FMG
FindMultiGenes5 FMG = new FindMultiGenes5();
//read a DNA sequence from file
String dna = FMG.readStrFromFile();
String geneList = FMG.storeAll(dna);
//store all genes into a document
StorageResource dnaStore = new StorageResource();
System.out.println("\n There are " + geneList.size() + " genes. ");
int longerthan60 = 0;
int CGGreaterthan35 = 0;
for(int i=0; i<geneList.size(); i++){
if(!dnaStore.contains(geneList.get(i)))
dnaStore.add(geneList.get(i));
if(geneList.get(i).length() > 60) longerthan60++;
if(FMG.calCGRatio(geneList.get(i)) > 0.35) CGGreaterthan35++;
}
System.out.println("dnaStore.size: " + dnaStore.size());
System.out.println("\n There are " + dnaStore.size() + " genes. ");
System.out.println("There are " + longerthan60 + " genes longer than 60.");
System.out.println("There are " + CGGreaterthan35 + " genes with CG ratio greater than 0.35.");
}//end main();
}
I found your post as I am also doing a similar course at Duke using those edu.duke libraries.
When I get that error message it is because I'm using the wrong method to access it.
Try FMD.data() to get an iterable of all of the gene strings.

Split 2 string and join

I have 2 string which I want to join as per my requirements. Say I have
String sa = {"as,asd,asdf"};
String qw = {"12,123,1234"};
String[] separated = ItemSumm.split(",");
String[] separateds = Itemumm.split(",");
StringBuffer sb = new StringBuffer();
for (int i = 0; i < separateds.length; i++)
{
if (separated.length == i + 1)
{
sb.append(separated[i] + "(" + separateds[i] + ")");
} else
{
sb.append(separated[i] + "(" + separateds[i] + "),");
}
}
deleteListItem.list_summ.setText(sb.toString());
it gives as(12),asd(123),asdf(1234)
But problem is , it can be like
String sa = {"as,asdf"};
String qw = {"12,123,1234"};
So in this I want like
as(12),asdf(123),1234
Try this code :
String sa = {"as,asd"};
String qw = {"12,123,1234"};
String[] separated = ItemSumm.split(",");
String[] separateds = Itemumm.split(",");
StringBuffer sb = new StringBuffer();
for (int i = 0; i < separateds.length; i++) {
if (separated.length == i + 1) {
if(separated.length == i) {
sb.append(separateds[i] + "");
} else {
sb.append(separated[i] + "(" + separateds[i] + ")");
}
} else {
if(separated.length == i) {
sb.append("," + separateds[i]);
} else {
sb.append(separated[i] + "(" + separateds[i] + "),");
}
}
}
deleteListItem.list_summ.setText(sb.toString());
// Answer : as(12),asd(123),1234
String sa = {"as,asd,asdf"};
String qw = {"12,123,1234"};
String[] separated = ItemSumm.split(",");
String[] separateds = Itemumm.split(",");
StringBuffer sb = new StringBuffer();
// first loop through separated, starting with a comma
for (int i = 0; i < separated.length; i++) {
sb.append(",").append(separated[i]).append("(").append(separateds[i]).append(")"));
}
// append remaining items in separateds
for (int i = separated.length; i < separateds.length; i++) {
sb.append(",").append(separateds[i]);
}
deleteListItem.list_summ.setText(sb.toString().substring(1)); // remove starting comma
if the lenghts of the strings are the sa, do the join
if (separated.length == i + 1 && (separated[i].lenght == separateds[i].lenght))

Sentence comparison with NLP

I used lingpipe for sentence detection but I don't have any idea if there is a better tool. As far as I have understood, there is no way to compare two sentences and see if they mean the same thing.
Is there anyother good source where I can have a pre-built method for comparing two sentences and see if they are similar?
My requirement is as below:
String sent1 = "Mary and Meera are my classmates.";
String sent2 = "Meera and Mary are my classmates.";
String sent3 = "I am in Meera and Mary's class.";
// several sentences will be formed and basically what I need to do is
// this
boolean bothAreEqual = compareOf(sent1, sent2);
sop(bothAreEqual); // should print true
boolean bothAreEqual = compareOf(sent2, sent3);
sop(bothAreEqual);// should print true
How to test if the meaning of two sentences are the same: this would be a too open-ended question.
However, there are methods for comparing two sentences and see if they are similar. There are many possible definition for similarity that can be tested with pre-built methods.
See for example http://en.wikipedia.org/wiki/Levenshtein_distance
Distance between
'Mary and Meera are my classmates.'
and 'Meera and Mary are my classmates.':
6
Distance between
'Mary and Meera are my classmates.'
and 'Alice and Bobe are not my classmates.':
14
Distance between
'Mary and Meera are my classmates.'
and 'Some totally different sentence.':
29
code:
public class LevenshteinDistance {
private static int minimum(int a, int b, int c) {
return Math.min(Math.min(a, b), c);
}
public static int computeDistance(CharSequence str1,
CharSequence str2) {
int[][] distance = new int[str1.length() + 1][str2.length() + 1];
for (int i = 0; i <= str1.length(); i++){
distance[i][0] = i;
}
for (int j = 0; j <= str2.length(); j++){
distance[0][j] = j;
}
for (int i = 1; i <= str1.length(); i++){
for (int j = 1; j <= str2.length(); j++){
distance[i][j] = minimum(
distance[i - 1][j] + 1,
distance[i][j - 1] + 1,
distance[i - 1][j - 1]
+ ((str1.charAt(i - 1) == str2.charAt(j - 1)) ? 0 : 1));
}
}
int result = distance[str1.length()][str2.length()];
//log.debug("distance:"+result);
return result;
}
public static void main(String[] args) {
String sent1="Mary and Meera are my classmates.";
String sent2="Meera and Mary are my classmates.";
String sent3="Alice and Bobe are not my classmates.";
String sent4="Some totally different sentence.";
System.out.println("Distance between \n'"+sent1+"' \nand '"+sent2+"': \n"+computeDistance(sent1, sent2));
System.out.println("Distance between \n'"+sent1+"' \nand '"+sent3+"': \n"+computeDistance(sent1, sent3));
System.out.println("Distance between \n'"+sent1+"' \nand '"+sent4+"': \n"+computeDistance(sent1, sent4));
}
}
Here is wat i have come up with. this is just a substitute till i get to the real thing but it might be of some help to people out there..
package com.examples;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import com.aliasi.sentences.MedlineSentenceModel;
import com.aliasi.sentences.SentenceModel;
import com.aliasi.tokenizer.IndoEuropeanTokenizerFactory;
import com.aliasi.tokenizer.Tokenizer;
import com.aliasi.tokenizer.TokenizerFactory;
import com.aliasi.util.Files;
import com.sun.accessibility.internal.resources.accessibility;
public class SentenceWordAnalysisAndLevenshteinDistance {
private static int minimum(int a, int b, int c) {
return Math.min(Math.min(a, b), c);
}
public static int computeDistance(CharSequence str1, CharSequence str2) {
int[][] distance = new int[str1.length() + 1][str2.length() + 1];
for (int i = 0; i <= str1.length(); i++) {
distance[i][0] = i;
}
for (int j = 0; j <= str2.length(); j++) {
distance[0][j] = j;
}
for (int i = 1; i <= str1.length(); i++) {
for (int j = 1; j <= str2.length(); j++) {
distance[i][j] = minimum(
distance[i - 1][j] + 1,
distance[i][j - 1] + 1,
distance[i - 1][j - 1]
+ ((str1.charAt(i - 1) == str2.charAt(j - 1)) ? 0
: 1));
}
}
int result = distance[str1.length()][str2.length()];
return result;
}
static final TokenizerFactory TOKENIZER_FACTORY = IndoEuropeanTokenizerFactory.INSTANCE;
static final SentenceModel SENTENCE_MODEL = new MedlineSentenceModel();
public static void main(String[] args) {
try {
ArrayList<String> sentences = null;
sentences = new ArrayList<String>();
// Reading from text file
// sentences = readSentencesInFile("D:\\sam.txt");
// Giving sentences
// ArrayList<String> sentences = new ArrayList<String>();
sentences.add("Mary and Meera are my classmates.");
sentences.add("Mary and Meera are my classmates.");
sentences.add("Meera and Mary are my classmates.");
sentences.add("Alice and Bobe are not my classmates.");
sentences.add("Some totally different sentence.");
// Self-implemented
wordAnalyser(sentences);
// Internet referred
// levenshteinDistance(sentences);
} catch (Exception e) {
// TODO: handle exception
e.printStackTrace();
}
}
private static ArrayList<String> readSentencesInFile(String path) {
ArrayList<String> sentencesList = new ArrayList<String>();
try {
System.out.println("Reading file from : " + path);
File file = new File(path);
String text = Files.readFromFile(file, "ISO-8859-1");
System.out.println("INPUT TEXT: ");
System.out.println(text);
List<String> tokenList = new ArrayList<String>();
List<String> whiteList = new ArrayList<String>();
Tokenizer tokenizer = TOKENIZER_FACTORY.tokenizer(
text.toCharArray(), 0, text.length());
tokenizer.tokenize(tokenList, whiteList);
System.out.println(tokenList.size() + " TOKENS");
System.out.println(whiteList.size() + " WHITESPACES");
String[] tokens = new String[tokenList.size()];
String[] whites = new String[whiteList.size()];
tokenList.toArray(tokens);
whiteList.toArray(whites);
int[] sentenceBoundaries = SENTENCE_MODEL.boundaryIndices(tokens,
whites);
System.out.println(sentenceBoundaries.length
+ " SENTENCE END TOKEN OFFSETS");
if (sentenceBoundaries.length < 1) {
System.out.println("No sentence boundaries found.");
return new ArrayList<String>();
}
int sentStartTok = 0;
int sentEndTok = 0;
for (int i = 0; i < sentenceBoundaries.length; ++i) {
sentEndTok = sentenceBoundaries[i];
System.out.println("SENTENCE " + (i + 1) + ": ");
StringBuffer sentenceString = new StringBuffer();
for (int j = sentStartTok; j <= sentEndTok; j++) {
sentenceString.append(tokens[j] + whites[j + 1]);
}
System.out.println(sentenceString.toString());
sentencesList.add(sentenceString.toString());
sentStartTok = sentEndTok + 1;
}
} catch (IOException e) {
// TODO: handle exception
e.printStackTrace();
}
return sentencesList;
}
private static void levenshteinDistance(ArrayList<String> sentences) {
System.out.println("\nLevenshteinDistance");
for (int i = 0; i < sentences.size(); i++) {
System.out.println("Distance between \n'" + sentences.get(0)
+ "' \nand '" + sentences.get(i) + "': \n"
+ computeDistance(sentences.get(0),
sentences.get(i)));
}
}
private static void wordAnalyser(ArrayList<String> sentences) {
System.out.println("No.of Sentences : " + sentences.size());
List<String> stopWordsList = getStopWords();
List<String> tokenList = new ArrayList<String>();
ArrayList<List<String>> filteredSentences = new ArrayList<List<String>>();
for (int i = 0; i < sentences.size(); i++) {
tokenList = new ArrayList<String>();
List<String> whiteList = new ArrayList<String>();
Tokenizer tokenizer = TOKENIZER_FACTORY.tokenizer(sentences.get(i)
.toCharArray(), 0, sentences.get(i).length());
tokenizer.tokenize(tokenList, whiteList);
System.out.print("Sentence " + (i + 1) + ": " + tokenList.size()
+ " TOKENS, ");
System.out.println(whiteList.size() + " WHITESPACES");
filteredSentences.add(filterStopWords(tokenList, stopWordsList));
}
for (int i = 0; i < sentences.size(); i++) {
System.out.println("\n" + (i + 1) + ". Comparing\n '"
+ sentences.get(0) + "' \nwith\n '" +
sentences.get(i)
+ "' : \n");
System.out.println(filteredSentences.get(0) + "\n and \n"
+ filteredSentences.get(i));
System.out.println("Percentage of similarity: "
+ calculateSimilarity(filteredSentences.get(0),
filteredSentences.get(i))
+ "%");
}
}
private static double calculateSimilarity(List<String> list1,
List<String> list2) {
int length1 = list1.size();
int length2 = list2.size();
int count1 = 0;
int count2 = 0;
double result1 = 0.0;
double result2 = 0.0;
int least, highest;
if (length2 > length1) {
least = length1;
highest = length2;
} else {
least = length2;
highest = length1;
}
// computing result1
for (String string1 : list1) {
if (list2.contains(string1))
count1++;
}
result1 = (count1 * 100) / length1;
// computing result2
for (String string2 : list2) {
if (list1.contains(string2))
count2++;
}
result2 = (count2 * 100) / length2;
double avg = (result1 + result2) / 2;
return avg;
}
private static List<String> getStopWords() {
String stopWordsString = ".,a,able,about,across,after,all,almost,also,am,among,an,and,any,are,as,at,be,because,been,but,by,can,cannot,could,dear,did,do,does,either,else,ever,every,for,from,get,got,had,has,have,he,her,hers,him,his,how,however,i,if,in,into,is,it,its,just,least,let,like,likely,may,me,might,most,must,my,neither,no,nor,not,of,off,often,on,only,or,other,our,own,rather,said,say,says,she,should,since,so,some,than,that,the,their,them,then,there,these,they,this,tis,to,too,twas,us,wants,was,we,were,what,when,where,which,while,who,whom,why,will,with,would,yet,you,your";
List<String> stopWordsList = new ArrayList<String>();
List<String> stopWordTokenList = new ArrayList<String>();
List<String> whiteList = new ArrayList<String>();
Tokenizer tokenizer = TOKENIZER_FACTORY.tokenizer(
stopWordsString.toCharArray(), 0, stopWordsString.length());
tokenizer.tokenize(stopWordTokenList, whiteList);
for (int i = 0; i < stopWordTokenList.size(); i++) {
// System.out.println((i + 1) + ":" + tokenList.get(i));
if (!stopWordTokenList.get(i).equals(",")) {
stopWordsList.add(stopWordTokenList.get(i));
}
}
System.out.println("No.of stop words: " + stopWordsList.size());
return stopWordsList;
}
private static List<String> filterStopWords(List<String> tokenList,
List<String> stopWordsList) {
List<String> filteredSentenceWords = new ArrayList<String>();
for (String sentenceToken : tokenList) {
if (!stopWordsList.contains(sentenceToken)) {
filteredSentenceWords.add(sentenceToken);
}
}
return filteredSentenceWords;
}
}

Adding commas to output

This is what it does:
2,
4,
6,
8,
10
I would like for it to be horizontal:
2, 4, 6, 8, 10
try {
PrintWriter fout = new PrintWriter(new BufferedWriter(new FileWriter("numbers.dat")));
for(int i = start; i <= 100; i = i + 2) {
fout.println(i+",");
}
Another attempt
for(int i = start; i <= 100; i += 2)
System.out.print(i + (i > 98 ? "\n" : ", "));
In the write() method:
for(int i = start; i <= 100; i = i + 2) {
if (i > start) {
fout.print(",");
}
fout.println(i);
}
Then when you call output(), it will display as comma separated list. And for screen display
while((outputline = fin.readLine()) != null) {
System.out.print(outputline + " ");
}
System.out.println();
Alternatively, skip saving the comma in the file and displaying will be as follows
int count = 0;
while((outputline = fin.readLine()) != null) {
if (count > 0)
System.out.print(", ");
System.out.print(outputline);
count++;
}
System.out.println();
If you mean to do it on printing to your .dat file:
for(int i = start; i <= 100; i = i + 2) {
fout.print(i);
if(i < 100) {
fout.print(", ");
} else {
fout.println();
}
}
If it is when printing to the system output after reading your file, try this:
try {
BufferedReader fin = new BufferedReader(new FileReader("numbers.dat"));
String outputline = "";
StringBuilder result = new StringBuilder();
while((outputline = fin.readLine()) != null) {
result.append(outputline).append(", ");
}
int length = result.length();
result.delete(length - 2, length);
System.out.println(result);
fin.close();
}
This uses a StringBuilder to build up your result and then removes the last , when it is done.
Here is a very basic example of what you could do:
int max=11;
for(int i = 0; i < max; i += 2){
System.out.print(i);
if(i < max-2){
System.out.print(", ");
}
}
fout.println(i + ", ");
System.out.println(outputline + ", ");
This will give you commas after all the numbers but it looks like you don't need one after the final number.
You will have to remove the last comma either by using indexOf or checking for the last iteration of your loop.

Categories

Resources