java.lang.NullPointerException output term frequency-inverse document frequency (tfidf) matrix java - java

I have this code that outputs the tfidf for all words in each file in the directory. I'm trying to transfer this to a matrix where each row correspond to each file in the directory and each column to all words in the files and I have some difficulty in doing it and i need some help.
what i get is a java.lang.NullPointerException when i try to output the matrix.
The values start to appear but for some reason they stop and the null error generates.
this is the code
public class TestTF_IDF {
public static void main(String[] args) throws UnsupportedEncodingException, FileNotFoundException{
//Test code for TfIdf
TfIdf tf = new TfIdf("E:/Thesis/ThesisWork/data1");
//Contains file name being processed
//String file;
tf.buildAllDocuments();
int numDocuments = tf.documents.size();
Double matrix[][] = new Double[numDocuments][];
int documentIndex = 0;
for (String file : tf.documents.keySet())
{
// System.out.println("File \t" + file);
Map<String, Double[]> myMap =
tf.documents.get(file).getF_TF_TFIDF();
int numWords = myMap.size();
matrix[documentIndex] = new Double[numWords];
int wordIndex = 0;
for (String key : myMap.keySet())
{
Double[] values = myMap.get(key);
matrix[documentIndex][wordIndex] = values[2];
wordIndex++;
//System.out.print("file="+ file+ "term=" +key + values[2]+" ");
}
documentIndex++;
for(int i=0; i<numDocuments;i++){
for(int j=0; j<numWords;j++){
System.out.print("file="+ file+ matrix[i][j]+ " "); //error here
}
}
}
}//public static void main(String[] args)
}//public class TestTF_IDF
Any ideas. Thanks

Although the question is remarkably unclear, here is what I tried to guess based on the question and the comments.
import java.util.Map;
public class TestTF_IDF
{
public static void main(String[] args) throws Exception
{
TfIdf tf = new TfIdf("E:/Thesis/ThesisWork/data1");
tf.buildAllDocuments();
int numDocuments = tf.documents.size();
Double[] matrix[][] = new Double[numDocuments][][];
int documentIndex = 0;
for (String file : tf.documents.keySet())
{
System.out.println("File \t" + file);
Map<String, Double[]> myMap =
tf.documents.get(file).getF_TF_TFIDF();
int numWords = myMap.size();
matrix[documentIndex] = new Double[numWords][];
int wordIndex = 0;
for (String key : myMap.keySet())
{
Double[] values = myMap.get(key);
matrix[documentIndex][wordIndex] = values;
wordIndex++;
}
documentIndex++;
}
}
}
class Document
{
public Map<String, Double[]> getF_TF_TFIDF()
{
return null;
}
}
class TfIdf
{
public Map<String, Document> documents;
TfIdf(String s)
{
}
public void buildAllDocuments()
{
}
}

Related

IndexOutOfBoundsException for automation

I am trying to automate an application. For that, i am using hash map for excel data set and i have created my methods for performing action on that data.
Class file to execute is shown below
#Test
public void testLAP_Creamix() throws Exception {
try {
launchMainApplication();
Lapeyre_frMain Lapeyre_frMainPage = new Lapeyre_frMain(tool, test, user, application);
HashMap<String, ArrayList<String>> win = CreamixWindowsDataset.main();
SortedSet<String> keys = new TreeSet<>(win.keySet());
for (String i : keys) {
System.out.println("########### Test = " + win.get(i).get(0) + " ###########");
Lapeyre_frMainPage.EnterTaille(win.get(i).get(1));
Lapeyre_frMainPage.SelectCONFIGURATION(win.get(i).get(2));
Lapeyre_frMainPage.SelectPLANVASQUE(win.get(i).get(3));
Lapeyre_frMainPage.SelectCOULEUR(win.get(i).get(4));
Lapeyre_frMainPage.SelectPOIGNEES(win.get(i).get(5));
Lapeyre_frMainPage.SelectTYPE_DE_MEUBLE(win.get(i).get(6));
Lapeyre_frMainPage.VerifyPanierPrice(win.get(i).get(7));
Lapeyre_frMainPage.VerifyECO_PARTPrice(win.get(i).get(8));
Lapeyre_frMainPage.ClickCREAMIXReinit();
System.out.println("########### Test End ##############");
}
test.setResult("pass");
} catch (AlreadyRunException e) {
} catch (Exception e) {
verificationErrors.append(e.getMessage());
throw e;
}
}
Hash Map code :
public static HashMap<String, ArrayList<String>> main() throws IOException {
final String DatasetSheet = "src/test/resources/CreamixDataSet.xlsx";
final String DatasetTab = "Creamix";
Object[][] ab = DataLoader.ReadMyExcelData(DatasetSheet, DatasetTab);
int rowcount = DataLoader.myrowCount(DatasetSheet, DatasetTab);
int colcount = DataLoader.mycolCount(DatasetSheet, DatasetTab);
HashMap<String, ArrayList<String>> map = new HashMap<String, ArrayList<String>>();
// i = 2 to avoid column names
for (int i = 2; i < rowcount;) {
ArrayList<String> mycolvalueslist = new ArrayList<String>();
for (int j = 0; j < colcount;) {
mycolvalueslist.add(ab[i][j].toString());
j++;
}
map.put(ab[i][0].toString(), mycolvalueslist);
i++;
}
return map;
}
Query: I was able to run this code few days back, but now after adding some new columns it is giving me below mentioned error.
IndexOutOfBoundsException Index 7 out of bounds for length 7
I am not able to trace the issue here, what should i look for? please help!
for (String i : keys) {
arr = win.get(i);//debug here,watch it size
Lapeyre_frMainPage.EnterTaille(arr.get(1));
}

Java ArrayIndexOutOfBoundsException keeps appearing while trying to find most occuring word in file

I am currently building a program which reads a file and prints the most occurring words and how many times each word appears like so:
package WordLookUp;
import java.util.*;
import java.io.*;
import java.lang.*;
public class WordLookUp {
private String[] mostWords;
private Scanner reader;
private String line;
private FileReader fr;
private BufferedReader br;
private List<String> original;
private String token = " ";
public WordLookUp(String file) throws Exception {
this.reader = new Scanner(new File(file));
this.original = new ArrayList<String>();
while (this.reader.hasNext()) { //reads file and stores it in string
this.token = this.reader.next();
this.original.add(token); //adds it to my arrayList
}
}
public void findMostOccurringWords() {
List<String> mostOccur = new ArrayList<String>();
List<Integer> count = new ArrayList<Integer>();
int counter = 0;
this.mostWords = this.token.split(" "); //storing read lines in mostWords arrayList
try {
for (int i = 0; i < original.size(); i++) {
if (this.original.equals(this.mostWords[i])) {
counter++; //increase counter
mostOccur.add(this.mostWords[i]);
count.add(counter);
}
}
for (int i = 0; i < mostOccur.size(); i++) {
System.out.println("Word: " + mostOccur.get(i) + " count: " + count.get(i));
}
} catch (ArrayIndexOutOfBoundsException ae) {
System.out.println("Illegal index");
}
}
}
package WordLookUp;
import java.util.*;
import java.io.*;
public class Main {
public static void main(String[] args) throws Exception {
// TODO Auto-generated method stub
WordLookUp wL = new WordLookUp("tiny1.txt");
wL.findMostOccurringWords();
}
}
So when I keep running my file, it throws the exception I gave it: "Illegal index". I think it is my findMostOccuringWords method. To me the logic feels correct, but I don't know why it is throwing an ArrayIndexOutOfBoundsException. I tried playing with the for loops and tried to go from int i = 0 to i < mostOccur.size() - 1 but that is not working either. Is my logic wrong ? I am not allowed to use a hashmap and our professor gave us a hint that we can do this assignment easily with arrays and ArrayLists (no other built in functions, but regexes is highly recommended for use as well for the rest of the assignment). I put a private FileReader and BufferedReader up there as I am trying to see if they would work better or not. Thanks for the advice!
Can you try to use the following codes? I think your current algorithm is wrong.
public class WordLookUp {
private List<String> original;
private List<String> mostOccur = new ArrayList<String>();
private List<Integer> count = new ArrayList<Integer>();
public WordLookUp(String file) throws Exception {
try(Scanner reader = new Scanner(new File(file));){
this.original = new ArrayList<String>();
String token = " ";
while (reader.hasNext()) { //reads file and stores it in string
token = reader.next();
this.original.add(token); //adds it to my arrayList
findMostOccurringWords(token);
}
}
}
public void findMostOccurringWords(String token) {
int counter = 0;
String[] mostWords = token.split(" "); //storing read lines in mostWords arrayList
try {
for (int i = 0; i < mostWords.length; i++) {
for(int j = 0; j < this.original.size(); j++) {
if (original.get(j).equals(mostWords[i])) {
counter++; //increase counter
}
}
if (mostOccur.contains(mostWords[i])) {
count.set(mostOccur.indexOf(mostWords[i]),counter);
}else {
mostOccur.add(mostWords[i]);
count.add(counter);
}
}
} catch (ArrayIndexOutOfBoundsException ae) {
System.out.println("Illegal index");
}
}
public void count() {
for (int i = 0; i < mostOccur.size(); i++) {
System.out.println("Word: " + mostOccur.get(i) + " count: " + count.get(i));
}
}
}
public class Main {
public static void main(String[] args) throws Exception {
// TODO Auto-generated method stub
WordLookUp wL = new WordLookUp("F:\\gc.log");
wL.count();
}
}
Here in this loop:
for (int i = 0; i < mostOccur.size(); i++) {
System.out.println("Word: " + mostOccur.get(i) + " count: " + count.get(i));
}
You check to make sure that i is within bounds for mostOccur but not count. I would add a condition to check to make sure it is in bounds. Such as:
for (int i = 0; i < mostOccur.size() && i < count.size(); i++) {
System.out.println("Word: " + mostOccur.get(i) + " count: " + count.get(i));
}

Needing to update my outfile after items are changed

For my program I have it set up that I can edit and change values of items that are stored within my outfile in the program itself. However the numbers that they change to only update in the program itself. For example if I sell 10 ketchups than in my program i would have 0 but my outfile would still say I have 10. I need my outfile to update with my program. I came up with an override method but all it does currently is adds content on a new line within the outfile, I am not sure how I would go about actually updating any information stored on the outfile any help would be great.
Code:
public class Driver {
public static ArrayList<Item> list = new ArrayList<Item>();
static double myBalance = 100;
/*static ArrayList<Item> list = new ArrayList<Item>();*/
/**
* #param args
* #throws IOException
* #throws FileNotFoundException
*/
public static void main(String[] args) throws IOException {
// TODO Auto-generated method stub
ArrayList<String> inventoryList = new ArrayList<String>();
BufferedReader readIn = null;
try {
readIn = new BufferedReader(new FileReader("inventory.out"));
readIn.lines().forEach(inventoryList::add);
} catch (Exception e) {
e.printStackTrace();
} finally {
if(readIn != null) {
readIn.close();
}
}
for (int i = 0; i < 4; i++) {
String item = inventoryList.get(i);// input String like the one you would read from a file
String delims = "[,]"; //delimiter - a comma is used to separate your tokens (name, qty,cost, price)
String[] tokens = item.split(delims); // split it into tokens and place in a 2D array.
String name = tokens[0]; System.out.println(name);
double cost = Double.parseDouble(tokens[1]);System.out.println(cost);
int qty = Integer.parseInt(tokens[2]);System.out.println(qty);
double price = Double.parseDouble(tokens[3]);System.out.println(price);
list.add(new Item(name, cost, qty, price));
}
sell("Mayo", 10);
buy("Ketchup", 20);
remove_item("Ketchup");
add_item("Tums", 20, 10, 5);
overwrite("New line");
PrintAll();
}
// Method to sell items from the arraylist
public static void sell(String itemName, int amount) {
for (int i = 0; i < list.size(); i++) {
if (list.get(i).getName().equals(itemName)) {
int number = i;
list.get(number).qty -= amount;
myBalance += list.get(number).getPrice() * amount;
}
}
}
// Method to buy more of the items in our array list
public static void buy(String itemName, int amount) {
for (int i = 0; i < list.size(); i++) {
if (list.get(i).getName().equals(itemName)) {
int number = i;
list.get(number).qty += amount;
myBalance -= list.get(number).getPrice() * amount;
}
}
}
// Method to remove an item completely from our inventory
public static void remove_item(String itemName) {
for (int i = 0; i < list.size(); i++) {
if (list.get(i).getName().equals(itemName)) {
int number = i;
list.remove(number);
}
}
}
public static void add_item(String itemName, double itemCost, int qty, double itemPrice) {
list.add(new Item(itemName, itemCost, qty, itemPrice));
}
public static void PrintAll() {
String output = "";
for(Item i : list) {
int everything = i.getQty();
String everything2 = i.getName().toString();
output += everything +" "+ everything2 + "\n";
}
JOptionPane.showMessageDialog(null, "Your current balance is: $" + myBalance + "\n" + "Current stock:" + "\n" + output);
}
public static void overwrite(String update) {
try
{
String filename= "inventory.out";
FileWriter fw = new FileWriter(filename,true); //the true will append the new data
fw.write("\n"+"add a line");//appends the string to the file
fw.close();
}
catch(IOException ioe)
{
System.err.println("IOException: " + ioe.getMessage());
}
}
}
Outfile contents:
Ketchup,1,10,2
Mayo,2,20,3
Bleach,3,30,4
Lysol,4,40,5
If you know the name of your outfile then clear the outfile as and when you need it updated and then write to it again. You can use the below code to erase content of a file.
PrintWriter writer = new PrintWriter(file);
writer.print("");
writer.close();

Implementing Elimination of Immediate Left-Recursion in Java

I am working on implementing a generic code to solve left recursion problem in a grammar using java so my code is working as follows I am reading an input like this as each line goes to the next line:
E
E+T|T
T
T*F|F
F
(E)|id|number
and the required output is supposed to be like this one :
E->[TE']
T->[FT']
F->[(E), id, number]
E'->[+TE', !]
T'->[*FT', !]
I wrote that code which is storing input in Arraylists to iterate over them and produce the output:
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
public class IleftRecursion {
//storing each line in its corresponding Arraylist
static ArrayList<String> leftRules = new ArrayList<>();
static ArrayList<String> rightRules = new ArrayList<>();
public static void read_file(String file) throws IOException {
FileReader in = new FileReader(file);
BufferedReader br = new BufferedReader(in);
String line;
while ((line = br.readLine()) != null) {
leftRules.add(line);
rightRules.add(br.readLine());
}
br.close();
}
public static void ss() {
for (int i = 0; i < leftRules.size(); i++) {
for (int j = 1; j <= i - 1; j++) {
//splitting inputs on bars "|" to iterate through them
for (String x : rightRules.get(i).split("\\|")) {
if (x.contains(leftRules.get(j))) {
String f = "";
String ff = "";
for (int k=0; k<rightRules.get(k).split("\\|").length;k++) {
f = x;
f = f.replaceAll(leftRules.get(i), rightRules.get(k).split("\\|")[k]);
ff += f;
}
rightRules.remove(i);
rightRules.add(i, ff);
}
}
}
//Recursive or Not boolean
boolean isRec = false;
for (String z : rightRules.get(i).split("\\|")) {
if (z.startsWith(leftRules.get(i))) {
isRec = true;
break;
}
}
if (isRec) {
String a = "";
String b = "";
for (String s : rightRules.get(i).split("\\|")) {
if (s.startsWith(leftRules.get(i))) {
b += s.replaceAll(leftRules.get(i), "") + leftRules.get(i) + "',";
} else {
a += s + leftRules.get(i) + "'";
}
}
b += "!";
if(a.length()>=1)
a.substring(1, a.length() - 1);
rightRules.add(i, a);
rightRules.add(i + 1, b);
leftRules.add(leftRules.get(i) + "'");
}
}
}
public static void main(String[] args) throws IOException {
read_file("Sample.in");
ss();
for (int i=0;i<leftRules.size();i++)
{
System.out.print(leftRules.get(i)+"->");
System.out.println("["+rightRules.get(i)+"]");
}
}
}
I debugged the code many times trying to figure out why Am I getting output like this
E->[TE']
T->[+TE',!]
F->[T]
E'->[T*F]
Which is missing One rule and also not all the new productions generated in the right way but I couldn't fix could anyone help me through that ?

Code to read the dataset

Here I read the dataset and extracted the data lines(not the attributes) and print it.Next I need to sort the dataset.Now this is stored in an ArrayList.How to sort it?
public static void main(String args[]) throws Exception
{
String filen, jsnfl;
Customiseddata data = new Customiseddata();
data.setAlgorithm("C4.5");
data.setUserName("Dahlia");
System.out.println("Enter the file name");
sc = new Scanner(System.in);
filen = sc.nextLine();
data.setFileName("input_files/" + filen);
Mainclass main = new Mainclass();
main.build(data);
}
public void build(Customiseddata data) throws Exception
{
int extension;
String filename;
filename = data.getFileName();
extension = filename.lastIndexOf('.');
String extensionType = filename.substring(extension + 1,
filename.length());
if (extensionType.equalsIgnoreCase("csv"))
{
readcsv(filename);
}
else if (extensionType.equalsIgnoreCase("arff"))
{
readarff(filename);
}
}
public void readarff(String filename) throws Exception
{
#SuppressWarnings("unused")
int filesize, attributesize, c = 0, i;
#SuppressWarnings("unused")
float v = 0;
String s, line1;
ArrayList<String> filelines;
ArrayList<String> attributes;
Customiseddata data = new Customiseddata();
Arfffilereader arfffile = new Arfffilereader();
Extractdata exdata = new Extractdata();
exdata = arfffile.extractInputArff(filename);
filelines = exdata.getFileLines();
attributes = exdata.getAttributes();
filesize = filelines.size();
attributesize = attributes.size();
data.setFilesize(filesize);
System.out.println("Print the attributes");
System.out.println("--------------------");
for (i = 0; i < attributesize; i++)
{
System.out.println(attributes.get(i));
}
System.out.println("\t");
System.out.println("Print the filelines");
System.out.println("--------------------");
for (int j = 0; j < filesize; j++)
{
System.out.println(filelines.get(j));
}
}
But after this I need to sort the dataset.
Since the elements of the list are Strings and since String implements Comparable, sorting a list is as simple as:
Collections.sort(theList);
Note however that it will sort the list in place. If you don't want that, make a copy of the list and sort that copy.

Categories

Resources