Why do I only get one result of TF-IDF? - java

// Calculating term frequency
System.out.println("Please enter the required word :");
Scanner scan = new Scanner(System.in);
String word = scan.nextLine();
String[] array = word.split(" ");
int filename = 11;
String[] fileName = new String[filename];
int a = 0;
int totalCount = 0;
int wordCount = 0;
for (a = 0; a < filename; a++) {
try {
System.out.println("The word inputted is " + word);
File file = new File(
"C:\\Users\\user\\fypworkspace\\TextRenderer\\abc" + a
+ ".txt");
System.out.println(" _________________");
System.out.print("| File = abc" + a + ".txt | \t\t \n");
for (int i = 0; i < array.length; i++) {
totalCount = 0;
wordCount = 0;
Scanner s = new Scanner(file);
{
while (s.hasNext()) {
totalCount++;
if (s.next().equals(array[i]))
wordCount++;
}
System.out.print(array[i] + " ---> Word count = "
+ "\t\t " + "|" + wordCount + "|");
System.out.print(" Total count = " + "\t\t " + "|"
+ totalCount + "|");
System.out.printf(" Term Frequency = | %8.4f |",
(double) wordCount / totalCount);
System.out.println("\t ");
}
}
} catch (FileNotFoundException e) {
System.out.println("File is not found");
}
}
System.out.println("Please enter the required word :");
Scanner scan2 = new Scanner(System.in);
String word2 = scan2.nextLine();
String[] array2 = word2.split(" ");
int numofDoc;
for (int b = 0; b < array2.length; b++) {
numofDoc = 0;
for (int i = 0; i < filename; i++) {
try {
BufferedReader in = new BufferedReader(new FileReader(
"C:\\Users\\user\\fypworkspace\\TextRenderer\\abc"
+ i + ".txt"));
int matchedWord = 0;
Scanner s2 = new Scanner(in);
{
while (s2.hasNext()) {
if (s2.next().equals(array2[b]))
matchedWord++;
}
}
if (matchedWord > 0)
numofDoc++;
} catch (IOException e) {
System.out.println("File not found.");
}
}
System.out.println(array2[b]
+ " --> This number of files that contain the term "
+ numofDoc);
double inverseTF = Math.log10((float) numDoc / numofDoc);
System.out.println(array2[b] + " --> IDF " + inverseTF );
double TFIDF = (((double) wordCount / totalCount) * inverseTF );
System.out.println(array2[b] + " --> TFIDF " + TFIDF);
}
}
Hi, this is my code for calculating term frequency and TF-IDF. The first code calculates the term frequency for each file of a given string. The second code is supposed to calculate TF-IDF for each file using the value from the above. But I only received one value. It's supposed to provide TF-IDF value for each document.
Example output for term frequency :
The word input is 'is'
| File = abc0.txt |
is ---> Word count = |2| Total count = |150| Term Frequency = | 0.0133 |
The word inputted is 'is'
| File = abc1.txt |
is ---> Word count = |0| Total count = |9| Term Frequency = | 0.0000 |
The TF-IDF
is --> This number of files that contain the term 7
is --> IDF 0.1962946357308887
is --> TFIDF 0.0028607962606519654 <<< I suppose to get one value per file, means that i have 10 files, it suppose to give me 10 different values for each different file. But, it only prints one result only. Can someone point my mistake?

The println statement you suppose to be repeated per file is
double TFIDF = (((double) wordCount / totalCount) * inverseTF );
System.out.println(array2[b] + " --> TFIDF " + TFIDF);
but it is contained in the single loop
for (int b = 0; b < array2.length; b++)
only. If you want to print this line per file you have to surround this statement by another loop over all files.
Since this is homework I won't include the final code, but give you another hint: you also included the variables wordCount and totalCount in the calculation of TFIDF. But these are unique to each filename/word pair. Therefore you need to save it not only once, but per filename/word or recaclulate them again in your final loop.

The part that prints the TDIDF needs to be moved inside the for loop that loops over all the files.
ie:
System.out.println(array2[b]
+ " --> This number of files that contain the term "
+ numofDoc);
double inverseTF = Math.log10((float) numDoc / numofDoc);
System.out.println(array2[b] + " --> IDF " + inverseTF );
double TFIDF = (((double) wordCount / totalCount) * inverseTF );
System.out.println(array2[b] + " --> TFIDF " + TFIDF);
}
}
}

Related

How can I format the output with System.out?

I am creating a Premier League football table in my spare time and I have come across a problem. While the program runs I want it to be perfect and output in the format I want it to, the problem is:
You enter the the Input (" HomeTeam : AwayTeam : HomeScore : AwayScore ") as follows
When you are done with the list you enter "quit" to stop the program
My issue is that the scores come out like this
(" HomeTeam | AwayTeam | HomeScore | AwayScore ")
I intend it to print like this (" HomeTeam [HomeScore] | AwayTeam [AwayScore] ")
I have tried many variations of System.out.printlns to no avail, even trying to make several Boolean conditions that will output the input in the way I want it too. I am truly at a loss and it is frustrating - I hope that someone can give me tips the code is attached
Edited for loop;
for (int i = 0; i < counter; i++) { // A loop
String[] words = product_list[i].split(":");
System.out.println(words[0].trim() + "[" + words[2].trim() + "]" + " | " + words[1].trim() + "[" + words[3].trim()) + "]";
This should work:
Scanner sc = new Scanner(System.in);
public void outputScore(String input) {
String[] words = input.trim().split("\\s+");
String satisfied = sc.nextLine();
if (satisfied.equals("quit")) {
System.out.println(words[0] + " [" + words[4] + "] | " + words[2] + " [" + words[6] + "]");
}
}
This is what the method should look like when you call it:
outputScore(sc.nextLine());
Here is the code to your edited question:
String [] product_list = new String [100];
int counter = 0;
Scanner scanner = new Scanner(System.in);
System.out.println("Input as follows:");
System.out.println("Home team : Away team : Home score : Away score");
String line = null;
while (!(line = scanner.nextLine()).equals("")) {
if (line.equals("quit")) {
break;
} else {
product_list[counter] = line;
System.out.println("Home team : Away team : Home score : Away score");
}
counter++;
}
for (int i = 0; i < counter; i++) {
String[] words = product_list[i].split(":");
System.out.println(words[0].trim() + " : " + words[2].trim() + " | " + words[1].trim() + " : " + words[3].trim());
}
Hope this helps.

reading only some strings from a file and storing it in stack in java

the file is:
The Lord Of the Rings
J.R.R. Tolkein
Great Expectations
Charles Dickens
Green Eggs and Ham
Dr. Seuss
Tom Sawyer
Mark Twain
Moby Dick
Herman Melville
The Three Musketeers
Alexander Dumas
The Hunger Games
Suzanne Collins
1984
George Orwell
Gone With the Wind
Margret Mitchell
Life of Pi
Yann Martel
It has the title of the book in first line and in the next line it has author.
I only want to read first five books and authors from the file and then store it in a stack. Add the first two books with author in a stack and then remove the last one. How can I do it?
This is what I did
Stack<Book> readingList = new Stack<>();
File myFile = new File("books.txt");
Scanner input = new Scanner(myFile);
int i = 0;
while (input.hasNext()) {
readingList.push(new Book(input.nextLine(), input.nextLine()));
System.out.println("Adding: " + readingList.lastElement().getInfo());
readingList.push(new Book(input.nextLine(), input.nextLine()));
System.out.println("Adding: " + readingList.lastElement().getInfo());
System.out.println("Reading: " + readingList.pop().getInfo());
}
Asumming that every Book + Author is in one line of your file:
to read only the first five books use a for-loop instead of while:
Stack<Book> readingList = new Stack<>();
File myFile = new File("books.txt");
Scanner input = new Scanner(myFile);
int i = 0;
int counter = 0;
for (int numberOfBookToRead = 0; numberOfBookToRead < 2;numberOfBookToRead++) {
try {
if(readingList.hasNextLine() && counter <= 4){ // provides that only 4 books are pushed while the iteration is going, and also works
readingList.push(new Book(input.nextLine(), input.nextLine()));
counter += 1;
System.out.println("Adding: " + readingList.lastElement().getInfo());
readingList.push(new Book(input.nextLine(), input.nextLine()));
counter +=1;
System.out.println("Adding: " + readingList.lastElement().getInfo());
System.out.println("Reading: " + readingList.pop().getInfo());
readingList.pop() = null;
}catch(Exception e){
e.printStackTrace();
}
} else if(readingList.hasNextLine){
readingList.push(new Book(input.nextLine(), input.nextLine()));
System.out.println("Adding: " + readingList.lastElement().getInfo());
}
}
afterwards to clear the last item from stack:
readingList.pop() = null;
Because i got a lot of time. Here is the same function but with a variable number of maximum books:
Stack<Book> readingList = new Stack<>();
File myFile = new File("books.txt");
Scanner input = new Scanner(myFile);
int i = 0;
int counter = 0;
int maxNumberOfBooks = 10; //could be any number you wish
for (int numberOfBookToRead = 0; numberOfBookToRead < Math.round(maxNumberOfBooks/2)-1;numberOfBookToRead++) {
try {
if(readingList.hasNextLine() && counter <= maxNumberOfBooks-1){
readingList.push(new Book(input.nextLine(), input.nextLine()));
counter += 1;
System.out.println("Adding: " + readingList.lastElement().getInfo());
readingList.push(new Book(input.nextLine(), input.nextLine()));
counter +=1;
System.out.println("Adding: " + readingList.lastElement().getInfo());
System.out.println("Reading: " + readingList.pop().getInfo());
readingList.pop() = null;
}catch(Exception e){
e.printStackTrace();
}
} else if(readingList.hasNextLine){
readingList.push(new Book(input.nextLine(), input.nextLine()));
System.out.println("Adding: " + readingList.lastElement().getInfo());
}
}
You might want to check for exceptions, in case of a bad file
Stack<Book> readingList = new Stack<>();
File myFile = new File("books.txt");
Scanner input = new Scanner(myFile);
for (int count = 0; count < 5; count++) {
try {
readingList.push(new Book(input.nextLine(), input.nextLine()));
System.out.println("Adding: " + readingList.lastElement().getInfo());
System.out.println("Reading: " + readingList.pop().getInfo());
}
catch(Exception e) {
System.out.println(e.getMessage());
}
}

Array Length Outcome

My problem is probably ridiculously easy and I'm just missing something. My program crashes due to a null value of cell 1 during its first iteration. i troubleshot a bit myself and realized on iteration 1 the array length is 1 then after all other iterations the length is 2. this initial improper length causes a complete crash. Any ideas?
`import java.util.Scanner;
import java.io.*;
/* Takes in all of users personal information, and weather data. Then proceeds to determine status of day + averages of the data values provided, then reports to user*/
public class ClimateSummary
{
public static void main (String [] args) throws FileNotFoundException
{
Scanner sc = new Scanner (new File(args[0]));
String name = sc.nextLine();
String birthCity = sc.next();
String birthState = sc.next();
String loc = sc.next();
int birthDay = sc.nextInt();
String birthMonth = sc.next();
int birthYear = sc.nextInt();
int highTemp = 0;
double avgTemp;
double avgPrecip;
int coldDays = 0;
int hotDays = 0;
int rainyDays = 0;
int niceDays = 0;
int miserableDays = 0;
double totalTemp = 0;
double totalPrecip = 0;
int i = 0;
while(i <= 5)
{
String storage = sc.nextLine();
String[] inputStorage = storage.split(" "); //couldnt find scanf equiv in c for java so using array to store multiple values.
System.out.println(inputStorage[0]);
int tempTemp = Integer.parseInt(inputStorage[0]);
double tempPrecip = Double.parseDouble(inputStorage[1]);
totalTemp = totalTemp + tempTemp;
totalPrecip = totalPrecip + tempPrecip;
if(highTemp < tempTemp)
{
highTemp = tempTemp;
}
if(tempTemp >= 60.0)
{
hotDays++;
}else{
coldDays++;
}
if(tempPrecip > 0.1)
{
rainyDays++;
}
if(tempTemp >= 60.0 || tempTemp <= 80.0 || tempPrecip == 0.0)
{
niceDays++;
}else if(tempTemp < 32.0 || tempTemp > 90.0 || tempPrecip > 2.0)
{
miserableDays++;
}else{
}
i++;
}
avgTemp = totalTemp/5;
avgPrecip = totalPrecip/5;
System.out.println("Name: " + name);
System.out.println("Place of birth: " + birthCity + "," + birthState);
System.out.println("Data collected at: " + loc);
System.out.println("Date of birth: " + birthMonth + " " + birthDay +", " + birthYear);
System.out.println("");
System.out.println("The highest temperature during this tine was " + highTemp + " degrees Farenheit");
System.out.println("The average temperature was " + avgTemp + " degrees Farenheit");
System.out.println("The average amount of precipitation was " + avgPrecip + " inches");
System.out.println("Number of hots days = " + hotDays);
System.out.println("Number of cold days = " + coldDays);
System.out.println("Number of rainy days = " + rainyDays);
System.out.println("Number of nice days = " + niceDays);
System.out.println("Number of miserable days = " + miserableDays);
System.out.println("Goodbye and have a nice day!");
}
Eric Thomas
Columbus
Nebraska
Columbus
18
February
1990
54 0
44 2.2
64 0.06
26 0.5
34 0.02
If your file contains null values then you should handle it separately.... using something like this:
if (name == null) {
//do something
}
else {
// do something else;
}
A good discussion on nulls can be seen here...How to check for null value in java
Also, after splitting a string, you need to check if the array (which is the output) has values at the indices that you are using.
For example:
String name = "A/B/C";
String[] nameArray = name.split("/");
In the above case, nameArray[3] will throw an error.

How do I return the array values despite being inside a nested for loop?

for(int counter = 0; counter < args.length; counter++){
System.out.println("Displaying per words: " + args[counter]);
splitWords = args[counter].toCharArray();
for(int counter2 = 0; counter2 < splitWords.length; counter2++){
System.out.println("Word spliced: " + splitWords[counter2]);
System.out.println("The number equivalent of " + splitWords[counter2] + " is "
+ (int) splitWords[counter2]);
occurenceCount[(int)splitWords[counter2]]++;
System.out.println("The letter " + splitWords[counter2] +
" was shown " + occurenceCount[(int)splitWords[counter2]] + " times.");
}
}
My function doesn't detect counter2 as a variable since it was inside the nested for loop. So how do I get out of this dilemma?
I'm trying to use the argument inputs (string respectively) and post the number of occurrences using an ascii table as reference and, as you see, there's just one obstacle from stopping me from accomplishing that.
Any ideas?
Your primary problem is that you have missed one important fact - your counts are not complete until after your loop has completed.
You therefore need to print out your counts in a separate loop after your first loop is complete.
public void test() {
String[] args = {"Hello"};
int[] occurenceCount = new int[256];
for (int word = 0; word < args.length; word++) {
System.out.println("Displaying per words: " + args[word]);
char[] splitWords = args[word].toCharArray();
for (int character = 0; character < splitWords.length; character++) {
System.out.println("Word spliced: " + splitWords[character]);
System.out.println("The number equivalent of " + splitWords[character] + " is "
+ (int) splitWords[character]);
occurenceCount[(int) splitWords[character]]++;
System.out.println("Word spliced: " + splitWords[character]);
}
}
// Scond loop to print the results.
for (int character = 0; character < occurenceCount.length; character++) {
int count = occurenceCount[character];
if (count > 0) {
System.out.println("The letter " + ((char) character)
+ " was shown " + count + " times.");
}
}
}

Array of string from the console

I have a task:
"Send integers n [1 .. 10] from command line. Enter n rows to the Console, find the shortest and the longest line. Print the results and line length."
My idea is: Create array of strings and copy every line from BufferedReader to the array data[i]. Sample of my code:
String[] data = new String[n];
int j=0;
for (int i = 1; i <= n; i++) {
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
System.out.println("Please, enter " + i + " string: ");
String line = in.readLine();
for (int j=0; j<=data.length;j++){
data[j] = line;
j++;
} ///:~
System.out.println("Your " + i + " string : " + data[j] + "String len: " + line.length());
} ///:~
But I could not find the way how to fill elements of array data[i] with new line from console.
Can you please give me a small hint?
To fill data, just replace the inner for-loop with a simple assignment using index i-1:
for (int i = 1; i <= n; i++) {
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
System.out.println("Please, enter " + i + " string: ");
String line = in.readLine();
data[i-1] = line;
System.out.println("Your " + i + " string : " + data[i-1] + "\nString len: " + line.length());
}
I left the loop from 1 to n instead of 0 to n-1 because you're printing i.
But if you only want the shortest and longest lines, there's no need to store all the lines, you only need to check the length of the current line against the length of the shortest and longest lines and change them appropriately.
the easyst way is
data[i-1] = in.readLine();
Thank you all for all your Help :)
Here is my example:
package taskstring;
import java.io.*;
public class TaskString {
public static void main(String[] args) throws java.lang.Exception {
int n = Integer.parseInt(args[0]);
if (n <= 0) {
System.out.println("Wrong! please send more numbers to java");
return;
}
System.out.println("Enjoy! You are going to send " + n + " string(s) to java");
int maxLen = 0;
int minLen = 0;
for (int i = 1; i <= n; i++) {
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
System.out.println("Please, enter " + i + " string: ");
String line = in.readLine();
System.out.println("Your " + i + " string : " + "String len: " + line.length());
if (maxLen < line.length()) {
System.out.println("New string is bigger");
maxLen = line.length();
} else {
System.out.println("New string is smaller");
}
if (minLen > line.length() || minLen == 0) {
System.out.println("New string is smaller" + " minLen=" + minLen);
minLen = line.length();
} else {
System.out.println("New string is bigger");
}
//return;
} ///:~
System.out.println("Max row: " + maxLen + "\nMin row: " + minLen);
} ///:~
}

Categories

Resources