Sentiment analysis using SentiWordNet

Sentiment analysis using SentiWordNet - java

I am in desperate need for some help with the following.
For my master thesis I have to conduct a sentiment analysis on some Amazon, Twitter and Facebook data. I have saved these data in a csv document. Now I want to use SentiWordNet to obtain the polarity scores. However I'm unable to run the script provided on their website using python.
First of all I have to say that I am completely new to Java. So please don't blame me for not knowing it all. I have spent a lot of time searching on the internet for some information or tutorials with no luck. There was one topic on this site from a person with a similar problem (How to use SentiWordNet), although I came across a different problem. Whenever I run the script below, I get the following message: ImportError: No module named java.io.BufferedReader. I tried to search on the internet for a solution, but I couldn't find any. Could someone please help me out with how to run this script. For starters, I have already removed the garbage in the sentiwordnet.txt file. The pathway to the SentiWordNet.txt file is \Users\Mo\Documents\etc. This is also the pathway for the csv file. Btw I'm running this script on OSX with python 2.7.5.
Thank you so much in advance for your help!!!
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Set;
import java.util.Vector;
public class SWN3 {
private String pathToSWN = "data"+File.separator+"SentiWordNet_3.0.0.txt";
private HashMap<String, String> _dict;
public SWN3(){
_dict = new HashMap<String, String>();
HashMap<String, Vector<Double>> _temp = new HashMap<String, Vector<Double>>();
try{
BufferedReader csv = new BufferedReader(new FileReader(pathToSWN));
String line = "";
while((line = csv.readLine()) != null)
{
String[] data = line.split("\t");
Double score = Double.parseDouble(data[2])-Double.parseDouble(data[3]);
String[] words = data[4].split(" ");
for(String w:words)
{
String[] w_n = w.split("#");
w_n[0] += "#"+data[0];
int index = Integer.parseInt(w_n[1])-1;
if(_temp.containsKey(w_n[0]))
{
Vector<Double> v = _temp.get(w_n[0]);
if(index>v.size())
for(int i = v.size();i<index; i++)
v.add(0.0);
v.add(index, score);
_temp.put(w_n[0], v);
}
else
{
Vector<Double> v = new Vector<Double>();
for(int i = 0;i<index; i++)
v.add(0.0);
v.add(index, score);
_temp.put(w_n[0], v);
}
}
}
Set<String> temp = _temp.keySet();
for (Iterator<String> iterator = temp.iterator(); iterator.hasNext();) {
String word = (String) iterator.next();
Vector<Double> v = _temp.get(word);
double score = 0.0;
double sum = 0.0;
for(int i = 0; i < v.size(); i++)
score += ((double)1/(double)(i+1))*v.get(i);
for(int i = 1; i<=v.size(); i++)
sum += (double)1/(double)i;
score /= sum;
String sent = "";
if(score>=0.75)
sent = "strong_positive";
else
if(score > 0.25 && score<=0.5)
sent = "positive";
else
if(score > 0 && score>=0.25)
sent = "weak_positive";
else
if(score < 0 && score>=-0.25)
sent = "weak_negative";
else
if(score < -0.25 && score>=-0.5)
sent = "negative";
else
if(score<=-0.75)
sent = "strong_negative";
_dict.put(word, sent);
}
}
catch(Exception e){e.printStackTrace();}
}
public String extract(String word, String pos)
{
return _dict.get(word+"#"+pos);
}
}

Firstly, how are you running the class. Through command line or within an IDE such as Eclipse.
If you are using a command line you must ensure you classpath has been set properly. If you are unfamiliar with such matters I would encourage creating a java project in an IDE and running it from there as the classpath will be configured for you. Creating your first Java project

Related

Java 8 Search ArrayList with Streams algorithm failing

We are using a Stream to search an ArrayList of strings the Dictionary file is sorted & contains 307107 words all in lower case
We are using the findFirst to look for a match from the text in a TextArea
As long as the word is misspelled beyond the 3 character the search has favoriable results
If the misspelled word is like this "Charriage" the results are nothing close to a match
The obvious goal is to get as close to correct without the need to look at an enormous number of words
Here is the text we are tesing
Tak acheive it hommaker and aparent as Chariage NOT ME Charriag add missing vowel to Cjarroage
We have made some major changes to the stream search filters with reasonable improvements
We will edit the posted code to include ONLY the part of the code where the search is failing
And below that the code changes made to the stream filters
Before the code change if the searchString had a misspelled char at position 1 no results were found in the dictionary the new search filters fixed that
We also added more search information by increasing the number of char for endsWith
So what is still failing! If the searchString(misspelled word) is missing a char at the end of the word and if the word has an incorrect char from position 1 to 4 the search fails
We are working on adding & removing char but we are not sure this is a workable solution
Comments or code will be greatly appreciated if you would like the complete project we will post on GitHub Just ask in the comments
The question is still how to fix this search filter when multiple char are missing from the misspelled word?
After multiple hours of searching for a FREE txt Dictionary this is one of the best
A side bar fact it has 115726 words that are > 5 in length and have a vowel at the end of the word. That means it has 252234 words with no vowel at the end
Does that mean we have a 32% chance of fixing the issue by adding a vowel to the end of the searchString? NOT a question just an odd fact!
HERE is a link to the dictionary download and place the words_alpha.txt file on C drive at C:/A_WORDS/words_alpha.txt");
words_alpha.txt
Code Before Changes
}if(found != true){
lvListView.setStyle("-fx-font-size:18.0;-fx-background-color: white;-fx-font-weight:bold;");
for(int indexSC = 0; indexSC < simpleArray.length;indexSC++){
String NewSS = txtMonitor.getText().toLowerCase();
if(NewSS.contains(" ")||(NewSS.matches("[%&/0-9]"))){
String NOT = txtMonitor.getText().toLowerCase();
txtTest.setText(NOT+" Not in Dictionary");
txaML.appendText(NOT+" Not in Dictionary");
onCheckSpelling();
return;
}
int a = NewSS.length();
int Z;
if(a == 0){// manage CR test with two CR's
Z = 0;
}else if(a == 3){
Z = 3;
}else if(a > 3 && a < 5){
Z = 4;
}else if(a >= 5 && a < 8){
Z = 4;
}else{
Z = 5;
}
System.out.println("!!!! NewSS "+NewSS+" a "+a+" ZZ "+Z);
if(Z == 0){// Manage CR in TextArea
noClose = true;
strSF = "AA";
String NOT = txtMonitor.getText().toLowerCase();
//txtTo.setText("Word NOT in Dictionary");// DO NO SEARCH
//txtTest.setText("Word NOT in Dictionaary");
txtTest.setText("Just a Space");
onCheckSpelling();
}else{
txtTest.setText("");
txaML.clear();
txtTest.setText("Word NOT in Dictionaary");
txaML.appendText("Word NOT in Dictionaary");
String strS = searchString.substring(0,Z).toLowerCase();
strSF = strS;
}
// array & list use in stream to add results to ComboBox
List<String> cs = Arrays.asList(simpleArray);
ArrayList<String> list = new ArrayList<>();
cs.stream().filter(s -> s.startsWith(strSF))
//.forEach(System.out::println);
.forEach(list :: add);
for(int X = 0; X < list.size();X++){
String A = (String) list.get(X);
Improved New Code
}if(found != true){
for(int indexSC = 0; indexSC < simpleArray.length;indexSC++){
String NewSS = txtMonitor.getText().toLowerCase();
if(NewSS.contains(" ")||(NewSS.matches("[%&/0-9]"))){
String NOT = txtMonitor.getText().toLowerCase();
txtTest.setText(NOT+" Not in Dictionary");
onCheckSpelling();
return;
}
int a = NewSS.length();
int Z;
if(a == 0){// manage CR test with two CR's
Z = 0;
}else if(a == 3){
Z = 3;
}else if(a > 3 && a < 5){
Z = 4;
}else if(a >= 5 && a < 8){
Z = 4;
}else{
Z = 5;
}
if(Z == 0){// Manage CR
noClose = true;
strSF = "AA";
String NOT = txtMonitor.getText().toLowerCase();
txtTest.setText("Just a Space");
onCheckSpelling();
}else{
txtTest.setText("");
txtTest.setText("Word NOT in Dictionaary");
String strS = searchString.substring(0,Z).toLowerCase();
strSF = strS;
}
ArrayList list = new ArrayList<>();
List<String> cs = Arrays.asList(simpleArray);
// array list & list used in stream foreach filter results added to ComboBox
// Code below provides variables for refined search
int W = txtMonitor.getText().length();
String nF = txtMonitor.getText().substring(0, 1).toLowerCase();
String nE = txtMonitor.getText().substring(W - 2, W);
if(W > 7){
nM = txtMonitor.getText().substring(W-5, W);
System.out.println("%%%%%%%% nE "+nE+" nF "+nF+" nM = "+nM);
}else{
nM = txtMonitor.getText().substring(W-1, W);
System.out.println("%%%%%%%% nE "+nE+" nF "+nF+" nM = "+nM);
}
cs.stream().filter(s -> s.startsWith(strSF)
|| s.startsWith(nF, 0)
&& s.length()<= W+2
&& s.endsWith(nE)
&& s.startsWith(nF)
&& s.contains(nM))
.forEach(list :: add);
for(int X = 0; X < list.size();X++){
String A = (String) list.get(X);
sort(list);
cboSelect.setStyle("-fx-font-weight:bold;-fx-font-size:18.0;");
cboSelect.getItems().add(A);
}// Add search results to cboSelect
break;
Here is a screen shot of the FXML file the controls are named the same as the names used in our code with the exception of the ComboBox

I am adding a JavaFX answer. This app uses Levenshtein Distance. You have to click on Check Spelling to start. You can select a word from the list to replace the current word being checked. I notice Levenshtein Distance returns lots of words so you might want to find other ways to reduce the list down even more.
Main
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import javafx.application.Application;
import javafx.collections.FXCollections;
import javafx.collections.ObservableList;
import javafx.scene.Scene;
import javafx.scene.control.Button;
import javafx.scene.control.ListView;
import javafx.scene.control.TextArea;
import javafx.scene.control.TextField;
import javafx.scene.layout.VBox;
import javafx.stage.Stage;
public class App extends Application
{
public static void main(String[] args)
{
launch(args);
}
TextArea taWords = new TextArea("Tak Carrage thiss on hoemaker answe");
TextField tfCurrentWordBeingChecked = new TextField();
//TextField tfMisspelledWord = new TextField();
ListView<String> lvReplacementWords = new ListView();
TextField tfReplacementWord = new TextField();
Button btnCheckSpelling = new Button("Check Spelling");
Button btnReplaceWord = new Button("Replace Word");
List<String> wordList = new ArrayList();
List<String> returnList = new ArrayList();
HandleLevenshteinDistance handleLevenshteinDistance = new HandleLevenshteinDistance();
ObservableList<String> listViewData = FXCollections.observableArrayList();
#Override
public void start(Stage primaryStage)
{
setupListView();
handleBtnCheckSpelling();
handleBtnReplaceWord();
VBox root = new VBox(taWords, tfCurrentWordBeingChecked, lvReplacementWords, tfReplacementWord, btnCheckSpelling, btnReplaceWord);
root.setSpacing(5);
Scene scene = new Scene(root);
primaryStage.setScene(scene);
primaryStage.show();
}
public void handleBtnCheckSpelling()
{
btnCheckSpelling.setOnAction(actionEvent -> {
if (btnCheckSpelling.getText().equals("Check Spelling")) {
wordList = new ArrayList(Arrays.asList(taWords.getText().split(" ")));
returnList = new ArrayList(Arrays.asList(taWords.getText().split(" ")));
loadWord();
btnCheckSpelling.setText("Check Next Word");
}
else if (btnCheckSpelling.getText().equals("Check Next Word")) {
loadWord();
}
});
}
public void handleBtnReplaceWord()
{
btnReplaceWord.setOnAction(actionEvent -> {
int indexOfWordToReplace = returnList.indexOf(tfCurrentWordBeingChecked.getText());
returnList.set(indexOfWordToReplace, tfReplacementWord.getText());
taWords.setText(String.join(" ", returnList));
btnCheckSpelling.fire();
});
}
public void setupListView()
{
lvReplacementWords.setItems(listViewData);
lvReplacementWords.getSelectionModel().selectedItemProperty().addListener((obs, oldSelection, newSelection) -> {
tfReplacementWord.setText(newSelection);
});
}
private void loadWord()
{
if (wordList.size() > 0) {
tfCurrentWordBeingChecked.setText(wordList.get(0));
wordList.remove(0);
showPotentialCorrectSpellings();
}
}
private void showPotentialCorrectSpellings()
{
List<String> potentialCorrentSpellings = handleLevenshteinDistance.getPotentialCorretSpellings(tfCurrentWordBeingChecked.getText().trim());
listViewData.setAll(potentialCorrentSpellings);
}
}
CustomWord Class
/**
*
* #author blj0011
*/
public class CustomWord
{
private int distance;
private String word;
public CustomWord(int distance, String word)
{
this.distance = distance;
this.word = word;
}
public String getWord()
{
return word;
}
public void setWord(String word)
{
this.word = word;
}
public int getDistance()
{
return distance;
}
public void setDistance(int distance)
{
this.distance = distance;
}
#Override
public String toString()
{
return "CustomWord{" + "distance=" + distance + ", word=" + word + '}';
}
}
HandleLevenshteinDistance Class
/**
*
* #author blj0011
*/
public class HandleLevenshteinDistance
{
private List<String> dictionary = new ArrayList<>();
public HandleLevenshteinDistance()
{
try {
//Load DictionaryFrom file
//See if the dictionary file exists. If it don't download it from Github.
File file = new File("alpha.txt");
if (!file.exists()) {
FileUtils.copyURLToFile(
new URL("https://raw.githubusercontent.com/dwyl/english-words/master/words_alpha.txt"),
new File("alpha.txt"),
5000,
5000);
}
//Load file content to a List of Strings
dictionary = FileUtils.readLines(file, Charset.forName("UTF8"));
}
catch (IOException ex) {
ex.printStackTrace();
}
}
public List<String> getPotentialCorretSpellings(String misspelledWord)
{
LevenshteinDistance levenshteinDistance = new LevenshteinDistance();
List<CustomWord> customWords = new ArrayList();
dictionary.stream().forEach((wordInDictionary) -> {
int distance = levenshteinDistance.apply(misspelledWord, wordInDictionary);
if (distance <= 2) {
customWords.add(new CustomWord(distance, wordInDictionary));
}
});
Collections.sort(customWords, (CustomWord o1, CustomWord o2) -> o1.getDistance() - o2.getDistance());
List<String> returnList = new ArrayList();
customWords.forEach((item) -> {
System.out.println(item.getDistance() + " - " + item.getWord());
returnList.add(item.getWord());
});
return returnList;
}
}

You just needed to go a little further out into the Dictionary
We are sure you were getting a lot of suggested words from the Dictionary?
We tested your code and sometimes it found 3000 or more possible matches WOW
So here is the BIG improvement. It still needs a lot of testing we used this line for our tests with 100% favorable results.
Tske Charriage to hommaker and hommake as hommaer
Our fear is if the speller really butchers the word this improvement might solve that degree of misspelling
We are sure you know that if the first letter is wrong this will not work
Like zenophobe for xenophobe
Here is the BIG improvement tada
cs.stream().filter(s -> s.startsWith(strSF)
|| s.startsWith(nF, 0)
&& s.length() > 1 && s.length() <= W+3 // <== HERE
&& s.endsWith(nE)
&& s.startsWith(nF)
&& s.contains(nM))
.forEach(list :: add);
You can send the check to my address 55 48 196 195

This question is a possible duplicate: Search suggestion in strings
I think you should be using something similar to Levenshtein Distance or Jaro Winkler Distance. If you can use Apache's Commons. I would suggest using Apache Commons Lang. It has an implementation of Levenshtein Distance. The example demos this implementation. If you set the distance to (distance <= 2), you will potentially get more results.
import java.io.File;
import java.io.IOException;
import java.net.URL;
import java.nio.charset.Charset;
import java.util.List;
import java.util.logging.Level;
import java.util.logging.Logger;
import org.apache.commons.io.FileUtils;
import org.apache.commons.lang3.StringUtils;
/**
*
* #author blj0011
*/
public class Main
{
public static void main(String[] args)
{
try {
System.out.println("Hello World!");
File file = new File("alpha.txt");
if (!file.exists()) {
FileUtils.copyURLToFile(
new URL("https://raw.githubusercontent.com/dwyl/english-words/master/words_alpha.txt"),
new File("alpha.txt"),
5000,
5000);
}
List<String> lines = FileUtils.readLines(file, Charset.forName("UTF8"));
//lines.forEach(System.out::println);
lines.stream().forEach(line -> {
int distance = StringUtils.getLevenshteinDistance(line, "zorilta");
//System.out.println(line + ": " + distance);
if (distance <= 1) {
System.out.println("Did you mean: " + line);
}
});
}
catch (IOException ex) {
Logger.getLogger(Main.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
Output distance <= 1
Building JavaTestingGround 1.0
------------------------------------------------------------------------
--- exec-maven-plugin:1.5.0:exec (default-cli) # JavaTestingGround ---
Hello World!
Did you mean: zorilla
------------------------------------------------------------------------
BUILD SUCCESS
------------------------------------------------------------------------
Total time: 1.329 s
Finished at: 2019-11-01T11:02:48-05:00
Final Memory: 7M/30M
Distance <= 2
Hello World!
Did you mean: corita
Did you mean: gorilla
Did you mean: zoril
Did you mean: zorilla
Did you mean: zorillas
Did you mean: zorille
Did you mean: zorillo
Did you mean: zorils
------------------------------------------------------------------------
BUILD SUCCESS
------------------------------------------------------------------------
Total time: 1.501 s
Finished at: 2019-11-01T14:03:33-05:00
Final Memory: 7M/34M
See the possible duplicate for more details about Levenshtein Distance.

Regarding time consuming calculations in java

I am trying to write a code for reading 120 files from a folder and performing some calculations on it. When i debug the code, it works fine, however, execution time is more than 20 mins, I am aware that this might be due to bug in the code. However, can someone look into it and suggest possible methods to reduce the execution time. Kindly let me know if I should provide further information. Thank You.
import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
public class myclass {
static int total = 1;
static int r = 0;
public static void main(String[] args) {
ArrayList<Double> mysignal = new ArrayList<Double>();
ArrayList<Double> mylist = new ArrayList<Double>();
double x;
double a;
myclass obj = new myclass();
String target_dir = "path for folder";
File dir = new File(target_dir);
File[] files = dir.listFiles();
for (File f : files) {
if (f.isFile()) {
BufferedReader inputStream = null;
try {
inputStream = new BufferedReader(new FileReader(f));
String line;
while ((line = inputStream.readLine()) != null) {
System.out.println(line);
mysignal.add(Double.valueOf(line));
total++;
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
a = obj.funtioneg(mysignal, total);
mylist.add(r, a);
System.out.println(mylist.get(r));
r++;
}
}
}
public double functioneg(ArrayList<Double> s, int N) {
ArrayList<Double> y = new ArrayList<Double>();
double sum = 0, a1 = 0;
double[] o1 = new double[N - 1];// processed signal
for (int n = 0; n < counter_main - 1; n++) {
for (int k = 0; k < 40; k++) {
if (n - k >= 0) {
a1 = s.get(n - k);
sum = sum + (a1 * a1);// energy
} else
sum = sum + 0;
}
o1[n] = sum;
sum = 0;
}
double sum1 = 0;
double avg;
for (int t = 0; t < counter_main - 1; t++) {
sum1 = sum1 + o1[t];
}
avg = sum1 / N - 1;
return (avg);
}
}

You need to close your InputStream
After reading each file in the directory (after your try - catch block) write the statement:
inputStream.close();

As andrewdleach pointed out, you should close your input stream.
Additionally you might want to try out the Java 8 function Files#walk (see this question) for more efficiently walking through the files.

First try to comment out the line:
System.out.println(line);
The output to console is slow (and I mean really slow), this line is basically duplicating the contents of each processed file to the console.
Other than that, you can as well try to accumulate time spent in the functioneq() method and/or parts of it (for example using System.nanoTime()) to find the most time consuming parts (or run under debugger and use profiling by sampling, which is the easiest profiling method and surprisingly effective - just pause the program repeatedly and see where it paused most frequently).

Correcting and Condensing Java Program

I think I've almost figured out my java program. It is designed to read a text file and find the largest integer by using 10 different threads. I'm getting this error though:
Error:(1, 8) java: class Worker is public, should be declared in a file named Worker.java
I feel my code may be more complex than it needs to be so I'm trying to figure out how to shrink it down in size while also fixing the error above. Any assistance in this matter would be greatly appreciated and please let me know if I can clarify anything. Also, does the "worker" class have to be a seperate file? I added it to the same file but getting the error above.
import java.io.BufferedReader;
import java.io.FileReader;
public class datafile {
public static void main(String[] args) {
int[] array = new int[100000];
int count;
int index = 0;
String datafile = "dataset529.txt"; //string which contains datafile
String line; //current line of text file
try (BufferedReader br = new BufferedReader(new FileReader(datafile))) { //reads in the datafile
while ((line = br.readLine()) != null) { //reads through each line
array[index++] = Integer.parseInt(line); //pulls out the number of each line and puts it in numbers[]
}
}
Thread[] threads = new Thread[10];
worker[] workers = new worker[10];
int range = array.length / 10;
for (count = 0; count < 10; count++) {
int startAt = count * range;
int endAt = startAt + range;
workers[count] = new worker(startAt, endAt, array);
}
for (count = 0; count < 10; count++) {
threads[count] = new Thread(workers[count]);
threads[count].start();
}
boolean isProcessing = false;
do {
isProcessing = false;
for (Thread t : threads) {
if (t.isAlive()) {
isProcessing = true;
break;
}
}
} while (isProcessing);
for (worker worker : workers) {
System.out.println("Max = " + worker.getMax());
}
}
}
public class worker implements Runnable {
private int startAt;
private int endAt;
private int randomNumbers[];
int max = Integer.MIN_VALUE;
public worker(int startAt, int endAt, int[] randomNumbers) {
this.startAt = startAt;
this.endAt = endAt;
this.randomNumbers = randomNumbers;
}
#Override
public void run() {
for (int index = startAt; index < endAt; index++) {
if (randomNumbers != null && randomNumbers[index] > max)
max = randomNumbers[index];
}
}
public int getMax() {
return max;
}
}

I've written a few comments but I'm going to gather them all in an answer so anyone in future can see the aggregate info:
At the end of your source for the readtextfile class (which should be ReadTextile per java naming conventions) you have too many closing braces,
} while (isProcessing);
for (Worker worker : workers) {
System.out.println("Max = " + worker.getMax());
}
}
}
}
}
The above should end on the first brace that hits the leftmost column. This is a good rule of thumb when making any Java class, if you have more than one far-left brace or your last brace isn't far-left you've probably made a mistake somewhere and should go through checking your braces.
As for your file issues You should have all your classes named following Java conventions and each class should be stored in a file called ClassName.java (case sensitive). EG:
public class ReadTextFileshould be stored in ReadTextFile.java
You can also have Worker be an inner class. To do this you could pretty much just copy the source code into the ReadTextFile class (make sure it's outside of the main method). See this tutorial on inner classes for a quick overview.
As for the rest of your question Code Review SE is the proper place to ask that, and the smart folks over there probably will provide better answers than I could. However I'd also suggest using 10 threads is probably not the most efficient way in to find the largest int in a text file (both in development and execution times).

Issue with NullPointerException

I am continuing to get this error. Now I have gotten it for my SortSearchUtil. I've tried to do some debugging but can fix the issue. The error reads:
----jGRASP exec: java PostOffice
Exception in thread "main" java.lang.NullPointerException
at SortSearchUtil.selectionSort(SortSearchUtil.java:106)
at PostOffice.sortLetters(PostOffice.java:73)
at PostOffice.main(PostOffice.java:15)
----jGRASP wedge: exit code for process is 1.
----jGRASP: operation complete.
line 106 of selection Sort is:
if (array[indexSmallest].compareTo(array[curPos]) > 0)
I don't know what could be wrong with my method. It's a standard method that was given to me by my instructor. I've tried to debug my program but I'm pretty stuck. Here is the method that the error is originating from, selectionSort:
public static void selectionSort(Comparable[] array)
{
int curPos, indexSmallest, start;
Comparable temp;
for (start = 0; start < array.length - 1; start++)
{
indexSmallest = start;
for (curPos = start + 1; curPos < array.length; curPos++)
if (array[indexSmallest].compareTo(array[curPos]) > 0)
{
indexSmallest = curPos;
}
// end for
temp = array[start];
array[start] = array[indexSmallest];
array[indexSmallest] = temp;
} // end for
}
The sort method is at the bottom which calls SortSearchUtil.selectionSort of this Post Office Method:
import java.util.*;
import java.io.*;
public class PostOffice
{
private final int max = 1000;
private Letter [] ltrAra = new Letter[max];
private int count;
public static void main(String [] args)
{
PostOffice postOffice = new PostOffice();
postOffice.readLetters("letters.in");
postOffice.sortLetters();
postOffice.printLetters();
}
public PostOffice()
{
Letter [] Letters = ltrAra;
this.count = 0;
}
public void readLetters(String filename)
{
int count = 0;
int iWork = 0;
Scanner fin = new Scanner(filename);
String toName, toStreet, toCity, toState, toZip;
String fromName, fromStreet, fromCity, fromState, fromZip, temp;
double weight;
String sWork;
fin = FileUtil.openInputFile(filename);
if (fin != null)
{
while (fin.hasNext())
{
toName = fin.nextLine();
toStreet = fin.nextLine();
sWork = fin.nextLine();
iWork = sWork.indexOf(",");
toCity = sWork.substring(0, iWork);
iWork = iWork + 2;
toState = sWork.substring(iWork, iWork + 2);
iWork = iWork + 3;
toZip = sWork.substring(iWork);
fromName = fin.nextLine();
fromStreet = fin.nextLine();
sWork = fin.nextLine();
iWork = sWork.indexOf(",");
fromCity = sWork.substring(0, iWork);
iWork = iWork + 2;
fromState = sWork.substring(iWork, iWork + 2);
iWork = iWork + 3;
fromZip = sWork.substring(iWork);
sWork = fin.nextLine();
weight = Double.parseDouble(sWork);
ltrAra[count] = new Letter(toName, toStreet, toCity, toState, toZip, fromName, fromStreet, fromCity, fromState, fromZip, weight);
count++;
}
fin.close();
}
}
public void sortLetters()
{
SortSearchUtil.selectionSort(ltrAra);
}
public void printLetters()
{
for (Letter ltr : ltrAra)
{
System.out.println(ltr);
System.out.println();
}
}
}
My file looks like this "letters.in":
Stu Steiner
123 Slacker Lane
Slackerville, IL 09035
Tom Capaul
999 Computer Nerd Court
Dweebsville, NC 28804-1359
0.50
Tom Capaul
999 Computer Nerd Court
Dweebsville, NC 28804-1359
Chris Peters
123 Some St.
Anytown, CA 92111-0389
1.55

Obviously you get a NPE because:
You initialize ltrAra as array of 1000 items, but you read in less than 1000 items within method readLetters(). So at the end of this array some null references remain un-initialized (remember array-creation does itself not set the single items to any objects). Therefore following sorting-method gets some null-references => NPE.
Suggested solution:
You should use an ArrayList instead of an array because that will automatically prevent you from accessing too much items due to internal range check.

In addition to the above answer that Meno has well stated, you need to understand when you get a Null pointer Exception.
your error-line : if (array[indexSmallest].compareTo(array[curPos]) > 0)
If we get NPE in this line, it is obvious that array[indexSmallest] is null
And when you invoke an action on null, you get NPE. Hope this helps you to debug, down the line.
Also, One of the main reasons when we choose ArrayList over Arrays is when we do not know the length of the array.
One more suggestion, you can create an ArrayList and then convert to Arrays if you want to stick with Arrays
To convert ArrayList of any class into array, Convert T to the respective class. For eg: if you want String array, convert T to 'String'
List<T> list = new ArrayList<T>();
T [] students = list.toArray(new T[list.size()]);

Getting incorrect Score using SentiWordNet

I'm doing some sentiment analysis using SentiWordNet and I referred to the post here How to use SentiWordNet . However, I'm getting a score of 0.0 despite trying out various inputs. Is there anything I'm doing wrong here? Thanks!
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Set;
import java.util.Vector;
public class SWN3 {
private String pathToSWN = "C:\\Users\\Malcolm\\Desktop\\SentiWordNet_3.0.0\\home\\swn\\www\\admin\\dump\\SentiWordNet_3.0.0.txt";
private HashMap<String, Double> _dict;
public SWN3(){
_dict = new HashMap<String, Double>();
HashMap<String, Vector<Double>> _temp = new HashMap<String, Vector<Double>>();
try{
BufferedReader csv = new BufferedReader(new FileReader(pathToSWN));
String line = "";
while((line = csv.readLine()) != null)
{
String[] data = line.split("\t");
Double score = Double.parseDouble(data[2])-Double.parseDouble(data[3]);
String[] words = data[4].split(" ");
for(String w:words)
{
String[] w_n = w.split("#");
w_n[0] += "#"+data[0];
int index = Integer.parseInt(w_n[1])-1;
if(_temp.containsKey(w_n[0]))
{
Vector<Double> v = _temp.get(w_n[0]);
if(index>v.size())
for(int i = v.size();i<index; i++)
v.add(0.0);
v.add(index, score);
_temp.put(w_n[0], v);
}
else
{
Vector<Double> v = new Vector<Double>();
for(int i = 0;i<index; i++)
v.add(0.0);
v.add(index, score);
_temp.put(w_n[0], v);
}
}
}
Set<String> temp = _temp.keySet();
for (Iterator<String> iterator = temp.iterator(); iterator.hasNext();) {
String word = (String) iterator.next();
Vector<Double> v = _temp.get(word);
double score = 0.0;
double sum = 0.0;
for(int i = 0; i < v.size(); i++)
score += ((double)1/(double)(i+1))*v.get(i);
for(int i = 1; i<=v.size(); i++)
sum += (double)1/(double)i;
score /= sum;
String sent = "";
if(score>=0.75)
sent = "strong_positive";
else
if(score > 0.25 && score<=0.5)
sent = "positive";
else
if(score > 0 && score>=0.25)
sent = "weak_positive";
else
if(score < 0 && score>=-0.25)
sent = "weak_negative";
else
if(score < -0.25 && score>=-0.5)
sent = "negative";
else
if(score<=-0.75)
sent = "strong_negative";
_dict.put(word, score);
}
}
catch(Exception e){e.printStackTrace();}
}
public Double extract(String word)
{
Double total = new Double(0);
if(_dict.get(word+"#n") != null)
total = _dict.get(word+"#n") + total;
if(_dict.get(word+"#a") != null)
total = _dict.get(word+"#a") + total;
if(_dict.get(word+"#r") != null)
total = _dict.get(word+"#r") + total;
if(_dict.get(word+"#v") != null)
total = _dict.get(word+"#v") + total;
return total;
}
public static void main(String[] args) {
SWN3 test = new SWN3();
String sentence="Hello have a Super awesome great day";
String[] words = sentence.split("\\s+");
double totalScore = 0;
for(String word : words) {
word = word.replaceAll("([^a-zA-Z\\s])", "");
if (test.extract(word) == null)
continue;
totalScore += test.extract(word);
}
System.out.println(totalScore);
}
}
Here's the first 10 lines of SentiWordNet.txt
a 00001740 0.125 0 able#1 (usually followed by `to') having the necessary means or skill or know-how or authority to do something; "able to swim"; "she was able to program her computer"; "we were at last able to buy a car"; "able to get a grant for the project"
a 00002098 0 0.75 unable#1 (usually followed by `to') not having the necessary means or skill or know-how; "unable to get to town without a car"; "unable to obtain funds"
a 00002312 0 0 dorsal#2 abaxial#1 facing away from the axis of an organ or organism; "the abaxial surface of a leaf is the underside or side facing away from the stem"
a 00002527 0 0 ventral#2 adaxial#1 nearest to or facing toward the axis of an organ or organism; "the upper side of a leaf is known as the adaxial surface"
a 00002730 0 0 acroscopic#1 facing or on the side toward the apex
a 00002843 0 0 basiscopic#1 facing or on the side toward the base
a 00002956 0 0 abducting#1 abducent#1 especially of muscles; drawing away from the midline of the body or from an adjacent part
a 00003131 0 0 adductive#1 adducting#1 adducent#1 especially of muscles; bringing together or drawing toward the midline of the body or toward an adjacent part
a 00003356 0 0 nascent#1 being born or beginning; "the nascent chicks"; "a nascent insurgency"
a 00003553 0 0 emerging#2 emergent#2 coming into existence; "an emergent republic"

Usually the SentiWord.txt file comes with a weird format.
You need to remove the first part of it (which includes comments and instructions) and the last two lines:
#
EMPTY LINE
The parser doesn't know how to handle these situations, if you delete these extra two lines you'll be fine.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Sentiment analysis using SentiWordNet - java

Related

Java 8 Search ArrayList with Streams algorithm failing

Regarding time consuming calculations in java

Correcting and Condensing Java Program

Issue with NullPointerException

Getting incorrect Score using SentiWordNet

Categories

Resources