Why aren't my words coming out less than 8 characters?

Why aren't my words coming out less than 8 characters? - java

public String compWord() throws IOException, ClassNotFoundException
{
// Local constants
final int MAX_COUNT = 8;
// Local variables
BufferedReader reader = new BufferedReader(new FileReader("dictionary.txt")); // Create a new BufferedReader, looking for dictionary.txt
List<String> lines = new ArrayList<String>(); // New ArrayList to keep track of the lines
String line; // Current line
Random rand = new Random(); // New random object
String word; // The computer's word
/********************* Start compWord *********************/
// Start reading the txt file
line = reader.readLine();
// WHILE the line isn't null
while(line != null)
{
// Add the line to lines list
lines.add(line);
// Go to the next line
line = reader.readLine();
}
// Set the computers word to a random word in the list
word = lines.get(rand.nextInt(lines.size()));
if(word.length() > MAX_COUNT)
compWord();
// Return the computer's word
return word;
}
From what I understand it should only be returning words less than 8 characters? Any idea what I am doing wrong? The if statement should recall compWord until the word is less than 8 characters. But for some reason I'm still get words from 10-15 chars.

Look at this code:
if(word.length() > MAX_COUNT)
compWord();
return word;
If the word that is picked is longer than your limit, you're calling compWord recursively - but ignoring the return value, and just returning the "too long" word anyway.
Personally I would suggest that you avoid the recursion, and instead just use a do/while loop:
String word;
do
{
word = lines.get(rand.nextInt(lines.size());
} while (word.length() > MAX_COUNT);
return word;
Alternatively, filter earlier while you read the lines:
while(line != null) {
if (line.length <= MAX_COUNT) {
lines.add(line);
}
line = reader.readLine();
}
return lines.get(rand.nextInt(lines.size()));
That way you're only picking out of the valid lines to start with.
Note that using Files.readAllLines is a rather simpler way of reading all the lines from a text file, by the way - and currently you're not closing the file afterwards...

If the word is longer than 8 characters, you simply call your method again, continue, and nothing changes.
So:
You are getting all the words from the file,
Then getting a random word from the List, and putting it in the word String,
And if the word is is longer than 8 characters, the method runs again.
But, at the end, it will always return the word it picked first. The problem is that you just call the method recursively, and you do nothing with the return value. You are calling a method, and it will do something, and the caller method will continue, and in this case return your word. It does not matter if this method is recursive or not.
Instead, I would recommend you use a non-recursive solution, as Skeet recommended, or learn a bit about recursion and how to use it.

Related

Input multiple lines using hasNextLine() is not working in the way that I expected it to

I'm trying to input multiple lines in java by using hasNextline() in the while loop.
Scanner sc = new Scanner(System.in);
ArrayList<String> lines = new ArrayList<>();
while (sc.hasNextLine()) {
lines.add(sc.nextLine());
System.out.println(lines)
}
The code is inside the main method. But the print method in thewhile loop doesn't print the last line of my input. Also, while loop doesn't seem to break.
What should I do to print whole lines of input and finally break the while loop and end the program?

Since an answer that explains why hasNextLine() might be giving "unexpected" result has been linked / given in a comment, instead of repeating the answer, I'm giving you two examples that might give you "expected" result. Whether any of them suits your needs really depends on what kind of input you need the program to deal with.
Assuming you want the loop to be broken by an empty line:
while (true) {
String curLine = sc.nextLine();
if (curLine.isEmpty())
break;
lines.add(curLine);
System.out.println(curLine);
}
Assuming you want the loop to be broken by two consecutive empty lines:
while (true) {
String curLine = sc.nextLine();
int curSize = lines.size();
String LastLine = curSize > 0 ? lines.get(curSize-1) : "";
if (curLine.isEmpty() && LastLine.isEmpty())
break;
lines.add(curLine);
System.out.println(curLine);
}
// lines.removeIf(e -> e.isEmpty());

Parse .csv File in java returns outofbounds exception

I have the following issue: I am trying to parse a .csv file in java, and store specifically 3 columns of it in a 2 Dimensional array. The Code for the method looks like this:
public static void parseFile(String filename) throws IOException{
FileReader readFile = new FileReader(filename);
BufferedReader buffer = new BufferedReader(readFile);
String line;
String[][] result = new String[10000][3];
String[] b = new String[6];
for(int i = 0; i<10000; i++){
while((line = buffer.readLine()) != null){
b = line.split(";",6);
System.out.println("ID: "+b[0]+" Title: "+b[3]+ "Description: "+b[4]); // Here is where the outofbounds exception occurs...
result[i][0] = b[0];
result[i][1] = b[3];
result[i][2] = b[4];
}
}
buffer.close();
}
I feel like I have to specify this: the .csv file is HUGE. It has 32 columns, and (almost) 10.000 entries (!).
When Parsing, I keep getting the following:
XXXXX CHUNKS OF SUCCESFULLY EXTRACTED CODE
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException:3
at ParseCSV.parseFile(ParseCSV.java:24)
at ParseCSV.main(ParseCSV.java:41)
However, I realized that SOME of the stuff in the file has a strange format e.g. some of the texts inside it for instance have newlines in them, but there is no newline character involved in any way. However, if I delete those blank lines manually, the output generated (before the error message is prompted) adds the stuff to the array up until the next blank line ...
Does anyone have an idea how to fix this? Any help would be greately appreciated...

Your first problem is that you probably have at least one blank line in your csv file. You need to replace:
b = line.split(";", 6);
with
b = line.split(";");
if(b.length() < 5){
System.err.println("Warning, line has only " + b.length() +
"entries, so skipping it:\n" + line);
continue;
}
If your input can legitimately have new lines or embedded semi-colons within your entries, that is a more complex parsing problem, and you are probably better off using a third-party parsing library, as there are several very good ones.
If your input is not supposed to have new lines in it, the problem probably is \r. Windows uses \r\n to represent a new line, while most other systems just use \n. If multiple people/programs edited your text file, it is entirely possible to end up with stray \r by themselves, which are not easily handled by most parsers.
A way to easily check if that's your problem is before you split your line, do
line = line.replace("\r","").
If this is a process you are repeating many times, you might need to consider using a Scanner (or library) instead to get more efficient text processing. Otherwise, you can make do with this.

When you have new lines in your CSV file, after this line
while((line = buffer.readLine()) != null){
variable line will have not a CSV line but just some text without ;
For example, if you have file
column1;column2;column
3 value
after first iteration variable line will have
column1;column2;column
after second iteration it will have
3 value
when you call "3 value".split(";",6) it will return array with one element. and later when you call b[3] it will throw exception.
CSV format has many small things, to implement which you will spend a lot of time. This is a good article about all possible csv examples
http://en.wikipedia.org/wiki/Comma-separated_values#Basic_rules_and_examples
I would recommend to you some ready CSV parsers like this
https://commons.apache.org/proper/commons-csv/apidocs/org/apache/commons/csv/CSVParser.html

String's split(pattern, limit) method returns an array sized to the number of tokens found up to the the number specified by the limit parameter. Limit is the maximum, not the minimum number of array elements returned.
"1,2,3" split with (",", 6) with return an array of 3 elements: "1", "2" and "3".
"1,2,3,4,5,6,7" will return 6 elements: "1", "2", "3", "4", "5" and ""6,7" The last element is goofy because the split method stopped splitting after 5 and returned the rest of the source string as the sixth element.
An empty line is represented as an empty string (""). Splitting "" will return an array of 1 element, the empty string.
In your case, the string array created here
String[] b = new String[6];
and assigned to b is replaced by the the array returned by
b = line.split(";",6);
and meets it's ultimate fate at the hands of the garbage collector unseen and unloved.
Worse, in the case of the empty lines, it's replaced by a one element array, so
System.out.println("ID: "+b[0]+" Title: "+b[3]+ "Description: "+b[4]);
blows up when trying to access b[3].
Suggested solution is to either
while((line = buffer.readLine()) != null){
if (line.length() != 0)
{
b = line.split(";",6);
System.out.println("ID: "+b[0]+" Title: "+b[3]+ "Description: "+b[4]); // Here is where the outofbounds exception occurs...
...
}
or (better because the previous could trip over a malformed line)
while((line = buffer.readLine()) != null){
b = line.split(";",6);
if (b.length() == 6)
{
System.out.println("ID: "+b[0]+" Title: "+b[3]+ "Description: "+b[4]); // Here is where the outofbounds exception occurs...
...
}
You might also want to think about the for loop around the while. I don't think it's doing you any good.
while((line = buffer.readLine()) != null)
is going to read every line in the file, so
for(int i = 0; i<10000; i++){
while((line = buffer.readLine()) != null){
is going to read every line in the file the first time. Then it going to have 9999 attempts to read the file, find nothing new, and exit the while loop.
You are not protected from reading more than 10000 elements because the while loop because the while loop will read a 10001th element and overrun your array if there are more than 10000 lines in the file. Look into replacing the big array with an arraylist or vector as they will size to fit your file.

Please check b.length>0 before accessing b[].

Reject the empty lines at end of text file while reading using Java.

I need to read a text file using java. Not a problem. But I need to reject the empty lines at the end of the file. The file is quite large, around a million or so lines. I need to process each line, one at a time. Even if they are empty.
But, if the empty lines are at the end of the file, then I need to reject it. Note that there can be multiple empty lines at the end of the file.
Any quick solutions? I almost want to write a FileUtility.trimEmptyLinesAndEnd(File input). But I cant help feeling that someone might have written something like this already.
Any help appreciated.
Note:
I have read this link.
Java: Find if the last line of a file is empty.
But this is not what I am trying to do. I need to reject multiple
empty lines.

When you find an empty line, increment a counter for the number of empty lines. If the next line is also empty, increment the counter. If you reach the end of the file, just continue on with what you want to do (ignoring the empty lines you found). If you reach a non-empty line, first do whatever you do to process an empty line, repeating it for each empty line you counted. Then process the non-empty line as normal, and continue through the file. Also, don't forget to reset the empty line counter to zero.
Pseudo code:
emptyLines = 0;
while (the file has a next line) {
if (line is empty) {
emptyLines++;
} else {
if (emptyLines > 0) {
for (i = 0; i < emptyLines; i++) {
process empty line;
}
emptyLines = 0;
}
process line;
}
}

You have to read all lines in your file. You can introduce a guarding that will store the value of last not empty line. At the end return the subset from zero to guardian.
In case you have a stream process.
read line
if empty
increase empty lines counter
else
if there was some empty lines
yield fake empty lines that counter store
reset counter
yield line

Thanks for all the responses. I think both Vash - Damian Leszczyński and forgivenson cracked the pseudocode for this problem. I have taken that forward and am providing here the Java code for people who come looking for an answer after me.
#Test
public void test() {
BufferedReader br = null;
try {
String sCurrentLine;
StringBuffer fileContent = new StringBuffer();
int consecutiveEmptyLineCounter = 0;
br = new BufferedReader(new FileReader("D:\\partha\\check.txt"));
while ((sCurrentLine = br.readLine()) != null) {
// if this is not an empty line
if (!(sCurrentLine.trim().length() == 0)) {
// if there are no empty lines before this line.
if (!(consecutiveEmptyLineCounter > 0)) {
// It is a non empty line, with non empty line prior to this
// Or it is the first line of the file.
// Don't do anything special with it.
// Appending "|" at the end just for ease of debug.
System.out.println(sCurrentLine + "|");
} else {
// This is a non empty line, but there were empty lines before this.
// The consecutiveEmptyLineCounter is > 0
// The "fileContent" already has the previous empty lines.
// Add this non empty line to "fileContent" and spit it out.
fileContent.append(sCurrentLine);
System.out.println(fileContent.toString() + "#");
// and by the way, the counter of consecutive empty lines has to be reset.
// "fileContent" has to start from a clean slate.
consecutiveEmptyLineCounter = 0;
fileContent = new StringBuffer();
}
} else {
// this is an empty line
// Don't execute anything on it.
// Just keep it in temporary "fileContent"
// And count up the consecutiveEmptyLineCounter
fileContent.append(sCurrentLine);
fileContent.append(System.getProperty("line.separator"));
consecutiveEmptyLineCounter++;
}
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (br != null)
br.close();
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
Thanks for all the help.
And, what I have provided here is just one solution. If someone comes across something more clever, please share. I can't shake the feeling off that there should be some FileUtils.trimEmptyLinesAtEnd() sort of method somewhere.

Just read the File backwards. Starting from the first line you read, refrain from processing all blank lines you encounter.
Starting with the first non-blank line you encounter, and thereafter, process all lines whether or not they're blank.
The problem is "intractable" wrt to a neat solution if you read the File forward since you can never know if at some point after a long run of blank lines there might yet be a non-blank line.
If the processing of lines in order, first-to-last, matters, then there is no neat solution and anything like what you have now is about what there is.

Using a user inputted string of characters find the longest word that can be made

Basically I want to create a program which simulates the 'Countdown' game on Channel 4. In effect a user must input 9 letters and the program will search for the largest word in the dictionary that can be made from these letters.I think a tree structure would be better to go with rather than hash tables. I already have a file which contains the words in the dictionary and will be using file io.
This is my file io class:
public static void main(String[] args){
FileIO reader = new FileIO();
String[] contents = reader.load("dictionary.txt");
}
This is what I have so far in my Countdown class
public static void main(String[] args) throws IOException{
Scanner scan = new Scanner(System.in);
letters = scan.NextLine();
}
I get totally lost from here. I know this is only the start but I'm not looking for answers. I just want a small bit of help and maybe a pointer in the right direction. I'm only new to java and found this question in an interview book and thought I should give it a .
Thanks in advance

welcome to the world of Java :)
The first thing I see there that you have two main methods, you don't actually need that. Your program will have a single entry point in most cases then it does all its logic and handles user input and everything.
You're thinking of a tree structure which is good, though there might be a better idea to store this. Try this: http://en.wikipedia.org/wiki/Trie
What your program has to do is read all the words from the file line by line, and in this process build your data structure, the tree. When that's done you can ask the user for input and after the input is entered you can search the tree.
Since you asked specifically not to provide answers I won't put code here, but feel free to ask if you're unclear about something

There are only about 800,000 words in the English language, so an efficient solution would be to store those 800,000 words as 800,000 arrays of 26 1-byte integers that count how many times each letter is used in the word, and then for an input 9 characters you convert to similar 26 integer count format for the query, and then a word can be formed from the query letters if the query vector is greater than or equal to the word-vector component-wise. You could easily process on the order of 100 queries per second this way.

I would write a program that starts with all the two-letter words, then does the three-letter words, the four-letter words and so on.
When you do the two-letter words, you'll want some way of picking the first letter, then picking the second letter from what remains. You'll probably want to use recursion for this part. Lastly, you'll check it against the dictionary. Try to write it in a way that means you can re-use the same code for the three-letter words.

I believe, the power of Regular Expressions would come in handy in your case:
1) Create a regular expression string with a symbol class like: /^[abcdefghi]*$/ with your letters inside instead of "abcdefghi".
2) Use that regular expression as a filter to get a strings array from your text file.
3) Sort it by length. The longest word is what you need!
Check the Regular Expressions Reference for more information.
UPD: Here is a good Java Regex Tutorial.

A first approach could be using a tree with all the letters present in the wordlist.
If one node is the end of a word, then is marked as an end-of-word node.
In the picture above, the longest word is banana. But there are other words, like ball, ban, or banal.
So, a node must have:
A character
If it is the end of a word
A list of children. (max 26)
The insertion algorithm is very simple: In each step we "cut" the first character of the word until the word has no more characters.
public class TreeNode {
public char c;
private boolean isEndOfWord = false;
private TreeNode[] children = new TreeNode[26];
public TreeNode(char c) {
this.c = c;
}
public void put(String s) {
if (s.isEmpty())
{
this.isEndOfWord = true;
return;
}
char first = s.charAt(0);
int pos = position(first);
if (this.children[pos] == null)
this.children[pos] = new TreeNode(first);
this.children[pos].put(s.substring(1));
}
public String search(char[] letters) {
String word = "";
String w = "";
for (int i = 0; i < letters.length; i++)
{
TreeNode child = children[position(letters[i])];
if (child != null)
w = child.search(letters);
//this is not efficient. It should be optimized.
if (w.contains("%")
&& w.substring(0, w.lastIndexOf("%")).length() > word
.length())
word = w;
}
// if a node its end-of-word we add the special char '%'
return c + (this.isEndOfWord ? "%" : "") + word;
}
//if 'a' returns 0, if 'b' returns 1...etc
public static int position(char c) {
return ((byte) c) - 97;
}
}
Example:
public static void main(String[] args) {
//root
TreeNode t = new TreeNode('R');
//for skipping words with "'" in the wordlist
Pattern p = Pattern.compile(".*\\W+.*");
int nw = 0;
try (BufferedReader br = new BufferedReader(new FileReader(
"files/wordsEn.txt")))
{
for (String line; (line = br.readLine()) != null;)
{
if (p.matcher(line).find())
continue;
t.put(line);
nw++;
}
// line is not visible here.
br.close();
System.out.println("number of words : " + nw);
String res = null;
// substring (1) because of the root
res = t.search("vuetsrcanoli".toCharArray()).substring(1);
System.out.println(res.replace("%", ""));
}
catch (Exception e)
{
// TODO Auto-generated catch block
e.printStackTrace();
}
}
Output:
number of words : 109563
counterrevolutionaries
Notes:
The wordlist is taken from here
the reading part is based on another SO question : How to read a large text file line by line using Java?

Skipping line with BufferedReader

While reading a file with a BufferedReader, I want it to skip blank lines and lines that start with '#'. Ultimately, each individual character is then added to an arraylist
inputStream = new BufferedReader(new FileReader(filename));
int c = 0;
while((c = inputStream.read()) != -1) {
char face = (char) c;
if (face == '#') {
//skip line (continue reading at first char of next line)
}
else {
faceList.add(face);
}
Unless I'm mistaken, BufferedReader skips blank lines automatically. Other than that, how would I go about doing this?
Would I just skip()? The length of the lines may vary, so I don't think that would work.

Do not attempt to read the file a character at a time.
Read in one complete line, into a String, on each iteration of your main loop. Next, check it it matches the specific patterns you want to ignore (empty, blanks only, starting with a #, etc). Once you have a line you want to process, only then iterate over the String a character at a time if you need to.
This makes checking for and ignoring blank lines and lines matching a pattern MUCH easier.
while((line=in.readline()) != null)
{
String temp = line.trim();
if (temp.isEmpty() || temp.startsWith("#"))
/* ignore line */;
else
...
}

Use continue. This will continue to the next item in any loop.
if (face == '#') {
continue;
}
else {
faceList.add(face);
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Why aren't my words coming out less than 8 characters? - java

Related

Input multiple lines using hasNextLine() is not working in the way that I expected it to

Parse .csv File in java returns outofbounds exception

Reject the empty lines at end of text file while reading using Java.

Using a user inputted string of characters find the longest word that can be made

Skipping line with BufferedReader

Categories

Resources