Is there a reason .contains() would not work with scanner? - java

I am working on a linear search problem that takes a file of names and compares it to a phonebook file of names and numbers. My only task right now is to see how many names are in the phonebook file. Everything works as expected up until the if statement in my main method, but for the life of me, I cannot figure out what I am doing wrong. Through testing, I can print out all the lines in both files, so I know I am reading the files correctly. Output should be 500 / 500 as all the names are in the phonebook file of over a million lines. Please help.
package phonebook;
import java.util.Objects;
import java.util.Scanner;
import java.io.File;
import java.io.FileNotFoundException;
public class Main {
final static String NAME_PATH = "C:\\Users\\{user}\\Downloads\\find.txt";
final static String PHONEBOOK_PATH = "C:\\Users\\{user}\\Downloads\\directory.txt";
private static String[] namesList(File file) {
int count = 0;
try (Scanner scanner = new Scanner(file)) {
while (scanner.hasNextLine()) {
scanner.nextLine();
count++;
}
String[] names = new String[count];
Scanner sc = new Scanner(file);
for (int i = 0; i < count; i++) {
names[i] = sc.nextLine();
}
return names;
} catch (FileNotFoundException e) {
System.out.printf("File not found: %s", NAME_PATH);
return null;
}
}
private static String timeDifference(long timeStart, long timeEnd) {
long difference = timeEnd - timeStart;
long minutes = (difference / 1000) / 60;
long seconds = (difference / 1000) % 60;
long milliseconds = difference - ((minutes * 60000) + (seconds * 1000));
return "Time taken: " + minutes + " min. " + seconds + " sec. " +
milliseconds + " ms.";
}
public static void main(String[] args) {
File findFile = new File(NAME_PATH);
File directoryFile = new File(PHONEBOOK_PATH);
String[] names = namesList(findFile);
int count = 0;
try (Scanner scanner = new Scanner(directoryFile)) {
System.out.println("Start searching...");
long timeStart = System.currentTimeMillis();
for (int i = 0; i < Objects.requireNonNull(names).length; i++) {
while (scanner.hasNextLine()) {
if (scanner.nextLine().contains(names[i])) {
count++;
break;
}
}
}
long timeEnd = System.currentTimeMillis();
System.out.print("Found " + count + " / " + names.length + " entries. " +
timeDifference(timeStart, timeEnd));
} catch (FileNotFoundException e) {
System.out.printf("File not found: %s", PHONEBOOK_PATH);
}
}
}
Output:
Start searching...
Found 1 / 500 entries. Time taken: 0 min. 0 sec. 653 ms.
Process finished with exit code 0

The problem is how you are searching. If you want to search iteratively then you need to re-start the iteration for each name. Otherwise, you are merely searching forward in the phonebook. If the second name in the name list appears before the first name then you will only find one name since you will have exhausted the phonebook before finding anything.
However, repeatedly reading the phonebook file is a costly endeavor. Instead, load the phone list (as you have done for the name list) and then you can iteratively search that list for each element in the name list. The following examples assume you are using List rather than arrays. Using for-each loops to make it obvious what is going on (versus using Stream API).
List<String> names = loadNames();
// each phonebook entry contains the name and the phone number in one string
List<String> phonebook = loadPhonebook();
int numFound = 0;
for (String name : names) {
for (String entry : phonebook) {
if (entry.contains(name)) {
++numFound;
}
}
}
However, this is still an expensive task because you are repeatedly doing nested iterations. Depending on the format of the phonebook file you should be able to parse out the names and store these in a TreeSet. Then the search is constant time.
List<String> names = loadNames();
// phonebookNames are just the names - the phone number has been stripped away
TreeSet<String> phonebookNames = loadPhonebookNames();
int numFound = 0;
for (String name : names) {
if (phonebookNames.contains(name)) {
++numFound;
}
}
Presumably, your assignment will eventually want to use the phone number for something so you probably don't want to drop that on the floor. Instead of parsing out just the name, you can capture the name and the phone number using a Map (key=name, value=phone number). Then you can count the presence of names thusly.
List<String> names = loadNames();
// phonebook is a Map of phone number values keyed on name
Map<String,String> phonebook = loadPhonebook();
int numFound = 0;
for (String name : names) {
if (phonebook.containsKey(name)) {
++numFound;
}
}

You are moving forward in your file for every name (using nextLine), you should do the loop on names for each line instead.
In your code, if your first name (name[0]) is on the last line of your file, you are already at the end of your file on the first iteration, and when searching for the second name, there is already no more line.
Try something like this:
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
for (int i = 0; i < Objects.requireNonNull(names).length; i++) {
if (line.contains(names[i])) {
count++;
break;
}
}
}

Related

Find a string in a 2D array and then go to corresponding column

So I'm converting a CSV into an Array. The CSV has the first column which consists of titles the describe what is in that column. In my case: product ID | product name | product cost | quantity
I'm trying to go through the array, find the string item1 and then go to that item's quantity, which in the same line, but in a different column.
For example:
product ID | product name | product cost | quantity
-----001----- | -----item1----- | -----5.99----- | -----3-----
-----002----- | -----item2----- | -----2.99----- | -----5-----
So I want to go this array, find the string item1 in line index 1, then go to column index 3 to extract the quantity into a variable. Then I want to store into a variable to ultimately print out there are only 3 item1's left or something of the sort.
This is what I got so far:
import java.io.File;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Scanner;
public class test2 {
public static List<List<String>> csvToArray() {
String fileName = "c:\\temp\\test.csv";
File file = new File(fileName);
// this gives you a 2-dimensional array of strings
List<List<String>> lines = new ArrayList<>();
Scanner inputStream;
try {
inputStream = new Scanner(file);
while (inputStream.hasNext()) {
String line = inputStream.next();
String[] values = line.split(",");
// this adds the currently parsed line to the 2-dimensional string array
lines.add(Arrays.asList(values));
}
inputStream.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
}
return lines;
}
public static void printArray(List<List<String>> lines){
int lineNo = 1;
for (List<String> line : lines) {
int columnNo = 1;
for (String value : line) {
System.out.println("Line " + lineNo + " Column " + columnNo + ": " + value);
columnNo++;
}
lineNo++;
}
}
public static void main(String[] args) {
csvToArray();
printArray(csvToArray());
}
}
As you can see, in the method printArray I'm just printing out the Array to get a reference of where I am, but once I try to add if's then Im getting lost.
Any help would be great :)
Determine the indexes of the clumns of interest and iterate through the lines, printing and feeding a variable for future use along the way.
Creating a class to represent a csv line seems to be over-engineering here but depends on the requirements, of course.
Code is not tested.
/*
Preprocessing:
Determine column indexes of product name and quantity.
Use the informstion available.
Simple case: column indices are known beforehand.
This code scans the csv header lines for given strings.
*/
final String s_HEADER_PRODUCT = "product name";
final String s_HEADER_QUANTITY = "quantity";
HashMap<String, int> quantities = new HashMap<String, int>();
int idxProduct = -1;
int idxQuantity = -1;
List<String> headers = lines[0];
int columnNo = 0;
for ( columnNo = 0; columnNo < headers.size(); columnNo++ ) {
if ((idxProduct < 0) && headers[columnNo].equals( s_HEADER_PRODUCT )) { idxProduct = columnNo; }
if ((idxQuantity < 0) && headers[columnNo].equals( s_HEADER_QUANTITY )) { idxQuantity = columnNo; }
if ((idxProduct >= 0) && (idxQuantity >= 0)) {
break;
}
}
/*
Print out.
After the loop, 'quantities' will hold a map of product names onto quantities.
Assumptions:
- Quantities are integer values.
- Product names are unique within the file.
*/
lineNo = 0;
for (List<String> line : lines) {
if (lineNo > 0) { // skip header line
System.out.println("Item '" + line.get(idxProduct) + "': " + line.get(idxQuantity) + " specimens left.");
quantities.put ( line.get(idxProduct), Integer.parseInt ( line.get(idxQuantity) );
}
lineNo++;
}
The best solution is to map each line to a "Product" object.
This news class will contains attributes like productID, productName, productCost & quantity.
Once each lines map to a product object, you just have to find the product with the productName you want, then you can access its other properties easily.
It would be better if you use List<String[]> instead of List<List<String>>.
But for your problem, you can do something like that:
for (int i = 0; i < lines.size(); i++)
System.out.println("There are only " + lines.get(i).get(3).replace("-", "") + " " + lines.get(i).get(1).replace("-", "") + "'s left");

Finding Most Frequent Element(s) In A File Of Integers

I am working on a program to find the most frequent element(s) in a text file. Thus far I have made the file read into a List then iterate through the list to find the occurrences of every value and map them in a SortedMap.
The issue is occurring with files where every digit occurs equally. My Map is not filling with all the data and will only contain one of the digits at the end.
Here is my code:
public class FileAnalyzer {
public static void main(String[] args) throws IOException, FileNotFoundException {
System.out.print("Please Enter A File Name: ");
String file = new Scanner(System.in).nextLine();
final long startTime = System.currentTimeMillis();
BufferedReader reader = new BufferedReader(new FileReader(file));
List<Integer> numbers = new ArrayList<>();
SortedMap<Integer, Integer> sortedMap = new TreeMap<>();
String line;
while ((line = reader.readLine()) != null) {
numbers.add(Integer.parseInt(line));
}
Collections.sort(numbers);
int frequency = 0;
int tempNum = 0;
for (int i = 0; i < numbers.size(); i++) {
if (tempNum == numbers.get(i)) {
frequency++;
} else {
if (frequency != 0) {
sortedMap.put((frequency+1), tempNum);
}
frequency = 0;
tempNum = numbers.get(i);
}
}
if (frequency !=0) {
sortedMap.put((frequency+1), tempNum);
}
final long duration = System.currentTimeMillis() - startTime;
System.out.println(sortedMap);
System.out.println("Runtime: " + duration + " ms\n");
System.out.println("Least Frequent Digit(s): " + sortedMap.get(sortedMap.firstKey()) + "\nOccurences: " + sortedMap.firstKey());
}
}
Also this is the text file I am running into issues when reading from:
1
2
1
1
2
1
1
2
1
2
2
2
Thanks in advance!
You should look up the Java Documentation for TreeMap. It is designed to not store duplicate keys, so since you are sorting on frequency as a key, values with the same frequency will be overwritten in your map!

Issue iterating through two arraylists

EDIT: Thanks so much for all the really quick feedback. Wow. I did just paste it all for you instead of just those two for loops. Thanks.
This may have been totally answered before. I have read SO for the last few years but this is my first post. I have been using the site and others to help solve this so my apologies in advance if this has been answered!
I am iterating through two arraylists. One is derived from user input; the other is a dictionary file converted into an arraylist. I am trying to compare a word in the input with a dictionary word. The input list and the dictionary list are valid and if I simply iterate through them, they contain what they should (so that isn't the issue. I assume my issue is somewhere with how I am handling the iteration. I'm a fairly novice Java programmer so please go easy on me.
Thanks
public String isSub(String x) throws FileNotFoundException, IOException {
//todo handle X
String out = "**********\nFor input \n" + x + "If you're reading this no match was found.\n**********";
String dictionary;
boolean solve = true;
/// Get dictionary
dictMaker newDict = new dictMaker();
dictionary = newDict.arrayMaker();
List<String> myDict = new ArrayList<String>(Arrays.asList(dictionary.split(",")));
List<String> input = new ArrayList<String>(Arrays.asList(x.split(" ")));
List<String> results = new ArrayList<String>();
//results = input;
String currentWord;
String match = "";
String checker = "";
String fail="";
//Everything to break sub needs to happen here.
while (solve) {
for(int n = 0; n < input.size(); n++) { //outside FOR (INPUT)
if(!fail.equals("")) results.add(fail);
checker = input.get(n).trim();
for(int i = 0; i < myDict.size(); i++) { //inside FOR (dictionary)
currentWord = myDict.get(i).trim();
System.out.print(checker + " " + currentWord + "\n");
if(checker.equals(currentWord)) {
match = currentWord;
results.add(currentWord);
fail="";
} //end if
else {
fail = "No match for " + checker;
}
}//end inside FOR (dictionary)
} //END OUTSIDE FOR (input)
solve=false;
} //end while
out = results.toString();
return out;
}
Output results for input "test tester asdasdfasdlfk"
[test, No match for test, tester, No match for tester]
Carl Manaster gave the correct explanation.
Here's an improved version of your code:
for (int n = 0; n < input.size(); n++) { //outside FOR (INPUT)
String checker = input.get(n).trim();
boolean match = false;
for (int i = 0; i < myDict.size(); i++) { //inside FOR (dictionary)
String currentWord = myDict.get(i).trim();
System.out.print(checker + " " + currentWord + "\n");
if (checker.equals(currentWord)) {
match = true;
results.add(currentWord);
break;
} //end if
} //end inside FOR (dictionary)
if (!match) {
results.add("No match for " + checker);
}
} //END OUTSIDE FOR (input)
Also, consider using a HashMap instead of an ArrayList to store the dictionary and trim the words when you store them to avoid doing it in each pass.
It looks as though every word in input gets compared to every word in your dictionary. So for every word that doesn't match, you get a fail (although you only write the last failure in the dictionary to the results). The problem appears to be that you keep looping even after you have found the word. To avoid this, you probably want to add break to the success case:
if (checker.equals(currentWord)) {
match = currentWord;
results.add(currentWord);
fail = "";
break;
} else {
fail = "No match for " + checker;
}
If you are using a dictionary, you should get it with keys not with index. So it should be
if(myDict.containsKey(checker)){
String currentWord =myDict.get(checker);
System.out.print(checker + " " + currentWord + "\n");
match = currentWord;
results.add(currentWord);
fail = "";
}
else {
fail = "No match for " + checker;
}
I think more or less your code should like following.
ArrayList<String> input= new ArrayList<String>();
input.add("ahmet");
input.add("mehmet");
ArrayList<String> results= new ArrayList<String>();
Map<String, String> myDict = new HashMap<String, String>();
myDict.put("key", "ahmet");
myDict.put("key2", "mehmet");
String match="";
String fail="";
for (int n = 0; n < input.size(); n++) { //outside FOR (INPUT)
if (!fail.equals(""))
results.add(fail);
String checker = input.get(n).trim();
for (int i = 0; i < myDict.size(); i++) { //inside FOR (dictionary)
// String currentWord = myDict.get(i).trim();
if(myDict.containsKey(checker)){
String currentWord =myDict.get(checker);
System.out.print(checker + " " + currentWord + "\n");
match = currentWord;
results.add(currentWord);
fail = "";
}
else {
fail = "No match for " + checker;
}
} // end inside FOR (dictionary)
} // end outside FOR (input)
// solve = false; I dont know what is this
//} //end while. no while in my code
return results.toString();
You should place the dictionary to a HashSet and trim while add all words. Next you just need to loop the input list and compare with dict.conatins(inputWord). This saves the possible huge dictionary loop processed for all input words.
Untested brain dump:
HashSet<String> dictionary = readDictionaryFiles(...);
List<String> input = getInput();
for (String inputString : input)
{
if (dictionary.contains(inputString.trim()))
{
result.add(inputString);
}
}
out = result.toString()
....
And a solution similar to the original posting. The unnecessary loop index variables are removed:
for (String checker : input)
{ // outside FOR (INPUT)
fail = "No match for " + checker;
for (String currentWord : myDict)
{ // inside FOR (dictionary)
System.out.print(checker + " " + currentWord + "\n");
if (checker.equals(currentWord))
{
match = currentWord;
results.add(currentWord);
fail = null;
break;
}
} // end inside FOR (dictionary)
if (fail != null)
{
results.add(fail);
}
} // end outside FOR (input)
solve = false;
return results.toString();
The trim should be made while add the elements to the list. Trim the dictionary values each time is overhead. And the inner loop itself too. The complexity of the task can be reduced if the dictionary data structure is changed from List to Set.
Adding the result of "fail" is moved to the end of the outer loop. Otherwise the result of the last input string is not added to the result list.
The following code is terrible:
else {
fail = "No match for " + checker;
}
The checker does not change within the dictionary loop. But the fail string is constructed each time the checker and the dictionary value does not match.

Problems with the Iterator throwing NoSuchElementException

I am trying to write a program that randomizes groups of people. I am experiencing problems with the Iterator.
Here is the code:
#SuppressWarnings("deprecation")
public static void results(List<String> nameslist) {
Scanner scan = new Scanner(System.in);
int groups = 0;
int count = nameslist.size();
int done=0;
do{
System.out.println("How many people do you want per group?");
groups = scan.nextInt();
} while(groups == 0);
Iterator itr = nameslist.listIterator();
int peopledone=0;
while(peopledone<count){
int groupsdone = 0;
while (groupsdone <= groups){
groupsdone++;
peopledone = 0;
System.out.println("Group "+groupsdone+":");
while (peopledone <= groups){
try{
Object obj = itr.next();
System.out.println(obj);
peopledone++;
}catch (NoSuchElementException e){
System.out.println("Error");
Thread.currentThread().stop();
}
}
}
}
A few things to note:
nameslist is a list of letters (a-f) that I put together for testing purposes. Normally, they would be names of people in a class.
I am trying to get it to just list off names until it runs out.
Thanks so much!
Your getting the NoSuchElementException because, due to the nested loops, you are doing too many iterations. Once you reach the end of the list, if you call next() on the iterator again, it throws that exception.
Unless I'm misunderstanding what you're trying to do, this should work (there's probably a more elegant way but it at least corrects your issue):
public static void results(List<String> namesList)
{
Scanner scan = new Scanner(System.in);
int namesPerGroup = 0;
while (namesPerGroup == 0)
namesPerGroup = scan.nextInt();
int group = 0;
int namesInGroup = 0;
System.out.println("Group " + group + ": ");
for (String name : namesList)
{
if (namesInGroup == namesPerGroup)
{
group++;
namesInGroup = 0;
System.out.println("Group " + group + ": ");
}
System.out.println(name);
namesInGroup++;
}
}
You are iterating more times than there are list elements - be sure how many times your loop is going to loop.
What you are trying to do could and should be done in two lines:
Collections.shuffle(nameslist);
List<String> result = nameslist.subList(0, Math.min(count, nameslist.size()));

Java Dictionary Searcher

I am trying to implement a program that will take a users input, split that string into tokens, and then search a dictionary for the words in that string. My goal for the parsed string is to have every single token be an English word.
For Example:
Input:
aman
Split Method:
a man
a m an
a m a n
am an
am a n
ama n
Desired Output:
a man
I currently have this code which does everything up until the desired output part:
import java.util.Scanner;
import java.io.*;
public class Words {
public static String[] dic = new String[80368];
public static void split(String head, String in) {
// head + " " + in is a segmentation
String segment = head + " " + in;
// count number of dictionary words
int count = 0;
Scanner phraseScan = new Scanner(segment);
while (phraseScan.hasNext()) {
String word = phraseScan.next();
for (int i=0; i<dic.length; i++) {
if (word.equalsIgnoreCase(dic[i])) count++;
}
}
System.out.println(segment + "\t" + count + " English words");
// recursive calls
for (int i=1; i<in.length(); i++) {
split(head+" "+in.substring(0,i), in.substring(i,in.length()));
}
}
public static void main (String[] args) throws IOException {
Scanner scan = new Scanner(System.in);
System.out.print("Enter a string: ");
String input = scan.next();
System.out.println();
Scanner filescan = new Scanner(new File("src:\\dictionary.txt"));
int wc = 0;
while (filescan.hasNext()) {
dic[wc] = filescan.nextLine();
wc++;
}
System.out.println(wc + " words stored");
split("", input);
}
}
I know there are better ways to store the dictionary (such as a binary search tree or a hash table), but I don't know how to implement those anyway.
I am stuck on how to implement a method that would check the split string to see if every segment was a word in the dictionary.
Any help would be great,
Thank you
Splitting the input string every possible way is not going to finish in a reasonable amount of time if you want to support 20 or more characters. Here's a more efficient approach, comments inline:
public static void main(String[] args) throws IOException {
// load the dictionary into a set for fast lookups
Set<String> dictionary = new HashSet<String>();
Scanner filescan = new Scanner(new File("dictionary.txt"));
while (filescan.hasNext()) {
dictionary.add(filescan.nextLine().toLowerCase());
}
// scan for input
Scanner scan = new Scanner(System.in);
System.out.print("Enter a string: ");
String input = scan.next().toLowerCase();
System.out.println();
// place to store list of results, each result is a list of strings
List<List<String>> results = new ArrayList<>();
long time = System.currentTimeMillis();
// start the search, pass empty stack to represent words found so far
search(input, dictionary, new Stack<String>(), results);
time = System.currentTimeMillis() - time;
// list the results found
for (List<String> result : results) {
for (String word : result) {
System.out.print(word + " ");
}
System.out.println("(" + result.size() + " words)");
}
System.out.println();
System.out.println("Took " + time + "ms");
}
public static void search(String input, Set<String> dictionary,
Stack<String> words, List<List<String>> results) {
for (int i = 0; i < input.length(); i++) {
// take the first i characters of the input and see if it is a word
String substring = input.substring(0, i + 1);
if (dictionary.contains(substring)) {
// the beginning of the input matches a word, store on stack
words.push(substring);
if (i == input.length() - 1) {
// there's no input left, copy the words stack to results
results.add(new ArrayList<String>(words));
} else {
// there's more input left, search the remaining part
search(input.substring(i + 1), dictionary, words, results);
}
// pop the matched word back off so we can move onto the next i
words.pop();
}
}
}
Example output:
Enter a string: aman
a man (2 words)
am an (2 words)
Took 0ms
Here's a much longer input:
Enter a string: thequickbrownfoxjumpedoverthelazydog
the quick brown fox jump ed over the lazy dog (10 words)
the quick brown fox jump ed overt he lazy dog (10 words)
the quick brown fox jumped over the lazy dog (9 words)
the quick brown fox jumped overt he lazy dog (9 words)
Took 1ms
If my answer seems silly, it's because you're really close and I'm not sure where you're stuck.
The simplest way given your code above would be to simply add a counter for the number of words and compare that to the number of matched words
int count = 0; int total = 0;
Scanner phraseScan = new Scanner(segment);
while (phraseScan.hasNext()) {
total++
String word = phraseScan.next();
for (int i=0; i<dic.length; i++) {
if (word.equalsIgnoreCase(dic[i])) count++;
}
}
if(total==count) System.out.println(segment);
Implementing this as a hash-table might be better (it's faster, for sure), and it'd be really easy.
HashSet<String> dict = new HashSet<String>()
dict.add("foo")// add your data
int count = 0; int total = 0;
Scanner phraseScan = new Scanner(segment);
while (phraseScan.hasNext()) {
total++
String word = phraseScan.next();
if(dict.contains(word)) count++;
}
There are other, better ways to do this. One is a trie (http://en.wikipedia.org/wiki/Trie) which is a bit slower for lookup but stores data more efficiently. If you have a large dictionary, you might not be able ot fit it in memory, so you could use a database or key-value store like a BDB (http://en.wikipedia.org/wiki/Berkeley_DB)
package LinkedList;
import java.util.LinkedHashSet;
public class dictionaryCheck {
private static LinkedHashSet<String> set;
private static int start = 0;
private static boolean flag;
public boolean checkDictionary(String str, int length) {
if (start >= length) {
return flag;
} else {
flag = false;
for (String word : set) {
int wordLen = word.length();
if (start + wordLen <= length) {
if (word.equals(str.substring(start, wordLen + start))) {
start = wordLen + start;
flag = true;
checkDictionary(str, length);
}
}
}
}
return flag;
}
public static void main(String[] args) {
// TODO Auto-generated method stub
set = new LinkedHashSet<String>();
set.add("Jose");
set.add("Nithin");
set.add("Joy");
set.add("Justine");
set.add("Jomin");
set.add("Thomas");
String str = "JoyJustine";
int length = str.length();
boolean c;
dictionaryCheck obj = new dictionaryCheck();
c = obj.checkDictionary(str, length);
if (c) {
System.out
.println("String can be found out from those words in the Dictionary");
} else {
System.out.println("Not Possible");
}
}
}

Categories

Resources