Java: Read from Config File & Efficiently Perform Tests Based on Values - java

Okay, I'm sure that I'm not going about this in the most efficient way and I'm looking for some help regarding how to do this more efficiently...
config.txt file contains key/value pairs, where key = name of test and value = whether to execute test
parse through config file and create a list of tests to run
run those tests
Here is how I'm currently going about this
create an ArrayList by passing to a helper function, parseConfig, a BufferedReader over my config file. parseConfig returns a TreeSet , which I use in the constructor method for my ArrayList
parseConfig iterates over lines of text in config file. If value indicates to perform test, add name of test to TreeSet. Return TreeSet.
Iterate over ArrayList with enhanced for loop. Body of enhanced for loop is basically a long if/else statement...if key.equals ("thisTest"), perform thisTest, else if key.equals (thatTest), perform thatTest...etc
It's that last part that I really don't like. It works well enough, but it seems clumsy and inefficient. Since my ArrayList is constructed using a TreeSet, it is in sorted order. I would like to use a more elegant and deterministic method for mapping my keys to tests to perform. Can anyone help me?

I would do something else since all you need to do with this list is to test it's entries or not.
I would take line by line and apply a regular expression on it, from what I see it is going to be really simple with only two groups and a positive lookahead, this way I could extract all the matching lines only and create an ArrayList out of those, then iterate the ArrayList and test every method. If you can give some input of how the file looks I can help you put with the code.
UPDATE
For example here is the code I come up (in 5 min could be improved) that would do the parsing:
/**
*
* #param inputFile location of inputFile
* #return {#link ImmutableSet} of tests to run
*/
public static ImmutableSet<String> parseConfigFile(File inputFile){
HashSet<String> innerSet = Sets.newHashSet();
BufferedReader bufferedReader = null;
try {
bufferedReader = new BufferedReader(new FileReader(inputFile));
String newLine = "";
while( (newLine = bufferedReader.readLine()) != null){
Pattern p = Pattern.compile("(.+)=(?=yes|1|true)(.+)");
Matcher m = p.matcher(newLine);
while(m.find()){
//System.out.println(m.group(1));
innerSet.add(m.group(1));
}
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if(bufferedReader != null)
try {
bufferedReader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
return ImmutableSet.copyOf(innerSet);
}
I testes it for a file that looks like this for example:
SomeTest=true
SomeOtherTest=false
YetAnotherTest=1
LastTest=yes
GogoTest=no
OneMore=0

The answer was to create a HashMap <String, Method> object.

Related

Iterator over TreeSet causes infinite loop

For this assignment, I'm required to save instances of a custom data class (called User) each containing 2 strings into a TreeSet. I must then search the TreeSet I created for a string taken from each line of another file. The first file is a .csv file in which each line contains an email address and a name, the .txt file contains only addresses. I have to search for every line in the .txt file, and I also have to repeat the entire operation 4000 times.
I can't use .contains to search the TreeSet because I can't search by User, since the .txt file only contains one of the two pieces of information that User does. According to information I've found in various places, I can take the iterator from my TreeSet and use that to retrieve each User in it, and then get the User's username and compare that directly to the string from the second file. I wrote my code exactly as every site I found suggested, but my program still gets stuck at an infinite loop. Here's the search code I have so far:
for (int i = 0; i < 4000; i++)//repeats search operation 4000 times
{
try
{
BufferedReader fromPasswords = new BufferedReader(new FileReader("passwordInput.txt"));
while ((line = fromPasswords.readLine()) != null)
{
Iterator it = a.iterator();
while (it.hasNext())
{
//the infinite loop happens about here, if I put a println statement here it prints over and over
if(it.next().userName.compareTo(line) == 0)
matches++; //this is an int that is supposed to go up by 1 every time a match is found
}
}
}
catch (Exception e)
{
System.out.println("Error while searching TreeSet: " + e);
System.exit(0);
}
}
For some additional info, here's my User class.
class User implements Comparable<User>
{
String userName;
String password;
public User() { userName = "none"; password = "none"; }
public User(String un, String ps) { userName = un; password = ps; }
public int compareTo(User u)
{
return userName.compareToIgnoreCase(u.userName);
}
} //User
I've done everything seemingly correctly but it looks to me like iterator doesn't move its pointer even when I call next(). Does anyone see something I'm missing?
Edit: Thanks to KevinO for pointing this out- a is the name of the TreeSet.
Edit: Here's the declaration of TreeSet.
TreeSet<User> a = new TreeSet<User>();
Are you certain there's an infinite loop? You're opening a file 4000 times and iterating through a collection for every line in the file. Depending on size of the file and the collection this could take a very long time.
Some other things to be aware of:
Later versions of Java have a more succinct way of opening a file and iterating through all the lines: Files.lines
You don't need an Iterator to iterate through a collection. A normal for-each loop will do or convert it to a stream
If all you want to do is count the matches then a stream is just as good
Putting all that together:
Path path = Paths.get("passwordInput.txt");
Set<User> users = new TreeSet<>();
long matches = Paths.lines(path)
.mapToLong(l -> users.stream()
.map(User::getName).filter(l::equals).count())
.sum();

Reading text file using stream - Variable in lambda expression

I would like to read a text file, work with the string I have from each line, add a variable and continue.
For example: I have a text file with three lines and I'd like to add a variable to the end of each line which gets upped by one for each line.
To break everything down I made an example program but I still do not get fully behind it.
My text file "test.txt" has three lines of text "Line1", "Line2" and "Line3" and I have an Integer named testNum with a value of 500 which I would like to up after each line.
My code is the following:
String fileName = "test.txt";
int testNum = 1024;
Stream<String> readFileStream = null;
try {
readFileStream = Files.lines(Paths.get(fileName));
} catch (IOException e) {
e.printStackTrace();
}
readFileStream.forEach( line -> {
System.out.println(line+testNum);
testNum++;
});
Now I understand that the issue lies in the lambda expression. Can someone explain to me why I need local variables for the lambda expression and am unable to access variables declared outside from it?
Moreover, I tried to change my code to use a for each instead but for each seems not applicable for "Stream", e.g.:
for(String line : readFileStream){
}
Thanks a lot in advance.
Generally, the time and place of creation of a lambda expression differs from the time and place of execution. Sometimes, a lambda expression is created in method A and executed in method B half minutes or hours later. And it might be executed on a different thread. So it would not be sensible to write to variables with method scope (i.e. stack variables that only exist while the method is executed). Read access to those variable is allowed as their value is 'copied' into the lambda expression at creation time.
In your case, it might be easier to give up streams and use the List version of Files.lines(...), so you can iterate via for-loop:
List<String> lines = Files.readAllLines(Paths.get(filename));
int testNum = 500;
for(String line : lines) {
System.out.println(line + testNum);
testNum++;
}
If you want to use streams for this task you can create a specific String consumer https://docs.oracle.com/javase/8/docs/api/java/util/function/Consumer.html
String fileName = "test.txt";
Stream<String> readFileStream = null;
try {
readFileStream = Files.lines(Paths.get(fileName));
} catch (IOException e) {
e.printStackTrace();
}
readFileStream.forEach(
new Consumer<String>() {
int testNum = 1024;
public void accept(String line) {
try {
System.out.println(line + testNum++);
} catch (Exception e) {
e.printStackTrace();
}
}
});

Creating a getList method with a csv file as an input parameter

The first assignment of my algorithms class is that I have to create a program that reads a series of book titles from a provided csv file, sorts them, and then prints them out. The assignment has very specific parameters, and one of them is that I have to create a static List getList(String file) method. The specifics of what this method entails are as follows:
"The method getList should readin the data from the csv
file book.csv. If a line doesn’t follow the pattern
title,author,year then a message should be written
to the standard error stream (see sample output) The
program should continue reading in the next line. NO
exception should be thrown ."
I don't have much experience with the usage of List, ArrayList, or reading in files, so as you can guess this is very difficult for me. Here's what I have so far for the method:
public static List<Book> getList(String file)
{
List<Book> list = new ArrayList<Book>();
return list;
}
Currently, my best guess is to make a for loop and instantiate a new Book object into the List using i as the index, but I wouldn't know how high to set the loop, as I don't have any method to tell the program how, say, many lines there are in the csv. I also wouldn't know how to get it to differentiate each book's title, author, and year in the csv.
Sorry for the long-winded question. I'd appreciate any help. Thanks.
The best way to do this, would be to read the file line by line, and check if the format of the line is correct. If it is correct, add a new object to the list with the details in the line, otherwise write your error message and continue.
You can read your file using a BufferedReader. They can read line by line by doing the following:
BufferedReader br = new BufferedReader(new FileReader(file));
String line;
while ((line = br.readLine()) != null) {
// do something with the line here
}
br.close();
Now that you have the lines, you need to verify they are in the correct format. A simple method to do this, is to split the line on commas (since it is a csv file), and check that it has at least 3 elements in the array. You can do so with the String.split(regex) method.
String[] bookDetails = line.split(",");
This would populate the array with the fields from your file. So for example, if the first line was one,two,three, then the array would be ["one","two","three"].
Now you have the values from the line, but you need to verify that it is in the correct format. Since your post specified that it should have 3 fields, we can check this by checking the length of the array we got above. If the length is less than 3, we should output some error message and skip that line.
if(bookDetails.length<3){ //title,author,year
System.err.println("Some error message here"); // output error msg
continue; // skip this line as the format is corrupted
}
Finally, since we have read and verified that the information we need is there, and is in the valid format. We can create a new object and add it to the list. We will use the Integer wrapper built into Java to parse the year into a primitive int type for the Book class constructor. The Integer has a function Integer.parseInt(String s) that will parse a String into an int value.
list.add(new Book(bookDetails[0], bookDetails[1], Integer.parseInt(bookDetails[2])));
Hopefully this helps you out, and answers your question. A full method of what we did could be the following:
public static List<Book> getList(String file) {
List<Book> list = new ArrayList<Book>();
try {
BufferedReader br = new BufferedReader(new FileReader(file));
String line;
while ((line = br.readLine()) != null) {
String[] bookDetails = line.split(",");
if (bookDetails.length < 3) { // title,author,year
System.err.println("Some error message here");
continue;
}
list.add(new Book(bookDetails[0], bookDetails[1], Integer.parseInt(bookDetails[2])));
}
br.close();
} catch (IOException e) {
e.printStackTrace();
}
return list;
}
And if you would like to test this, a main method can be made with the following code (this is how I tested it).
public static void main(String[] args) {
String file = "books.csv";
List<Book> books = getList(file);
for(Book b : books){
System.out.println(b);
}
}
To test it, make sure you have a file (mine was "books.csv") in your root directory of your Java project. Mine looked like:
bob,jones,1993
bob,dillon,1994
bad,format
good,format,1995
another,good,1992
bad,format2
good,good,1997
And with the above main method, getList function, and file, my code generator the following output (note: the error messages were in red for the Std.err stream, SO doesn't show colors):
Some error message here
Some error message here
[title=bob, author=jones, years=1993]
[title=bob, author=dillon, years=1994]
[title=good, author=format, years=1995]
[title=another, author=good, years=1992]
[title=good, author=good, years=1997]
Feel free to ask questions if you are confused on any part of it. The output shown is from a toString() method I wrote on the Book class that I used for testing the code in my answer.
You can use a do while loop and read it till the end of file. Each new line will represent a Book Object detail.
In a csv all details are comma separated, So you can read the string and each comma will act as a delimiter between attributes of Book.

Remove all objects in an arraylist that exist in another arraylist

I'm trying to read in from two files and store them in two separate arraylists. The files consist of words which are either alone on a line or multiple words separated by commas.
I read each file with the following code (not complete):
ArrayList<String> temp = new ArrayList<>();
FileInputStream fis;
fis = new FileInputStream(fileName);
Scanner scan = new Scanner(fis);
while (scan.hasNextLine()) {
Scanner input = new Scanner(scan.nextLine());
input.useDelimiter(",");
while (scan.hasNext()) {
String md5 = scan.next();
temp.add(md5);
}
}
scan.close();
return temp;
I now need to read two files in and remove all words from the first file which also exist in the second file (there are some duplicate words in the files). I have tried with for-loops and other such stuff, but nothing has worked so any help would be greatly appreciated!
Bonus question: I also need to find out how many duplicates there are in the two files - I've done this by adding both arraylists to a HashSet and then subtracting the size of the set from the combined size of the two arraylists - is this a good solution, or could it be done better?
You can use the removeAll method to remove the items of one list from another list.
To obtain the duplicates you can use the retainAll method, though your approach with the set is also good (and probably more efficient)
The collection facility has a convenient method for this purpose:
list1.removeAll(list2);
First you need to override equal method in your custom class and define the matching criteria of removing list
public class CustomClass{
#Override
public boolean equals(Object obj) {
try {
CustomClass licenceDetail = (CustomClass) obj;
return name.equals(licenceDetail.getName());
}
catch (Exception e)
{
return false;
}
}
}
Second you call the removeAll() method
list1.removeAll(list2);
As others have mentioned, use the Collection.removeAll method if you wish to remove all elements that exist in one Collection from the Collection you are invoking removeall on.
As for your bonus question, I'm a huge fan of Guava's Sets class. I would suggest the use of Sets.intersection as follows:
Sets.intersection(wordSetFromFile1, wordSetFromFile2).size();
Assuming you created a Set of words from both files, you can determine how many distinct words they have in common with that one liner.

Java ConcurrentHashMap corrupt values

I have a ConcurrentHashMap that exhibits strange behavior on occasion.
When my app first starts up, I read a directory from the file system and load contents of each file into the ConcurrentHashMap using the filename as the key. Some files may be empty, in which case I set the value to "empty".
Once all files have been loaded, a pool of worker threads will wait for external requests. When a request comes in, I call the getData() function where I check if the ConcurrentHashMap contains the key. If the key exists I get the value and check if the value is "empty". If value.contains("empty"), I return "file not found". Otherwise, the contents of the file is returned. When the key does not exist, I try to load the file from the file system.
private String getData(String name) {
String reply = null;
if (map.containsKey(name)) {
reply = map.get(name);
} else {
reply = getDataFromFileSystem(name);
}
if (reply != null && !reply.contains("empty")) {
return reply;
}
return "file not found";
}
On occasion, the ConcurrentHashMap will return the contents of a non-empty file (i.e. value.contains("empty") == false), however the line:
if (reply != null && !reply.contains("empty"))
returns FALSE. I broke down the IF statement into two parts: if (reply != null) and if (!reply.contains("empty")). The first part of the IF statement returns TRUE. The second part returns FALSE. So I decided to print out the variable "reply" in order to determine if the contents of the string does in fact contain "empty". This was NOT the case i.e. the contents did not contain the string "empty". Furthermore, I added the line
int indexOf = reply.indexOf("empty");
Since the variable reply did not contain the string "empty" when I printed it out, I was expecting indexOf to return -1. But the function returned a value approx the length of the string i.e. if reply.length == 15100, then reply.indexOf("empty") was returning 15099.
I experience this issue on a weekly basis, approx 2-3 times a week. This process is restarted on a daily basis therefore the ConcurrentHashMap is re-generated regularly.
Has anyone seen such behavior when using Java's ConcurrentHashMap?
EDIT
private String getDataFromFileSystem(String name) {
String contents = "empty";
try {
File folder = new File(dir);
File[] fileList = folder.listFiles();
for (int i = 0; i < fileList.length; i++) {
if (fileList[i].isFile() && fileList[i].getName().contains(name)) {
String fileName = fileList[i].getAbsolutePath();
FileReader fr = null;
BufferedReader br = null;
try {
fr = new FileReader(fileName);
br = new BufferedReader(fr);
String sCurrentLine;
while ((sCurrentLine = br.readLine()) != null) {
contents += sCurrentLine.trim();
}
if (contents.equals("")) {
contents = "empty";
}
return contents;
} catch (Exception e) {
e.printStackTrace();
if (contents.equals("")) {
contents = "empty";
}
return contents;
} finally {
if (fr != null) {
try {
fr.close();
} catch (Exception e) {
e.printStackTrace();
}
}
if (br != null) {
try {
br.close();
} catch (Exception e) {
e.printStackTrace();
}
}
if (map.containsKey(name)) {
map.remove(name);
}
map.put(name, contents);
}
}
}
} catch (Exception e) {
e.printStackTrace();
if (contents.equals("")) {
contents = "empty";
}
return contents;
}
return contents;
}
I think your problem is that some of your operations should be atomic and they aren't.
For example, one possible thread interleaving scenario is the following:
Thread 1 reads this line in the getData method:
if (map.containsKey(name)) // (1)
the result is false and Thread 1 goes to
reply = getDataFromFileSystem(name); // (2)
in getDataFromFileSystem, you have the following code:
if (map.containsKey(name)) { // (3)
map.remove(name); // (4)
}
map.put(name, contents); // (5)
imagine that another thread (Thread 2) arrives at (1) while Thread 1 is between (4) and (5): name is not in the map, so thread 2 goes to (2) again
Now that does not explain the specific issue you are observing but it illustrates the fact that when you let many threads run concurrently in a section of code without synchronization, weird things can and do happen.
As it stands, I can't find an explanation for the scenario you describe, unless you call reply = map.get(name) more than once in your tests, in which case it is very possible that the 2 calls don't return the same result.
First off, don't even think that there is a bug in ConcurrentHashMap. JDK faults are very rare and even entertaining the idea will pull you away from properly debugging your code.
I think your bug is as follows. Since you are using contains("empty") what happens if the line from the file has the word "empty" in it? Isn't that going to screw things up?
Instead of using contains("empty") I would use ==. Make the "empty" a private static final String then you can use equality on it.
private final static String EMPTY_STRING_REFERENCE = "empty";
...
if (reply != null && reply != EMPTY_STRING_REFERENCE) {
return reply;
}
...
String contents = EMPTY_STRING_REFERENCE;
...
// really this should be if (contents.isEmpty())
if (contents.equals("")) {
contents = EMPTY_STRING_REFERENCE;
}
This is, btw, the only time you should be using == to compare strings. In this case you want to test it by reference and not by contents since lines from your files could actually contain the magic string.
Here are some other points:
In general, whenever you are using the same String in multiple places in your program, it should be pulled up to a static final field. Java will probably do this for you anyway but it makes the code a lot cleaner as well.
#assylias is spot on about race conditions when you make 2 calls to ConcurrentHashMap. For example, instead of doing:
if (map.containsKey(name)) {
reply = map.get(name);
} else {
You should do the following so you do only one.
reply = map.get(name);
if (reply == null) {
In your code you do this:
if (map.containsKey(name)) {
map.remove(name);
}
map.put(name, contents);
That should be rewritten as the following. There is no need to remove before the put which introduces race conditions as #assylias mentioned.
map.put(name, contents);
You said:
if reply.length == 15100, then reply.indexOf("empty") was returning 15099.
This is not possible with the same reply String. I suspect that you were looking at different threads or in some other way misinterpreting the output. Again, don't be fooled into thinking that there are bugs in java.lang.String.
First, using ConcurrentHashMap does not protect you if you call its methods from multiple threads in sequence. If you call containsKey and get afterwards and another thread calls remove in between you will have a null result. Be sure to call only get and check for null instead of containsKey/get. It's also better regarding performance, because both methods nearly have the same cost.
Second, the weird indexOf call result is either due to a programming error, or points to memory corruption. Is there any native code involved in your application? What are you doing in getDataFromFileSystem? I observed memory corruption when using FileChannel objects from multiple threads.

Categories

Resources