reading from a comma separated text file - java

I am trying to write a Java program that simulates a record store shopping cart. The first step is to open up the inventory.txt file and read the contents which is basically what the "store has to offer". Then I need to read every line individually and process the id record and price.
The current method outputs a result that is very close to what I need, however, it picks up on the item id of the next line, as you can see below.
I was wondering if someone can assist me in figuring out how to process every line in the text document individually and store every piece of data in its own variable without picking up the id of the next item?
public void openFile(){
try{
x = new Scanner(new File("inventory.txt"));
x.useDelimiter(",");
}
catch(Exception e){
System.out.println("Could not find file");
}
}
public void readFile(){
while(x.hasNext()){
String id = x.next();
String record = x.next();
String price = x.next();
System.out.println(id + " " + record + " " + price);
break;
}
}
.txt document:
11111, "Hush Hush... - Pussycat Dolls", 12.95
22222, "Animal - Ke$ha", 9.95
33333, "Hanging By A Moment - Lifehouse - Single, 4.95
44444, "Have A Nice Day - Bon Jovi", 9.99
55555, "Day & Age - Killers", 10.99
66666, "She Wolf - Shakira", 15.99
77777, "Dark Horse - Nickelback", 12.99
88888, "The E.N.D. - Black Eyed Peas", 10.95
actual output
11111 "Hush Hush... - Pussycat Dolls" 12.95
22222
expected result
11111 "Hush Hush... - Pussycat Dolls" 12.95

So the problem here specifically is that you are breaking on commas, and you should be breaking on commas and newlines. But there are tons of other corner cases (for example, if your column is "abc,,,abc" you shouldn't break on those commas). Apache Commons comes with a CSVParser that handles all of these corner cases, you should use it:
http://commons.apache.org/csv/apidocs/org/apache/commons/csv/CSVParser.html

You can use a Pattern as the argument to Scanner.useDelimiter. Use this to provide alernates for the delimiter: either comma, or the line separator.
x.useDelimiter(",|" + System.getProperty("line.separator"));
Depending on what your input file uses as the line separator, you may need to change the second option.
The advice in other answers to use an existing CSV library is good: parsing CSV isn't as simple as breaking up the input around commas.

There are multiple ways to achieve this but going with your own way, you could use Scanner to first read lines (use Java's "line.separator" as delimiter) and then use Scanner class again with comma as delimiter.

The problem you're going to be facing is the CSV is more then just splitting a String on a comma. There are considerations to take into account with "escaped" commas (commas you don't want to delimante against).
I suggest you save your self a lot of time and head aches and use an existing API.
The Apache Commons has already been mentioned. I recently used OpenCSV and found it to be extremely simple to use and powerful
IMHO

An easy way to read in the entire file into a list of Strings (lines)...
public class Scanner {
public static List<String> readLines(String filename) throws IOException {
FileReader fileReader = new FileReader(filename);
BufferedReader bufferedReader = new BufferedReader(fileReader);
List<String> lines = new ArrayList<String>();
String line = null;
while ((line = bufferedReader.readLine()) != null) {
lines.add(line);
}
bufferedReader.close();
return lines;
}
}
Then you can process the individual lines as before, as each line is it's own String object. That is, if you don't use a CSVParser.

Related

Java - Best way to open a ton a files and search for a word?

I am searching a directory with about 450 files, each file around 20kb. Here is my method:
public void search(String searchWord) throws IOException
{
this.directoryPath = FileSystems.getDefault().getPath(this.directoryString);
this.fileListStream = Files.newDirectoryStream(this.directoryPath);
int fileCount = 0;
for(Path path : this.fileListStream)
{
String fileName = path.getFileName().toString();
if(!fileName.startsWith("."))
{
BufferedReader br = Files.newBufferedReader(path, Charset.defaultCharset());
String line;
while((line = br.readLine()) != null)
{
System.out.println(fileName + ": " + line);
}
fileCount++;
br.close();
}
}
System.out.println("File Count: " + fileCount);
}
My goal is to go word by word and find a match for searchWord and print out the line number and the file name it was found in.
My problem is that I'm wondering if I should split the line into an array and search the array for the word and add it to a list. Or should I scan the entire file into an array of words and then search for the words and add them to a list? Or does it even matter? Also, if there is a better way to do this, please let me know! I'm trying to do this as efficient as possible due to limited resources.
You shouldn't be looking word-by-word, just read the entire line as a String and then use String.indexOf() method to find if the line contains the word or not.
You can use Scanner class to parse files and use its next() method to read each word so you won't require any array or other storage. Try to use multi-threading if possible for each file which can even improve performance.

Parsing Individual Lines of Multi-Line Text File?

I have a question about something I've done in the past, but never really thought if it was the most efficient method to use.
Let's say I have a text file, where each line contains something important and let's then say I have multiple sets of these lines, each corresponding to a unique environment...so for example:
1
String that I need to parse for specific tokens..
2
String that I need to parse for specific tokens..
String that I need to parse for specific tokens..
3
String that I need to parse for specific tokens..
String that I need to parse for specific tokens..
String that I need to parse for specific tokens..
So given the above input file, my past way of solving this would be something similar to the following (semi-pseudocode!):
BufferedReader inputFile = new BufferedReader(new FileReader("file.txt"));
while(inputFile.hasNextLine())
{
Scanner line = new Scanner(inputFile.nextLine());
//parse the line looking for tokens
}
inputFile.close();
My issue with this is it seems incredibly inefficient to create a new Scanner object for every line I have in my BufferedReader.
Is there a better way to achieve this functionality?
One suggestion may be to scan the whole document by tokens, but my issue with that is I won't be able to keep track of how many strings are apart of the subset (indicated by the integer); or at least I can't think of another solution to that other than to decrement a counter every time I look at a new line.
Thanks in advance!
check out with this;
public static void main(String[] args) throws IOException {
BufferedReader bf = new BufferedReader(new FileReader(new File("d:/sample.txt")));
LineNumberReader lr = new LineNumberReader(bf);
String line = "";
while ((line = lr.readLine()) != null) {
System.out.println("Line Number " + lr.getLineNumber() +
": " + line);
}
}

Split lines into two Strings using BufferedReader

I want to split each line into two separate strings when reading through the txt file I'm using and later store them in a HashMap. But right now I can't seem to read through the file properly. This is what a small part of my file looks like:
....
CPI Clock Per Instruction
CPI Common Programming Interface [IBM]
.CPI Code Page Information (file name extension) [MS-DOS]
CPI-C Common Programming Interface for Communications [IBM]
CPIO Copy In and Out [Unix]
....
And this is what my code looks like:
try {
BufferedReader br = new BufferedReader(new FileReader("akronymer.txt"));
String line;
String akronym;
String betydning;
while((line = br.readLine()) != null) {
String[] linje = line.split("\\s+");
akronym = linje[0];
betydning = linje[1];
System.out.println(akronym + " || " + betydning);
}
} catch(Exception e) {
System.out.println("Feilen som ble fanget opp: " + e);
}
What I want is to store the acronym in one String and the definition in another String
The problem is that whitespace in the definition is interpreted as additional fields. You're getting only the first word of the definition in linje[1] because the other words are in other array elements:
["CPI", "Clock", "Per", "Instruction"]
Supply a limit parameter in the two-arg overload of split, to stop at 2 fields:
String[] linje = line.split("\\s+", 2);
E.g. linje[0] will be CPI and linje[1] will be Clock Per Instruction.
If you want to limit your split to only two parts then use split("\\s+", 2). Now you are splitting your line on every whitespace, so every word is stored in different position.

how to read a set of lines with each line having CRLF(\r\n) as delimiter using java

Using java how can i read a paragraph with each line having a delimiter CRLF(\r\n).
For example
4\r\n
This\r\n
8\r\n
response\r\n
I want to extract 4 and store it into buffer and then read 8 and store for this paragraph.
Please help me.
Use BufferedReader.readLine() to just read the lines, or Scanner if you want to automically parse the numbers.
I'm not quite sure what output you require, but the BufferedReader class will allow you to read a text file line by line. There are several examples available on the internet.
The readLine method will do the following:
Reads a line of text. A line is considered to be terminated by any one of a line feed ('\n'), a carriage return ('\r'), or a carriage return followed immediately by a linefeed.
I'm not really sure of what you have to do but if you have to get the strings from your text you can do it by following my solution.
It may be not the best way but it works using StringTokenizer and by giving \r\n as delimiter.
StringTokenizer st1 = new StringTokenizer("4\r\n This\r\n 8\r\n response\r\n", "\r\n");
//iterate through tokens
while (st1.hasMoreTokens()) {
String str = st1.nextToken();
System.out.println(str);
}
This will print
4
This
8
response
This, of course, is useful if you're not reading from a file but you've just got the full string and you need to extract data from it.
If you're reading from a file you should check the other answers as BufferedReader.readLine is the right way.
EDIT:
Here's the new code:
StringTokenizer st1 = new StringTokenizer("4\r\n This\r\n 8\r\n response\r\n", "\r\n");
//iterate through tokens
while (st1.hasMoreTokens()) {
String str = st1.nextToken();
try{
//Integer.parseInt throws an exception in the input String doesn't represent a number so we catch the exception and we simply skip it. This will just output each number in your string.
int i = Integer.parseInt(str.trim());
System.out.println(i);
} catch (NumberFormatException e){}
}
Now it just ouputs:
4
8

How to replace multiple occurences of a string in a text file with a variable entered by the user and save all to a new file?

public static void main(String args[])
{
try
{
File file = new File("input.txt");
BufferedReader reader = new BufferedReader(new FileReader(file));
String line = "000000", oldtext = "414141";
while((line = reader.readLine()) != null)
{
oldtext += line + "\r\n";
}
reader.close();
// replace a word in a file
//String newtext = oldtext.replaceAll("drink", "Love");
//To replace a line in a file
String newtext = oldtext.replaceAll("This is test string 20000", "blah blah blah");
FileWriter writer = new FileWriter("input.txt");
writer.write(newtext);writer.close();
}
catch (IOException ioe)
{
ioe.printStackTrace();
}
}
}
A couple suggestions on your sample code:
Have the user pass in old and new on the command line (i.e., args[0] and args1).
If it's sufficient to do this a line at a time, it's going to be much more efficient to read a line, replace old -> new, then stream it out.
Also check out StringUtils and IOUtils, which may make your life easier in this case.
Easiest is the String.replace(oldstring, newstring), or String.replaceAll(regex, newString) function, you can just read the one file and write the replacement into a new file (or do it line by line if you're concerned about file size).
After reading your last comment - that's a totally different story... the preferred solution would be to parse the css file into an object model (like DOM), apply the changes there and serialize the model to css afterwards. It's much easier to find all color attributes in DOM and change them compared to doing the same with search and replace.
I've found some CSS parser in the wild wild web, but none of them looked like being capable of writing CSS files.
If you wanted to replace the color names with search and replace, you'd search for 'color:<colorname>' and replace it with 'color:<youHexColorValue>'. You may have to do the same for 'color:"<colorname>"', because the color name can be set in double quotes (another argument for using a CSS parser..)
String.replaceAll() is the easiest way to do it. Just read the complete CSS file into one String, replace all as suggested above and write the new String to the same (or a temporary) file (first).

Categories

Resources