I am trying to read a BufferedReader that reads in a file containing records separated by commas. I would like to split each string (or record) in between two commas, strip the double quotes, and put each of those into an index of a String array. For example:
say I have this line in the file:
("0001", "00203", "82409" (newline)
"0002", "00204", "82500" (newline)
etc.)
I want to put 0001 into a String array[1],
I want 00203 into String array[2],
and so on....
The following code traverses the file, putting all records in column two into String array[2]. This means, after I execute the code below, if I do System.out.println (arr[2]), it will print 00203 and 00204, whereas I would like array[2] to be 00203 and array[5] to be 00204.
Here is my code:
public String[] getArray(String source) {
FileInputStream fileinput = new FileInputStream(source);
GZIPInputStream gzip = new GZIPInputStream(fileinput);
InputStreamReader inputstream = new InputStreamReader(gzip);
BufferedReader bufr = new BufferedReader(inputstream);
String str = null;
String[] arr = null;
while((str = bufr.readLine()) != null) {
arr = str.replace("\"", "").split("\\s*,\\s*");
}
return arr;
Commons CSV was designed for your specific use case. Let's not reinvent the wheel, the code below will result in a GZipped CSV being parsed into fields and lines and seems to be what you're trying to do.
public String[][] getInfo() throws IOException {
final CSVParser parser = new CSVParser(new FileReader(new InputStreamReader(new GZIPInputStream(fileinput)), CSVFormat.DEFAULT.withIgnoreSurroundingSpaces(true));
String[][] result = parser.nextRecord().values();
return result;
}
Few of these modifications should work for you.
public String[] getArray(String source) {
FileInputStream fileinput = new FileInputStream(source);
GZIPInputStream gzip = new GZIPInputStream(fileinput);
InputStreamReader inputstream = new InputStreamReader(gzip);
BufferedReader bufr = new BufferedReader(inputstream);
String str = null;
List<String> numbers = new LinkedList<String>;
while((str = bufr.readLine()) != null) {
String[] localArr = str.split(",");
for(String intString : localArr){
numbers.add(intString.trim());
}
}
return arr;
Have you tried using the scanner class as well as scanner.nextInt(). you do not need to do striping then.
Scanner s = new Scanner(inputstream);
ArrayList<String> list = new ArrayList<String>();
while (s.hasNextInt())
list.add(s.nextInt());
String[] arr = list.toArray(new String[list.size()]);
Not tested:
arr = str.replaceAll("\"", "").replaceAll("(","").replaceAll(")","").split(",");
Related
I wanted to ask how I can load information of a csv-file in a list. I don't have much until now so a little help would be nice if possible. What I tried until now is to get the file but I'm not sure if it's right because I'm not that good at File I/O. After that I stuck at how to save it in a list.
List<GameCharacter> characters;
static void loadTextFile(String textFile) throws FileNotFoundException {
//textFile = String.valueOf(new File("C:/Users/User/AppData/Local/Temp/Temp1_2022_WHP.zip/resources/characters.csv"));
textFile = "C:/Users/User/AppData/Local/Temp/Temp1_2022_WHP.zip/resources/characters.csv";
FileInputStream d = new FileInputStream(textFile);
}
Not sure if you have any limitations on what you can use and what can't but did you try using OpenCSV library?
You can then read it like this
try (CSVReader reader = new CSVReader(new FileReader("file.csv"))) {
List<String[]> r = reader.readAll();
r.forEach(x -> System.out.println(Arrays.toString(x)));
}
This will give you a list of array of strings which will contain all the values for each line.
You can use Scanner like this
textFile = "C:/Users/User/AppData/Local/Temp/Temp1_2022_WHP.zip/resources/characters.csv";
List<List<String>> mylist = new ArrayList<>();
try (Scanner scanner = new Scanner(new File("textFile"));) {
while (scanner.hasNextLine()) {
mylist.add(getRecordFromLine(scanner.nextLine()));
}
}
You can also use BufferedReader in java.io like the following
textFile = "C:/Users/User/AppData/Local/Temp/Temp1_2022_WHP.zip/resources/characters.csv";
List<List<String>> myList = new ArrayList<>();
try (BufferedReader bffuerReader = new BufferedReader(new FileReader("textFile"))) {
String line;
while ((line = bffuerReader.readLine()) != null) {
String[] values = line.split(COMMA_DELIMITER);
myList.add(Arrays.asList(values));
}
}
I wrote the following Java code,
try {
String str = "";
Hashtable< String, String> table = new Hashtable< String, String>();
fis = new FileInputStream("C:\\Users\\Dave\\Desktop\\station.txt");// FileInputStream
isr = new InputStreamReader(fis);
br = new BufferedReader(isr);
String str1 = "012649-99999";
String str2 = "012650-99999";
while ((str = br.readLine()) != null) {
String[] record = str.split("\t");
table.put(record[0], record[1]);
}
String stationName1 = table.get(str1);
String stationName2 = table.get(str2);
} catch(...)
Snd the content of station.txt is as follows:
012649-99999 SIHCCAJAVRI
012650-99999 TRNSET-HANSMOEN
When I run the program, the stationName1 is always null, and the stationName2 can get value 012650-99999. Who can tell me why this happen? Thank you in advance!
#matt: Yes, that's right, when I changed the encoding from 'UTF-8' to 'ANSI', it worked, stationName1 can get value, but why 'UTF-8' does not work for this situation? I always use that format.
The problem is your text file doesn't contain any \t character. There are multiple spaces. The correct way is to use \\s+, that matches mutliple whitespaces.
String[] record = str.split("\\s+");
Moreover Hashtable is obsolete. There is HashMap<> instead now. Here is the full code working for me. I have tested it:
String str;
HashMap<String, String> table = new HashMap<>();
FileInputStream fis = new FileInputStream("station.txt");
InputStreamReader isr = new InputStreamReader(fis);
BufferedReader br = new BufferedReader(isr);
String str1 = "012649-99999";
String str2 = "012650-99999";
while ((str = br.readLine()) != null) {
String[] record = str.split("\\s+");
table.put(record[0], record[1]);
}
System.out.println(table.get(str1));
System.out.println(table.get(str2));
Could you please replace your following line:
String[] record = str.split("\t");
by following line:
String[] record = str.split("[\\s]+");
and see the result?
Your working solution is here:-
import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.InputStreamReader;
import java.util.Hashtable;
public class Test {
public static void main(String[] args) {
try {
String str = "";
Hashtable< String, String> table = new Hashtable< String, String>();
FileInputStream fis = new FileInputStream("C:\\Users\\Dave\\Desktop\\station.txt");// FileInputStream
InputStreamReader isr = new InputStreamReader(fis);
BufferedReader br = new BufferedReader(isr);
String str1 = "012649-99999";
String str2 = "012650-99999";
while ((str = br.readLine()) != null) {
System.out.println(str);
String[] record = str.split("[\\s]+");
table.put(record[0], record[1]);
}
br.close();
String stationName1 = table.get(str1);
String stationName2 = table.get(str2);
System.out.println("stationName1:"+stationName1);//
System.out.println("stationName2:"+stationName2);//
} catch(Exception e){
System.out.println(e);
}
}
}
If you're sure the file contains TAB space, the correct way to match will be as follows.
String[] record = str.split("\\t");
The argument to split() is a regex and the regex for TAB space is \t which as a Java String will be "\\t".
Also, don't use Hashtable, use HashMap instead as explained in the other answer.
I am trying to read a huge file where a new line is indicated by no space, comma, new line character, or anything.
Example: line1element1, line1element2, line1element3, line2element1, line2element2, and so on..
The file is a csv and I am reading it like following:
public static void main(String[] args) throws Exception {
ArrayList<String> list = new ArrayList<>();
String element;
String filename = "E:\\csv.csv";
Scanner scanner = new Scanner(new File(filename));
scanner.useDelimiter(",");
for (int i = 0; i < 50; i++) {
element = scanner.next();
list.add(element);
}
System.out.print(list);
}
This causes issues because the element50 in a line gets combined with element51, although it should be a new line.
Use a BufferedReader for this:
String filename = "E:\\csv.csv";
BufferedReader fileReader = null;
//Delimiter used in CSV file
final String DELIMITER = ",";
String line = "";
//Create the file reader
fileReader = new BufferedReader(new FileReader(filename ));
//Read the file line by line
while ((line = fileReader.readLine()) != null)
{
//Get all tokens available in line
String[] tokens = line.split(DELIMITER);
for(String token : tokens)
{
//Print all tokens
System.out.println(token);
}
}
USe a BufferedReader, not Scanner
File f= ...;
BufferedReader br = new BufferedReader(new FileReader(f));
String line;
while ((line = br.nextLine()) != null) {
String[] columns = line.split(",");
}
Why not use CSVParser from Apache Commons or OpenCSV?
Examples here:
OpenCSV Example
Apache Commons example
If you insist on doing this manually, use BufferedReader as the other comments mention.
From your description it seems your file does not have headers for each column. Use uniVocity-parsers to do this for you - it's 3 times faster than Commons CSV & OpenCSV and packed with features.
// you have many configuration options here - check the tutorial. By default values are trimmed and blank lines skipped.
CsvParserSettings settings = new CsvParserSettings();
CsvParser parser = new CsvParser(settings);
List<String[]> allRows = parser.parseAll(new FileReader(new File("/path/to/your/file.csv")));
Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).
I have a text file with 300 lines or so. And the format is like:
Name Amount Unit CountOfOrder
A 1 ml 5000
B 1 mgm 4500
C 4 gm 4200
// more data
I need to read the text file line by line because each line of data should be together for further processing.
Now I just use string array for each line and access the data by index.
for each line in file:
array[0] = {data from the 'Name' column}
array[1] = {data from the 'Amount' column}
array[2] = {data from the 'Unit' column}
array[3] = {data from the 'CountOfOrder' column}
....
someOtherMethods(array);
....
However, I realized that if the text file changes its format (e.g. switch two columns, or insert another column), it would break my program (accessing through index might be wrong or even cause exception).
So I would like to use the title as reference to access each column. Maybe HashMap is a good option, but since I have to keep each line of data together, if I build a HashMap for each line, that would be too expensive.
Does anyone have any thought on this? Please help!
you only need a single hash map to map your column names to the proper column index. you fill the arrays by indexing with integers as you did before, to retrieve a column by name you'd use array[hashmap.get("Amount")].
You can read the file using opencsv.
CSVReader reader = new CSVReader(new FileReader("yourfile.txt"), '\t');
List<String[]> lines = reader.readAll();
The fist line contains the headers.
you can read each line of the file and assuming that the first line of the file has the column header you can parse that line to get all the names of the columns.
String[] column_headers = firstline.split("\t");
This will give you the name of all the columns now you just read through splitting on tabs and they will all line up.
You could do something like this:
BufferedReader in = new BufferedReader(new InputStreamReader(
new FileInputStream(FILE)));
String line = null;
String[] headers = null;
String[] data = null;
Map<String, List<String>> contents = new HashMap<String, List<String>>();
if ((line = in.readLine()) != null) {
headers = line.split("\t");
}
for(String h : headers){
contents.put(h, new ArrayList<String>());
}
while ((line = in.readLine()) != null) {
data = line.split("\t");
if(data.length != headers.length){
throw new Exception();
}
for(int i = 0; i < data.length; i++){
contents.get(headers[i]).add(data[i]);
}
}
It would give you flexibility, and would only require making the map once. You can then get the data lists from the map, so it should be a convenient data structure for the rest of your program to use.
This will give you individual list of columns.
public static void main(String args[]) throws FileNotFoundException, IOException {
List<String> headerList = new ArrayList<String>();
List<String> column1 = new ArrayList<String>();
List<String> column2 = new ArrayList<String>();
List<String> column3 = new ArrayList<String>();
List<String> column4 = new ArrayList<String>();
int lineCount=0;
BufferedReader br = new BufferedReader(new FileReader("file.txt"));
try {
StringBuilder sb = new StringBuilder();
String line = br.readLine();
String tokens[];
while (line != null) {
tokens = line.split("\t");
if(lineCount != 0)
{
int count = 0;
column1.add(tokens[count]); ++count;
column2.add(tokens[count]); ++count;
column3.add(tokens[count]); ++count;
column4.add(tokens[count]); ++count;
continue;
}
if(lineCount==0){
for(int count=0; count<tokens.length; count++){
headerList.add(tokens[count]);
lineCount++;
}
}
}
} catch (IOException e) {
} finally {
br.close();
}
}
using standard java.util.Scanner
String aa = " asd 9 1 3 \n d -1 4 2";
Scanner ss = new Scanner(aa);
ss.useDelimiter("\n");
while ( ss.hasNext()){
String line = ss.next();
Scanner fs = new Scanner(line);
System.out.println( "1>"+ fs.next()+" " +fs.nextInt() +" " +fs.nextLong()+" " +fs.nextBigDecimal());
}
using a bunch of hashmap's is ok...i won't be afraid ;)
if you need to process a lot of data...then try to translate your problem into a dataprocessing transformation
for example:
read all of you data into a hashmap's, but store them in a database using some JPA implementation....then you can go round'a'round your data ;)\
I have a text file like this:
Item 1
Item 2
Item 3
I need to be able to read each "Item X" into a string and ideally store all the strings as a vector / ArrayList.
I tried:
InputStream is = new FileInputStream("file.txt");
is.read(); //looped for every line of text
but that seems to only handle integers.
Thanks
You have several answers here, the easiest would be to us a Scanner (in java.util).
It has several convenience methods like nextLine() and next() and nextInt(), so you could simply do the following:
Scanner scanner = new Scanner(new File("file.txt"));
List<String> text = new ArrayList<String>();
while (scanner.hasNextLine()) {
text.add(scanner.nextLine());
}
Alternatively you could use a BufferedReader (in java.io):
BufferedReader reader = new BufferedReader(new FileReader("file.txt"));
List<String> text = new ArrayList<String>();
for (String line; (line = reader.readLine()) != null; ) {
text.add(line);
}
However Scanners are generally easier to work with.
You should use FileUtils to do this. It has a method named readLines
public static List<String> readLines(File file, Charset encoding) throws IOException
Reads the contents of a file line by line to a List of Strings. The file is always closed.
See #BackSlash's comment above to see how you're using InputStream.read() wrong.
#BackSlash also mentioned you can use java.nio.file.Files#readAllLines but only if you're using Java 1.7 or later.
You could use Java 7's Files#readAllLines. A short one-liner and no 3rd party library imports necessary :)
List<String> lines =
Files.readAllLines(Paths.get("file.txt"), StandardCharsets.UTF_8);
BufferedReader br = new BufferedReader(new FileReader("file.txt"));
try {
StringBuilder sb = new StringBuilder();
String line = br.readLine();
String [] tmp ;
while (line != null) {
sb.append(line);
tmp = line.Split(" ");
line = br.readLine();
}
String everything = sb.toString();
} finally {
br.close();
}
Scanner scan = new Scanner(new FileInputStream("file.txt"));
scan.nextLine();