I'm working under Java and want to extract data according to column from a text file.
"myfile.txt" contents:
ID SALARY RANK
065 12000 1
023 15000 2
035 25000 3
076 40000 4
I want to extract the data individually according to any Column i.e ID, SALARY, RANK etc
Basically I want to perform operations on individual data according to columns.
I've listed the data from "myfile.txt" by using while loop and reading line-by-line:
while((line = b.readLine()) != null) {
stringBuff.append(line + "\n");
}
link: Reading selective column data from a text file into a list in Java
Under bove link it is written to use the following:
String[] columns = line.split(" ");
But how to use it correctly, please any hint or help?
You can use a regex to detect longer spaces, example:
String text = "ID SALARY RANK\n" +
"065 12000 1\n" +
"023 15000 2\n" +
"035 25000 3\n" +
"076 40000 4\n";
Scanner scanner = new Scanner(text);
//reading the first line, always have header
//I suppose
String nextLine = scanner.nextLine();
//regex to break on any ammount of spaces
String regex = "(\\s)+";
String[] header = nextLine.split(regex);
//this is printing all columns, you can
//access each column from row using the array
//indexes, example header[0], header[1], header[2]...
System.out.println(Arrays.toString(header));
//reading the rows
while (scanner.hasNext()) {
String[] row = scanner.nextLine().split(regex);
//this is printing all columns, you can
//access each column from row using the array
//indexes, example row[0], row[1], row[2]...
System.out.println(Arrays.toString(row));
System.out.println(row[0]);//first column (ID)
}
while((line = b.readLine()) != null) {
String[] columns = line.split(" ");
System.out.println("my first column : "+ columns[0] );
System.out.println("my second column : "+ columns[1] );
System.out.println("my third column : "+ columns[2] );
}
Now instead of System.out.println, do whatever you want with your columns.
But I think your columns are separated by tabs so you might want to use split("\t") instead.
Related
I am doing a project where I need to read multiple lines that contains user data from a txt file. This data will create a profile.
For example:
name,lastname,email,hobbies1;hobbies2...hobbiesN,activity1;activity2...activityN
name2,lastname2,email.... and so on
I don't know how many hobbies or activities are there so I have to set them into an array. All of variables are on one line.
I tried using delimiter and split, but when I move onto the next line I get inputMismatchException.
Simplest way is to change the format.
Instead of separating every field with , try using ; for separating different types of attributes and , for elements of arrays. The end result will be something like:
name; lastname; email; hobbies1, hobbies2, ..., hobbiesN; activity1, activity2, ..., activityN
First you split the String using ; as delimiter, then for those fields that allows for arrays you divide the array in its elements by splitting that subString with , as delimiter.
Read lines, then split on comma, and split on semi-colon where needed.
Don't use Scanner for line-reading a file.
try (BufferedReader in = Files.newBufferedReader(Paths.get("test.txt"))) {
for (String line; (line = in.readLine()) != null; ) {
String[] fields = line.split(",");
String name = (fields.length >= 1 ? fields[0] : "");
String lastname = (fields.length >= 2 ? fields[1] : "");
String email = (fields.length >= 3 ? fields[2] : "");
String[] hobbies = (fields.length >= 4 ? fields[3].split(";") : new String[0]);
String[] activities = (fields.length >= 5 ? fields[4].split(";") : new String[0]);
System.out.println("name=" + name +
", lastname=" + lastname +
", email=" + email +
", hobbies=" + Arrays.toString(hobbies) +
", activities=" + Arrays.toString(activities));
}
}
test.txt
name,lastname,email,hobbies1;hobbies2...hobbiesN,activity1;activity2...activityN
name2,lastname2,email
Output
name=name, lastname=lastname, email=email, hobbies=[hobbies1, hobbies2...hobbiesN], activities=[activity1, activity2...activityN]
name=name2, lastname=lastname2, email=email, hobbies=[], activities=[]
I'm trying to load a csv file and split 'timespan' into 'begin' and 'end'. If the timespan consists of one date 'begin' and 'end' are the same.
timespan,someOtherField, ...
27.03.2017 - 31.03.2017,someOtherValue, ...
31.03.2017,someOtherValue, ...
Result:
begin,end,someOtherField
27.03.2017,31.03.2017,someOtherValue, ...
31.03.2017,31.03.2017,someOtherValue, ...
At the moment I'm loading the file line by line using OpenCSV. This works pretty good but i don't know how to split one attribute. Propably I have to parse the CSV into an array?
For any line l you can use StringTokenizer to get the tokens separated by ,:
StringTokenizer tokens = new StringTokenizer(l, ",")
The first token represents timespan, so:
String timespan = tokens.nextToken()
Then you can split timespan based on " - ", so:
String[] startEnd = timespan.split(" - ");
Finally, you have to compute the size of the startEnd, if startEnd.length == 1, then you absolutely know that start begin and end coincides, so startEnd[0],startEnd[0]
otherwise the result would look like the following startEnd[0],startEnd[1]
I hope this could help you solve the problem.
Thanks for your answer! I parsed the csv into an extra class and created an object for each record. The code below shows the splitting of the timespan. I will now rebuild a new csv file from all objects.
// Load CSV as Booking objects
ArrayList<Booking> bookings = Utils.readCSV(csvClean);
for (int i = 0; i < bookings.size(); i++) {
String timespan = bookings.get(i).getTimespan();
String begin = "";
String end = "";
if (timespan.contains(" - ")) {
// Split timespan and set values
String[] parts = timespan.split(" - ");
begin = parts[0].trim();
end = parts[1].trim();
bookings.get(i).setBegin(begin);
bookings.get(i).setEnd(end);
} else {
bookings.get(i).setBegin(timespan.trim());
bookings.get(i).setEnd(timespan.trim());
} // end if else
} // end for
I am parsing around 10 number of CSV files
and doing tokenization. So the fourth token 'PageTitle' sometimes start with double quotes ("). For that, I am taking special care like this
String page = st.nextToken();
if(page.startsWith("\""))
{
String s;
while(!(s=st.nextToken()).endsWith("\""))
{
System.out.println(page);
page += (","+s);
System.out.println(page);
}
page += (","+s);
page = page.substring(0, page.length());
}
I don't know where I am doing mistake but I want to read tokens, which start with double quotes followed by some tokens and end with double quotes, into one token like this
"List of lesbian, gay, bisexual or transgender-related films of 2012"
But I am getting only "List of lesbian, gay
Instead of rolling out your own parser , you can use a library like OpenCSV. You will need to do the following
a) Add dependency , if you are using maven
<dependency>
<groupId>net.sf.opencsv</groupId>
<artifactId>opencsv</artifactId>
<version>2.3</version>
</dependency>
To illustrate i have used the following sample data , saved as data.csv on WD
one , two , three
four,five,"read , these , numerals"
c) Sample code
CSVReader reader = new CSVReader(new FileReader("data.csv"));
String [] nextLine;
while ((nextLine = reader.readNext()) != null) {
// nextLine[] is an array of values from the line
System.out.println("Column 1 :"+nextLine[0]);
System.out.println("Column 2 :"+nextLine[1]);
System.out.println("Column 3 :"+ nextLine[2]);
}
}
prints :
Column 1 :one
Column 2 : two
Column 3 : three
Column 1 :four
Column 2 :five
Column 3 :read , these , numerals
I have one question regarding about retrieving Barcode data.
Below screen shot is the my java application.
For example, the barcode data have "12345-6789".
I put cursor on "Mo No." and scan barcode, System will read barcode and display on "Mo No." filled as "12345-6789"
But what I want is "12345" in Mo No. and "6789" in Container No. Once I scanned barcode.
How should I implement the code.
Please advice.Thanks.
you can just ignore whatever after dash -
for example:
String barcode="12345-6789";
System.out.println(barcode.substring(0,barcode.indexOf("-"))); //this will only print whatever before first occurance of '-'
OUTPUT:
12345
Use String#split
String toSplit = "12345-6789";
String a;
String b;
//check if string contains your split-char with [string#contains][2])
if(toSplit.contains("-")
{
//split takes RegularExpression!
String[] parts = toSplit.split("-");
a = parts[0]; // =12345
b = parts[1]; // =6789
}
else
{
throw new IllegalArgumentException(toSplit + " does not contain -");
}
I have csv in file containing multiple rows.If first column value is nothing its is giving error and m not able to insert in database.
ex
If row is :130,1,datafile8.csv, 2007 ,17,List_date no problem in reading n inserting
but if row is: ,0,datafile8.csv,Bihar,7,list_Left ,not able to read n insert .how to insert null in above row .so i can insert dis row in database.
String keyword = "celldescription.csv";
File makefile = new File(keyword);
BufferedReader r2 = new BufferedReader(new FileReader(makefile));
strLine1 = r2.readLine();
System.out.println (strLine1);
String r="0";int r1=0;
while((strLine1=r2.readLine())!=null)
{
System.out.println (strLine1);
StringTokenizer st2 = new StringTokenizer(strLine1, ",");
// Print the content on the console
String cellvalue = st2.nextToken();
String position = st2.nextToken();
String Docid=st2.nextToken();
String Word=st2.nextToken();
String Count=st2.nextToken();
String List_Entry=st2.nextToken();
String tab3="insert into description(cellvalue,position,Docid,Word,Count,List_Entry) values(?,?,?,?,?,?)";
ps = connection.prepareStatement(tab3);
ps.setString (1,cellvalue );
ps.setString (2,position );
ps.setString (3,Docid);
ps.setString (4,Word );
ps.setString (5,Count );
ps.setString (6,List_Entry );
ps.executeUpdate();
}//end of while
r2.close();
System.out.println("Data is inserted");
}//try closed**
When your String strLine1 starts with comma(,) StringTokenizer omit empty string if it is in start or end or even in between.
Ex - ,0,datafile8.csv,Bihar,7,list_Left
token -> "0" - "datafile8.csv" - "Bihar" - "7" and "list_Left"
better you split the string by comma(,).
Ex -
String[] str = strLine1.split(",",-1);
str[] -> ["","datafile8.csv","Bihar","7" and "list_Left"]
You may want to consider using a java library for your work with csv files.
OpenCSV is one, it helped me a lot.
Some of its features:
Arbitrary numbers of values per line
Ignoring commas in quoted elements
Handling quoted entries with embedded carriage returns (ie entries that span multiple lines)
Configurable separator and quote characters (or use sensible defaults)
Read all the entries at once, or use an Iterator style model
Creating csv files from String[] (ie. automatic escaping of embedded quote chars)