Read csv file with Java OpenCSV

Read csv file with Java OpenCSV - java

I have the following .csv file:
Company ABC
"Jan 1, 2020 - Sep 30, 2020"
Product Country Avg. monthly clients Avg. month charge Parts change Impact In stock Clients in list City
Nissan Maxima USA 6600 0% -18% Low 18
BMW X7 M50i USA 18100 22% 0% Low 28
Volvo XC90 USA 880 0% -12% Low 10
Opel Insignia USA 320 -34% -34% Low 23
Renult Triber USA 140 -18% -36% Low 8
Toyota Yaris USA 880 0% -28% Low 30
Ford Mondeo USA 70 -20% -71% Low 1
for delimiter I have empty space(Tab). I tried to use this code in order to read the file using Opencsv:
#Getter
#Setter
public class CsvLine {
#CsvBindByPosition(position = 1)
private String model;
#CsvBindByPosition(position = 2)
private String country;
}
String fileName = "C:\\in_progress\\zzz.csv";
List<CsvLine> beans = new CsvToBeanBuilder(new FileReader(fileName))
.withType(CsvLine.class)
.withSeparator(' ')
.withSkipLines(1)
.build()
.parse();
for(CsvLine item: beans){
System.out.println(item.getModel());
}
But I get this output:
X C 9 0
null
I n s i g n i a U S A 3 2 0 - 3 4 % - 3 4 % L o w 2 3
null
T r i b e r
null
Y a r i s U S A 8 8 0 0 % - 2 8 % L o w 3 0
null
M o n d e o U S A 7 0 - 2 0 % - 7 1 % L o w 1
null
null
Do you know how I can the file properly with Java preferably with OpenCSV?
Test file https://www.dropbox.com/s/7jo4i3bs6h8at25/zzz.csv?dl=0

If your CSV file really uses the Tab character as field delimitier, it should be sufficient to change to:
List<CsvLine> beans = new CsvToBeanBuilder(new FileReader(fileName))
.withType(CsvLine.class)
.withSeparator('\t')
.withSkipLines(2)
.build()
.parse();
I changed withSeparator argument and increased the number of lines to skip to 2

Related

Using Java and pdfBox to search a pdf for usd amounts

This is for something that could save me about 10 minutes at work, I am not getting paid for it. This is Java. Its been a while since I touched Java. I'm searching a PDF for just numbers that use USD currency form via pdfBox. This is a what the document looks a lot like.
Activity Report
Business Date: 10/9/2019 Property Code: me.ra777 Shift: 9 User: me.ra777
Reserve
Account Person Name Start End Days Status Money TypeOfCode Type Location Source GTD Date User
077071543 Smith's, John Middle 9/25/19 9/26/19 1 O 55.50 BAR SNQQ 211 WI MC 9/25/19 me.ra777
877075375 45Lisa, Jo.nes Mid 9/25/19 9/26/19 1 I 99.00 SEG SNKE 138 WI VI 9/25/19 me.ra777
677256813 Jo^hn Wi.ck Ed 9/26/19 9/27/19 1 O 129.00 TRQ SNQQ 132 WI VI 9/26/19 me.ra777
477007406 Guys, Are 9/26/19 9/27/19 1 O 129.00 BAR SNQQ 133 WI VI 9/26/19 me.ra777
977495887 Last, First 9/27/19 9/28/19 1 O 165.00 BAR SNKE 438 WI VI 9/27/19 me.ra777
677472246 Po.or, Rich 9/27/19 9/28/19 1 O 165.00 BAR SNKE 138 WI MC 9/27/19 me.ra777
677457228 Dude, Isn't Here 9/27/19 9/28/19 1 I 180.00 BAR SNQQ 433 WI MC 9/27/19 me.ra777
Date/Time of Printing: 10/10/2019 1:42 PM Software Version: ssrs7x67 Page 1 of 1
If I used a a method like this......
public static void oneLine(Scanner sc){
while (sc.hasNextLine()) {
String line = sc.nextLine();
if(line.contains(" WI ")){
displayArea.append("\n"+line + "\n");
break;
}else{}
}
sc.close();
}
I would only get this for my output.
077071543 Smith's, John Middle 9/25/19 9/26/19 1 O 55.50 BAR SNQQ 211 WI MC 9/25/19 me.ra777
My desired out put would be just
55.50
Maybe even all the USD amounts like this
55.50
99.00
129.00
129.00
165.00
165.00
180.00
Okay a little bit more data about this document. I only need the data in these lines
077071543 Smith's, John Middle 9/25/19 9/26/19 1 O 55.50 BAR SNQQ 211 WI MC 9/25/19 me.ra777
877075375 45Lisa, Jo.nes Mid 9/25/19 9/26/19 1 I 99.00 SEG SNKE 138 WI VI 9/25/19 me.ra777
677256813 Jo^hn Wi.ck Ed 9/26/19 9/27/19 1 O 129.00 TRQ SNQQ 132 WI VI 9/26/19 me.ra777
477007406 Guys, Are 9/26/19 9/27/19 1 O 129.00 BAR SNQQ 133 WI VI 9/26/19 me.ra777
977495887 Last, First 9/27/19 9/28/19 1 O 165.00 BAR SNKE 438 WI VI 9/27/19 me.ra777
677472246 Po.or, Rich 9/27/19 9/28/19 1 O 165.00 BAR SNKE 138 WI MC 9/27/19 me.ra777
677457228 Dude, Isn't Here 9/27/19 9/28/19 1 I 180.00 BAR SNQQ 433 WI MC 9/27/19 me.ra777
Everything in the those lines can change EXCEPT under source where it says "WI" AND Under User where it says "me.ra777" People can mess up names like where you see "45Lisa, Jo.nes" and "Jo^hn Wi.ck"
Ultimately I still have more work to do after this. Where I need to add all the USD amounts and actually, still a little more where I divide them by 100; which, in this example I believe would give me 9.225 if I did my math right.....
I'm really hoping I can just change part of this code like here ....
if(line.contains(" WI ")){
So then I could at least get an output of only the lines I need and I could work a little on my own from there and try to figure the rest out on my own.

Solved it. In short I had two major methods ----> find() & getUSD(Final String)
find() Used
1. A for Loop
2. A while(String.astNextLine)
3. If line contains("WI" && ! linecontains "Software Version " )
4. varable rate = getUSD(String)
5. doSumMathStuffs'andComplainAboutWhyJavacan'tTellThisIsa#WithoutParseing
;;
6. print ("\n"+rate);
getUSD(Final String) used
1. If/else "Matcher m = Pattern.compile("-?\d+(\.\d+)").matcher(strings);"
2. while(m.find)
3 return m.group
4. There's Actualy some parseing and some other "transfer this variable tYpE to that TyPe " too

i want to scrape text from a image file and store it in excel

BOWLING O M R W ECON 0s 45 6 WD NB Losing Dhoni as a batter always
difficult for us - Raina
TABoult 4 0 3 0 925 M 2 3 1 0 The Chennai Super Kings batsman
struck form after lean season and
JETED 6 0 = 4 O 0 0 lauded Dhoni's support at the crease
CHMorris 4 0 4 ns o9 8 1 1 against Delhi Capitals
AR Patel 3 o 3 1 1033 6 3 2 o o “Watch the ball, hit the ball' - Dhoni's
formula for the final over
S o0 e sEoe 10 o o The CSK captain has hit 554 runs in
e PR el 227 balls inthe 20th over of an IPL
match. Thats 13% of all the runs he's
made i this tournament
. Delhi Capitals Innings (target: 180 runs from 20 overs) Talking Points - Is Dhoni babering #EEIEER -
this one is my String
i want in excel

Based on the sparse description on what you want to do i would suggest:
Read the text from the image
Replace all spaces with a colon
String csvContent = imgData.replaceAll(" ",";");
save text to a csv file
open csv file with excel
The following example assumes that you have managed to retrieve the data which is then post-processed to provide the csv format. The contents are written to a file which you can just doubleclick to see that the data is split into columns as you requested.
String[] data = new String[] {
"BOWLING O M R W ECON 0s 45 6", //notice that your OCR software does not properly recognise the string here
"TABoult 4 0 3 0 925 M 2 3",
"JETED 6 0 = 4 O 0 0"
};
BufferedWriter writer = new BufferedWriter( new FileWriter( System.getProperty( "user.home" ) + System.getProperty( "file.separator" ) + "data.csv" ) );
for( String record : data ) {
writer.write( record.replaceAll( " ", ";" ) );
writer.write( "\n" );
}
writer.close();
Like i put in comment above, your OCR does not work correctly. I would suggest you take a look into JSOUP html parser to get the information and continue from there. Otherwise you will not be satisfied by the result.

driver.get("https://www.espncricinfo.com/series/8048/scorecard/1178425/chennai-super-kings-vs-delhi-capitals-50th-match-indian-premier-league-2019");
WebElement element = driver.findElement(By.xpath("//article[#class='sub-module scorecard'][1]"));
JavascriptExecutor js = (JavascriptExecutor) driver;
js.executeScript("arguments[0].scrollIntoView(true);", element);
File screen = ((TakesScreenshot)driver).getScreenshotAs(OutputType.FILE);
File file = new File("C:\\Users\\user\\Desktop\\screenshot1\\screenshotOfElement2.png");
FileHandler.copy(screen, file);
ITesseract instance = new Tesseract();
instance.setDatapath("C:\\selenium_work\\ScrapingText.PDF\\tessdata");
String result = instance.doOCR(file);
//System.out.println(result);
String[] lines = result.split("\\n");
this one what am trying

How to read a tab separated file and select few values from it using java

I have a tab separated file which looks like this
STID STNM TIME TMAX TMAXO TMIN TMINO TAVG TBAD DMAX DMAXO DMIN DMINO DAVG VDEF DBAD SMAX SMAXO SMIN SMINO SAVG SBAD BMAX BMAXO BMIN BMINO BAVG BBAD S5MX S5MXO S5MN S5MNO S5AV S5BD S25X S25XO S25N S25NO S25AV S25BD S60X S60XO S60N S60NO S60AV S60BD HMAX HMAXO HMIN HMINO HAVG HBAD PMAX PMAXO PMIN PMINO PAVG MSLP PBAD AMAX AMAXO ATOT ABAD PDIR PDFQ SDIR SDFQ IBAD WSMX WSMXO WSMN WSMNO WSPD WDEV WMAX WMAXO WBAD RAIN RNUM RMAX RBAD 9AVG 9BAD 2MAX 2MIN 2AVG 2DEV 2BAD HDEG CDEG HTMX HTMXO HTBAD WCMN WCMNO WCBAD
ACME 110 0 76.32 131 69.22 184 71.57 0 69.10 286 61.55 3 66.48 4.22 0 83.16 3 78.24 288 80.85 0 85.37 3 77.74 288 81.77 0 83.12 150 77.86 288 80.58 0 83.84 3 81.23 288 82.34 0 81.54 3 80.94 285 81.29 0 96.82 278 66.82 1 84.59 0 28.74 284 28.67 23 28.71 30.10 0 412.73 130 5.46 0 -996 -999 -996 -999 59 10.92 132 0.00 37 4.34 2.41 14.61 146 0 0.22 19 0.24 0 71.67 0 8.44 0.00 2.49 2.30 0 0.00 7.77 -996 999 288 -996 999 288
ADAX 1 0 73.99 96 68.61 21 71.32 0 70.91 169 62.77 1 68.22 2.58 0 87.15 3 82.99 288 84.83 0 88.32 3 79.54 288 83.59 0 85.06 3 81.84 288 83.31 0 88.48 3 85.21 288 86.61 0 -996 999 -996 999 -996 96 98.40 274 73.27 1 90.20 0 29.08 137 29.01 17 29.04 30.08 0 210.42 151 5.23 0 -996 -999 -996 -999 139 12.83 106 0.00 33 3.65 3.03 19.28 121 0 0.24 23 0.24 0 71.57 0 8.84 0.00 2.07 2.48 0 0.00 6.30 -996 999 288 -996 999 288
ALTU 2 0 75.51 107 68.74 168 71.63 0 70.43 279 64.56 125 67.48 3.50 0 80.60 3 77.88 288 78.91 0 79.11 3 75.96 288 77.08 0 79.97 3 77.23 288 78.41 0 81.95 3 79.57 288 80.55 0 -996 999 -996 999 -996 96 98.36 286 70.28 106 87.18 0 28.68 276 28.60 51 28.64 30.09 0 202.20 123 5.03 0 2 30.80 4 18.63 25 13.72 128 0.00 70 5.79 2.71 18.19 128 0 0.19 19 0.12 0 71.53 0 9.55 0.00 3.71 2.22 0 0.00 7.12 -996 999 288 -996 999 288
I am trying to read this file so that I can append some of the values from this file to another file.
But firstly I am unable to read the values of the column TMAX which is 4th in the columns
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
public class first {
public static void main(String[] args)
{
// TODO Auto-generated method stub
String fileName="daily.txt";
File file = new File(fileName);
try{
Scanner inputStream = new Scanner(file);
while (inputStream.hasNext()){
String data = inputStream.next();
String[] values = data.split("\t");
System.out.println(values[4]);
}
inputStream.close();
}
catch(FileNotFoundException e){
e.printStackTrace();
}
}
}
When I use the above code the output looks like this
STID
STNM
TIME
TMAX
TMAXO
TMIN
TMINO
TAVG
TBAD
DMAX
DMAXO
DMIN
DMINO
DAVG
VDEF
DBAD
SMAX
SMAXO
SMIN
I want to get an output which displays the values of the specified column numbers.

You need to use nextLine instead of next to read the whole line. Also I ran your program and found out that your file is not truely splitted by tab that's why your split may not work. Fix these two things and then you are good to go.

Here's a sample method which achieves what you're looking for (not tested, but the concept is there). Essentially, you need to read by line, split the line into some sort of array or list, and then have 2d array. you can also replace split("\t") with splitting by white space in general.
public List<String> getByColumn(int col, File file)
{
List<ArrayList<String>> arrayOfArrays = null;
try {
FileInputStream fis = new FileInputStream(file);
InputStreamReader isr = new InputStreamReader(fis, "UTF-8");
BufferedReader br = new BufferedReader(isr);
String line;
arrayOfArrays = new ArrayList<ArrayList<String>>();
while ( ( line = br.readLine() ) != null )
{
ArrayList<String> list = new ArrayList<String>(Arrays.asList(line.split("\t")));
arrayOfArrays.add(list);
}
} catch (IOException e) {
e.printStackTrace();
}
ArrayList<String> output = new ArrayList<String>();
//can use a foreach loop, or the below method.
//for (ArrayList<String> l : arrayOfArrays)
//{
// output.add(l.get(col));
//}
for ( int i = 1; i < arrayOfArrays.size(); i++ )
{
output.add(arrayOfArrays.get(i).get(col));
}
return output;
}

You should really use univocity-parsers for that - it will be way faster than your String.split and also helps a lot selecting the columns you want:
//configure the parser
TsvParserSettings parserSettings = new TsvParserSettings();
parserSettings.selectFields("TMAX" /*and others*/);
//then parse
TsvParser parser = new TsvParser(parserSettings);
List<String[]> parsedRows = parser.parseAll(new File("daily.txt"), "UTF-8");
Hope this helps.
Disclaimer: I'm the author of this library. It's open source and free (Apache 2.0 license)

use this code to parse the line
String[] values = data.trim().replaceAll(" +", " ").split(" ");

Print integer barcode with zebra in CPCL language

I have an java app and I'm trying to print code. Everything works fine until my data is an integer,only digits. The printing is made in another function like this :
byte[] configLabel = getConfigLabel();
printerConnection.write(configLabel);
private byte[] getConfigLabel() {
byte[] configLabel = null;
String str=inputbarcode.getText().toString();
String str2 = "link";
StringBuilder print = new StringBuilder("! UTILITIES\r\n");
print.append("IN-MILLIMETERS\r\n");
print.append("SETFF 15 2.5\r\n");
print.append("PRINT\r\n");
print.append("! 0 180 180 180 1\r\n");
print.append("CENTER\r\n");
print.append("BARCODE 128 1 1 50 0 20"+str.toString()+"\r\n");
print.append("T 0 3 0 80"+str.toString()+"\r\n");
print.append("T 0 3 0 100"+str2+"\r\n");
print.append("PRINT\r\n");
configLabel=String.valueOf(print).getBytes();
return configLabel;
}

Your code snippet shows missing separator between the barcode and text commands and the actual barcode or text content. Insert one space character at the end of the following strings in your code:
change "BARCODE 128 1 1 50 0 20" to "BARCODE 128 1 1 50 0 20 "
change to "T 0 3 0 80" to "T 0 3 0 80 "
change "T 0 3 0 100" to "T 0 3 0 100 "

I cannot get this code to line up underneath the headers

I have to get my output to line up beneath a heading. No matter what I do, I cannot get to line up. The item name is very long also, and the words end up wrapping to the next line when I open my outfile. Here is my current output:
8 items are currently available for purchase in Joan's Hardware Store.
----------Joan's Hardware Store-----------
itemID itemName pOrdered pInStore pSold manufPrice sellPrice
1111 Dish Washer 20 20 0 250.50 550.50
2222 Micro Wave 75 75 0 150.00 400.00
3333 Cooking Range 50 50 0 450.00 850.00
4444 Circular Saw 150 150 0 45.00 125.00
5555 Cordless Screwdriver Kit 10 10 0 250.00 299.00
6666 Keurig Programmable Single-Serve 2 2 0 150.00 179.00
7777 Moen Chrome Kitchen Faucet 1 1 0 90.00 104.00
8888 Electric Pressure Washer 0 0 0 150.00 189.00
Total number of items in store: 308
Total inventory: $: 48400.0
Here is my code:
public void endOfDay(PrintWriter outFile)
{
outFile.println (nItems + " items are currently available for purchase in Joan's Hardware Store.");
outFile.println("----------Joan's Hardware Store-----------");
outFile.printf("itemID, itemName, pOrdered, pInStore, pSold, manufPrice, sellPrice");
for (int index = 0; index < nItems; index++)
{
outFile.printf("%n %-5s %-32s %d %d %d %.2f %.2f%n", items[index].itemID , items[index].itemName , items[index].numPord ,items[index].numCurrInSt , items[index].numPSold , items[index].manuprice , items[index].sellingprice);
}
outFile.println("Total number of items in store: " + getTotalOfStock());
outFile.println("Total inventory: $: " + getTotalDollarValueInStore());
} // end endOfDay
Thanks for any help! I have tried many things for hours!!

Basically, you need to format your header the same way you format your lines, for example...
System.out.println("----------Joan's Hardware Store-----------");
System.out.printf("%-6s %-32s %-8s %-8s %-5s %-10s %-8s%n", "itemID", "itemName", "pOrdered", "pInStore", "pSold", "manufPrice", "sellPrice");
System.out.printf("%-6s %-32s %-8d %-8d %-5d %-10.2f %-8.2f%n", "1111", "Dish Washer", 20, 20, 0, 250.50, 550.50);
Results in...
----------Joan's Hardware Store-----------
itemID itemName pOrdered pInStore pSold manufPrice sellPrice
1111 Dish Washer 20 20 0 250.50 550.50

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Read csv file with Java OpenCSV - java

Related

Using Java and pdfBox to search a pdf for usd amounts

i want to scrape text from a image file and store it in excel

How to read a tab separated file and select few values from it using java

Print integer barcode with zebra in CPCL language

I cannot get this code to line up underneath the headers

Categories

Resources