Java Extracting values from text files - java

I have many text files (up to 20) and each file has it's contents like this
21.0|11|1/1/1997
13.3|12|2/1/1997
14.6|9|3/1/1997
and every file has approximately more than 300 lines.
so the problem I'm facing is this, how can I extract all and only the first values
of the file's content.
for example I want to extract the values (21.0,13.3,14.6.....etc) so I can decide the max number and minimum in all of the 20 files.
I have wrote this code from my understanding to experience it on of the files
but it didn't work
String inputFileName = "Date.txt";
File inputFile = new File(inputFileName);
Scanner input = new Scanner(inputFile);
int count = 0;
while (input.hasNext()){
double line = input.nextDouble(); //Error occurs "Exception in thread "main" java.util.InputMismatchException"
count++;
double [] lineArray= new double [365];
lineArray[count]= line;
System.out.println(count);
for (double s : lineArray){
System.out.println(s);
System.out.println(count);
and this one too
String inputFileName = "Date.txt";
File inputFile = new File(inputFileName);
Scanner input = new Scanner(inputFile);
while (input.hasNext()){
String line = input.nextLine();
String [] lineArray = line.split("//|");
for (String s : lineArray){
System.out.println(s+" ");
}
Note: I'm still kind of a beginner in Java
I hope I was clear and thanks

For each line of text, check whether it contains the pipe character. If it does, grab the first portion of the text and parse it to double.
double val = 0.0;
Scanner fScn = new Scanner(new File(“date.txt”));
while(fScn.hasNextLine()){ //Can also use a BufferedReader
data = fScn.nextLine();
if(data.contains("|")) //Ensure line contains "|"
val = Double.parseDouble(data.substring(0, data.indexOf("|"))); //grab value
}

Or you could try some streams, cool stuff
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.Stream;
public class MinMaxPrinter {
public static void main(String[] args) {
final List<String> files = Arrays.asList("file", "names", "that", "you", "need");
new MinMaxPrinter().printMinMax(files);
}
public void printMinMax(List<String> fileNames) {
List<Double> numbers = fileNames.stream()
.map(Paths::get)
.flatMap(this::toLines)
.map(line -> line.split("\\|")[0])
.map(Double::parseDouble)
.collect(Collectors.toList());
double max = numbers.stream().max(Double::compare).get();
double min = numbers.stream().min(Double::compare).get();
System.out.println("Min: " + min + " Max: " + max);
}
private Stream<String> toLines(Path path) {
try {
return Files.lines(path);
} catch (IOException e) {
return Stream.empty();
}
}
}

try (BufferedReader br = new BufferedReader(new FileReader(file))) {
String line;
while ((line = br.readLine()) != null) {
String res = s.split("\\|")[0];
}
}

Related

Excluding header from.csv

I have a .csv file which has header that I would like to be skipped. I get error when the header is present in the .csv file but when it is removed program runs perfectly fine. I would like my code to skip the header and continue on with the process.
What the .csv files looks like:
Make Model Speed Fuel BaseMPG ScaleFactor Time Travelled
Ford Mustang 0 20.2 20 0.02 2.3
import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;
public class Test {
public static void main(String[] args) throws IOException {
List<Vehicle> cars = new ArrayList<Vehicle>();
Scanner scanner = new Scanner(System.in);
System.out.println("Enter the file name:");
String filename = scanner.nextLine();
BufferedReader reader = new BufferedReader(new FileReader(new File(
filename.trim())));
String line = "";
while ((line = reader.readLine()) != null) {
String[] words = line.split(",");
String make = words[0];
String model = words[1];
int currentSpeed = Integer.parseInt(words[2]);
double fuel = Double.parseDouble(words[3]);
double baseMpg = Double.parseDouble(words[4]);
double scaleFactor = Double.parseDouble(words[5]);
double timeTravelled = Double.parseDouble(words[6]);
Vehicle car = new Car(fuel, currentSpeed, baseMpg, scaleFactor,
make, model, timeTravelled);
System.out.println(car);
cars.add(car);
}
FileWriter writer=new FileWriter(new File("ProcessedCars.txt"));
for(Vehicle car:cars)
{
writer.write(car.toString());
writer.flush();
writer.write("\r\n");
}
}
}
Skip the first line in your while loop:
boolean skip = true;
while ((line = reader.readLine()) != null) {
if(skip) {
skip = false; // Skip only the first line
continue;
}
String[] words = line.split(",");
// ...
}
One way to do it is to catch the exception:
try{
int currentSpeed = Integer.parseInt(words[2]);
// ...
}catch(NumberFormatException e){
// Failed to parse speed, input is likely a text, like header
}
Or, if you are sure there is a header, just call an extra readline() before your loop.

Removing all non alphanumeric characters in java

This a program which presents how many times does each word occur within a text file. what is going on is that its also picking up characters like ? and , i only want it to pick letters. This is just part of the results {"1"=1, "Cheers"=1, "Fanny"=1, "I=1, "biscuits"=1, "chairz")=1, "cheeahz"=1, "crisps"=1, "jumpers"=1, ?=20, work:=1
import java.io.File;
import java.io.FileReader;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.IOException;
import java.util.TreeMap;
import java.util.StringTokenizer;
public class Unigrammodel {
public static void main(String [] args){
//Creating BufferedReader to accept the file name from the user
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
String fileName = null;
System.out.print("Please enter the file name with path: ");
try{
fileName = (String) br.readLine();
//Creating the BufferedReader to read the file
File textFile = new File(fileName);
BufferedReader input = new BufferedReader(new FileReader(textFile));
//Creating the Map to store the words and their occurrences
TreeMap<String, Integer> frequencyMap = new TreeMap<String, Integer>();
String currentLine = null;
//Reading line by line from the text file
while((currentLine = input.readLine()) != null){
//Parsing the words from each line
StringTokenizer parser = new StringTokenizer(currentLine);
while(parser.hasMoreTokens()){
String currentWord = parser.nextToken();
//remove all non-alphanumeric from this word
currentWord.replaceAll(("[^A-Za-z0-9 ]"), "");
Integer frequency = frequencyMap.get(currentWord);
if(frequency == null){
frequency = 0;
}
//Putting each word and its occurrence into Map
frequencyMap.put(currentWord, frequency + 1);
}
}
//Displaying the Result
System.out.println(frequencyMap +"\n");
}catch(IOException ie){
ie.printStackTrace();
System.err.println("Your entered path is wrong");
}
}
}
Strings are immutable, so you need to assign the modified string to a variable before adding it to the map.
String wordCleaned= currentWord.replaceAll(("[^A-Za-z0-9 ]"), "");
...
frequencyMap.put(wordCleaned, frequency + 1);

Input a text file in a two dimensional array with doubles

I'm new in Java.
I want to input a text file and create from it a two dimensional array the input is
like this
12,242 323,2324
23,4434 23,4534
23,434 56,3434
....
34,434 43,3443
I have tried
import java.util.Scanner;
import java.io.File;
import java.io.IOException;
public class InputText {
/**
* #param args the command line arguments
* #throws java.io.IOException
*/
public static void main(String[] args) throws IOException {
int i=0;
File file;
file = new File("file.txt");
Scanner read=new Scanner(file);
while (read.hasNextLine()) {
String line=read.nextLine();
System.out.println(line);
}
}
}
which gives me the input but I cannot insert this in an array I tried different ways like splitting it.
Any suggestions?
Sorry for not being clear. The input i mentioned is doubles seperated by spaces. Also the format i gave you is what i get after i run the part of the programm i wrote. What i see in the text file is the numbers seperated by spaces. I tried to implement your suggestion but nothing seemed to work. I'm really lost here....
If you want to split a line to two numbers you can use
string[] numbers = line.split("\\s+");
If you want to read a double with comma
NumberFormat format = NumberFormat.getInstance(Locale.FRANCE);
...
double d1 = format.parse(numbers[0]).doubleValue();
double d2 = format.parse(numbers[1]).doubleValue();
Personally i prefer to use scanner. In that case create it with
Scanner scanner = new Scanner(file);
while (scanner.hasNextLine()) {
Scanner scanner2 = new Scanner(scanner.nextLine()).useLocale(Locale.FRANCE);
if (!scanner2.hasNextDouble()){
System.out.println("Do not have a pair");
continue;
}
double d1 = scanner2.nextDouble();
if (!scanner2.hasNextDouble()){
System.out.println("Do not have a pair");
continue;
}
double d2 = scanner2.nextDouble();
//do something
}
After reading the line.. you will have to again split the string on ','. The split string need to be converted into interger. YOu can see as below:
import java.util.Scanner;
import java.io.File;
import java.io.IOException;
public class InputText {
/**
* #param args the command line arguments
* #throws java.io.IOException
*/
public static void main(String[] args) throws IOException {
int i = 0;
File file;
file = new File("file.txt");
Scanner read = new Scanner(file);
while (read.hasNextLine()) {
String line = read.nextLine();
String[] numbers = line.split(",");
for (i = 0; i < numbers.lenght; i++) {
String numStr = numbers[i];
String x=numStr.replaceAll("\\s+",""); //eleminate the space in any.
Double num = Double.valueOf(x);
System.out.println(" num is: " + num); //Here you can store the number in array.
}
}
}
}
Try to use something like that(add also try catch statement)
String line = "";
br = new BufferedReader(new FileReader("file.txt"));
int i=0;
while ((line = br.readLine()) != null) {
// use comma as separator
String[] lineArray= line.split(",");
for(int j=0;j<lineArray.length;j++){
my2DArray[i][j] = lineArray[j];
}
i++;
}
for(int i=0;i<my2DArray[0].length;i++){
for(int j=0;j<my2DArray[1].length;j++){
System.out.print(my2DArray[i][j] + " ");
}
}

CSV data to 2d array java

Im probably going around this the wrong way, but My question is, how would I go about filling the array for fxRates?
CAD,EUR,GBP,USD
1.0,0.624514066,0.588714763,0.810307
1.601244959,1.0,0.942676548,1.2975
1.698615463,1.060809248,1.0,1.3764
1.234100162,0.772200772,.726532984,1.0
This is the information i have in the CSV file, I was thinking about using the scanner class to read it. Something like
private double[][] fxRates;
String delimiter = ","
Scanner sc = new Scanner(file);
while (sc.hasNextLine()) {
String line = sc.nextLine();
fxRates = line.split(delimiter)
Your way of solving this problem seems OK. But line.split(",") will return a 1D String array. You cannot assign it to fxRates. And also you should know the number of lines or rows in order to initialize fxRates at the beginning. Otherwise you should use a dynamic list structure like ArrayList.
Supposing you have 50 lines in your file, you can use something like:
private String[][] fxRates = String[50][];
String delimiter = ",";
Scanner sc = new Scanner(file);
int index=0;
while (sc.hasNextLine())
{
String line = sc.nextLine();
fxRates[index++] = line.split(delimiter)
}
And note that I've declared fxRates as a 2D String array, if you need double values you should do some conversion in place or later on.
import java.nio.charset.Charset;
import java.nio.file.Paths;
import java.nio.file.Files;
import java.io.IOException;
public class CSVReader{
private String readFile(String path, Charset encoding) throws IOException
{
//Read in all bytes from a file at the specified path into a byte array
//This method will fail if there is no file to read at the specified path
byte[] encoded = Files.readAllBytes(Paths.get(path));
//Convert the array of bytes into a string.
return new String(encoded, encoding);
}
public String readFile(String path)
{
try {
//Read the contents of the file at the specified path into one long String
String content = readFile(path, Charset.defaultCharset());
//Display the string. Feel free to comment this line out.
System.out.println("File contents:\n"+content+"\n\n");
//Return the string to caller
return content;
}catch (IOException e){
//This code will only execute if we could not open a file
//Display the error message
System.out.println("Cannot read file "+path);
System.out.println("Make sure the file exists and the path is correct");
//Exit the program
System.exit(1);
}`enter code here`
return null;
}
}
The result of a split operation is a String array, not an array of double. So one step is missing: converting the Strings to doubles:
private double[][] fxRates = new double[maxLines][4];
String delimiter = ","
int line = 0;
Scanner sc = new Scanner(file);
while (sc.hasNextLine()) {
String line = sc.nextLine();
String[] fxRatesAsString = line.split(delimiter);
for (int i = 0; i < fxRatesAsString.length; i++) {
fxRates[line][i] = Double.parseDouble(fxRatesAsString[i]);
}
Another example;
Double[][] fxRates = new Double[4][];
String delimiter = ",";
//file code goes here
Scanner sc = new Scanner(file);
// Read File Line By Line
int auxI = 0;
// Read File Line By Line
for (int auxI =0; sc.hasNextLine(); auxI++) {
String line = sc.nextLine();
System.out.println(line);
String[] fxRatesAsString = line.split(delimiter);
Double[] fxRatesAsDouble = new Double[fxRatesAsString.length];
for (int i = 0; i < fxRatesAsString.length; i++) {
fxRatesAsDouble[i] = Double.parseDouble(fxRatesAsString[i]);
}
fxRates[auxI] = fxRatesAsDouble;
}
//to double check it
for (int y =0; y<fxRates.length; y++){
for (int x =0; x<fxRates.length; x++){
System.out.print(fxRates[y][x] +" ");
}
System.out.println("");
}
I wouldn't recommend you to parse CSVs in such a way, because Scanner is too low-level and raw solution for this. In comparison, DOM/SAX parsers are better to parse XML rather than regular expressions parsing or whatever that does not consider the document structure. There are CSV parsers that feature good APIs and suggest configuration options during a reader initialization. Just take a look at easy to use CsvReader. Here is a code sample using it:
package q12967756;
import java.io.IOException;
import java.io.StringReader;
import java.util.ArrayList;
import java.util.Collection;
import static java.lang.Double.parseDouble;
import static java.lang.System.out;
import com.csvreader.CsvReader;
public final class Main {
private Main() {
}
private static final String MOCK =
"CAD,EUR,GBP,USD\n" +
"1.0,0.624514066,0.588714763,0.810307\n" +
"1.601244959,1.0,0.942676548,1.2975\n" +
"1.698615463,1.060809248,1.0,1.3764\n" +
"1.234100162,0.772200772,.726532984,1.0\n";
private static final char SEPARATOR = ',';
public static void main(String[] args) throws IOException {
// final FileReader contentReader = new FileReader("yourfile.csv");
final StringReader contentReader = new StringReader(MOCK);
final CsvReader csv = new CsvReader(contentReader, SEPARATOR);
csv.readHeaders(); // to skip `CAD,EUR,GBP,USD`
final Collection<double[]> temp = new ArrayList<double[]>();
while ( csv.readRecord() ) {
temp.add(parseRawValues(csv.getValues()));
}
final double[][] array2d = temp.toArray(new double[temp.size()][]);
out.println(array2d[3][1]);
}
private static double[] parseRawValues(String[] rawValues) {
final int length = rawValues.length;
final double[] values = new double[length];
for ( int i = 0; i < length; i++ ) {
values[i] = parseDouble(rawValues[i]);
}
return values;
}
}

How to tokenize an input file in java

i'm doing tokenizing a text file in java. I want to read an input file, tokenize it and write a certain character that has been tokenized into an output file. This is what i've done so far:
package org.apache.lucene.analysis;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.StreamTokenizer;
class StringProcessing {
// Create BufferedReader class instance
public static void main(String[] args) throws IOException {
InputStreamReader input = new InputStreamReader(System.in);
BufferedReader keyboardInput = new BufferedReader(input);
System.out.print("Please enter a java file name: ");
String filename = keyboardInput.readLine();
if (!filename.endsWith(".DAT")) {
System.out.println("This is not a DAT file.");
System.exit(0);
}
File File = new File(filename);
if (File.exists()) {
FileReader file = new FileReader(filename);
StreamTokenizer streamTokenizer = new StreamTokenizer(file);
int i = 0;
int numberOfTokensGenerated = 0;
while (i != StreamTokenizer.TT_EOF) {
i = streamTokenizer.nextToken();
numberOfTokensGenerated++;
}
// Output number of characters in the line
System.out.println("Number of tokens = " + numberOfTokensGenerated);
// Output tokens
for (int counter = 0; counter < numberOfTokensGenerated; counter++) {
char character = file.toString().charAt(counter);
if (character == ' ') { System.out.println(); } else { System.out.print(character); }
}
} else {
System.out.println("File does not exist!");
System.exit(0);
}
System.out.println("\n");
}//end main
}//end class
When i run this code, this is what i get:
Please enter a java file name: D://eclipse-java-helios-SR1-win32/LexractData.DAT
Number of tokens = 129
java.io.FileReader#19821fException in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: 25
at java.lang.String.charAt(Unknown Source)
at org.apache.lucene.analysis.StringProcessing.main(StringProcessing.java:40)
The input file will look like this:
-K1 Account
--Op1 withdraw
---Param1 an
----Type Int
---Param2 amount
----Type Int
--Op2 deposit
---Param1 an
----Type Int
---Param2 Amount
----Type Int
--CA1 acNo
---Type Int
-K2 CheckAccount
--SC Account
--CA1 credit_limit
---Type Int
-K3 Customer
--CA1 name
---Type String
-K4 Transaction
--CA1 date
---Type Date
--CA2 time
---Type Time
-K5 CheckBook
-K6 Check
-K7 BalanceAccount
--SC Account
I just want to read the string which are starts with -K1, -K2, -K3, and so on... can anyone help me?
The problem is with this line --
char character = file.toString().charAt(counter);
file is a reference to a FileReader that does not implement toString() .. it calls Object.toString() which prints a reference around 25 characters long. Thats why your exception says OutofBoundsException at the 26th character.
To read the file correctly, you should wrap your filereader with a bufferedreader and then put each readline into a stringbuffer.
FileReader fr = new FileReader(filename);
BufferedReader br = new BufferedReader(fr);
StringBuilder sb = new StringBuilder();
String s;
while((s = br.readLine()) != null) {
sb.append(s);
}
// Now use sb.toString() instead of file.toString()
If you are wanting to tokenize the input file then the obvious choice is to use a Scanner. The Scanner class reads a given input stream, and can output either tokens or other scanned types (scanner.nextInt(), scanner.nextLine(), etc).
import java.util.Scanner;
import java.io.File;
import java.io.IOException;
public static void main(String[] args) throws IOException {
Scanner in = new Scanner(new File("filename.dat"));
while (in.hasNext) {
String s = in.next(); //get the next token in the file
// Now s contains a token from the file
}
}
Check out Oracle's documentation of the Scanner class for more info.
public class FileTokenize {
public static void main(String[] args) throws IOException {
final var lines = Files.readAllLines(Path.of("myfile.txt"));
FileWriter writer = new FileWriter( "output.txt");
String data = " ";
for (int i = 0; i < lines.size(); i++) {
data = lines.get(i);
StringTokenizer token = new StringTokenizer(data);
while (token.hasMoreElements()) {
writer.write(token.nextToken() + "\n");
}
}
writer.close();
}

Categories

Resources