Save String with random number of lines to single line array JAVA - java

So here is the method which is reading from the file, it then splits the information by the # sign. which is where a new month begins in the text file
public static String readPurchaseOrder(Scanner sc) {
final String DELIMITER = "#";
try {
while (sc.hasNext()) {
sc.useDelimiter(DELIMITER);
String data = sc.next();
return data;
}
} catch (Exception e) {
System.out.println(e);
}
sc.close();
return null;
}
The text file contains information shown below up to the 12th month
04/01/12#PNW-1234#PA/1234#10
15/01/12#BSE-5566#bT/4674#5#
08/02/12#PNE-3456#Xk/8536#1#
07/03/12#PEA-4567#ZR/7413#3
09/03/12#ESE-6329#HY/7195#30#
03/04/12#ESE-5577#LR/4992#12
23/04/12#PNW-1235#HY/7195#2#
09/05/12#ESE-6329#PV/5732#6
25/05/12#BSE-5566#PV/5732#10#
08/06/12#PNE-3457#kD/9767#1
31/06/12#EMI-6329#ZR/7413#10#
03/07/12#EMI-6329#PV/5732#12
25/07/12#BSE-5566#bT/4674#5#
I am using this to output the information from the file split by the #
for (int i = 0; i <12; i ++){
String str[] = InputFileData.readPurchaseOrder(sC).split("\\n");
for(String s : str){
System.out.println(s);
}
It outputs the data like this
04/01/12#PNW-1234#PA/1234#10
15/01/12#BSE-5566#bT/4674#5
08/02/12#PNE-3456#Xk/8536#1
07/03/12#PEA-4567#ZR/7413#3
09/03/12#ESE-6329#HY/7195#30
03/04/12#ESE-5577#LR/4992#12
23/04/12#PNW-1235#HY/7195#2
09/05/12#ESE-6329#PV/5732#6
25/05/12#BSE-5566#PV/5732#10
I want to store each individual line in an array, so I can then further split up the line to its each respective variables

If you would like to collect the results in an array, one line per array element, the easiest way to do it is to use a list (since you don't know in advance the number of lines), and then convert it to an array. The size of an array has to be declared in advance, so you want to use a more flexible data structure if you don't know how big it's going to be.
public static String[] readPurchaseOrder(Scanner sc) {
final String DELIMITER = "#";
List<String> results = new ArrayList<>();
try {
while (sc.hasNext()) {
sc.useDelimiter(DELIMITER);
String data = sc.next();
results.add(data); // add the line to the list
}
} catch (Exception e) {
System.out.println(e);
}
sc.close();
// convert the list to an array and return it.
return results.toArray(new String[results.size()]);
}

Related

How to count duplicate entries in a .csv file?

I have a .csv file that is formated like this:
ID,date,itemName
456,1-4-2020,Lemon
345,1-3-2020,Bacon
345,1-4-2020,Sausage
123,1-1-2020,Apple
123,1-2-2020,Pineapple
234,1-2-2020,Beer
345,1-4-2020,Cheese
I have already implemented the algorithm to go through the file, scan for the first number and sort it in a descending order and make a new output:
123,1-1-2020,Apple
123,1-2-2020,Pineapple
234,1-2-2020,Beer
345,1-3-2020,Bacon
345,1-4-2020,Cheese
345,1-4-2020,Sausage
456,1-4-2020,Lemon
My question is, how do I implement my algorithm to make an output that counts the duplicate first number entries and reformat it to make it look like this...
123,1-1-2020,1,Apple
123,1-2-2020,1,Pineapple
234,1-2-2020,1,Beer
345,1-3-2020,1,Bacon
345,1-4-2020,2,Cheese,Sausage
456,1-4-2020,1,Lemon
...so that it counts the number of occurrence for each ID, denote it with the number of times, and if the date of that ID is also the same, combine the item names to the same line. Below is my source code (each line in the .csv is made into an object named 'receipt' that has ID, date, and name with their respective get() methods):
public class ReadFile {
private static List<Receipt> readFile() {
List<Receipt> receipts = new ArrayList<>();
try {
BufferedReader reader = new BufferedReader(new FileReader("dataset.csv"));
// Move past the first title line
reader.readLine();
String line = reader.readLine();
// Start reading from second line till EOF, split each string at ","
while (line != null) {
String[] attributes = line.split(",");
Receipt attribute = getAttributes(attributes);
receipts.add(attribute);
line = reader.readLine();
}
reader.close();
} catch (IOException e) {
e.printStackTrace();
}
return receipts;
}
private static Receipt getAttributes(String[] attributes) {
// Get ID located before the first ","
long memberNumber = Long.parseLong(attributes[0]);
// Get date located after the first ","
String date = attributes[1];
// Get name located after the second ","
String name = attributes[2];
return new Receipt(memberNumber, date, name);
}
// Parse the data into new file after sorting
private static void parse(List<Receipt> receipts) {
PrintWriter output = null;
try {
output = new PrintWriter("output.txt");
} catch (FileNotFoundException e) {
e.printStackTrace();
}
// For each receipts, assert the text output stream is not null, print line.
for (Receipt p : receipts) {
assert output != null;
output.println(p.getMemberNumber() + "," + p.getDate() + "," + p.getName());
}
assert output != null;
output.close();
}
// Main method, accept input file, sort and parse
public static void main(String[] args) {
List<Receipt> receipts = readFile();
QuickSort q = new QuickSort();
q.quickSort(receipts);
parse(receipts);
}
}
The easiest way is to use a map.
Sample data from your file.
String[] lines = {
"123,1-1-2020,Apple",
"123,1-2-2020,Pineapple",
"234,1-2-2020,Beer",
"345,1-3-2020,Bacon",
"345,1-4-2020,Cheese",
"345,1-4-2020,Sausage",
"456,1-4-2020,Lemon"};
Create a map
as you read the lines, split them and add them to the map using the compute method. This will put the line in if the key (number and date) doesn't exist. Otherwise it simply appends the last item to the existing entry.
the file does not have to be sorted but the values will be added to the end as they are encountered.
Map<String, String> map = new LinkedHashMap<>();
for (String line : lines) {
String[] vals = line.split(",");
// if v is null, add the line
// if v exists, take the existing line and append the last value
map.compute(vals[0]+vals[1], (k,v)->v == null ? line : v +","+vals[2]);
}
for (String line : map.values()) {
String[] fields = line.split(",",3);
int count = fields[2].split(",").length;
System.out.printf("%s,%s,%s,%s%n", fields[0],fields[1],count,fields[2]);
}
For this sample run prints
123,1-1-2020,1,Apple
123,1-2-2020,1,Pineapple
234,1-2-2020,1,Beer
345,1-3-2020,1,Bacon
345,1-4-2020,2,Cheese,Sausage
456,1-4-2020,1,Lemon

How can read a txt file so that all words are placed in a new array element and not every new line is placed in a element?

I currently have written a code that is able to read through a .txt file and for every new line it will be placed in a array element (not very hard). It works but this was not my initial intention, I want to have every word placed in a new array element, not after every new line. Here is my current code, can someone maybe help? Thank you!
public static ArrayList<String> read_file() {
try {
ArrayList<String> data_base = new ArrayList<String>();
Scanner s1 = new Scanner(new File("C:\\Users\\Jcool\\OneDrive\\A Levels\\Computer Science\\CSV files\\data convert\\convert.txt"));
while(s1.hasNextLine()) {
data_base.add(s1.nextLine());
}
return data_base;
}catch(FileNotFoundException e) {
}
return null;
}
Read all the lines at once and split them into array.
private static String readAllBytes(String filePath)
{
String content = "";
try
{
content = new String ( Files.readAllBytes( Paths.get(filePath) ) );
}
catch (IOException e)
{
e.printStackTrace();
}
return content;
}
Create a method named readAllBytes and call it like this;
/* String to split. */
String stringToSplit = readAllBytes(filePath);
String[] tempArray;
/* delimiter */
String delimiter = " ";//space if its a file contains words
/* given string will be split by the argument delimiter provided. */
tempArray = stringToSplit.split(delimiter);
If you mean to split your lines into array check this answer.
Take a look at the split(String) method. It returns a String[]. As an example
String string = "AAA-BBB";
String[] parts = string.split("-");
String part1 = parts[0]; // AAA
String part2 = parts[1]; // BBB

Read the each string text from file in java

I am new in java. I just wants to read each string in java and print it on console.
Code:
public static void main(String[] args) throws Exception {
File file = new File("/Users/OntologyFile.txt");
try {
FileInputStream fstream = new FileInputStream(file);
BufferedReader infile = new BufferedReader(new InputStreamReader(
fstream));
String data = new String();
while ((data = infile.readLine()) != null) { // use if for reading just 1 line
System.out.println(""+data);
}
} catch (IOException e) {
// Error
}
}
If file contains:
Add label abc to xyz
Add instance cdd to pqr
I want to read each word from file and print it to a new line, e.g.
Add
label
abc
...
And afterwards, I want to extract the index of a specific string, for instance get the index of abc.
Can anyone please help me?
It sounds like you want to be able to do two things:
Print all words inside the file
Search the index of a specific word
In that case, I would suggest scanning all lines, splitting by any whitespace character (space, tab, etc.) and storing in a collection so you can later on search for it. Not the question is - can you have repeats and in that case which index would you like to print? The first? The last? All of them?
Assuming words are unique, you can simply do:
public static void main(String[] args) throws Exception {
File file = new File("/Users/OntologyFile.txt");
ArrayList<String> words = new ArrayList<String>();
try {
FileInputStream fstream = new FileInputStream(file);
BufferedReader infile = new BufferedReader(new InputStreamReader(
fstream));
String data = null;
while ((data = infile.readLine()) != null) {
for (String word : data.split("\\s+") {
words.add(word);
System.out.println(word);
}
}
} catch (IOException e) {
// Error
}
// search for the index of abc:
for (int i = 0; i < words.size(); i++) {
if (words.get(i).equals("abc")) {
System.out.println("abc index is " + i);
break;
}
}
}
If you don't break, it'll print every index of abc (if words are not unique). You could of course optimize it more if the set of words is very large, but for a small amount of data, this should suffice.
Of course, if you know in advance which words' indices you'd like to print, you could forego the extra data structure (the ArrayList) and simply print that as you scan the file, unless you want the printings (of words and specific indices) to be separate in output.
Split the String received for any whitespace with the regex \\s+ and print out the resultant data with a for loop.
public static void main(String[] args) { // Don't make main throw an exception
File file = new File("/Users/OntologyFile.txt");
try {
FileInputStream fstream = new FileInputStream(file);
BufferedReader infile = new BufferedReader(new InputStreamReader(fstream));
String data;
while ((data = infile.readLine()) != null) {
String[] words = data.split("\\s+"); // Split on whitespace
for (String word : words) { // Iterate through info
System.out.println(word); // Print it
}
}
} catch (IOException e) {
// Probably best to actually have this on there
System.err.println("Error found.");
e.printStackTrace();
}
}
Just add a for-each loop before printing the output :-
while ((data = infile.readLine()) != null) { // use if for reading just 1 line
for(String temp : data.split(" "))
System.out.println(temp); // no need to concatenate the empty string.
}
This will automatically print the individual strings, obtained from each String line read from the file, in a new line.
And afterwards, I want to extract the index of a specific string, for
instance get the index of abc.
I don't know what index are you actually talking about. But, if you want to take the index from the individual lines being read, then add a temporary variable with count initialised to 0.
Increment it till d equals abc here. Like,
int count = 0;
for(String temp : data.split(" ")){
count++;
if("abc".equals(temp))
System.out.println("Index of abc is : "+count);
System.out.println(temp);
}
Use Split() Function available in Class String.. You may manipulate according to your need.
or
use length keyword to iterate throughout the complete line
and if any non- alphabet character get the substring()and write it to the new line.
List<String> words = new ArrayList<String>();
while ((data = infile.readLine()) != null) {
for(String d : data.split(" ")) {
System.out.println(""+d);
}
words.addAll(Arrays.asList(data));
}
//words List will hold all the words. Do words.indexOf("abc") to get index
if(words.indexOf("abc") < 0) {
System.out.println("word not present");
} else {
System.out.println("word present at index " + words.indexOf("abc"))
}

Reading a .txt file and excluding certain elements

In my journey to complete this program I've run into a little hitch with one of my methods. The method I am writing reads a certain .txt file and creates a HashMap and sets every word found as a Key and the amount of time it appears is its Value. I have managed to figure this out for another method, but this time, the .txt file the method is reading is in a weird format. Specifically:
more 2
morning's 1
most 3
mostly 1
mythology. 1
native 1
nearly 2
northern 1
occupying 1
of 29
off 1
And so on.
Right now, the method is returning only one line in the file.
Here is my code for the method:
public static HashMap<String,Integer> readVocabulary(String fileName) {
// Declare the HashMap to be returned
HashMap<String, Integer> wordCount = new HashMap();
String toRead = fileName;
try {
FileReader reader = new FileReader(toRead);
BufferedReader br = new BufferedReader(reader);
// The BufferedReader reads the lines
String line = br.readLine();
// Split the line into a String array to loop through
String[] words = line.split(" ");
// for loop goes through every word
for (int i = 0; i < words.length; i++) {
// Case if the HashMap already contains the key.
// If so, just increments the value.
if (wordCount.containsKey(words[i])) {
int n = wordCount.get(words[i]);
wordCount.put(words[i], ++n);
}
// Otherwise, puts the word into the HashMap
else {
wordCount.put(words[i], 1);
}
}
br.close();
}
// Catching the file not found error
// and any other errors
catch (FileNotFoundException fnfe) {
System.err.println("File not found.");
}
catch (Exception e) {
System.err.print(e);
}
return wordCount;
}
The issue is that I'm not sure how to get the method to ignore the 2's and 1's and 29's of the .txt file. I attempted making an 'else if' statement to catch all of these cases but there are too many. Is there a way for me to catch all the ints from say, 1-100, and exlude them from being Keys in the HashMap? I've searched online but have turned up something.
Thank you for any help you can give!
How about just doing wordCount.put(words[0],1) into wordcount for every line, after you've done the split. If the pattern is always "word number", you only need the first item from the split array.
Update after some back and forth
public static HashMap<String,Integer> readVocabulary(String toRead)
{
// Declare the HashMap to be returned
HashMap<String, Integer> wordCount = new HashMap<String, Integer>();
String line = null;
String[] words = null;
int lineNumber = 0;
FileReader reader = null;
BufferedReader br = null;
try {
reader = new FileReader(toRead);
br = new BufferedReader(reader);
// Split the line into a String array to loop through
while ((line = br.readLine()) != null) {
lineNumber++;
words = line.split(" ");
if (words.length == 2) {
if (wordCount.containsKey(words[0]))
{
int n = wordCount.get(words[0]);
wordCount.put(words[0], ++n);
}
// Otherwise, puts the word into the HashMap
else
{
boolean word2IsInteger = true;
try
{
Integer.parseInt(words[1]);
}
catch(NumberFormatException nfe)
{
word2IsInteger = false;
}
if (word2IsInteger) {
wordCount.put(words[0], Integer.parseInt(words[1]));
}
}
}
}
br.close();
br = null;
reader.close();
reader = null;
}
// Catching the file not found error
// and any other errors
catch (FileNotFoundException fnfe) {
System.err.println("File not found.");
}
catch (Exception e) {
System.err.print(e);
}
return wordCount;
}
To check if a String contains a only digits use StringĀ“s matches() method, e.g.
if (!words[i].matches("^\\d+$")){
// NOT a String containing only digits
}
This wont require checking exceptions and it doesnt matter if the number wouldnt fit inside an Integer.
Option 1: Ignore numbers separated by whitespace
Use Integer.parseInt() or Double.parseInt() and catch the exception.
// for loop goes through every word
for (int i = 0; i < words.length; i++) {
try {
int wordAsInt = Integer.parseInt(words[i]);
} catch(NumberFormatException e) {
// Case if the HashMap already contains the key.
// If so, just increments the value.
if (wordCount.containsKey(words[i])) {
int n = wordCount.get(words[i]);
wordCount.put(words[i], ++n);
}
// Otherwise, puts the word into the HashMap
else {
wordCount.put(words[i], 1);
}
}
}
There is a Double.parseDouble(String) method, which you could use in place of Integer.parseInt(String) above if you wanted to eliminate all numbers, not just integers.
Option 2: Ignore numbers everywhere
Another option is to parse your input one character at a time and ignore any character that isn't a letter. When you scan whitespace, then you could add the word generated by the characters just scanned in to your HashMap. Unlike the methods mentioned above, scanning by character would allow you to ignore numbers even if they appear immediately next to other characters.

ArrayList confusion

The code below is my attempt to read from a file of strings, read through each line until a ':' is found then store + print everything after that. however The print function prints out everything that I read in from the file. Can someone spot where I'm going wrong? thanks
edit: every line is in this format "Some text here:More text here"
public void openFile() {
try {
scanner = new BufferedReader(new FileReader("calendar.ics"));
} catch (Exception e) {
System.out.println("Could not open file");
}
}
public void readFile() {
ArrayList<String> vals = new ArrayList<String>();
String test;
try {
while ((line = scanner.readLine()) != null)
{
int indexOfComma = line.indexOf("\\:"); // returns firstIndexOf ':'
test = line.substring(indexOfComma+1); // test to be everything after ':'
vals.add(test); // add values to vals
}
} catch(Exception ex){ }
for(int i=0; i<vals.size(); i++){
System.out.println(vals.get(i));
}
}
You don't need to escape your colon.
line.indexOf("\\:");
Change the above line to: -
line.indexOf(":");
Because, that will search for \\:, and if not found return the value -1.
test = line.substring(indexOfComma+1);
So, if your indexComma is -1, which will certainly be, if your string does not contain - \\:, then your above line becomes: -
line.substring(0); // same as whole string
As a suggestion, you should have abstract type as the type of reference when declaring your list. So, you should use List instead of ArrayList on the LHS of the List declaration: -
List<String> vals = new ArrayList<String>();

Categories

Resources