How to compare each line of a text File? java

How to compare each line of a text File? java - java

I have a text file with content like, with 792 lines:
der 17788648
und 14355959
die 10939606
Die 10480597
Now I want to compare if "Die" and "die" are equal in lowercase.
So if two Strings in lowerCase are equal, copy the word into a new text file in lowerCase and sum the values.
Expected output:
der 17788648
und 14355959
die 114420203
I have that so far:
try {
BufferedReader bk = null;
BufferedWriter bw = null;
bk = new BufferedReader(new FileReader("outagain.txt"));
bw = new BufferedWriter(new FileWriter("outagain5.txt"));
List<String> list = new ArrayList<>();
String s = "";
while (s != null) {
s = bk.readLine();
list.add(s);
}
for (int k = 0; k < 793; k++) {
String u = bk.readLine();
if (list.contains(u.toLowerCase())) {
//sum values?
} else {
bw.write(u + "\n");
}
}
System.out.println(list.size());
} catch (Exception e) {
System.out.println("Exception caught : " + e);
}

Instead of list.add(s);, use list.add(s.toLowerCase());. Right now your code is comparing lines of indeterminate case to lower-cased lines.

With Java 8, the best approach to standard problems like reading files, comparing, grouping, collecting is to use the streams api, since it is much more concise to do that in that way. At least when the files is only a few KB, then there will be no problems with that.
Something like:
Map<String, Integer> nameSumMap = Files.lines(Paths.get("test.txt"))
.map(x -> x.split(" "))
.collect(Collectors.groupingBy(x -> x[0].toLowerCase(),
Collectors.summingInt(x -> Integer.parseInt(x[1]))
));
First, you can read the file with Files.lines(), which returns a Stream<String>, than you can split the strings into a Stream<String[]>,
finally you can use the groupingBy() and summingInt() functions to group by the first element of the array and sum by the second one.
If you don't want to use the stream API, you can also create a HashMap und do your summing manually in the loop.

Use a HashMap to keep track of the unique fields. Before you do a put, do a get to see if the value is already there. If it is, sum the old value with the new one and put it in again (this replaces the old line having same key)
package com.foundations.framework.concurrency;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.HashMap;
import java.util.Iterator;
public class FileSummarizer {
public static void main(String[] args) {
HashMap<String, Long> rows = new HashMap<String, Long>();
String line = "";
BufferedReader reader = null;
try {
reader = new BufferedReader(new FileReader("data.txt"));
while ((line = reader.readLine()) != null) {
String[] tokens = line.split(" ");
String key = tokens[0].toLowerCase();
Long current = Long.parseLong(tokens[1]);
Long previous = rows.get(key);
if(previous != null){
current += previous;
}
rows.put(key, current);
}
}
catch (IOException e) {
e.printStackTrace();
}
finally {
try {
reader.close();
Iterator<String> iterator = rows.keySet().iterator();
while (iterator.hasNext()) {
String key = iterator.next().toString();
String value = rows.get(key).toString();
System.out.println(key + " " + value);
}
}
catch (IOException e) {
e.printStackTrace();
}
}
}
}

The String class has an equalIgnoreCase method which you can use to compare two strings irrespective of case. so:
String var1 = "Die";
String var2 = "die";
System.out.println(var1.equalsIgnoreCase(var2));
Would print TRUE.

If I got your question right, you want to know how you can get the prefix from the file, compare it, get the value behind it and sum them up for each prefix. Is that about right?
You could use regular expressions to get the prefixes and values seperately. Then you can sum up all values with the same prefix and write them to the file for each one.
If you are not familiar with regular expressions, this links could help you:
Regex on tutorialpoint.com
Regex on vogella.com
For additional tutorials just scan google for "java regex" or similar tags.
If you do not want to differ between upper- and lowercase strings, just convert them all to lower/upper before comparing them as #spork explained already.

Related

How to sort a file in alphabetical order?

I'm currently trying to make a program in java to sort a content of a file in alphabetical order, the file contains :
camera10:192.168.112.43
camera12:192.168.2.112
camera1:127.0.0.1
camera2:133.192.31.42
camera3:145.154.42.58
camera8:192.168.12.205
camera3:192.168.2.122
camera5:192.168.112.54
camera123:192.168.2.112
camera4:192.168.112.1
camera6:192.168.112.234
camera7:192.168.112.20
camera9:192.168.2.112
And I would like to sort them and write that back into the file (which in this case is "daftarCamera.txt"). But somehow my algorithm sort them in the wrong way, which the result is :
camera10:192.168.112.43
camera123:192.168.2.112
camera12:192.168.2.112
camera1:127.0.0.1
camera2:133.192.31.42
camera3:145.154.42.58
camera3:192.168.2.122
camera4:192.168.112.1
camera5:192.168.112.54
camera6:192.168.112.234
camera7:192.168.112.20
camera8:192.168.12.205
camera9:192.168.2.112
while the result I want is :
camera1:127.0.0.1
camera2:133.192.31.42
camera3:145.154.42.58
camera3:192.168.2.122
camera4:192.168.112.1
camera5:192.168.112.54
camera6:192.168.112.234
camera7:192.168.112.20
camera8:192.168.12.205
camera9:192.168.2.112
camera10:192.168.112.43
camera12:192.168.2.112
camera123:192.168.2.112
Here's the code I use :
public void sortCamera() {
BufferedReader reader = null;
BufferedWriter writer = null;
ArrayList <String> lines = new ArrayList<String>();
try {
reader = new BufferedReader(new FileReader (log));
String currentLine = reader.readLine();
while (currentLine != null){
lines.add(currentLine);
currentLine = reader.readLine();
}
Collections.sort(lines);
writer = new BufferedWriter(new FileWriter(log));
for (String line : lines) {
writer.write(line);
writer.newLine();
}
} catch(IOException e) {
e.printStackTrace();
} finally {
try {
if (reader != null) {
reader.close();
}
if(writer != null){
writer.close();
}
} catch (IOException e) {
e.printStackTrace();
}
}
}

You are currently performing a lexical sort, to perform a numerical sort you would need a Comparator that sorts the lines numerically on the value between "camera" and : in each line. You could split by :, and then use a regular expression to grab the digits, parse them and then compare. Like,
Collections.sort(lines, (String a, String b) -> {
String[] leftTokens = a.split(":"), rightTokens = b.split(":");
Pattern p = Pattern.compile("camera(\\d+)");
int left = Integer.parseInt(p.matcher(leftTokens[0]).replaceAll("$1"));
int right = Integer.parseInt(p.matcher(rightTokens[0]).replaceAll("$1"));
return Integer.compare(left, right);
});
Making a fully reproducible example
List<String> lines = new ArrayList<>(Arrays.asList( //
"camera10:192.168.112.43", //
"camera12:192.168.2.112", //
"camera1:127.0.0.1", //
"camera2:133.192.31.42", //
"camera3:145.154.42.58", //
"camera8:192.168.12.205", //
"camera3:192.168.2.122", //
"camera5:192.168.112.54", //
"camera123:192.168.2.112", //
"camera4:192.168.112.1", //
"camera6:192.168.112.234", //
"camera7:192.168.112.20", //
"camera9:192.168.2.112"));
Collections.sort(lines, (String a, String b) -> {
String[] leftTokens = a.split(":"), rightTokens = b.split(":");
Pattern p = Pattern.compile("camera(\\d+)");
int left = Integer.parseInt(p.matcher(leftTokens[0]).replaceAll("$1"));
int right = Integer.parseInt(p.matcher(rightTokens[0]).replaceAll("$1"));
return Integer.compare(left, right);
});
System.out.println(lines);
And that outputs
[camera1:127.0.0.1, camera2:133.192.31.42, camera3:145.154.42.58, camera3:192.168.2.122, camera4:192.168.112.1, camera5:192.168.112.54, camera6:192.168.112.234, camera7:192.168.112.20, camera8:192.168.12.205, camera9:192.168.2.112, camera10:192.168.112.43, camera12:192.168.2.112, camera123:192.168.2.112]

Your sorting algorithm assumes the ordinary collating sequence, i.e. sorts the strings by alphabetical order, as if the digits were letters. Hence the shorter string come first.
You need to specify an ad-hoc comparison function that splits the string and extracts the numerical value of the suffix.

Java compare strings from two places and exclude any matches

I'm trying to end up with a results.txt minus any matching items, having successfully compared some string inputs against another .txt file. Been staring at this code for way too long and I can't figure out why it isn't working. New to coding so would appreciate it if I could be steered in the right direction! Maybe I need a different approach? Apologies in advance for any loud tutting noises you may make. Using Java8.
//Sending a String[] into 'searchFile', contains around 8 small strings.
//Example of input: String[]{"name1","name2","name 3", "name 4.zip"}
^ This is my exclusions list.
public static void searchFile(String[] arr, String separator)
{
StringBuilder b = new StringBuilder();
for(int i = 0; i < arr.length; i++)
{
if(i != 0) b.append(separator);
b.append(arr[i]);
String findME = arr[i];
searchInfo(MyApp.getOptionsDir()+File.separator+"file-to-search.txt",findME);
}
}
^This works fine. I'm then sending the results to 'searchInfo' and trying to match and remove any duplicate (complete, not part) strings. This is where I am currently failing. Code runs but doesn't produce my desired output. It often finds part strings rather than complete ones. I think the 'results.txt' file is being overwritten each time...but I'm not sure tbh!
file-to-search.txt contains: "name2","name.zip","name 3.zip","name 4.zip" (text file is just a single line)
public static String searchInfo(String fileName, String findME)
{
StringBuffer sb = new StringBuffer();
try {
BufferedReader br = new BufferedReader(new FileReader(fileName));
String line = null;
while((line = br.readLine()) != null)
{
if(line.startsWith("\""+findME+"\""))
{
sb.append(line);
//tried various replace options with no joy
line = line.replaceFirst(findME+"?,", "");
//then goes off with results to create a txt file
FileHandling.createFile("results.txt",line);
}
}
} catch (Exception e) {
e.printStackTrace();
}
return sb.toString();
}
What i'm trying to end up with is a result file MINUS any matching complete strings (not part strings):
e.g. results.txt to end up with: "name.zip","name 3.zip"

ok with the information I have. What you can do is this
List<String> result = new ArrayList<>();
String content = FileUtils.readFileToString(file, "UTF-8");
for (String s : content.split(", ")) {
if (!s.equals(findME)) { // assuming both have string quotes added already
result.add(s);
}
}
FileUtils.write(newFile, String.join(", ", result), "UTF-8");
using apache commons file utils for ease. You may add or remove spaces after comma as per your need.

Java how to remove duplicates from ArrayList

I have a CSV file which contains rules and ruleversions. The CSV file looks like this:
CSV FILE:
#RULENAME, RULEVERSION
RULE,01-02-01
RULE,01-02-02
RULE,01-02-34
OTHER_RULE,01-02-04
THIRDRULE, 01-02-04
THIRDRULE, 01-02-04
As you can see, 1 rule can have 1 or more rule versions. What I need to do is read this CSV file and put them in an array. I am currently doing that with the following script:
private static List<String[]> getRulesFromFile() {
String csvFile = "rulesets.csv";
BufferedReader br = null;
String line = "";
String delimiter = ",";
List<String[]> input = new ArrayList<String[]>();
try {
br = new BufferedReader(new FileReader(csvFile));
while ((line = br.readLine()) != null) {
if (!line.startsWith("#")) {
String[] rulesetEntry = line.split(delimiter);
input.add(rulesetEntry);
}
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (br != null) {
try {
br.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
return input;
}
But I need to adapt the script so that it saves the information in the following format:
ARRAY (
=> RULE => 01-02-01, 01-02-02, 01-02-04
=> OTHER_RULE => 01-02-34
=> THIRDRULE => 01-02-01, 01-02-02
)
What is the best way to do this? Multidimensional array? And how do I make sure it doesn't save the rulename more than once?

You should use a different data structure, for example an HashMap, like this.
HashMap<String, List<String>> myMap = new HashMap<>();
try {
br = new BufferedReader(new FileReader(csvFile));
while ((line = br.readLine()) != null) {
if (!line.startsWith("#")) {
String[] parts = string.split(delimiter);
String key = parts[0];
String value = parts[1];
if (myMap.containsKey(key)) {
myMap.get(key).add(value);
} else {
List<String> values = new ArrayList<String>();
values.add(value);
myMap.put(key, values);
}
}
}
This should work!

See using an ArrayList is not a good data structure of choice here.
I would personally suggest you to use a HashMap> for this particular purpose.
The rules will be your keys and rule versions will be your values which will be a list of strings.
While traversing your original file, just check if the rule (key) is present, then add the value to the list of rule versions (values) already present, otherwise add a new key and add the value to it.

For instance like this:
public List<String> removeDuplicates(List<String> myList) {
Hashtable<String, String> hashtable=new Hashtable<String, String>();
for(String s:myList) {
hashtable.put(s, s);
}
return new ArrayList<String>(hashtable.values());
}

This is exactly what key - value pairs can be used for. Just take a look at the Map Interface. There you can define a unique key containing various elements as value, perfectly for your issue.

Code:
// This collection will take String type as a Key
// and Prevent duplicates in its associated values
Map<String, HashSet<String>> map = new HashMap<String,HashSet<String>>();
// Check if collection contains the Key you are about to enter
// !REPLACE! -> "rule" with the Key you want to enter into your collection
// !REPLACE! -> "whatever" with the Value you want to associate with the key
if(!map.containsKey("rule")){
map.put("rule", new HashSet<String>());
}
else{
map.get("rule").add("whatever");
}
Reference:
Set
Map

Reading a .txt file and excluding certain elements

In my journey to complete this program I've run into a little hitch with one of my methods. The method I am writing reads a certain .txt file and creates a HashMap and sets every word found as a Key and the amount of time it appears is its Value. I have managed to figure this out for another method, but this time, the .txt file the method is reading is in a weird format. Specifically:
more 2
morning's 1
most 3
mostly 1
mythology. 1
native 1
nearly 2
northern 1
occupying 1
of 29
off 1
And so on.
Right now, the method is returning only one line in the file.
Here is my code for the method:
public static HashMap<String,Integer> readVocabulary(String fileName) {
// Declare the HashMap to be returned
HashMap<String, Integer> wordCount = new HashMap();
String toRead = fileName;
try {
FileReader reader = new FileReader(toRead);
BufferedReader br = new BufferedReader(reader);
// The BufferedReader reads the lines
String line = br.readLine();
// Split the line into a String array to loop through
String[] words = line.split(" ");
// for loop goes through every word
for (int i = 0; i < words.length; i++) {
// Case if the HashMap already contains the key.
// If so, just increments the value.
if (wordCount.containsKey(words[i])) {
int n = wordCount.get(words[i]);
wordCount.put(words[i], ++n);
}
// Otherwise, puts the word into the HashMap
else {
wordCount.put(words[i], 1);
}
}
br.close();
}
// Catching the file not found error
// and any other errors
catch (FileNotFoundException fnfe) {
System.err.println("File not found.");
}
catch (Exception e) {
System.err.print(e);
}
return wordCount;
}
The issue is that I'm not sure how to get the method to ignore the 2's and 1's and 29's of the .txt file. I attempted making an 'else if' statement to catch all of these cases but there are too many. Is there a way for me to catch all the ints from say, 1-100, and exlude them from being Keys in the HashMap? I've searched online but have turned up something.
Thank you for any help you can give!

How about just doing wordCount.put(words[0],1) into wordcount for every line, after you've done the split. If the pattern is always "word number", you only need the first item from the split array.
Update after some back and forth
public static HashMap<String,Integer> readVocabulary(String toRead)
{
// Declare the HashMap to be returned
HashMap<String, Integer> wordCount = new HashMap<String, Integer>();
String line = null;
String[] words = null;
int lineNumber = 0;
FileReader reader = null;
BufferedReader br = null;
try {
reader = new FileReader(toRead);
br = new BufferedReader(reader);
// Split the line into a String array to loop through
while ((line = br.readLine()) != null) {
lineNumber++;
words = line.split(" ");
if (words.length == 2) {
if (wordCount.containsKey(words[0]))
{
int n = wordCount.get(words[0]);
wordCount.put(words[0], ++n);
}
// Otherwise, puts the word into the HashMap
else
{
boolean word2IsInteger = true;
try
{
Integer.parseInt(words[1]);
}
catch(NumberFormatException nfe)
{
word2IsInteger = false;
}
if (word2IsInteger) {
wordCount.put(words[0], Integer.parseInt(words[1]));
}
}
}
}
br.close();
br = null;
reader.close();
reader = null;
}
// Catching the file not found error
// and any other errors
catch (FileNotFoundException fnfe) {
System.err.println("File not found.");
}
catch (Exception e) {
System.err.print(e);
}
return wordCount;
}

To check if a String contains a only digits use String´s matches() method, e.g.
if (!words[i].matches("^\\d+$")){
// NOT a String containing only digits
}
This wont require checking exceptions and it doesnt matter if the number wouldnt fit inside an Integer.

Option 1: Ignore numbers separated by whitespace
Use Integer.parseInt() or Double.parseInt() and catch the exception.
// for loop goes through every word
for (int i = 0; i < words.length; i++) {
try {
int wordAsInt = Integer.parseInt(words[i]);
} catch(NumberFormatException e) {
// Case if the HashMap already contains the key.
// If so, just increments the value.
if (wordCount.containsKey(words[i])) {
int n = wordCount.get(words[i]);
wordCount.put(words[i], ++n);
}
// Otherwise, puts the word into the HashMap
else {
wordCount.put(words[i], 1);
}
}
}
There is a Double.parseDouble(String) method, which you could use in place of Integer.parseInt(String) above if you wanted to eliminate all numbers, not just integers.
Option 2: Ignore numbers everywhere
Another option is to parse your input one character at a time and ignore any character that isn't a letter. When you scan whitespace, then you could add the word generated by the characters just scanned in to your HashMap. Unlike the methods mentioned above, scanning by character would allow you to ignore numbers even if they appear immediately next to other characters.

Print data from file to array

I need to have this file print to an array, not to screen.And yes, I MUST use an array - School Project - I'm very new to java so any help is appreciated. Any ideas? thanks
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.Scanner;
public class HangmanProject
{
public static void main(String[] args) throws FileNotFoundException
{
String scoreKeeper; // to keep track of score
int guessesLeft; // to keep track of guesses remaining
String wordList[]; // array to store words
Scanner keyboard = new Scanner(System.in); // to read user's input
System.out.println("Welcome to Hangman Project!");
// Create a scanner to read the secret words file
Scanner wordScan = null;
try {
wordScan = new Scanner(new BufferedReader(new FileReader("words.txt")));
while (wordScan.hasNext()) {
System.out.println(wordScan.next());
}
} finally {
if (wordScan != null) {
wordScan.close();
}
}
}
}

Nick, you just gave us the final piece of the puzzle. If you know the number of lines you will be reading, you can simply define an array of that length before you read the file
Something like...
String[] wordArray = new String[10];
int index = 0;
String word = null; // word to be read from file...
// Use buffered reader to read each line...
wordArray[index] = word;
index++;
Now that example's not going to mean much to be honest, so I did these two examples
The first one uses the concept suggested by Alex, which allows you to read an unknown number of lines from the file.
The only trip up is if the lines are separated by more the one line feed (ie there is a extra line between words)
public static void readUnknownWords() {
// Reference to the words file
File words = new File("Words.txt");
// Use a StringBuilder to buffer the content as it's read from the file
StringBuilder sb = new StringBuilder(128);
BufferedReader reader = null;
try {
// Create the reader. A File reader would be just as fine in this
// example, but hay ;)
reader = new BufferedReader(new FileReader(words));
// The read buffer to use to read data into
char[] buffer = new char[1024];
int bytesRead = -1;
// Read the file to we get to the end
while ((bytesRead = reader.read(buffer)) != -1) {
// Append the results to the string builder
sb.append(buffer, 0, bytesRead);
}
// Split the string builder into individal words by the line break
String[] wordArray = sb.toString().split("\n");
System.out.println("Read " + wordArray.length + " words");
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
reader.close();
} catch (Exception e) {
}
}
}
The second demonstrates how to read the words into an array of known length. This is probably closer to the what you actually want
public static void readKnownWords()
// This is just the same as the previous example, except we
// know in advance the number of lines we will be reading
File words = new File("Words.txt");
BufferedReader reader = null;
try {
// Create the word array of a known quantity
// The quantity value could be defined as a constant
// ie public static final int WORD_COUNT = 10;
String[] wordArray = new String[10];
reader = new BufferedReader(new FileReader(words));
// Instead of reading to a char buffer, we are
// going to take the easy route and read each line
// straight into a String
String text = null;
// The current array index
int index = 0;
// Read the file till we reach the end
// ps- my file had lots more words, so I put a limit
// in the loop to prevent index out of bounds exceptions
while ((text = reader.readLine()) != null && index < 10) {
wordArray[index] = text;
index++;
}
System.out.println("Read " + wordArray.length + " words");
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
reader.close();
} catch (Exception e) {
}
}
}
If you find either of these useful, I would appropriate it you would give me a small up-vote and check Alex's answer as correct, as it's his idea that I've adapted.
Now, if you're really paranoid about which line break to use, you can find the values used by the system via the System.getProperties().getProperty("line.separator") value.

Do you need more help with the reading the file, or getting the String to a parsed array? If you can read the file into a String, simply do:
String[] words = readString.split("\n");
That will split the string at each line break, so assuming this is your text file:
Word1
Word2
Word3
words will be: {word1, word2, word3}

If the words you are reading are stored in each line of the file, you can use the hasNextLine() and nextLine() to read the text one line at a time. Using the next() will also work, since you just need to throw one word in the array, but nextLine() is usually always preferred.
As for only using an array, you have two options:
You either declare a large array, the size of whom you are sure will never be less than the total amount of words;
You go through the file twice, the first time you read the amount of elements, then you initialize the array depending on that value and then, go through it a second time while adding the string as you go by.
It is usually recommended to use a dynamic collection such as an ArrayList(). You can then use the toArray() method to turnt he list into an array.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to compare each line of a text File? java - java

Instead of list.add(s);, use list.add(s.toLowerCase());. Right now your code is comparing lines of indeterminate case to lower-cased lines.

The String class has an equalIgnoreCase method which you can use to compare two strings irrespective of case. so: String var1 = "Die"; String var2 = "die"; System.out.println(var1.equalsIgnoreCase(var2)); Would print TRUE.

Related

How to sort a file in alphabetical order?

Java compare strings from two places and exclude any matches

Java how to remove duplicates from ArrayList

Reading a .txt file and excluding certain elements

Print data from file to array

Categories

Resources