I have java comparing two different files but I would like it to take the one with the most characters and delete the other one. I don’t think it should go by file size because just one extra character added could have same size file.. Correct?
Any help is appreciated.
Here is my code:
import java.io.*;
import java.util.*;
public class FileComp
{
public static void main (String[] args) throws java.io.IOException
{
BufferedReader br2 = new BufferedReader (new
InputStreamReader(System.in));
String str = ("compt1.txt");
String str1 = ("compt2.txt");
String s1="";
String s2="",s3="",s4="";
String y="",z="";
BufferedReader br = new BufferedReader (new FileReader (str));
BufferedReader br1 = new BufferedReader (new FileReader (str1));
while((z=br1.readLine())!=null)
s3+=z;
while((y=br.readLine())!=null)
s1+=y;
System.out.println ();
int numTokens = 0;
StringTokenizer st = new StringTokenizer (s1);
String[] a = new String[10000];
for(int l=0;l<10000;l++)
{a[l]="";}
int i=0;
while (st.hasMoreTokens())
{
s2 = st.nextToken();
a[i]=s2;
i++;
numTokens++;
}
int numTokens1 = 0;
StringTokenizer st1 = new StringTokenizer (s3);
String[] b = new String[10000];
for(int k=0;k<10000;k++)
{b[k]="";}
int j=0;
while (st1.hasMoreTokens())
{
s4 = st1.nextToken();
b[j]=s4;
j++;
numTokens1++;
}
int x=0;
for(int m=0;m<a.length;m++)
{
if(a[m].equals(b[m])){}
else
{
x++;
System.out.println(a[m] + " -- " +b[m]);
System.out.println();}
}
//////////////////////////////Change this:
System.out.println("Number of differences " + x);
if(x>0){System.out.println("Files are not equal");}
else{System.out.println("Files are equal. No difference found");}
////////////////////////////////////
}
}
Use
File file = new File("File.txt");
long l = file.length();
You can use the length() method on File which returns the size in bytes
Each character takes some amount of memory. So, file size shouldn't be same.
Related
I have the following code which counts and displays the number of times each word occurs in the whole text document.
try {
List<String> list = new ArrayList<String>();
int totalWords = 0;
int uniqueWords = 0;
File fr = new File("filename.txt");
Scanner sc = new Scanner(fr);
while (sc.hasNext()) {
String words = sc.next();
String[] space = words.split(" ");
for (int i = 0; i < space.length; i++) {
list.add(space[i]);
}
totalWords++;
}
System.out.println("Words with their frequency..");
Set<String> uniqueSet = new HashSet<String>(list);
for (String word : uniqueSet) {
System.out.println(word + ": " + Collections.frequency(list,word));
}
} catch (Exception e) {
System.out.println("File not found");
}
Is it possible to modify this code to make it so it only counts each occurrence once per line rather than in the entire document?
One can read the contents per line and then apply logic per line to count the words:
File fr = new File("filename.txt");
FileReader fileReader = new FileReader(file);
BufferedReader br = new BufferedReader(fileReader);
// Read the line in the file
String line = null;
while ((line = br.readLine()) != null) {
//Code to count the occurrences of the words
}
Yes. The Set data structure is very similar to the ArrayList, but with the key difference of having no duplicates.
So, just use a set instead.
In your while loop:
while (sc.hasNext()) {
String words = sc.next();
String[] space = words.split(" ");
//convert space arraylist -> set
Set<String> set = new HashSet<String>(Arrays.asList(space));
for (int i = 0; i < set.length; i++) {
list.add(set[i]);
}
totalWords++;
}
Rest of the code should remain the same.
For the given text file (text.txt) compute how many times each word appears in the file. The output of the program should be another text file containing on each line a word and then the number of times it appears in the original file. After you finish change the program so that the words in the output file are sorted alphabetically. Do not use maps, use only basic arrays. The thing is displaying me only one word that I enter from keyboard in that text file, but how can I display for all words, not only for one? Thanks
package worddata;
import java.io.IOException;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.*;
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;
class WordData {
public FileReader fr = null;
public BufferedReader br =null;
public String [] stringArray;
public int counLine = 0;
public int arrayLength ;
public String s="";
public String stringLine="";
public String filename ="";
public String wordname ="";
public WordData(){
try{
Scanner scan = new Scanner(System.in);
System.out.println("Please enter the filename: ");
filename = scan.nextLine();
Scanner scan2 = new Scanner(System.in);
System.out.println("Please enter a word: ");
wordname = scan.nextLine();
fr = new FileReader(filename);
br = new BufferedReader(fr);
while((s = br.readLine()) != null){
stringLine = stringLine + s;
//System.out.println(s);
stringLine = stringLine + " ";
counLine ++;
}
stringArray = stringLine.split(" ");
arrayLength = stringArray.length;
for (int i = 0; i < arrayLength; i++) {
int c = 1 ;
for (int j = i+1; j < arrayLength; j++) {
if(stringArray[i].equalsIgnoreCase(stringArray[j])){
c++;
for (int j2 = j; j2 < arrayLength; j2++) {
stringArray[j2] = stringArray[j2+1];
arrayLength = arrayLength - 1;
}
if (stringArray[i].equalsIgnoreCase(wordname)){
System.out.println("The word "+wordname+" is present "+c+" times in the specified file.");
}
}
}
}
System.out.println("Total number of lines: "+counLine);
fr.close();
br.close();
}catch (Exception e) {
e.printStackTrace();
}
}
public static void main(String[] args) throws IOException {
Scanner scan = new Scanner(System.in);
OutputStream out = new FileOutputStream("output.txt");
System.out.println("Please enter the filename: ");
String filename = scan.nextLine();
System.out.println("Please enter a word: ");
String wordname = scan.nextLine();
int count = 0;
try (LineNumberReader r = new LineNumberReader(new FileReader(filename))) {
String line;
while ((line = r.readLine()) != null) {
for (String element : line.split(" ")) {
if (element.equalsIgnoreCase(wordname)) {
count++;
System.out.println("Word found at line " + r.getLineNumber());
}
}
}
}
FileReader fileReader = new FileReader(filename);
BufferedReader bufferedReader = new BufferedReader(fileReader);
StringBuffer stringBuffer = new StringBuffer();
String line;
while ((line = bufferedReader.readLine()) != null) {
stringBuffer.append(line);
stringBuffer.append("\n");
}
fileReader.close();
System.out.println("The word " + stringBuffer.toString() + " appears " + count + " times.");
int i;
List<String> ls = new ArrayList<String>();
for (i = 1; i <= 1000; i++) {
String str = null;
str = +i + ":- The word "+wordname+" was found " + count +" times";
ls.add(str);
}
String listString = "";
for (String s : ls) {
listString += s + "\n";
}
FileWriter writer = null;
try {
writer = new FileWriter("final.txt");
writer.write(listString);
writer.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
The code below does something like you want I think.
it does the following:
read the contents from the input.txt file
Remove punctuation marks from the text
make it one string of words by removing line breaks
Split the text up in words by using space as delimiter
The lambda maps all the words to lowercase then removes whitespace and all empty entries then it...
loops over all words and computes there word count in het HashMap
then we sort the Map based on the count value in reverse order to get the highest counted words first
then write them to a StringBuilder to format it like this "word : count\n" and then write it to a text file
final String content = new String(Files.readAllBytes(Paths.get("<PATH TO YOUR PLACE>/input.txt")));
final List<String> words = Arrays.asList(content.replaceAll("[\\p{InCombiningDiacriticalMarks}]", "").replace("\n", " ").split(" "));
final Map<String, Integer> wordlist = new HashMap<>();
words.stream()
.map(String::toLowerCase)
.map(String::trim)
.filter(s -> !s.isEmpty())
.forEach(s -> {
wordlist.computeIfPresent(s, (s1, integer) -> ++integer);
wordlist.putIfAbsent(s, 1);
});
final StringBuilder sb = new StringBuilder();
wordlist.entrySet()
.stream()
.sorted(Map.Entry.comparingByValue(Collections.reverseOrder()))
.collect(Collectors.toMap(
Map.Entry::getKey,
Map.Entry::getValue,
(e1, e2) -> e1,
LinkedHashMap::new
)).forEach((s, integer) -> sb.append(s).append(" : ").append(integer).append("\n"));
Files.write(Paths.get("<PATH TO YOUR PLACE>/output.txt"), sb.toString().getBytes());
Hope it helps :-)
Note: the <PATH TO YOUR PLACE> needs to be replaced by the fully qualified path to your text file with words.
So in my codes, I am trying to read a file that is like:
100
22
123;22
123 342;432
but when it outputs it would include the ";" ( ex. 100,22,123;22,123,342;432} ).
I am trying to make the file into an array ( ex. {100,22,123,22,123...} ).
Is there a way to read the file, but ignore the semicolons?
Thanks!
public static void main(String args [])
{
String[] inFile = readFiles("ElevatorConfig.txt");
for ( int i = 0; i <inFile.length; i = i + 1)
{
System.out.println(inFile[i]);
}
System.out.println(Arrays.toString(inFile));
}
public static String[] readFiles(String file)
{
int ctr = 0;
try{
Scanner s1 = new Scanner(new File(file));
while (s1.hasNextLine()){
ctr = ctr + 1;
s1.next();
}
String[] words = new String[ctr];
Scanner s2 = new Scanner(new File(file));
for ( int i = 0 ; i < ctr ; i = i + 1){
words[i] = s2.next();
}
return words;
}
catch(FileNotFoundException e)
{
return null;
}
}
public static String[] readFiles(String file)
{
int ctr = 0;
try{
Scanner s1 = new Scanner(new File(file));
while (s1.hasNextLine()){
ctr = ctr + 1;
s1.next();
}
String[] words = new String[ctr];
Scanner s2 = new Scanner(new File(file));
for ( int i = 0 ; i < ctr ; i = i + 1){
words[i] = s2.next();
}
return words;
}
catch(FileNotFoundException e)
{
return null;
}
}
Replace this by
public static String[] readFiles(String file) {
List<String> retList = new ArrayList<String>();
Scanner s2 = new Scanner(new File(file));
for ( int i = 0 ; i < ctr ; i = i + 1){
String temp = s2.next();
String[] tempArr = se.split(";");
for(int k=0;k<tempArr.length;k++) {
retList.add(tempArr[k]);
}
}
return (String[]) retList.toArray();
}
Use regex. Read the entire file into a String (read each token as a String and append a blank space after each token in the String) and then split it at blank spaces and semi colons.
String x <--- contains all contents of the file
String[] words = x.split("[\\s\\;]+");
The contents of words[] are:
"100", "22", "123", "22", "123", "342", "432"
Remember to parse them to int before using as numbers.
Simple way to use BufferedReader Read line by line then split by ;
public static String[] readFiles(String file)
{
BufferedReader br = new BufferedReader(new FileReader(file)))
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null) {
sb.append(line);
sb.append(System.lineSeparator());
line = br.readLine();
}
String allfilestring = sb.toString();
String[] array = allfilestring.split(";");
return array;
}
You can use split() to split the string into array according to your requirement using regex.
String s; // string you have read from the file
String[] s1 = s.split(" |;"); // s1 contains the strings separated by space and ";"
Hope it helps
Keep the code for counting the size of the array.
I would just change the way you input your values.
for (int i = 0; i < ctr; i++) {
words[i] = "" + s1.nextInt();
}
Another option is to replace all non digit characters in your complete file string with a space. That way any non number character is ignored.
BufferedReader br = new BufferedReader(new FileReader(file)))
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null) {
sb.append(line);
line = br.readLine();
}
String str = sb.toString();
str = str.replaceAll("\\D+"," ");
Now you have a string with numbers separated by spaces, we can tokenize them into number strings.
String[] final = str.split("\\s+");
then convert to int datatypes.
class Ideone
{
public static void main (String[] args) throws java.lang.Exception
{//BigInteger bi1, bi2, bi3;
long t,j,n;
int i,x;
BigInteger u,sum,temp,m;
BigInteger[] a=new BigInteger[100009]; long[] b=new long[100009];
long mm=1000000007,f;
Scanner har=new Scanner(System.in);
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
t=har.nextInt();
for(f=0;f<t;f++)
{ temp = BigInteger.valueOf(1); sum = BigInteger.valueOf(0); u = BigInteger.valueOf(0);
n=har.nextInt(); x=har.nextInt(); m=har.nextBigInteger();//String line = br.readLine();
//String line = br.readLine(); // to read multiple integers line
//String[] strs = line.trim().split("\\s+");
//String[] s1 = br.readLine().split(" ");
//StringTokenizer st = new StringTokenizer(br.readLine());
for(i=1;i<=n;i++)
{b[i]=har.nextInt();
// b[i] = Integer.parseInt(st.nextToken());
//b[i] = Long.parseLong(System.console().readLine());
//b[i]=Long.parseLong(s1[i-1]);
}
}
}
}
input is of form
t
n x m
a[1] a[2] a[3] .......a[n]
This code is running correctly for scanner but if i try to use buffered reader or string tokenizer it is giving runtime error .
I am new to java and i need to use big integer for further part of the question.
There is nothing wrong with your code. You are defining the BufferReader as BufferedReader br = new BufferedReader(new InputStreamReader(System.in)); and reading String as br.readLine();.
This snippet works absolutely fine and prints the input string back to the console:
public static void main (String[] args)
{
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
System.out.println(br.readLine());
}
Some of the possibilities where you can get errors:
StringTokenizer
StringTokenizer st = new StringTokenizer(br.readLine());
Before passing it to the StringTokenizer make sure that you get a String back from br.readLine().
String text = br.readLine();
if(!StringUtils.isEmpty(text)) {
StringTokenizer st = new StringTokenizer(br.readLine());
/* Rest of the code */
}
Loop Condition Variable
for(i=1;i<=n;i++)
I do not see anything for BufferedReader which initializes n. Make sure you initialize it correctly.
Here is a quick code snippet:
public static void main (String[] args) throws Exception
{
long j, n, f, mm = 1000000007;
int i, x;
BigInteger u, sum, temp, m;
BigInteger[] a = new BigInteger[100009]; long[] b=new long[100009];
Scanner har = new Scanner(System.in);
long t = har.nextInt();
for(f=0;f<t;f++)
{
temp = BigInteger.valueOf(1);
sum = BigInteger.valueOf(0);
u = BigInteger.valueOf(0);
n = har.nextInt();
x = har.nextInt();
m = har.nextBigInteger();
/* Goto Next Line */
har.nextLine();
/* Start Reading Line */
String text = har.nextLine();
System.out.println("String Value: " + text);
if(null != text) {
StringTokenizer st = new StringTokenizer(text);
for(i = 1; i <= n; i++)
{
b[i] = Integer.parseInt(st.nextToken());
System.out.println("Value of b[" + i + "] = " + b[i]);
}
}
}
}
again. im gonna ask again about counting words and how to store it in array. So far, all i got is this.
Scanner sc = new Scanner(System.in);
int count;
void readFile() {
System.out.println("Gi navnet til filen: ");
String filNavn = sc.next();
try{
File k = new File(filNavn);
Scanner sc2 = new Scanner(k);
count = 0;
while(sc2.hasNext()) {
count++;
sc2.next();
}
Scanner sc3 = new Scanner(k);
String a[] = new String[count];
for(int i = 0;i<count;i++) {
a[i] =sc3.next();
if ( i == count -1 ) {
System.out.print(a[i] + "\n");
}else{
System.out.print(a[i] + " ");
}
}
System.out.println("Number of words: " + count);
}catch(FileNotFoundException e) {
my code works. but my question is, is there a more simple way to this? And the other question is how do i count the unique words out of the total words in a given file without using hashmap and arraylist.
Heres a simpler way to go about it:
public static void main(String[] args){
File f= new File(filename);
BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(f)));
String line = null;
String[] res;
while((line = br.readLine())!= null ){
String[] tokens = line.split("\\s+");
String[] both = ArrayUtils.addAll(res, tokens);
}
}