Counting number of special characters in a given file - java

Hy,
In the result, the number of zeros and ones isn't the same.
And i don't find where is the problem.
Can any One help me please?
//Main {
int bufferSize = 10240; //10KB
int fileSize = 10 * 1024 * 1024; //10MB
Random r = new Random();
//Writing 0 and 1 into file
File file = new File("test.txt");
FileWriter fw = new FileWriter(file, false); //this false means, every time we want to write into file, it will destructs what was before
BufferedWriter bw = new BufferedWriter(fw);
PrintWriter pw = new PrintWriter(bw);
for(int i=0; i<1000; i++){
for(int j=0; j<1000; j++){
if(r.nextBoolean()){
pw.write("0 ");
}else{
pw.write("1 ");
}
}
pw.write("\n");
}
System.out.println("End of writing into file : " + file.getName() + ", in : " + file.getAbsolutePath() + ", and its size : " + file.length());
pw.close();
//Read from file, and counting number of zeros and ones
System.out.println("Reading from file : Scanner method");
Scanner sc = null;
//sc = new Scanner(new BufferedReader(new FileReader(file)));
sc = new Scanner(new BufferedReader(new InputStreamReader(new FileInputStream(file), "UTF-8"), bufferSize));
int countZeros=0;
int countOnes=0;
StringTokenizer st = null;
String temp = null;
//Start counting time
long debut = System.nanoTime();
while(sc.hasNext()){
st = new StringTokenizer(sc.next(), " ");
while(st.hasMoreTokens() ){
temp = st.nextToken();
if(temp.compareTo("0")==0 && !Character.isSpaceChar(temp.charAt(0))){
countZeros++;
}
else if(temp.compareTo("1")==0 && !Character.isSpaceChar(temp.charAt(0))){
countOnes++;
}
}
}
//End counting time
long end = System.nanoTime() - debut;
sc.close();
System.out.println("Number of Zeros : " + countZeros);
System.out.println("Number of Ones : " + countOnes);
System.out.println("Total of zeros and Ones : " + (countZeros+countOnes));
System.out.println("Duration of counting zeros and ones : " + end/1000000 + "ms");
System.out.println("************");
System.out.println("Reading from file : BufferedReader method");
countZeros=0;
countOnes=0;
st=null;
BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(file), "UTF-8"), bufferSize);
String[] tempLigne = null;
//Start counting time
debut = System.nanoTime();
for(int i=0; (i=br.read())>-1;){
tempLigne = br.readLine().split(" ");
for(int j=0; j<tempLigne.length; j++){
if(tempLigne[j].equals("0")){
countZeros++;
}else if(tempLigne[j].equals("1")){
countOnes++;
}
}
}
//End counting time
end = System.nanoTime() - debut;
br.close();
System.out.println("Number of Zeros : " + countZeros);
System.out.println("Number of Ones : " + countOnes);
System.out.println("Total of zeros and Ones : " + (countZeros+countOnes));
System.out.println("Duration of counting zeros and ones : " + end/1000000 + "ms");
}
}
//Output
End of writing into file : test.txt, in : C:\Users\youness\workspace\ScannerFile\test.txt, and its size : 1990656
Reading from file : Scanner method
Number of Zeros : 499807
Number of Ones : 500193
Total of zeros and Ones : 1000000
Duration of counting zeros and ones : 1020ms
************
Reading from file : BufferedReader method
Number of Zeros : 499303
Number of Ones : 499697
Total of zeros and Ones : 999000
Duration of counting zeros and ones : 177ms
Thank you,
Best Reagrds

The problem lies in the code here:
for(int i=0; (i=br.read())>-1;){
tempLigne = br.readLine().split(" ");
for(int j=0; j<tempLigne.length; j++){
if(tempLigne[j].equals("0")){
countZeros++;
}else if(tempLigne[j].equals("1")){
countOnes++;
}
}
}
br.read() actually reads one character. Instead of processing that character, you discard the result by reading the remaining whole line by br.readline().
Since there are 1000 lines in the file and you discarded the first character of each line, you end up 1000 characters less.
You may change your for(int i=0; (i=br.read())>-1;) to while (br.ready()). When the br is empty, the loop will terminate

You're generating 0 and 1 values using Random class.
This class generates a random distribution so it doesn't ensures that the number of 0 and 1 characters generated will be the same, because it's a random generator.
Only if you generate an infinite amount of numbers then there will be the same number of 0s and 1s.
The more characters you generate, the closer the two values will be.

Related

How many words are in each sentence of the file based in this code Java

I need to count how many words are in each sentence of the file based in this code
We have the file called archivo:
File archivo = null;
try {
archivo = new File("Text.txt");
String line;
FileReader fr = new FileReader (archivo);
BufferedReader br = new BufferedReader(fr);
int i,a=0;
while((linea=br.readLine())!=null) {
for(i=0;i<line.length();i++){
if(i==0){
if(line.charAt(i)!=' ')
a++;
}else{
if(line.charAt(i-1)==' ')
if(line.charAt(i)!=' ')
a++;
}
}
}
Here we print the number of words, but i also need the number of words per sentence
System.out.println("There are "+a+" words");
fr.close();
}catch(IOException a){
System.out.println(a);
}
}
}
The text.txt says:
hi
I'm Katie
and I have two cats.
Do it as follows:
String line;
int count=0, totalCount=0;
while((line=br.readLine())!=null) {
count = line.split("\\s+").length;
System.out.println("The number of words in '" + line + "' is: " + count);
totalCount += count;
}
System.out.println("The total number of words in the file is " + count);
Explanation: String::split function splits a string into an array of strings based on the specified regex. The regex, \\s+ means one or more spaces. For each line, the program is printing count i.e. the number of words (which is the length of the resulting array after the split happens) and also adding it to totalCount. In the end, the program prints totalCount (which is the total number of words in the file).

need help about reading numbers inside file

First I create a txt file (a.txt) -- DONE
create 10 random number from - to ( like from 5 -10 ) --DONE
I write this number in txt file --DONE
I want to check its written or not -- DONE
Now I need to find: how many number, biggest, smallest, sum of numbers
But I can not call that file and search in the file (a.txt). I am just sending last part. Other parts work. I need some help to understand. It is also inside another method. not main
Scanner keyboard = new Scanner(System.in);
boolean again = true;
int max = Integer.MIN_VALUE;
int min = Integer.MAX_VALUE;
int a = 0;
int count = 0;
System.out.println("Enter the filename to write into all analysis: ");
outputFileName = keyboard.nextLine();
File file2 = new File(outputFileName);
if (file2.exists()) {
System.out.println("The file " + outputFileName +
" already exists. Will re-write its content");
}
try {
PrintWriter yaz = new PrintWriter(file2);
// formulas here. created file a.txt need to search into that file biggest smallest and sum of numbers
yaz.println("Numeric data file name: " + inputFileName);
yaz.println("Number of integer: " + numLines);
yaz.println("The total of all integers in file: " + numLines); //fornow
yaz.println("The largest integer in the set: " + max);
yaz.println("The smallest integer in the set " + min);
yaz.close();
System.out.println("Data written to the file.");
} catch (Exception e) {
System.out.printf("ERROR reading from file %s!\n", inputFileName);
System.out.printf("ERROR Message: %s!\n", e.getMessage());
}
So you want a code to read a text file and give you the biggest, smallest and the average.
You can use Scanner class for that and use hasNextInt() to find integers
File f = new File("F:/some_text_file.txt"); // input your text file here
if(f.exists()){
try{
Scanner sc = new Scanner(f);
int max = Integer.MIN_VALUE;
int min = Integer.MAX_VALUE;
int temp=0, i=0;
double sum=0;
while(sc.hasNextInt()){
temp = sc.nextInt();
if(temp>max) max = temp;
if(temp<min) min =temp;
sum+=(double) temp;
i++;
}
System.out.println("average : " +sum/i);
System.out.println("large : "+max);
System.out.println("small :"+min);
sc.close();
}catch(Exception e){
e.printStackTrace();
}
}
See if this works
You need to read the file into memory. One way to do that is to move the text of the file into a String.
This post will help you: Reading a plain text file in Java
Here's the relevant code:
try(BufferedReader br = new BufferedReader(new FileReader("file.txt"))) {
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null) {
sb.append(line);
sb.append(System.lineSeparator());
line = br.readLine();
}
String everything = sb.toString();
}

Read specific data from a .txt file JAVA

I have a problem. I'm trying to read a large .txt file, but I don't need every piece of data that's inside.
My .txt file looks something like this:
8000000 abcdefg hijklmn word word letter
I only need, let's say, the number and the first two text positions: "abcdefg" and "hijklmn" and write it to another file after that. I don't know how to read and write just the data that I need.
Here is my code so far:
BufferedReader br = new BufferedReader(new FileReader("position2.txt"));
BufferedWriter bw = new BufferedWriter(new FileWriter("position.txt"));
String line;
while ((line = br.readLine())!= null){
if(line.isEmpty() || line.trim().equals("") || line.trim().equals("\n")){
continue;
}else{
//bw.write(line + "\n");
String[] data = line.split(" ");
bw.write(data[0] + " " + data[1] + " " + data[2] + "\n");
}
}
br.close();
bw.close();
}
Can you give me some sugestions ?
Thanks in advance
UPDATE:
My .txt files are a bit weird. Using the code above works great when there is only one single " " between them. My files can have a \t or more spaces, or a \t and some spaces between the words. Ho can I proceed now ?
Depending on the complexity of you data, you have a few options.
If the lines are simple space-separated values like shown, the simplest is to split the text, and write the values you want to keep to the new file:
try (BufferedReader br = new BufferedReader(new FileReader("text.txt"));
BufferedWriter bw = new BufferedWriter(new FileWriter("data.txt"))) {
String line;
while ((line = br.readLine()) != null) {
String[] values = line.split(" ");
if (values.length >= 3)
bw.write(values[0] + ' ' + values[1] + ' ' + values[2] + '\n');
}
}
If the values might be more complex, you could use a regular expression:
Pattern p = Pattern.compile("^(\\d+ \\w+ \\w+)");
try (BufferedReader br = new BufferedReader(new FileReader("text.txt"));
BufferedWriter bw = new BufferedWriter(new FileWriter("data.txt"))) {
String line;
while ((line = br.readLine()) != null) {
Matcher m = p.matcher(line);
if (m.find())
bw.write(m.group(1) + '\n');
}
}
This ensures that first value is digits only, and second and third values are word-characters only (a-z A-Z _ 0-9).
Assuming all lines of your text file follow the structure you described then you could do this:
Replace FILE_PATH with your actual file path.
public static void main(String[] args) {
try {
Scanner reader = new Scanner(new File("FILE_PATH/myfile.txt"));
PrintWriter writer = new PrintWriter(new File("FILE_PATH/myfile2.txt"));
while (reader.hasNextLine()) {
String line = reader.nextLine();
String[] tokens = line.split(" ");
writer.println(tokens[0] + ", " + tokens[1] + ", " + tokens[2]);
}
writer.close();
reader.close();
} catch (FileNotFoundException ex) {
System.out.println("Error: " + ex.getMessage());
}
}
You'll get something like:
word0, word1, word2
If your files are really huge (above 50-100 MB maybe GBs) and you are sure that the first word is a number and you need two words after that I would suggest you to read one line and iterate through that string. Stop when you find 3rd space.
String str = readLine();
int num_spaces = 0, cnt = 0;
String arr[] = new String[3];
while(num_spaces < 3){
if(str.charAt(cnt) == ' '){
num_space++;
}
else{
arr[num_space] += str.charAt(cnt);
}
}
If your data is couple of MB only or have a lot of numbers inside, no need to worry about iterating char by char. Just read line by line and split lines then check the words as it is mentioned
else {
String[] res = line.split(" ");
bw.write(res[0] + " " + res[1] + " " + res[2] + "\n"); // the first three words...
}

Why does my Java program add only half a list of numbers in a file?

I have a space delimited list of 36 numbers in a single line in a file that I am trying to read into an array. My program reads the entire line, but adds only 18 of the numbers. Does anyone see the reason?
Thank you.
StringTokenizer st;
try{
BufferedReader br = new BufferedReader(
new FileReader("Scores.txt"));
String line = br.readLine();
double avg = 0.0;
double sum = 0.0;
int count = 0;
while (line!=null)
{
st = new StringTokenizer(line);
System.out.println("Total tokens : " + st.countTokens());
for(int i = 0; i < st.countTokens(); i++)
{
avg += Double.parseDouble(st.nextToken());
count++;
System.out.println("i: " + i);
}
System.out.println(line);
line = br.readLine();
}
br.close();
sum = avg;
System.out.println("Sum: " + sum);
System.out.println("Count: " + count);
avg = avg/count;
System.out.println("Avg: " + avg);
}catch(Exception e)
st.countTokens() gives the number tokens left. When you have 18 tokens, there are 18 tokens left so you stop. Instead of doing this I suggest you read the documentation which suggest the following pattern
StringTokenizer st = new StringTokenizer("this is a test");
while (st.hasMoreTokens()) {
System.out.println(st.nextToken());
}
I suggest using String.split():
String[] lineNum;
int n;
while (line!=null)
{
lineNum = line.split(" ");
n = lineNum.length;
System.out.println("Total numbers : " + n);
for(int i = 0; i < n; i++)
{
avg += Double.parseDouble(lineNum[i]);
System.out.println("i: " + i);
}
count += n;
System.out.println(line);
line = br.readLine();
}

the counter does not working

this code i have been doing suppose to add a counter everytime the code found a term in a file. The counter represents the number of documents containing the term.
System.out.println("Please enter the required word :");
Scanner scan2 = new Scanner(System.in);
String word2 = scan.nextLine();
String[] array2 = word2.split(" ");
for (int b = 0; b < array.length; b++) {
for (int i = 0; i < filename; i++) {
try {
BufferedReader in = new BufferedReader(new FileReader(
"C:\\Users\\user\\fypworkspace\\TextRenderer\\abc"
+ i + ".txt"));
int numDoc = 0;
int numofDoc = 0;
Scanner s2 = new Scanner(in);
{
while (s2.hasNext()) {
if (s2.next().equals(word2))
numDoc++;
}
}
if (numDoc > 0)
numofDoc++;
System.out.println("File containing the term is "
+ numofDoc);
} catch (IOException e) {
System.out.println("File not found.");
}
The output is :
Please enter the required word :
the
File containing the term is 1
File containing the term is 1
File containing the term is 1
File containing the term is 1
File containing the term is 1
File containing the term is 1
File not found
File containing the term is 1
File containing the term is 1
File containing the term is 1
File containing the term is 1
I would like the output to display the number of file containing the term is 10.
Mind to point out my mistake ? thanks..
Indent your code properly (under Eclipse, CTRL + SHIFT + F will do it for you)
Give sensible and explicit names to your variables. numDoc and numOfDoc are too close to avoid mistakes
You are outputing the counter in the inner loop, try to get your System.out.println("File containing the term is " + numofDoc); out of the second for loop (this can easily be spotted if you indent your code properly). Also check that you are outputting the right variable.
Now that you print the result in the proper place, int numofDoc = 0; shall also be outside the second for loop.
Additionally, you are using String.equals to check if the current line of the file contains the required text. Maybe you want to look for the documentation of String.contains
I guess that numDoc represents the number of occurences in the file and numofDoc reprents the number of files.
The problem is that the variable int numofDoc = 0 is set in the for loop. So for every new file the counter is reset.
Set int numDoc = 0; before the two loops.
So you're setting back the value to 0 every time the loop is executed.
declare int numDoc = 0; int numofDoc = 0; outside for loop.
whenever executing to for loop, they initialized & then incremented to 1. That's why you getting all time 1.
I think you want to do this
public static void main(String[] args)
{
System.out.println("Please enter the required word :");
Scanner scan2 = new Scanner(System.in);
String word2 = scan.nextLine();
String[] array2 = word2.split(" ");
for ( int b = 0; b < array.length; b++ )
{
**//Declare before the loop**
int numofDoc = 0;
for ( int i = 0; i < filename; i++ )
{
try
{
BufferedReader in = new BufferedReader(new FileReader(
"C:\\Users\\user\\fypworkspace\\TextRenderer\\abc" + i + ".txt"));
int matchedWord = 0;
Scanner s2 = new Scanner(in);
{
while ( s2.hasNext() )
{
if ( s2.next().equals(word2) )
matchedWord++;
}
}
if ( matchedWord > 0 )
numofDoc++;
System.out.println("File containing the term is " + numofDoc);
}
catch ( IOException e )
{
System.out.println("File not found.");
}
}
}
}

Categories

Resources