Count all characters in a file including \n etc

Count all characters in a file including \n etc - java

I am trying to iterate through a txt file and count all characters. This includes \n new line characters and anything else. I can only read through the file once. I am also recording letter frequency, amount of lines, amount of words, and etc. I can't quite figure out where to count the total amount of characters. (see code below) I know I need to before I use the StringTokenizer. (I have to use this by the way). I have tried multiple ways, but just can't quite figure it out. Any help would be appreciated. Thanks in advance. Note* my variable numChars is only counting alpha characters(a,b,c etc) edit posting class variables to make more sense of the code
private final int NUMCHARS = 26;
private int[] characters = new int[NUMCHARS];
private final int WORDLENGTH = 23;
private int[] wordLengthCount = new int[WORDLENGTH];
private int numChars = 0;
private int numWords = 0;
private int numLines = 0;
private int numTotalChars = 0;
DecimalFormat df = new DecimalFormat("#.##");
public void countLetters(Scanner scan) {
char current;
//int word;
String token1;
while (scan.hasNext()) {
String line = scan.nextLine().toLowerCase();
numLines++;
StringTokenizer token = new StringTokenizer(line,
" , .;:'\"&!?-_\n\t12345678910[]{}()##$%^*/+-");
for (int w = 0; w < token.countTokens(); w++) {
numWords++;
}
while (token.hasMoreTokens()) {
token1 = token.nextToken();
if (token1.length() >= wordLengthCount.length) {
wordLengthCount[wordLengthCount.length - 1]++;
} else {
wordLengthCount[token1.length() - 1]++;
}
}
for (int ch = 0; ch < line.length(); ch++) {
current = line.charAt(ch);
if (current >= 'a' && current <= 'z') {
characters[current - 'a']++;
numChars++;
}
}
}
}

Use string.toCharArray(), something like:
while (scan.hasNext()) {
String line = scan.nextLine();
numberchars += line.toCharArray().length;
// ...
}
An Alternative would be to use directly the string.length:
while (scan.hasNext()) {
String line = scan.nextLine();
numberchars += line.length;
// ...
}
Using the BfferedReader you can do it like this:
BufferedReader reader = new BufferedReader(
new InputStreamReader(
new FileInputStream(file), charsetName));
int charCount = 0;
while (reader.read() > -1) {
charCount++;
}

I would read by char from file with BufferedReader and use Guava Multiset to count chars
BufferedReader rdr = Files.newBufferedReader(path, charSet);
HashMultiset < Character > ms = HashMultiset.create();
for (int c;
(c = rdr.read()) != -1;) {
ms.add((char) c);
}
for (Multiset.Entry < Character > e: ms.entrySet()) {
char c = e.getElement();
int n = e.getCount();
}

Related

Trying to determine how to read input from console that could be over 10,000 characters in length

I'm currently trying determine how to use bufferedreader to read from a console program. I know the correct syntax to read from the console and I know the program is working for smaller text. However, any text greater than 5118 characters will be truncated. The console itself will also not print out any text greater than 5118 characters. The goal is to create a java program that will read from the console independent of the size of data being read.
The following is the code I have created.
package countAnagrams;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.util.*;
public class TestClass {
public static int check_For_Missing_Characters(String a1, String b1){
int first_String_Length = a1.length();
int missing_Characters = 0;
for( int y = 0; y < first_String_Length; y++ ){
final char character_To_Check_String = a1.charAt(y);
if ( b1.chars().filter(ch -> ch ==
character_To_Check_String).count() == 0 ){
missing_Characters+=1;
}
}
return missing_Characters;
}
public static int check_For_Duplicate_Characters(String a1, String
b1){
int first_String_Length = a1.length();
int duplicat_Characters = 0;
String found_Characters = "";
for( int y = 0; y < first_String_Length; y++ ){
final char current_Character_To_Check = a1.charAt(y);
long first_String_Count = b1.chars().filter(ch -> ch ==
current_Character_To_Check).count();
long second_String_Count = a1.chars().filter(ch -> ch ==
current_Character_To_Check).count();
long found_String_Count = found_Characters.chars().filter(ch -> ch == current_Character_To_Check).count();
if ( first_String_Count > 0 && second_String_Count > 0 && found_String_Count == 0 ){
duplicat_Characters+=Math.abs(first_String_Count - second_String_Count);
found_Characters = found_Characters +
current_Character_To_Check;
}
}
return duplicat_Characters;
}
public static void main(String[] args) throws Exception {
// TODO Auto-generated method stub
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
int test_Case_Count = Integer.parseInt(br.readLine()); //
Reading input from STDIN
for(int x = 0; x < test_Case_Count; x++ ){
int total_Count_Of_Diff_Chars = 0;
StringBuilder first_StringBuilder = new StringBuilder();
int first_String = '0';
while(( first_String = br.read()) != -1 ) {
first_StringBuilder.append((char) first_String );
}
StringBuilder second_StringBuilder = new StringBuilder();
String second_String = "";
while((( second_String = br.readLine()) != null )){
second_StringBuilder.append(second_String);
}
total_Count_Of_Diff_Chars = total_Count_Of_Diff_Chars +
check_For_Missing_Characters(first_StringBuilder.toString(),
second_StringBuilder.toString());
total_Count_Of_Diff_Chars = total_Count_Of_Diff_Chars +
check_For_Missing_Characters(second_StringBuilder.toString(),
first_StringBuilder.toString());
total_Count_Of_Diff_Chars = total_Count_Of_Diff_Chars +
check_For_Duplicate_Characters(second_StringBuilder.toString(),
first_StringBuilder.toString());
System.out.println(total_Count_Of_Diff_Chars);
}
br.close();
}
}
The above code will work for input that is less than 5118 characters. I would like to understand what is need to make it read beyond the 5118 limit. I'm not sure if the page, I using is causing the limit or there is something that I'm missing. Remember this is also written in java code.

Counting the number of each character in a file

I'm reading the contents of a text file char by char, then I've sorted them in ascending order and count the number of times each char occurs. When I run the program my numbers are way off, for example there are 7 'A' in the file, but I get 17. I'm thinking this means either something is wrong with my counting, or the way I'm reading the chars. Any ideas on what is wrong?
public class CharacterCounts {
public static void main(String[] args) throws IOException{
String fileName = args[0];
BufferedReader in = new BufferedReader(new FileReader(new File(fileName)));
ArrayList<Character> vals = new ArrayList<Character>();
ArrayList<Integer> valCounts = new ArrayList<Integer>();
while(in.read() != -1){
vals.add((char)in.read());
}
Collections.sort(vals);
//This counts how many times each char occures,
//resets count to 0 upon finding a new char.
int count = 0;
for(int i = 1; i < vals.size(); i++){
if(vals.get(i - 1) == vals.get(i)){
count++;
} else {
valCounts.add(count + 1);
count = 0;
}
}
//Removes duplicates from vals by moving from set then back to ArrayList
Set<Character> hs = new HashSet<Character>();
hs.addAll(vals);
vals.clear();
vals.addAll(hs);
//System.out.print(vals.size() + "," + valCounts.size());
for(int i = 0; i < vals.size(); i++){
//System.out.println(vals.get(i));
System.out.printf("'%c' %d\n", vals.get(i), valCounts.get(i));
}
}
}

When you write
if(vals.get(i - 1) == vals.get(i)){
Both are completely different references and they are not at all equals. You have to compare their value.
You want
if(vals.get(i - 1).equals(vals.get(i))){

I think you are overcomplicating your count logic. In addition you call read() twice in the loop so you are skipping every other value.
int[] counts = new int[256]; // for each byte value
int i;
while ((i = in.read()) != -1) { // Note you should only be calling read once for each value
counts[i]++;
}
System.out.println(counts['a']);

Why not use regex instead, the code will be more flexible and simple. Have a look at the code below:
...
final BufferedReader reader = new BufferedReader(new FileReader(filename));
final StringBuilder contents = new StringBuilder();
//read content in a string builder
while(reader.ready()) {
contents.append(reader.readLine());
}
reader.close();
Map<Character,Integer> report = new TreeMap<>();
//init a counter
int count = 0;
//Iterate the chars from 'a' to 'z'
for(char a = 'a';a <'z'; a++ ){
String c = Character.toString(a);
//skip not printable char
if(c.matches("\\W"))
continue;
String C = c.toUpperCase();
//match uppercase and lowercase char
Pattern pattern = Pattern.compile("[" + c + C +"]", Pattern.MULTILINE);
Matcher m = pattern.matcher(contents.toString());
while(m.find()){
count++;
}
if(count>0){
report.put(a, count);
}
//reset the counter
count=0;
}
System.out.println(report);
...

Reading String from the File

What am I doing wrong in the code by reading String.
Suppose the following String is passed as a layout:
String m = "..z\n"+
"...\n"+
"...\n"+
"...\n"+
"z..\n"+
"";
My method should return the same result but it's not returning me anything, it does not pring anything. Please do not suggest using StringBuilder or smth similar. Can smb please help me out with this?
public static Shape makeShape(String layout,char displayChar)
{
Shape result;
int height = 0;
int width = 0;
Scanner data = new Scanner(layout);
char[][] temp;
while(data.hasNextLine())
{
String line = data.nextLine();
height = line.length();
width++;
}
temp = new char[height][width];
Scanner data2 = new Scanner(layout);
while(data.hasNextLine())
{
String line2 = data.nextLine();
if(line2.charAt(0) == '.' && line2.charAt(width) == '.')
throw new FitItException("Empty borders!");
else {
for (int r = 0; r < height; r++)
for (int c = 0; c < width; c++) {
//System.out.println(line2.charAt(c));
if (temp[r][c] == '.') {
temp[r][c] = displayChar;
}
System.out.println(line2.charAt(temp[r][c]));
}
}
}
result = new CreateShape(height, width, displayChar);
return result;
}

Hint: look carefully at these two lines:
Scanner data2 = new Scanner(layout);
while(data.hasNextLine())
Do you see something wrong with ... the ... variable ... names ... ?

Given the code and string above width would be 5 or 6. No line is more than 3 so line2.charAt(width) would throw an exception?

Java - make new string based on old one and lag

I need to get a new string based on an old one and a lag. Basically, I have a string with the alphabet (s = "abc...xyz") and based on a lag (i.e. 3), the new string should replace the characters in a string I type with the character placed some positions forward (lag). If, let's say, I type "cde" as my string, the output should be "fgh". If any other character is added in the string (apart from space - " "), it should be removed. Here is what I tried, but it doesn't work :
String code = "abcdefghijklmnopqrstuvwxyzabcd"; //my lag is 4 and I added the first 4 characters to
char old; //avoid OutOfRange issues
char nou;
for (int i = 0; i < code.length() - lag; ++i)
{
old = code.charAt(i);
//System.out.print(old + " ");
nou = code.charAt(i + lag);
//System.out.println(nou + " ");
// if (s.indexOf(old) != 0)
// {
s = s.replace(old, nou);
// }
}
I commented the outputs for old and nou (new, but is reserved word) because I have used them only to test if the code from position i to i + lag is working (and it is), but if I uncomment the if statement, it doesn't do anything and I leave it like this, it keeps executing the instructions inside the for statmement for code.length() times, but my string doesn't need to be so long. I have also tried to make the for statement like below, but I got lost.
for (int i = 0; i < s.length(); ++i)
{
....
}
Could you help me with this? Or maybe some advices about how I should think the algorithm?
Thanks!

It doesn't work because, as the javadoc of replace() says:
Returns a new string resulting from replacing all occurrences of oldChar in this string with newChar.
(emphasis mine)
So, the first time you meet an 'a' in the string, you replace all the 'a's by 'd'. But then you go to the next char, and if it's a 'd' that was an 'a' before, you replace it once again, etc. etc.
You shouldn't use replace() at all. Instead, you should simply build a new string, using a StringBuilder, by appending each shifted character of the original string:
String dictionary = "abcdefghijklmnopqrstuvwxyz";
StringBuilder sb = new StringBuilder(input.length());
for (int i = 0; i < input.length(); i++) {
char oldChar = input.charAt(i);
int oldCharPositionInDictionary = dictionary.indexOf(oldChar);
if (oldCharPositionInDictionary >= 0) {
int newCharPositionInDictionary =
(oldCharPositionInDictionary + lag) % dictionary.length();
sb.append(dictionary.charAt(newCharPositionInDictionary));
}
else if (oldChar == ' ') {
sb.append(' ');
}
}
String result = sb.toString();

Try this:
Convert the string to char array.
iterate over each char array and change the char by adding lag
create new String just once (instead of loop) with new String passing char array.
String code = "abcdefghijklmnopqrstuvwxyzabcd";
String s = "abcdef";
char[] ch = s.toCharArray();
char[] codes = code.toCharArray();
for (int i = 0; i < ch.length; ++i)
{
ch[i] = codes[ch[i] - 'a' + 3];
}
String str = new String(ch);
System.out.println(str);
}

My answer is something like this.
It returns one more index to every character.
It reverses every String.
Have a good day!
package org.owls.sof;
import java.util.Scanner;
public class Main {
private static final String CODE = "abcdefghijklmnopqrstuvwxyz"; //my lag is 4 and I added the first 4 characters to
#SuppressWarnings("resource")
public static void main(String[] args) {
System.out.print("insert alphabet >> ");
Scanner scanner = new Scanner(System.in);
String s = scanner.next();
char[] char_arr = s.toCharArray();
for(int i = 0; i < char_arr.length; i++){
int order = CODE.indexOf(char_arr[i]) + 1;
if(order%CODE.length() == 0){
char_arr[i] = CODE.charAt(0);
}else{
char_arr[i] = CODE.charAt(order);
}
}
System.out.println(new String(char_arr));
//reverse
System.out.println(reverse(new String(char_arr)));
}
private static String reverse (String str) {
char[] char_arr = str.toCharArray();
for(int i = 0; i < char_arr.length/2; i++){
char tmp = char_arr[i];
char_arr[i] = char_arr[char_arr.length - i - 1];
char_arr[char_arr.length - i - 1] = tmp;
}
return new String(char_arr);
}
}

String alpha = "abcdefghijklmnopqrstuvwxyzabcd"; // alphabet
int N = alpha.length();
int lag = 3; // shift value
String s = "cde"; // input
StringBuilder sb = new StringBuilder();
for (int i = 0, index; i < s.length(); i++) {
index = s.charAt(i) - 'a';
sb.append(alpha.charAt((index + lag) % N));
}
String op = sb.toString(); // output

Java Read Each Line Into Separate Array

I have 1,000 lines of data in a text file and I would like each line to be its own float [].
1,1,1,1,1,1
2,2,2,2,2,2
3,3,3,3,3,3
Would result in:
float[0] = {1,1,1,1,1,1}
float[1] = {2,2,2,2,2,2}
float[2] = {3,3,3,3,3,3}
Is this possible? I could only find examples of loading an entire file into an array. I tried hardcoding all the arrays, but exceeded the byte character limit of ~65,000

Try the following:
// this list will store all the created arrays
List<float[]> arrays = new ArrayList<float[]>();
// use a BufferedReader to get the handy readLine() function
BufferedReader reader = new BufferedReader(new FileReader("myfile.txt"));
// this reads in all the lines. If you only want the first thousand, just
// replace these loop conditions with a regular counter variable
for (String line = reader.readLine(); line != null; line = reader.readLine()) {
String[] floatStrings = line.split(",");
float[] floats = new float[floatStrings.length];
for (int i = 0; i < floats.length; ++i) {
floats[i] = Float.parseFloat(floatStrings[i]);
}
arrays.add(floats);
}
Note that I haven't added any exception handling (readLine(), for example, throws IOException).

use a LineIterator to read each line without loading the whole file
for each line, use a regular expression to extract figures like (\d\.)+ and iterator over the matches found with methods like find() and group()

<body>
<pre>
import java.io.FileReader;
public class Check {
public static void main(String[] args) {
readingfile();
}
public static void readingfile() {
try {
FileReader read = new FileReader("D:\\JavaWkspace\\numbers.txt");
int index;
String nums1 = "";
while ((index = read.read()) != -1) {
if (((char) index) != '\n') {
nums1 += String.valueOf((char) index);
}
}
System.out.println("Problem statement: Print out the greatest number on each line:\n" + nums1);
String f = nums1.substring(0, 14);
String s = nums1.substring(15, 29);
String t = nums1.substring(30);
String[] fs = f.split(",");
int size = fs.length;
int[] arr = new int[size];
for (int i = 0; i < size; i++) {
arr[i] = Integer.parseInt(fs[i]);
}
int max = arr[0];
for (int i = 0; i < arr.length; i++) {
if (max < arr[i]) {
max = arr[i];
}
}
System.out.println("\nGreatest number in the first line is:" + (max));
String[] sstr = s.split(",");
int size2 = sstr.length;
int[] arr2 = new int[size2];
for (int i = 0; i < size2; i++) {
arr2[i] = Integer.parseInt(sstr[i]);
}
int max2 = arr2[0];
for (int i = 0; i < arr2.length; i++) {
if (max2 < arr2[i]) {
max2 = arr2[i];
}
}
System.out.println("\nGreatest number in the second line is:" + (max2));
String[] str3 = t.split(",");
int size3 = str3.length;
int[] arr3 = new int[size3];
for (int i = 0; i < size3; i++) {
arr3[i] = Integer.parseInt(str3[i]);
}
int max3 = arr3[0];
for (int i = 0; i < arr3.length; i++) {
if (max3 < arr3[i]) {
max3 = arr3[i];
}
}
System.out.println("\nGreatest number in the third line is:" + (max3));
read.close();
} catch (Exception e) {
System.out.println(e);
}
}
}
</pre>
</body>

Loop over the line-delimited contents of the file with .split("\n") and then cast each result as float array. Here's how to convert the string into a float for you => http://www.devdaily.com/java/edu/qanda/pjqa00013.shtml

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Count all characters in a file including \n etc - java

Related

Trying to determine how to read input from console that could be over 10,000 characters in length

Counting the number of each character in a file

Reading String from the File

Java - make new string based on old one and lag

Java Read Each Line Into Separate Array

Categories

Resources