Counting the number of each character in a file

Counting the number of each character in a file - java

I'm reading the contents of a text file char by char, then I've sorted them in ascending order and count the number of times each char occurs. When I run the program my numbers are way off, for example there are 7 'A' in the file, but I get 17. I'm thinking this means either something is wrong with my counting, or the way I'm reading the chars. Any ideas on what is wrong?
public class CharacterCounts {
public static void main(String[] args) throws IOException{
String fileName = args[0];
BufferedReader in = new BufferedReader(new FileReader(new File(fileName)));
ArrayList<Character> vals = new ArrayList<Character>();
ArrayList<Integer> valCounts = new ArrayList<Integer>();
while(in.read() != -1){
vals.add((char)in.read());
}
Collections.sort(vals);
//This counts how many times each char occures,
//resets count to 0 upon finding a new char.
int count = 0;
for(int i = 1; i < vals.size(); i++){
if(vals.get(i - 1) == vals.get(i)){
count++;
} else {
valCounts.add(count + 1);
count = 0;
}
}
//Removes duplicates from vals by moving from set then back to ArrayList
Set<Character> hs = new HashSet<Character>();
hs.addAll(vals);
vals.clear();
vals.addAll(hs);
//System.out.print(vals.size() + "," + valCounts.size());
for(int i = 0; i < vals.size(); i++){
//System.out.println(vals.get(i));
System.out.printf("'%c' %d\n", vals.get(i), valCounts.get(i));
}
}
}

When you write
if(vals.get(i - 1) == vals.get(i)){
Both are completely different references and they are not at all equals. You have to compare their value.
You want
if(vals.get(i - 1).equals(vals.get(i))){

I think you are overcomplicating your count logic. In addition you call read() twice in the loop so you are skipping every other value.
int[] counts = new int[256]; // for each byte value
int i;
while ((i = in.read()) != -1) { // Note you should only be calling read once for each value
counts[i]++;
}
System.out.println(counts['a']);

Why not use regex instead, the code will be more flexible and simple. Have a look at the code below:
...
final BufferedReader reader = new BufferedReader(new FileReader(filename));
final StringBuilder contents = new StringBuilder();
//read content in a string builder
while(reader.ready()) {
contents.append(reader.readLine());
}
reader.close();
Map<Character,Integer> report = new TreeMap<>();
//init a counter
int count = 0;
//Iterate the chars from 'a' to 'z'
for(char a = 'a';a <'z'; a++ ){
String c = Character.toString(a);
//skip not printable char
if(c.matches("\\W"))
continue;
String C = c.toUpperCase();
//match uppercase and lowercase char
Pattern pattern = Pattern.compile("[" + c + C +"]", Pattern.MULTILINE);
Matcher m = pattern.matcher(contents.toString());
while(m.find()){
count++;
}
if(count>0){
report.put(a, count);
}
//reset the counter
count=0;
}
System.out.println(report);
...

Related

Sort by number of apearances

for example, I am given a word and I have to sort its letters by the number of occurrences in that word, if 2 letters appear the same number of times it will be sorted by the lexicographic minimum.
For now, I have started to see how many times a letter appears in a word but from here I do not know exactly how to do it.
The problem requires me to use BufferedReader and BufferedWriter.
import java.util.*;
public class Main {
public static void main(String[] args) {
Scanner sc = new Scanner(System.in);
Map<Character, Integer> m = new HashMap<>();
String s = sc.nextLine();
for (int i = 0; i < s.length(); ++i) {
char c = s.charAt(i);
if (m.containsKey(c))
m.put(c, m.get(c) + 1);
else
m.put(c, 1);
}
for (char letter = 'a'; letter <= 'z'; ++letter)
if (m.containsKey(letter))
System.out.println(letter + ": " + m.get(letter));
}
For the moment I am posting what letters appear most often in the word, but I do not know how to sort them by the number of occurrences and in case there are two letters that appear at the same number of times with the minimum lexicographic.

I hope this is what you want
public static void main(String[] args) {
Map<Character, Integer> m = new HashMap<>();
String testString = "Instructions";
Map<Character, List<Character>> map = new HashMap<>();
for (int i = 0; i < testString.length(); i++) {
char someChar = testString.charAt(i);
if (someChar == ' ') {
continue;
}
char ch = testString.charAt(i);
List<Character> characters = map.getOrDefault(Character.toLowerCase(ch), new ArrayList<>());
characters.add(ch);
map.put(Character.toLowerCase(ch), characters);
}
List<Map.Entry<Character, List<Character>>> list = new ArrayList<>(map.entrySet());
list.sort((o1, o2) -> {
if (o1.getValue().size() == o2.getValue().size()) {
return o1.getKey() - o2.getKey();/// your lexicographic comparing
}
return o2.getValue().size() - o1.getValue().size();
});
list.forEach(entry -> entry.getValue().forEach(System.out::print));
}

To count letters in word, you can use much simpler method:
define array with 26 zeros, scan input line and increase appropriate index in this array, so if you meet 'a' (or 'A' - which is the same letter, but different symbol) - you will increase value at index 0, b - index 1, etc
during this scan you can also compute most occurred symbol, like this:
public static void main(final String[] args) throws IOException {
char maxSymbol = 0;
int maxCount = 0;
final int[] counts = new int[26]; // number of times each letter (a-z) appears in string
try (final BufferedReader br = new BufferedReader(new InputStreamReader(System.in))) {
final String s = br.readLine().toLowerCase(); // calculate case-insensitive counts
for (final char c : s.toCharArray()) {
final int idx = c - 'a'; // convert form ASCII code to 0-based index
counts[idx]++;
if (counts[idx] > maxCount) {
maxSymbol = c; // we found most occurred symbol for the current moment
maxCount = counts[idx];
} else if (counts[idx] == maxCount) { // we found 2nd most occurred symbol for the current moment, need to check which one is minimal in lexicographical order
if (c < maxSymbol) {
maxSymbol = c;
}
}
}
}
if (maxSymbol > 0) {
System.out.println("Most frequent symbol " + maxSymbol + " occurred " + maxCount);
}
}
I've used buffered reader to get data from stdin, but I have no idea where to put buffered writer here, maybe to print result?

Reverse a string without use of library function in java

I am new to programming I need to reverse a string without using the library function.
I am able to reverse but as expected.
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
String s = br.readLine();
String rev = "";
ArrayList<Integer> list = new ArrayList<Integer>();
ArrayList<String> splitResult = new ArrayList<String>();
for (int i = 0; i < s.length(); i++)
if (s.charAt(i) == ' ')
list.add(i);
list.add(0, 0);
list.add(list.size(), s.length());
String[] words = new String[list.size()];
for (int j = 0; j <= words.length - 2; j++)
splitResult.add(s.substring(list.get(j), list.get(j + 1)).trim());
System.out.println(splitResult);
String[] str = new String[splitResult.size()];
str = splitResult.toArray(str);
for (int i = 0; i < str.length; i++) {
if (i == str.length - 1) {
rev = str[i] + rev;
} else {
rev = " " + str[i] + rev;
}
}
System.out.println(rev);
Expected:
Input: i am coder
output: redoc ma i
actual
input: i am coder
output: coder am i

You can just provide an empty result variable, iterate the characters of the given String by using an enhanced for-loop (also known as for-each loop) setting every character to index 0 of the result variable by just concatenating the character to the result variable like this:
public static void main(String[] args) {
// input
String s = "I am coder";
// result variable for reverse input
String reverseS = "";
// go through every single character of the input
for (char c : s.toCharArray()) {
// and concatenate it and the result variable
reverseS = c + reverseS;
}
// then print the result
System.out.println(reverseS);
}
You can of course do that in a slightly different way using a classic for-loop and the length of the input, see this example:
public static void main(String[] args) {
String s = "I am coder";
String reverseS = "";
for (int i = 0; i < s.length(); i++) {
reverseS = s.charAt(i) + reverseS;
}
System.out.println(reverseS);
}

String s = "I am coder";
String rev="";
for (int i = s.length()-1; i >=0; i--) {
rev+=s.charAt(i);
}
System.out.println(rev);
I have set the index I to the last character of the given string and the condition is set to 0(i.e the first character). Hence the loop runs from the last character to the first character. It extracts each of the characters to a given new String. Hope it helps!

You can just take another list go from last to first and display like this
BufferedReader br=new BufferedReader(new InputStreamReader(System.in));
String s=br.readLine();
ArrayList<Character> working = new ArrayList<Character>();
ArrayList<Character> finished = new ArrayList<Character>();
for (char ch: s.toCharArray()) {
working.add(ch);
}
for(int i = working.size() - 1 ; i>=0 ; i--){
finished.add(working.get(i));
}
for(int j = 0 ; j < finished.size() ; j++){
System.out.print(finished.get(j));
}

Splitting this string to get the max count to a corresponding character

I am currently implementing Run Length Encoding for text compression and my algorithm does return Strings of the following form:
Let's say we have a string as input
"AAAAABBBBCCCCCCCC"
then my algorithm returns
"1A2A3A4A5A1B2B3B4B1C2C3C4C5C6C7C8C"
Now I want to apply Java String split to solve this, because I want to get the highest number corresponding to character. For our example it would be
"5A4B8C"
My function can be seen below
public String getStrfinal(){
String result = "";
int counter = 1;
StringBuilder sb = new StringBuilder();
sb.append("");
for (int i=0;i<str.length()-1;i++) {
char c = str.charAt(i);
if (str.charAt(i)==str.charAt(i+1)) {
counter++;
sb.append(counter);
sb.append(c);
}
else {
counter = 1;
continue;
}
}
result = sb.toString();
return result;
}

public static String getStrfinal(){
StringBuilder sb = new StringBuilder();
char last = 0;
int count = 0;
for(int i = 0; i < str.length(); i++) {
if(i > 0 && last != str.charAt(i)) {
sb.append(count + "" + last);
last = 0;
count = 1;
}
else {
count++;
}
last = str.charAt(i);
}
sb.append(count + "" + last);
return sb.toString();
}

Here is one possible solution. It starts with the raw string and simply iterates thru the string.
public static void main(String[] args) {
String input = "AAAABBBCCCCCCCDDDEAAFBBCD";
int index = 0;
StringBuilder sb = new StringBuilder();
while (index < input.length()) {
int count = 0;
char c = input.charAt(index);
for (; index < input.length(); index++) {
if (c != input.charAt(index)) {
count++;
}
else {
break;
}
}
sb.append(Integer.toString(count));
sb.append(c);
count = 0;
}
System.out.println(sb.toString());
}
But one problem with this method and others is what happens if there are digits in the text? For example. What if the string is AAABB999222AAA which would compress to 3A2B39323A. That could also mean AAABB followed by 39 3's and 23 A's

Instead of string Buffer you can use a map it will be much easier and clean to do so.
public static void main(String[] args) {
String input = "AAAAABBBBCCCCCCCCAAABBBDDCCCC";
int counter=1;
for(int i=1; i<input.length(); i++) {
if(input.charAt(i-1)==input.charAt(i)) {
counter=counter+1;
}else if(input.charAt(i-1)!=input.charAt(i)){
System.out.print(counter+Character.toString(input.charAt(i-1)));
counter=1;
}if(i==input.length()-1){
System.out.print(counter+Character.toString(input.charAt(i)));
}
}
}
This will gives
5A4B8C3A3B2D4C
UPDATES
I Agree with #WJS if the string contains number the out put becomes messy
hence if the System.out in above code will be exchange with below i.e.
System.out.print(Character.toString(input.charAt(i-1))+"="+counter+" ");
then for input like
AAAAABBBBCCCCCCCCAAABBBDD556677CCCCz
we get out put as below
A=5 B=4 C=8 A=3 B=3 D=2 5=2 6=2 7=2 C=4 z=1

This is one of the possible solutions to your question. We can use a LinkedHashMap data structure which is similar to HashMap but it also maintains the order. So, we can traverse the string and store the occurrence of each character as Key-value pair into the map and retrieve easily with its maximum occurrence.
public String getStrFinal(String str){
if(str==null || str.length()==0) return str;
LinkedHashMap<Character,Integer> map = new LinkedHashMap<>();
StringBuilder sb=new StringBuilder(); // to store the final string
for(char ch:str.toCharArray()){
map.put(ch,map.getOrDefault(ch,0)+1); // put the count for each character
}
for(Map.Entry<Character,Integer> entry:map.entrySet()){ // iterate the map again and append each character's occurence into stringbuilder
sb.append(entry.getValue());
sb.append(entry.getKey());
}
System.out.println("String = " + sb.toString()); // here you go, we got the final string
return sb.toString();
}

Find common alphabets in two strings using for loop

I am trying to find the common characters in two strings just by using the for loop. The below code is working fine, if I provide two completely different strings ex.one and two but if I provide two strings with same input ex.teen and teen it doesn't work as expected.
import java.util.Scanner;
public class CommonAlphabets {
public static void main(String[] args) {
try(Scanner input = new Scanner(System.in)){
System.out.println("Enter String one ");
String stringOne = input.nextLine();
System.out.println("Enter String two ");
String StringTwo = input.nextLine();
StringBuffer sb = new StringBuffer();
for(int i=0;i<stringOne.length();i++){
for(int j=0;j<StringTwo.length();j++){
if(stringOne.charAt(i)== StringTwo.charAt(j)){
sb.append(stringOne.charAt(i));
}
}
}
System.out.println("Common characters are " +sb.toString());
}
}
}
Should I create another nested for loop to find duplicates in the StringBuffer or is there a better way to handle this scenario.

You do not need an inner for loop but use contains instead
String stringOne = "one";
String stringTwo = "one";
StringBuilder sb = new StringBuilder();
for(int i=0;i<stringOne.length() && i < stringTwo.length ();i++){
if(stringOne.contains(String.valueOf(stringTwo.charAt(i))) &&
!sb.toString().contains(String.valueOf(stringTwo.charAt(i)))){
// check already added
sb.append(stringTwo.charAt(i));
}
}
System.out.println (sb.toString());
edit
check to make sure char to be added does not already exist in StringBuilder -
Could use a Set instead
If using a Set
Set<Character> set = new HashSet<> ();
your logic could be simplified to
if(stringOne.contains(String.valueOf(stringTwo.charAt(i)))){
set.add(stringTwo.charAt(i));
}

You can use Set for it.
Set<Character> set = new HashSet<>();
for(int i = 0; i<stringOne.length(); i++) {
for(int j = 0; j < StringTwo.length(); j++) {
if(stringOne.charAt(i) == StringTwo.charAt(j)){
set.add(stringOne.charAt(i));
}
}
}
StringBuilder sb = new StringBuilder();
for (Character c : set) {
sb.append(c);
}
System.out.println("Common characters are " + sb);

well your approach is fine as the result is showing what you are expecting there fore that code is fine, but you need to stop the duplication , therefore you have to write the code for 'sb' variable so that it will remove duplicates or write code in loop so that it wont provide duplicate.
as your code is becoming complicated to read so i would prefer that you make a method to write code to remove duplicate it will go like
static void removeDuplicate(StringBuilder s){
for(int i=0,i<s.length-1,i++){
for(int j=i+1,j<s.length,j++){
if(s.charAt(i)==s.charAt(j)){
s.deleteCharAt(j);
}
}
}
call this method before printing

Another approach you could try is - combine the two input strings, iterate over the concatenated string and return the characters which exist in both the strings.
Using a Set will ensure you do not add characters which get repeated due to the concatenation of the strings.
Here's what I wrote -
import java.util.HashSet;
public class HelloWorld {
private static Character[] findCommonLetters(String combined, String w1, String w2) {
HashSet<Character> hash = new HashSet<>();
for(char c: combined.toCharArray()) {
if(w1.indexOf(c) != -1 && w2.indexOf(c) != -1) {
hash.add(c);
}
}
return hash.toArray(new Character[hash.size()]);
}
public static void main(String []args){
// System.out.println("Hello World");
String first = "flour";
String second = "four";
String combined = first.concat(second);
Character[] result = findCommonLetters(combined, first, second);
for(char c: result) {
System.out.print(c);
}
System.out.println();
}
}
Demo here.

This is the best way to do this because it's time complexity is n so that why this is the best you could do.
import java.util.Scanner;
public class CommonAlphabets
{
public static void main(String[] args)
{
try (Scanner input = new Scanner(System.in))
{
System.out.println("Enter String one ");
String stringOne = input.nextLine();
System.out.println("Enter String two ");
String StringTwo = input.nextLine();
StringBuffer sb = new StringBuffer();
/**
* Assuming char as index of array where A-Z is from index 0 to 25 and a-z is index 26-51
*/
int[] alphabetArray1 = new int[52];
for(int i = 0, len = stringOne.length(); i < len; i++)
alphabetArray1[stringOne.charAt(i) > 94 ? stringOne.charAt(i) - 71 : stringOne.charAt(i) - 65] = 1;
int[] alphabetArray2 = new int[52];
for(int i = 0, len = StringTwo.length(); i < len; i++)
alphabetArray2[StringTwo.charAt(i) > 94 ? StringTwo.charAt(i) - 71 : StringTwo.charAt(i) - 65] = 1;
// System.out.println(Arrays.toString(alphabetArray1));
// System.out.println(Arrays.toString(alphabetArray2));
for (int i = 0; i < 52; i++)
if (alphabetArray1[i] == 1 && alphabetArray2[i] == 1)
sb.append((char) (i < 26 ? i + 65 : i + 71));
System.out.println("Common characters are " + sb.toString());
}
}
}

Java - make new string based on old one and lag

I need to get a new string based on an old one and a lag. Basically, I have a string with the alphabet (s = "abc...xyz") and based on a lag (i.e. 3), the new string should replace the characters in a string I type with the character placed some positions forward (lag). If, let's say, I type "cde" as my string, the output should be "fgh". If any other character is added in the string (apart from space - " "), it should be removed. Here is what I tried, but it doesn't work :
String code = "abcdefghijklmnopqrstuvwxyzabcd"; //my lag is 4 and I added the first 4 characters to
char old; //avoid OutOfRange issues
char nou;
for (int i = 0; i < code.length() - lag; ++i)
{
old = code.charAt(i);
//System.out.print(old + " ");
nou = code.charAt(i + lag);
//System.out.println(nou + " ");
// if (s.indexOf(old) != 0)
// {
s = s.replace(old, nou);
// }
}
I commented the outputs for old and nou (new, but is reserved word) because I have used them only to test if the code from position i to i + lag is working (and it is), but if I uncomment the if statement, it doesn't do anything and I leave it like this, it keeps executing the instructions inside the for statmement for code.length() times, but my string doesn't need to be so long. I have also tried to make the for statement like below, but I got lost.
for (int i = 0; i < s.length(); ++i)
{
....
}
Could you help me with this? Or maybe some advices about how I should think the algorithm?
Thanks!

It doesn't work because, as the javadoc of replace() says:
Returns a new string resulting from replacing all occurrences of oldChar in this string with newChar.
(emphasis mine)
So, the first time you meet an 'a' in the string, you replace all the 'a's by 'd'. But then you go to the next char, and if it's a 'd' that was an 'a' before, you replace it once again, etc. etc.
You shouldn't use replace() at all. Instead, you should simply build a new string, using a StringBuilder, by appending each shifted character of the original string:
String dictionary = "abcdefghijklmnopqrstuvwxyz";
StringBuilder sb = new StringBuilder(input.length());
for (int i = 0; i < input.length(); i++) {
char oldChar = input.charAt(i);
int oldCharPositionInDictionary = dictionary.indexOf(oldChar);
if (oldCharPositionInDictionary >= 0) {
int newCharPositionInDictionary =
(oldCharPositionInDictionary + lag) % dictionary.length();
sb.append(dictionary.charAt(newCharPositionInDictionary));
}
else if (oldChar == ' ') {
sb.append(' ');
}
}
String result = sb.toString();

Try this:
Convert the string to char array.
iterate over each char array and change the char by adding lag
create new String just once (instead of loop) with new String passing char array.
String code = "abcdefghijklmnopqrstuvwxyzabcd";
String s = "abcdef";
char[] ch = s.toCharArray();
char[] codes = code.toCharArray();
for (int i = 0; i < ch.length; ++i)
{
ch[i] = codes[ch[i] - 'a' + 3];
}
String str = new String(ch);
System.out.println(str);
}

My answer is something like this.
It returns one more index to every character.
It reverses every String.
Have a good day!
package org.owls.sof;
import java.util.Scanner;
public class Main {
private static final String CODE = "abcdefghijklmnopqrstuvwxyz"; //my lag is 4 and I added the first 4 characters to
#SuppressWarnings("resource")
public static void main(String[] args) {
System.out.print("insert alphabet >> ");
Scanner scanner = new Scanner(System.in);
String s = scanner.next();
char[] char_arr = s.toCharArray();
for(int i = 0; i < char_arr.length; i++){
int order = CODE.indexOf(char_arr[i]) + 1;
if(order%CODE.length() == 0){
char_arr[i] = CODE.charAt(0);
}else{
char_arr[i] = CODE.charAt(order);
}
}
System.out.println(new String(char_arr));
//reverse
System.out.println(reverse(new String(char_arr)));
}
private static String reverse (String str) {
char[] char_arr = str.toCharArray();
for(int i = 0; i < char_arr.length/2; i++){
char tmp = char_arr[i];
char_arr[i] = char_arr[char_arr.length - i - 1];
char_arr[char_arr.length - i - 1] = tmp;
}
return new String(char_arr);
}
}

String alpha = "abcdefghijklmnopqrstuvwxyzabcd"; // alphabet
int N = alpha.length();
int lag = 3; // shift value
String s = "cde"; // input
StringBuilder sb = new StringBuilder();
for (int i = 0, index; i < s.length(); i++) {
index = s.charAt(i) - 'a';
sb.append(alpha.charAt((index + lag) % N));
}
String op = sb.toString(); // output

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Counting the number of each character in a file - java

When you write if(vals.get(i - 1) == vals.get(i)){ Both are completely different references and they are not at all equals. You have to compare their value. You want if(vals.get(i - 1).equals(vals.get(i))){

Related

Sort by number of apearances

Reverse a string without use of library function in java

Splitting this string to get the max count to a corresponding character

Find common alphabets in two strings using for loop

Java - make new string based on old one and lag

Categories

Resources