How to NOT count control characters in a text file - java

I am having trouble understanding how to NOT count control characters in a text file. My program does everything but skip control characters \n \r:
contents of file: Ok upon further tries I am closer. If I change:
while (input.hasNext()) {
String line = input.nextLine();
lineCount++;
wordCount += countWords(line);
charcount += line.length();
to
while (input.hasNext()) {
String line = input.next();
lineCount++;
wordCount += countWords(line);
charCount += line.replace("\n", "").replace("\r", "").length();
the chars are counted but it messes up the lines. If I add the input.nextLine it messes up the chars.
contents of text file:
cat
sad dog
dog wag
import java.io.*;
import java.util.*;
public class Character_count {
public static void main(String args[]) throws Exception {
java.io.File file = new java.io.File("textFile.txt");
// Create a Scanner for the file
Scanner input = new Scanner(file);
int charcount = 0;
int wordCount = 0;
int lineCount = 0;
while (input.hasNext()) {
String line = input.nextLine();
lineCount++;
wordCount += countWords(line);
charcount += line.length();
}
System.out.println("The file " + file + " has ");
System.out.println(charcount + " characters");
System.out.println(wordCount + " words");
System.out.println(lineCount + " lines");
}
private static int countWords(String s) {
Scanner input = new Scanner(s);
int count = 0;
while (input.hasNext()) {
input.next();
count++;
}
return count;
}
}

You could replace all the \n and \r with empty String like this:
line = line.replaceAll("\\r?\\n", "")
Now you can do the counts and it would not take into account any \n or \r.
You could alternatively do (Without using regex):
line = line.replace("\n", "").replace("\r", "")

You can achieve that with your Scanner by using the useDelimiter method:
Scanner input = new Scanner(new File("textFile.txt"));
input.useDelimiter("\r\n");
And continue with your code as usual, should work.
Also (and very important) if you check hasNext() then use next(), and if you check hasNextLine() use nextLine()! Don't mix-and-match as it will cause (or already causing) issues down the line.

Hello you should use '\s' in the regular expression that represents white spaces
\s stands for "whitespace character". Again, which characters this actually includes, depends on the regex flavor. In all flavors discussed in this tutorial, it includes [ \t\r\n\f]. That is: \s matches a space, a tab, a line break, or a form feed.(http://www.regular-expressions.info/shorthand.html)
so here how you use it
Scanner scanner = new Scanner(path.toFile(),"UTF-8");
String content = scanner.useDelimiter("\\A").next();
System.out.println(content);
Pattern patternLine = Pattern.compile("\\r?\\n");
Matcher matcherLine = patternLine.matcher(content);
int numberLines = 1;
while (matcherLine.find())
numberLines++;
Pattern pattern = Pattern.compile("\\s");
Matcher matcherEliminateWhiteSpace = pattern.matcher(content);
String contentWithoutWhiteSpace=matcherEliminateWhiteSpace.replaceAll("");
// it will count only ASCII Charachter a->z A->Z 0->9 _'underscore'
Pattern patternCharachter=Pattern.compile("\\w");
Matcher matcherCharachterAscii= patternCharachter.matcher(contentWithoutWhiteSpace);
int numberCharachtersAscii = 0;
while (matcherCharachterAscii.find())
numberCharachtersAscii++;
//it will count UTF-8 charachters it will count all charachter no matter what script it is like français عربي and punctuation
Pattern patternUniversal= Pattern.compile(".");
Matcher matcherUniversal= patternUniversal.matcher(contentWithoutWhiteSpace);
int numberUniversalCharachter=0;
while(matcherUniversal.find())
numberUniversalCharachter++;
System.out
.println("******************************************************");
System.out.println(contentWithoutWhiteSpace);
System.out.println(numberLines);
System.out.println(numberCharachtersAscii);
System.out.println(numberUniversalCharachter);
EDIT
here is a simple modification that will make it work
while (scanner.hasNext()) {
String line = scanner.nextLine();
lineCount++;
wordCount += countWords(line);
charcount += word.replaceAll("\\s", "").length();
System.out.println(charcount);
i++;
}
\\s stands for white spaces[tab cariagReturn lineFeed space formFeed ]

Related

How can I move the punctuation from the end of a string to the beginning?

I am attempting to write a program that reverses a string's order, even the punctuation. But when my backwards string prints. The punctuation mark at the end of the last word stays at the end of the word instead of being treated as an individual character.
How can I split the end punctuation mark from the last word so I can move it around?
For example:
When I type in : Hello my name is jason!
I want: !jason is name my Hello
instead I get: jason! is name my Hello
import java.util.*;
class Ideone
{
public static void main(String[] args) {
Scanner userInput = new Scanner(System.in);
System.out.print("Enter a sentence: ");
String input = userInput.nextLine();
String[] sentence= input.split(" ");
String backwards = "";
for (int i = sentence.length - 1; i >= 0; i--) {
backwards += sentence[i] + " ";
}
System.out.print(input + "\n");
System.out.print(backwards);
}
}
Manually rearranging Strings tends to become complicated in no time. It's usually better (if possible) to code what you want to do, not how you want to do it.
String input = "Hello my name is jason! Nice to meet you. What's your name?";
// this is *what* you want to do, part 1:
// split the input at each ' ', '.', '?' and '!', keep delimiter tokens
StringTokenizer st = new StringTokenizer(input, " .?!", true);
StringBuilder sb = new StringBuilder();
while(st.hasMoreTokens()) {
String token = st.nextToken();
// *what* you want to do, part 2:
// add each token to the start of the string
sb.insert(0, token);
}
String backwards = sb.toString();
System.out.print(input + "\n");
System.out.print(backwards);
Output:
Hello my name is jason! Nice to meet you. What's your name?
?name your What's .you meet to Nice !jason is name my Hello
This will be a lot easier to understand for the next person working on that piece of code, or your future self.
This assumes that you want to move every punctuation char. If you only want the one at the end of the input string, you'd have to cut it off the input, do the reordering, and finally place it at the start of the string:
String punctuation = "";
String input = "Hello my name is jason! Nice to meet you. What's your name?";
System.out.print(input + "\n");
if(input.substring(input.length() -1).matches("[.!?]")) {
punctuation = input.substring(input.length() -1);
input = input.substring(0, input.length() -1);
}
StringTokenizer st = new StringTokenizer(input, " ", true);
StringBuilder sb = new StringBuilder();
while(st.hasMoreTokens()) {
sb.insert(0, st.nextToken());
}
sb.insert(0, punctuation);
System.out.print(sb);
Output:
Hello my name is jason! Nice to meet you. What's your name?
?name your What's you. meet to Nice jason! is name my Hello
Like the other answers, need to separate out the punctuation first, and then reorder the words and finally place the punctuation at the beginning.
You could take advantage of String.join() and Collections.reverse(), String.endsWith() for a simpler answer...
String input = "Hello my name is jason!";
String punctuation = "";
if (input.endsWith("?") || input.endsWith("!")) {
punctuation = input.substring(input.length() - 1, input.length());
input = input.substring(0, input.length() - 1);
}
List<String> words = Arrays.asList(input.split(" "));
Collections.reverse(words);
String reordered = punctuation + String.join(" ", words);
System.out.println(reordered);
The below code should work for you
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class ReplaceSample {
public static void main(String[] args) {
String originalString = "TestStr?";
String updatedString = "";
String regex = "end\\p{Punct}+|\\p{Punct}+$";
Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(originalString);
while (matcher.find()) {
int start = matcher.start();
updatedString = matcher.group() + originalString.substring(0, start);<br>
}
System.out.println("Original -->" + originalString + "\nReplaced -->" + updatedString);
}
}
You need to follow the below steps:
(1) Check for the ! character in the input
(2) If input contains ! then prefix it to the empty output string variable
(3) If input does not contain ! then create empty output string variable
(4) Split the input string and iterate in reverse order (you are already doing this)
You can refer the below code:
public static void main(String[] args) {
Scanner userInput = new Scanner(System.in);
System.out.print("Enter a sentence: ");
String originalInput = userInput.nextLine();
String backwards = "";
String input = originalInput;
//Define your punctuation chars into an array
char[] punctuationChars = {'!', '?' , '.'};
String backwards = "";
//Remove ! from the input
for(int i=0;i<punctuationChars.length;i++) {
if(input.charAt(input.length()-1) == punctuationChars[i]) {
input = input.substring(0, input.length()-1);
backwards = punctuationChars[i]+"";
break;
}
}
String[] sentence= input.split(" ");
for (int i = sentence.length - 1; i >= 0; i--) {
backwards += sentence[i] + " ";
}
System.out.print(originalInput + "\n");
System.out.print(input + "\n");
System.out.print(backwards);
}
Don't split by spaces; split by word boundaries. Then you don't need to care about punctuation or even putting spaces back, because you just reverse them too!
And it's only 1 line:
Arrays.stream(input.split("\\b"))
.reduce((a, b) -> b + a)
.ifPresent(System.out::println);
See live demo.

How to find new line characyter

I want to know how can i read the contents of a file, character by character?
I tried this code
Scanner sc = new Scanner(new BufferedReader(newFileReader("C:\\saml.txt")));
while(sc.hasNext())
{
String s=sc.next();
char x[]=s.toCharArray();
for(int i=0;i<x.length;i++)
{
if(x[i]=='\n')
System.out.println("hello");
System.out.println(x[i];
}
I want to give input in file as:
"pure
world
it
is"
I want the output
"pure
hello
world
hello
it
hello
is
hello"
Two options:
Compare the string read against System#lineSeparator. If you use Java 6 or prior, use System.getProperty("line.separator");:
String s = sc.next();
if (System.lineSeparator().equals(s)) {
//...
}
Use Scanner#nextLine that will return a String until it finds a line separator character(s) (it will consume the line separator character(s) for you and remove it), then split the string by empty spaces and work with every string between spaces.
String s = sc.nextLine();
String[] words = s.split("\\s+");
for (String word : words) {
//...
}
Note that in any of these cases you don't need to evaluate each character in the String.
I would suggest using Scanner.nextLine and either using String.join(java 8+) or going for the manual method:
Without a trailing "hello":
while (sc.hasNext()){
String[] words = sc.nextLine().split(" ");
String sentence = words[0];
for(int i = 0; ++i < words.length;)
sentence += " hello " + words[i];
System.out.println(sentence);
}
With a trailing "hello":
while (sc.hasNext()){
for(String w : sc.nextLine().split(System.getProperty("line.separator")))
System.out.print(w + " hello ");
System.out.println();
}

For loop iterating through string and adding/replacing characters

I need to write for loop to iterate through a String object (nested within a String[] array) to operate on each character within this string with the following criteria.
first, add a hyphen to the string
if the character is not a vowel, add this character to the end of the string, and then remove it from the beginning of the string.
if the character is a vowel, then add "v" to the end of the string.
Every time I have attempted this with various loops and various strategies/implementations, I have somehow ended up with the StringIndexOutOfBoundsException error.
Any ideas?
Update: Here is all of the code. I did not need help with the rest of the program, simply this part. However, I understand that you have to see the system at work.
import java.util.Scanner;
import java.io.IOException;
import java.io.File;
public class plT
{
public static void main(String[] args) throws IOException
{
String file = "";
String line = "";
String[] tempString;
String transWord = ""; // final String for output
int wordTranslatedCount = 0;
int sentenceTranslatedCount = 0;
Scanner stdin = new Scanner(System.in);
System.out.println("Welcome to the Pig-Latin translator!");
System.out.println("Please enter the file name with the sentences you wish to translate");
file = stdin.nextLine();
Scanner fileScanner = new Scanner(new File(file));
fileScanner.nextLine();
while (fileScanner.hasNextLine())
{
line = fileScanner.nextLine();
tempString = line.split(" ");
for (String words : tempString)
{
if(isVowel(words.charAt(0)) || Character.isDigit(words.charAt(0)))
{
transWord += words + "-way ";
transWord.trim();
wordTranslatedCount++;
}
else
{
transWord += "-";
// for(int i = 0; i < words.length(); i++)
transWord += words.substring(1, words.length()) + "-" + words.charAt(0) + "ay ";
transWord.trim();
wordTranslatedCount++;
}
}
System.out.println("\'" + line + "\' in Pig-Latin is");
System.out.println("\t" + transWord);
transWord = "";
System.out.println();
sentenceTranslatedCount++;
}
System.out.println("Total number of sentences translated: " + sentenceTranslatedCount);
System.out.println("Total number of words translated: " + wordTranslatedCount);
fileScanner.close();
stdin.close();
}
public static boolean isVowel (char c)
{
return "AEIOUYaeiouy".indexOf(c) != -1;
}
}
Also, here is the example file from which text is being pulled (we are skipping the first line):
2
How are you today
This example has numbers 1234
Assuming that the issue is StringIndexOutOfBoundsException, then the only way this is going to occur, is when one of the words is an empty String. Knowing this also provides the solution: do something different (if \ else) when words is of length zero to handle the special case differently. This is one way to do this:
if (!"".equals(words)) {
// your logic goes here
}
another way, is to simply do this inside the loop (when you have a loop):
if ("".equals(words)) continue;
// Then rest of your logic goes here
If that is not the case or the issue, then the clue is in the parts of the code you are not showing us (you didn't give us the relevant code after all in that case). Better provide a complete subset of the code that can be used to replicate the problem (testcase), and the complete exception (so we don't even have to try it out ourselves.

How to make the output show how many times 2 specific letters come up in a string?

The following code is not correct. When I enter the string, "The rain in Spain" the output is 0 when it should be 2. Or when I put "in in in" the output is 1 when it should be 3. So please help me out and show me how to change this to make it work. Thanks!
import java.util.Scanner;
public class Assignment5b {
public static void main(String[] args){
Scanner keyboard = new Scanner(System.in);
System.out.print("Enter the string: ");
String str = keyboard.next();
String findStr = "in";
int lastIndex = 0;
int count =0;
while(lastIndex != -1){
lastIndex = str.indexOf(findStr,lastIndex);
if( lastIndex != -1){
count ++;
lastIndex+=findStr.length();
}
}
System.out.print("Pattern matches: " + count);
}}
Replace String str = keyboard.next(); with String str = keyboard.nextLine();
.next() method only scans for the next token. While nextLine() scans the entire line being input till you hit enter.
Another way to find the occurrence is simply:
System.out.print("Enter the string: ");
String str = keyboard.nextLine();
String findStr = "in";
System.out.println(str.split(findStr, -1).length);
In your cases. keyboard.next() is taking word by word. So the expression keyboard.next() will give you the first word - "The". So your whole of code will run for String str = "The".
Change your keyboard.next() to keyboard.nextLine()
or
put your code in a while condition like
while(keyboard.hasNext()) {
}
import java.util.Scanner;
public class Temp {
public static void main(String[] args) {
Scanner keyboard = new Scanner(System.in);
System.out.print("Enter the string: ");
String str = keyboard.nextLine();
String findStr = "in";
int lastIndex = 0;
int count =0;
while(lastIndex != -1){
lastIndex = str.indexOf(findStr,lastIndex);
if( lastIndex != -1){
count ++;
lastIndex+=findStr.length();
}
}
System.out.print("Pattern matches: " + count);
}
}
NextLine Documentation says:
public String nextLine()
Advances this scanner past the current line and returns the input that was skipped. This method returns the rest of the current line, excluding any line separator at the end. The position is set to the beginning of the next line.
Since this method continues to search through the input looking for a line separator, it may buffer all of the input searching for the line to skip if no line separators are present.
Returns:
the line that was skipped
Throws:
NoSuchElementException - if no line was found
IllegalStateException - if this scanner is closed
import java.util.Scanner;
public class Assignment5b {
public static void main(String[] args){
Scanner keyboard = new Scanner(System.in);
System.out.print("Enter the string: ");
String str = keyboard.nextLine();
String findStr = "in";
int lastIndex = 0;
int count =0;
while(lastIndex != -1){
lastIndex = str.indexOf(findStr);
if( lastIndex != -1){
str = str.substring(lastIndex+findStr.length(),str.length());
count ++;
}
}
System.out.print("Pattern matches: " + count);
}}
Couple of the things to be noted -
You need to use keyboard.nextLine(); if you need the entire line. Else it will just get the first word.
If you input "in" in the input statement "The rain in Spain". Program will output 3, there is "in" in "rain", "in" and "Spain"

JAVA basic iterator

this was the solution to my homework and the purpose was to reverse each word in a string based on user inputting a sentence. I have completed this on my own, but I'm just wondering how the iterator worked in this piece of code. I don't understand the delcaration of tempword = ""; and how he printed out each word delimited by spaces.
import java.util.Scanner;
public class StringReverser
{
public static void main(String args[])
{
String sentence;
String word;
String tempWord = "";
Scanner scan = new Scanner(System.in);
Scanner wordScan;
System.out.print("Enter a sentence: ");
sentence = scan.nextLine();
wordScan = new Scanner(sentence);
while(wordScan.hasNext())
{
word = wordScan.next();
for(int numLetters = word.length() - 1; numLetters >= 0; numLetters--)
tempWord += word.charAt(numLetters);
System.out.print(tempWord + " ");
tempWord = "";
}
System.out.println();
}
}
this bit adds in the spaces
System.out.print(tempWord + " ");
this bit reverses it
for(int numLetters = word.length() - 1; numLetters >= 0; numLetters--)
tempWord += word.charAt(numLetters);
this bit sets it up for the next word
tempWord = "";
The for loop counts backwards, from the index of the last character in the word to the first (in zero based notation)
The print prints the reversed word + a space (" "), the fact it uses print in place of println is because println would add a carriage return putting each word in a different line.
The tempWord = ""; at the end of each iteration reset the variable so it can be reused.

Categories

Resources