Same regex but giving different result with StringTokenizer and Scanner class delimiter

Same regex but giving different result with StringTokenizer and Scanner class delimiter - java

Im trying to separate each word in the sentence using StringTokenizer class. It works fine for me. But I found another solution to my case using Scanner class.I applied same regular expression in both ways but got different result. I would like to know the reason for different out put I got but with same expression.
Here is my code :
String sentence = "I have some problems with this section!"
+ " But I love to learn.";
StringTokenizer st = new StringTokenizer(sentence, "[! ]");
System.out.println("========Using StringTokenizer=========");
while (st.hasMoreTokens()) {
System.out.println(st.nextToken());
}
Scanner s = new Scanner(sentence);
s.useDelimiter("[! ]");
System.out.println("========Using Delimiter=========");
while (s.hasNext()) {
System.out.println(s.next());
}
Out-put form StringTokenizer:
========Using StringTokenizer=========
I
have
some
problems
with
this
section
But
I
love
to
learn.
Out-put using Scanner class :
========Using Delimiter=========
I
have
some
problems
with
this
section
But
I
love
to
learn.

It is because Scanner may match an empty String, while StringTokonizer will not. In this case in the part "section! But" Scanner matches the whitespace after the ! symbol, whereas StringTokenizer does not.

Scanner includes empty matches while StringTokenizer does not.
StringTokenizer can't properly parse delimited files with meaningful indexed columns/fields like /etc/passwd or CSVs for this reason because it will not return all of the columns/fields while Scanner will.

Related

How to escape semicolon in Java Scanner

I'm trying to take an input using Scanner class in Java.
My code is:
Scanner scan = new Scanner( System.in);
String newline = scan.next();
My input is something like:
india gate;25;3
and I'm trying to replace the whole string above with a new string:
new delhi;23;2
using
.replace(str1, str2)
The problem is it's only replacing the first word in the string and the output is something like:
india delhi;25;3
How can I take it as a whole string using Scanner?

Use ; as delimiter like this
while (scanner.hasNextLine()) {
lineScanner = new Scanner(scanner.nextLine());
lineScanner.useDelimiter(";");
String article = lineScanner.next();
// and so on...
}

use .replaceAll("india gate;25;3", "new delhi;23;2");
output
new delhi;23;2

You should read up on how the Scanner class works. Basically by default, it uses whitespace as the delimiter for next(). This means that when you call next(), it reads until it finds whitespace, then it returns what it read. So when you call next() on "india gate;25;3", it reads "india" and then hits a space. So it returns you "india". If you want to read until a newline instead (which it looks like you do), you want to use nextLine().

Using Scanner to find letter

What I'm trying to do here is have a user input a sentence or some type of string, then have the scanner search through the string to determine if there is an "e" or "E" in there. If the scanner finds either one of those it will replace them with "xyz." If they are not present then it should return "All is good!" The code I have thus far is:
public class Example {
String userInput;
Scanner in = new Scanner(System.in);
System.out.println("Please write a sentence.");
userInput = in.nextLine();
System.out.println(userInput.replace("e","xyz"));
System.out.println(userInput.replace("E","xyz"));
in.close();
}
As I think you can tell this really just prints the same line twice, once removing a lower case e and the other removing a capital E. I was looking for a way to combine the two replacements and then have it replace and print if it finds an e or just print "All is good!" if there are no "e"s.

This isn't related to Scanner at all really. Trivial approach:
String replaced = userInput.replace("e","xyz").replace("E","xyz");
String out = replaced.equals(userInput) ? "All is good!" : replaced;
System.out.println(out);
Or use replaceAll:
Pattern eE = Pattern.compile("[eE]");
String replaced = eE.matcher(userInput).replaceAll("xyz");

You're going to want to look into Pattern and its associate Matcher. This can give you precisely what you want.
While String#replaceAll gives you what you want, it's not entirely semantic; you can certainly replace all of the "e"s with "xyz" and then compare them to see if there was no change, but using Matcher gives you a bit more obvious control over the logic.
We first need to compile a regex, then build a matcher from the compiled regex. If we find a match, we can then replace the strings; otherwise, we can print out our value.
String userInput;
Scanner in = new Scanner(System.in);
System.out.println("Please write a sentence.");
userInput = in.nextLine();
Pattern pattern = Pattern.compile("[eE]");
Matcher matcher = pattern.matcher(userInput);
if(matcher.find()) {
System.out.println(matcher.replaceAll("xyz"));
} else {
System.out.println("All is good!");
}
System.out.println(userInput.replaceAll("[eE]","xyz"));
Also, don't close the System.in stream. It's not desirable to close that out in case some other part of your application wants to make use of it, or if you want to make use of it later.

Java strings are immutable, so you can't just call replace on userInput and expect it to be modified in place. You need to save the result of the function call.

Read a csv-file value by value with Scanner, useDelimiter(";") not working

I am trying to read a CSV file value by value using Scanner.useDelimiter(";").
However Scanner.nextLine() still returns the whole line instead of a single Value.
The CSV-file looks like this:
0.00034;0.1;0.3;0.6;1,00E-13
My code:
Scanner iStream = new Scanner(new InputStreamReader(new FileInputStream(file.cvs);
iStream.useDelimiter(";");
String[] test = new String[5];
for (int i = 0; i < 5; i++) {
test[i] = iStream.nextLine();
}
Result:
"0.00034;0.1;0.3;0.6;1,00E-13"
Expected Result:
"0.00034", "0.1", "0.3", "0.6", "1,00E-13"
Is this possible, or should I use String.split()?
Am I missing something?

Apart from the fact that this problem is ready-made for a parsing library such as OpenCSV, nextLine doesnt account for delimiter patterns. Use next instead
test[i] = iStream.next();

From the Java Scanner documentation:
public String next()
Finds and returns the next complete token from this scanner.
A complete token is preceded and followed by input that matches the delimiter pattern.
This literally answers your question. However, I am not sure about next's behaviour at the start and end becuase it has to be "preceded and followed" by the delimiter. Maybe someone can fill in on this?
You could add extra characters to your delimiter, like \netc.

Array from split String problems

Hello I am using the following java code to split user input into individual words -
String command = scanner.next();
command = command.toLowerCase();
String[] words = command.split(" ");
however when i try to print " words[1] " for an input with two or more words it throws a ArrayIndexOutOfBoundsException. It would seem that words[1] would simply be the second word in the sentence but the array does not contain it.

Instead of scanner.next(), try
String command = scanner.nextLine();
This will make sure you read all the words.

From the Scanner API:
A Scanner breaks its input into tokens using a delimiter pattern, which by default matches whitespace.
While the javadoc for Scanner#next() states:
Finds and returns the next complete token from this scanner. A complete token is preceded and followed by input that matches the delimiter pattern.
So in your case scanner.next() will return a word with no whitespace, as whitespace is how your scanner likely knows when to stop scanning.
You might want to use Scanner#nextLine() or something of the sort instead.

Try this one, and make sure you have read all input words
Scanner scanner = Scanner(System.in);
String command = scanner.nextLine();
command = command.toLowerCase();
String[] words = command.split(" ");
Now you can print
words[1]
if you have valid index value,

How to scan for words in Java excluding punctuation

I'm trying to use the scanner class to parse all the words in a file. The file contains common text, but I only want to take the words excluding all the puntuation.
The solution I have until now is not complete but is already giving me some problem:
Scanner fileScan= new Scanner(file);
String word;
while(fileScan.hasNext("[^ ,!?.]+")){
word= fileScan.next();
this.addToIndex(word, filename);
}
Now if I use this on a sentence like "hi my name is mario!" it returns just "hi", "my", "name" and "is". It's not matching "mario!" (obviously) but it's not matching "mario", like I think it should.
Can you explain why is that and helping me find a better solution if you have one?
Thank you

This works:
import java.util.*;
class S {
public static void main(String[] args) {
Scanner fileScan= new Scanner("hi my name is mario!").useDelimiter("[ ,!?.]+");
String word;
while(fileScan.hasNext()){
word= fileScan.next();
System.out.println(word);
}
} // end of main()
}
javac -g S.java && java S
hi
my
name
is
mario

Since you want to get rid of the punctuation, you can simply replace all punctuation marks before adding to the index:
word = word.replaceAll("\\{Punct}", "");
In the case of hypens, or other isolated punctuation marks, you just check if word.isEmpty() before adding.
Of course, you'd have to get rid of your custom delimiter.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Same regex but giving different result with StringTokenizer and Scanner class delimiter - java

It is because Scanner may match an empty String, while StringTokonizer will not. In this case in the part "section! But" Scanner matches the whitespace after the ! symbol, whereas StringTokenizer does not.

Scanner includes empty matches while StringTokenizer does not. StringTokenizer can't properly parse delimited files with meaningful indexed columns/fields like /etc/passwd or CSVs for this reason because it will not return all of the columns/fields while Scanner will.

Related

How to escape semicolon in Java Scanner

Using Scanner to find letter

Read a csv-file value by value with Scanner, useDelimiter(";") not working

Array from split String problems

How to scan for words in Java excluding punctuation

Categories

Resources