Split method is splitting into single String - java

I have a little problem:
I have a program that split a String by whitespace (only single ws), but when I assign that value to the String array, it has only one object inside. (I can print only '0' index).
Here is the code:
public void mainLoop() {
Scanner sc = new Scanner(System.in);
String parse = "#start";
while (!parse.equals("#stop") || !parse.isEmpty()) {
parse = sc.next();
String[] line = parse.split("[ ]");
System.out.println(line[0]);
}
}
The 'mainLoop' is called from instance by method 'main'.

By default Scanner#next delimits input using a whitespace. You can use nextLine to read from the Scanner without using this delimiter pattern
parse = sc.nextLine();
The previous points mentioned in the comments are still valid
while (!parse.equals("#stop") && !parse.isEmpty()) {
parse = sc.nextLine();
String[] line = parse.split("\\s");
System.out.println(Arrays.toString(line));
}

When you call next on scanner it returns next token from input. For instance if user will write text
foo bar
first call of next will return "foo" and next call of next will return "bar".
Maybe consider using nextLine instead of next if you want to get string in form "foo bar" (entire line).
Also you don't have to place space in [] so instead of split("[ ]") you can use split(" ") or use character class \s which represents whitespaces split("\\s").
Another problem you seem to have is
while( !condition1 || !condition2 )
If you wanted to say that loop should continue until either of conditions is fulfilled then you should write them in form
while( !(condition1 || condition2) )
or using De Morgan's laws
while( !condition1 && !condition2 )

If you want to split by a single whitespace, why don't you do it?
String[] line = parse.split(" ");

Related

Using Scanner hasNext() and hasNextLine() to retrieve 2 elements per line

In Java, if you were given a text file with two elements per line, how can we grab those elements separately?
For example say we have a 5 line text file with the following:
a morgan
b stewart
c david
d alfonso
e brittany
and let's say we want to store the single char in a variable and the name in another variable. How do we do this is java?
I have implemented some code somewhat like this:
while(scanner.hasNextLine()){
for(int i = 0; i < 2; i++){
char character = scanner.hasNextChar(); // doesn't exist but idk how
String name = scanner.hasNext();
}
}
Basically I have a while loop reading each 2 elements line by line and in each line there is a for loop to store each element in a variable. I am just confused on how to extract each separate element in java.
Considering that you're using scanner.hasNextLine() as your loop condition. You can split the String then collect the result as needed.
while (scanner.hasNextLine()){
String[] result = scanner.nextLine().split(" ");
char character = result[0].charAt(0);
String name = result[1];
}
You can split the line with the whitespace character by using the String.split(String regex) method.
It will produce an array of two String.
If you invoke while(scanner.hasNextLine()){ to get an input, you should invoke String name = scanner.nextLine(); to retrieve the input.
The hasNext() method returns a boolean to indicate if this scanner has another token in its input.
Doing while(scanner.hasNextLine()){ and scanner.hasNext() is redundant.

How to use split function when input is new line?

The question is we have to split the string and write how many words we have.
Scanner in = new Scanner(System.in);
String st = in.nextLine();
String[] tokens = st.split("[\\W]+");
When I gave the input as a new line and printed the no. of tokens .I have got the answer as one.But i want it as zero.What should i do? Here the delimiters are all the symbols.
Short answer: To get the tokens in str (determined by whitespace separators), you can do the following:
String str = ... //some string
str = str.trim() + " "; //modify the string for the reasons described below
String[] tokens = str.split("\\s+");
Longer answer:
First of all, the argument to split() is the delimiter - in this case one or more whitespace characters, which is "\\s+".
If you look carefully at the Javadoc of String#split(String, int) (which is what String#split(String) calls), you will see why it behaves like this.
If the expression does not match any part of the input then the resulting array has just one element, namely this string.
This is why "".split("\\s+") would return an array with one empty string [""], so you need to append the space to avoid this. " ".split("\\s+") returns an empty array with 0 elements, as you want.
When there is a positive-width match at the beginning of this string then an empty leading substring is included at the beginning of the resulting array.
This is why " a".split("\\s+") would return ["", "a"], so you need to trim() the string first to remove whitespace from the beginning.
If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.
Since String#split(String) calls String#split(String, int) with the limit argument of zero, you can add whitespace to the end of the string without changing the number of words (because trailing empty strings will be discarded).
UPDATE:
If the delimiter is "\\W+", it's slightly different because you can't use trim() for that:
String str = ...
str = str.replaceAll("^\\W+", "") + " ";
String[] tokens = str.split("\\W+");
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
String line = null;
while (!(line = in.nextLine()).isEmpty()) {
//logic
}
System.out.print("Empty Line");
}
output
Empty Line

How to get the string between double quotes in a string in Java [duplicate]

This question already has answers here:
Split string on spaces in Java, except if between quotes (i.e. treat \"hello world\" as one token) [duplicate]
(1 answer)
Java Regex for matching quoted string with escaped quotes
(1 answer)
Closed 8 years ago.
For example, input will be like:
AddItem rt456 4 12 BOOK “File Structures” “Addison-Wesley” “Michael Folk”
and I want to read all by using scanner and put it in a array.
like:
info[0] = rt456
info[1] = 4
..
..
info[4] = File Structures
info[5] = Addison-Wesley
So how can I get the string between quotes?
EDIT: a part of my code->
public static void main(String[] args) {
String command;
String[] line = new String[6];
Scanner read = new Scanner(System.in);
Library library = new Library();
command = read.next();
if(command.matches("AddItem"))
{
line[0] = read.next(); // Serial Number
line[1] = read.next(); // Shelf Number
line[2] = read.next(); // Shelf Index
command = read.next(); // Type of the item. "Book" - "CD" - "Magazine"
if(command.matches("BOOK"))
{
line[3] = read.next(); // Name
line[4] = read.next(); // Publisher
line[5] = read.next(); // Author
Book yeni = new Book(line[0],Integer.parseInt(line[1]),Integer.parseInt(line[2]),line[3],line[4],line[5]);
}
}
}
so I use read.next to read String without quotes.
SOLVED BY USING REGEX AS
read.next("([^\"]\\S*|\".+?\")\\s*");
You can use StreamTokenizer for this in a pinch. If operating on a String, wrap it with a StringReader. If operating on a file just pass your Reader to it.
// Replace “ and ” with " to make parsing easier; do this only if you truly are
// using pretty quotes (as you are in your post).
inputString = inputString.replaceAll("[“”]", "\"");
StreamTokenizer tokenizer = new StreamTokenizer(new StringReader(inputString));
tokenizer.resetSyntax();
tokenizer.whitespaceChars(0, 32);
tokenizer.wordChars(33, 255);
tokenizer.quoteChar('\"');
while (tokenizer.nextToken() != StreamTokenizer.TT_EOF) {
// tokenizer.sval will contain the token
System.out.println(tokenizer.sval);
}
You will have to use an appropriate configuration for non-ASCII text, the above is just an example.
If you want to pull numbers out separately, then the default StreamTokenizer configuration is fine, although it uses double and provides no int numeric tokens. Annoyingly, it is not possible to simply disable number parsing without resetting the syntax from scratch.
If you don't want to mess with all this, you could also consider changing the input format to something more convenient, as in Steve Sarcinella's good suggestion, if it is appropriate.
As a reference, take a look at this: Scanner Docs
How you read from the scanner is determined by how you will present the data to your user.
If they are typing it all on one line:
Scanner scanner = new Scanner(System.in);
String result = "";
System.out.println("Enter Data:");
result = scanner.nextLine();
Otherwise if you split it up into input fields you could do:
Scanner scanner = new Scanner(System.in);
System.out.println("Enter Identifier:");
info[0] = scanner.nextLine();
System.out.println("Enter Num:");
info[1] = scanner.nextLine();
...
If you want to validate anything before assigning the data to a variable, try using scanner.next(""); where the quotes contain a regex pattern to match
EDIT:
Check here for regex info.
As an example, say I have a string
String foo = "The cat in the hat";
regex (Regular Expressions) can be used to manipulate this string in a very quick and efficient manner. If I take that string and do foo = foo.replace("\\s+", "");, this will replace any whitespace with nothing, therefore eliminating whitespace.
Breaking down the argument \\s+, we have \s which means match any character that is whitespace.
The extra \ before \s is a an escape character that allows the \s to be read properly.
The + means match the previous expression 0 or more times. (Match all).
So foo, after running replace, would be "TheCatInTheHat"
Same this regex logic can apply to scanner.next(String regex);
Hopefully this helps a bit more, I'm not the best at explanation :)
An alternative using a messy regular expression:
public static void main(String[] args) throws Exception {
Pattern p = Pattern.compile("^(\\w*)[\\s]+(\\w*)[\\s]+(\\w*)[\\s]+(\\w*)[\\s]+(\\w*)[\\s]+[“](.*)[”][\\s]+[“](.*)[”][\\s]+[“](.*)[”]");
Matcher m = p.matcher("AddItem rt456 4 12 BOOK “File Structures” “Addison-Wesley” “Michael Folk”");
if (m.find()) {
for (int i=1;i<=m.groupCount();i++) {
System.out.println(m.group(i));
}
}
}
That prints:
AddItem
rt456
4
12
BOOK
File Structures
Addison-Wesley
Michael Folk
I assumed quotes are as you typed them in the question “” and not "", so they dont need to be escaped.
You can try this. I have prepared the demo for your requirement
public static void main(String args[]) {
String str = "\"ABC DEF\"";
System.out.println(str);
String str1 = str.replaceAll("\"", "");
System.out.println(str1);
}
After reading just replace the double quotes with empty string

Getting scanner to read text file

I am trying to use a scanner to read a text file pulled with JFileChooser. The wordCount is working correctly, so I know it is reading. However, I cannot get it to search for instances of the user inputted word.
public static void main(String[] args) throws FileNotFoundException {
String input = JOptionPane.showInputDialog("Enter a word");
JFileChooser fileChooser = new JFileChooser();
fileChooser.showOpenDialog(null);
File fileSelection = fileChooser.getSelectedFile();
int wordCount = 0;
int inputCount = 0;
Scanner s = new Scanner (fileSelection);
while (s.hasNext()) {
String word = s.next();
if (word.equals(input)) {
inputCount++;
}
wordCount++;
}
You'll have to look for
, ; . ! ? etc.
for each word. The next() method grabs an entire string until it hits an empty space.
It will consider "hi, how are you?" as the following "hi,", "how", "are", "you?".
You can use the method indexOf(String) to find these characters. You can also use replaceAll(String regex, String replacement) to replace characters. You can individuality remove each character or you can use a Regex, but those are usually more complex to understand.
//this will remove a certain character with a blank space
word = word.replaceAll(".","");
word = word.replaceAll(",","");
word = word.replaceAll("!","");
//etc.
Read more about this method:
http://docs.oracle.com/javase/6/docs/api/java/lang/String.html#replaceAll%28java.lang.String,%20java.lang.String%29
Here's a Regex example:
//NOTE: This example will not work for you. It's just a simple example for seeing a Regex.
//Removes whitespace between a word character and . or ,
String pattern = "(\\w)(\\s+)([\\.,])";
word = word.replaceAll(pattern, "$1$3");
Source:
http://www.vogella.com/articles/JavaRegularExpressions/article.html
Here is a good Regex example that may help you:
Regex for special characters in java
Parse and remove special characters in java regex
Remove all non-"word characters" from a String in Java, leaving accented characters?
if the user inputed text is different in case then you should try using equalsIgnoreCase()
in addition to blackpanthers answer you should also use trim() to account for whitespaces.as
"abc" not equal to "abc "
You should take a look at matches().
equals will not help you, since next() doesn't return the file word by word,
but rather whitespace (not comma, semicolon, etc.) separated token by token (as others mentioned).
Here the java docString#matches(java.lang.String)
...and a little example.
input = ".*" + input + ".*";
...
boolean foundWord = word.matches(input)
. is the regex wildcard and stands for any sign. .* stands for 0 or more undefined signs. So you get a match, if input is somewhere in word.

Splitting strings based on a delimiter

I am trying to break apart a very simple collection of strings that come in the forms of
0|0
10|15
30|55
etc etc. Essentially numbers that are seperated by pipes.
When I use java's string split function with .split("|"). I get somewhat unpredictable results. white space in the first slot, sometimes the number itself isn't where I thought it should be.
Can anybody please help and give me advice on how I can use a reg exp to keep ONLY the integers?
I was asked to give the code trying to do the actual split. So allow me to do that in hopes to clarify further my problem :)
String temp = "0|0";
String splitString = temp.split("|");
results
\n
0
|
0
I am trying to get
0
0
only. Forever grateful for any help ahead of time :)
I still suggest to use split(), it skips null tokens by default. you want to get rid of non numeric characters in the string and only keep pipes and numbers, then you can easily use split() to get what you want. or you can pass multiple delimiters to split (in form of regex) and this should work:
String[] splited = yourString.split("[\\|\\s]+");
and the regex:
import java.util.regex.*;
Pattern pattern = Pattern.compile("\\d+(?=([\\|\\s\\r\\n]))");
Matcher matcher = pattern.matcher(yourString);
while (matcher.find()) {
System.out.println(matcher.group());
}
The pipe symbol is special in a regexp (it marks alternatives), you need to escape it. Depending on the java version you are using this could well explain your unpredictable results.
class t {
public static void main(String[]_)
{
String temp = "0|0";
String[] splitString = temp.split("\\|");
for (int i=0; i<splitString.length; i++)
System.out.println("splitString["+i+"] is " + splitString[i]);
}
}
outputs
splitString[0] is 0
splitString[1] is 0
Note that one backslash is the regexp escape character, but because a backslash is also the escape character in java source you need two of them to push the backslash into the regexp.
You can do replace white space for pipes and split it.
String test = "0|0 10|15 30|55";
test = test.replace(" ", "|");
String[] result = test.split("|");
Hope this helps for you..
You can use StringTokenizer.
String test = "0|0";
StringTokenizer st = new StringTokenizer(test);
int firstNumber = Integer.parseInt(st.nextToken()); //will parse out the first number
int secondNumber = Integer.parseInt(st.nextToken()); //will parse out the second number
Of course you can always nest this inside of a while loop if you have multiple strings.
Also, you need to import java.util.* for this to work.
The pipe ('|') is a special character in regular expressions. It needs to be "escaped" with a '\' character if you want to use it as a regular character, unfortunately '\' is a special character in Java so you need to do a kind of double escape maneuver e.g.
String temp = "0|0";
String[] splitStrings = temp.split("\\|");
The Guava library has a nice class Splitter which is a much more convenient alternative to String.split(). The advantages are that you can choose to split the string on specific characters (like '|'), or on specific strings, or with regexps, and you can choose what to do with the resulting parts (trim them, throw ayway empty parts etc.).
For example you can call
Iterable<String> parts = Spliter.on('|').trimResults().omitEmptyStrings().split("0|0")
This should work for you:
([0-9]+)
Considering a scenario where in we have read a line from csv or xls file in the form of string and need to separate the columns in array of string depending on delimiters.
Below is the code snippet to achieve this problem..
{ ...
....
String line = new BufferedReader(new FileReader("your file"));
String[] splittedString = StringSplitToArray(stringLine,"\"");
...
....
}
public static String[] StringSplitToArray(String stringToSplit, String delimiter)
{
StringBuffer token = new StringBuffer();
Vector tokens = new Vector();
char[] chars = stringToSplit.toCharArray();
for (int i=0; i 0) {
tokens.addElement(token.toString());
token.setLength(0);
i++;
}
} else {
token.append(chars[i]);
}
}
if (token.length() > 0) {
tokens.addElement(token.toString());
}
// convert the vector into an array
String[] preparedArray = new String[tokens.size()];
for (int i=0; i < preparedArray.length; i++) {
preparedArray[i] = (String)tokens.elementAt(i);
}
return preparedArray;
}
Above code snippet contains method call to StringSplitToArray where in the method converts the stringline into string array splitting the line depending on the delimiter specified or passed to the method. Delimiter can be comma separator(,) or double code(").
For more on this, follow this link : http://scrapillars.blogspot.in

Categories

Resources