Java String Split with multiple delimiter using pipe '|' - java

I am trying to break a string b = "x+yi" into a two integers x and y.
This is my original answer.
Here I removed trailing 'i' character with substring method:
int Integerpart = (int)(new Integer(b.split("\\+")[0]));
int Imaginary = (int)(new Integer((b.split("\\+")[1]).
substring(0, b.split("\\+")[1].length() - 1)));
But I found that the code below just works same:
int x = (int)(new Integer(a.split("\\+|i")[0]));
int y = (int)(new Integer(a.split("\\+|i")[1]));
Is there something special with '|'? I looked up documentation and many other questions but I couldn't find the answer.

The split() method takes a regular expression that controls the split. Try
"[+i]". The braces mark a group of characters, in this case "+" and "i".
However, that won't accomplish what you are trying to do. You will end up with something "b = x", "y", "". Regular expressions also offer search and capture capabilities. Look at String.matches(String regex).

You can use the given link for understanding of How Delimiters Works.
How do I use a delimiter in Java Scanner?
Another alternative Way
You can use useDelimiter(String pattern) method of Scanner class. The use of useDelimiter(String pattern) method of Scanner class. Basically we have used the String semicolon(;) to tokenize the String declared on the constructor of Scanner object.
There are three possible token on the String “Anne Mills/Female/18″ which is name,gender and age. The scanner class is used to split the String and output the tokens in the console.
import java.util.Scanner;
/*
* This is a java example source code that shows how to use useDelimiter(String pattern)
* method of Scanner class. We use the string ; as delimiter
* to use in tokenizing a String input declared in Scanner constructor
*/
public class ScannerUseDelimiterDemo {
public static void main(String[] args) {
// Initialize Scanner object
Scanner scan = new Scanner("Anna Mills/Female/18");
// initialize the string delimiter
scan.useDelimiter("/");
// Printing the delimiter used
System.out.println("The delimiter use is "+scan.delimiter());
// Printing the tokenized Strings
while(scan.hasNext()){
System.out.println(scan.next());
}
// closing the scanner stream
scan.close();
}
}

Related

How to input new line string using java-util-scanner in this situation?

I'm trying to create a simple String Calculator that allows an Add() method to handle new lines between numbers (instead of commas)-
The following input is ok: “1\n2,3” (will equal 6)
The following input is NOT ok: “1,\n”
How am I supposed to input string with nextline(\n) in it and split or tokenize on the basis of both the "\n" and ","?
I have given the piece of my Add() method below which is able to return the sum of the tokens-
public static int Add(String numbers1) {
int sum = 0;
StringTokenizer st = new StringTokenizer(numbers1, "\r?\n");
while(st.hasMoreTokens()) {
sum = sum + Integer.parseInt(st.nextToken());
}
return sum;
Simple idea: in your original input string replace all '\n' with ',' or other way around, and then you only have one delimiter. Also, note that StringTokenizer class is a legacy class, and it is recommended to use the method String.split(...) Regular expressions instead.

Passing multiple delimiters to StringTokenizer constructor

I have seen that the syntax for passing multiple delimiters (eg. '.' , '?', '!') to the StringTokenizer constructor is:
StringTokenizer obj=new StringTokenizer(str,".?!");
What I am not getting is that, I have enclosed all the delimiters together in double quotes, so does that not make it a String rather than individual
characters. How does the StringTokenizer class identify them as separate characters? Why is ".?!" not treated as a single delimiter?
StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code.
So forget about it.
It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.
So use String#split instead.
String[] elements = str.split("\\.\\?!"); // treats ".?!" as a single delimiter
String[] elements2 = str.split("[.?!]"); // three delimiters
If you miss StringTokenizer's Enumeration nature, get an Iterator.
Iterator<String> iterator = Arrays.asList(elements).iterator();
while (iterator.hasNext()) {
String next = iterator.next();
// ...
}
How does the StringTokenizer class identify them as separate characters?
It's an implementation detail and it shouldn't be your concern. There are a couple of ways to do that. They use String#charAt(int) and String#codePointAt(int).
Why is ".?!" not treated as a single delimiter?
That's the choice they've made: "We will take a String and we will be looking for delimeters there." The Javadoc makes it clear.
*
* #param str a string to be parsed.
* #param delim the delimiters.
* #param returnDelims flag indicating whether to return the delimiters
* as tokens.
* #exception NullPointerException if str is <CODE>null</CODE>
*/
public StringTokenizer(String str, String delim, boolean returnDelims) {
That's just how StringTokenizer is defined. Just take a look at the javadoc
Constructs a string tokenizer for the specified string. All characters in the delim argument are the delimiters for separating tokens.
Also in source code you will find delimiterCodePoints field described as following
/**
* When hasSurrogates is true, delimiters are converted to code
* points and isDelimiter(int) is used to determine if the given
* codepoint is a delimiter.
*/
private int[] delimiterCodePoints;
so basically each of delimiters character is being converted to the int code stored in the array - the array is then used to decide whether the character is delimiter or not
It's true that you pass a single string rather than individual characters, but what is done with that string is up to the StringTokenizer. The StringTokenizer takes each character from your delimiter string and uses each one as a delimiter. This way, you can split the string on multiple different delimiters without having to run the tokenizer more than once.
You can see the documentation for this function here where it states:
The characters in the delim argument are the delimiters for separating tokens.
If you don't pass anything in for this parameter, it defaults to " \t\n\r\f", which is basically just whitespace.
How does the StringTokenizer class identify them as separate characters?
There is a method in String called charAt and codePointAt, which returns the character or code point at an index:
"abc".charAt(0) // 'a'
The StringTokenizer's implementation will use it both of these methods on the delimiters passed in at some point. In my version of the JDK, the code points of the delimiters string are extracted and added to an array delimiterCodePoints in a method called setMaxDelimCodePoint, which is called by the constructor:
private void setMaxDelimCodePoint() {
// ...
if (hasSurrogates) {
delimiterCodePoints = new int[count];
for (int i = 0, j = 0; i < count; i++, j += Character.charCount(c)) {
c = delimiters.codePointAt(j); <--- notice this line
delimiterCodePoints[i] = c;
}
}
}
And then this array is accessed in the isDelimiter method, which decides whether a character is a delimiter:
private boolean isDelimiter(int codePoint) {
for (int i = 0; i < delimiterCodePoints.length; i++) {
if (delimiterCodePoints[i] == codePoint) {
return true;
}
}
return false;
}
Of course, this is not the only way that the API could be designed. The constructor could have accepted an array of char as delimiters instead, but I am not qualified to say why the designers did it this way.
Why is ".?!" not treated as a single delimiter?
StringTokenizer only supports single character delimiters. If you want a string as a delimiter, you can use Scanner or String.split instead. For both of these, the delimiter is represented as a regular expression, so you have to use "\\.\\?!" instead. You can learn more about regular expressions here

Special characters in a java string

I am trying to write a code to find the special characters in a java string.
Special characters are a-zA-Z.?#;'#~!£$%^&*()_+-=¬`,./<>
Please help me to understand and write how can I implement this.
Thank you
You can create a char array from a String.
String string = "test";
char[] charArray = string.toCharArray();
Then you can loop through all the characters:
for(char character: charArray){
//check if the character is special and do something with it, like adding it to an List.
}
You can use a Scanner to find the invalid characters in your String:
/* Regex with all character considered non-special */
public static final String REGULAR_CHARACTERS = "0-9a-z";
public static String specialCharacters(String string) {
Scanner scanner = new Scanner(string);
String specialCharacters= scanner.findInLine("[^" + REGULAR_CHARACTERS + "]+");
scanner.close();
return specialCharacters;
}
The findInLine returns a String with all characters not included in the constant (all special characters). You need to setup the constant with all the characters that you consider non-special.
Alternatively, if you want to setup only the characters you want to find, you can modify the example above with:
public static final String SPECIAL_CHARACTERS = "a-zA-Z.?#;'#~!£$%^&*()_+-=¬`,./<>";
....
String specialCharacters= scanner.findInLine("[" + SPECIAL_CHARACTERS + "]+");
....
The characters used in the constants need to be scaped as usual for regular expressions.
For example, to add the ] character, you need to use \\]

scanner is not storing strings past a "space"

Firstly, I'm very beginner, but I like to think I mildly understand things.
I'm trying to write a method that will store the user's input into a string. It works just fine, except if the user puts in a space. Then the string stops storing.
public static String READSTRING() {
Scanner phrase = new Scanner(System.in);
String text = phrase.next();
return text;
}
I think the problem is that phrase.next() stops scanning once it detects a space, but I would like to store that space in the string and continue storing the phrase. Does this require some sort of loop to keep storing it?
Use .nextLine() instead of .next().
.nextLine() will take your input until a newline character has been found (when you press enter, a newline character is added). This essentially allows you to get one line of input.
From the Javadoc, this is what we have:
A Scanner breaks its input into tokens using a delimiter pattern, which by default matches whitespace.
Either you can use phrase.nextLine() as suggested by others, or you can use Scanner#useDelimiter("\\n").
Try phrase.nextLine();. If I recall correctly, Scanner automatically uses spaces as delimiters.
Try
pharse.NextLine();
and you got do an array for limited words
String Stringname = {"word","word2"};
Random f = new Random(6);
Stringname = f.nextInt();
and you can convert an integer to string
int intvalue = 6697;
String Stringname = integer.ToString(intvalue);

Getting scanner to read text file

I am trying to use a scanner to read a text file pulled with JFileChooser. The wordCount is working correctly, so I know it is reading. However, I cannot get it to search for instances of the user inputted word.
public static void main(String[] args) throws FileNotFoundException {
String input = JOptionPane.showInputDialog("Enter a word");
JFileChooser fileChooser = new JFileChooser();
fileChooser.showOpenDialog(null);
File fileSelection = fileChooser.getSelectedFile();
int wordCount = 0;
int inputCount = 0;
Scanner s = new Scanner (fileSelection);
while (s.hasNext()) {
String word = s.next();
if (word.equals(input)) {
inputCount++;
}
wordCount++;
}
You'll have to look for
, ; . ! ? etc.
for each word. The next() method grabs an entire string until it hits an empty space.
It will consider "hi, how are you?" as the following "hi,", "how", "are", "you?".
You can use the method indexOf(String) to find these characters. You can also use replaceAll(String regex, String replacement) to replace characters. You can individuality remove each character or you can use a Regex, but those are usually more complex to understand.
//this will remove a certain character with a blank space
word = word.replaceAll(".","");
word = word.replaceAll(",","");
word = word.replaceAll("!","");
//etc.
Read more about this method:
http://docs.oracle.com/javase/6/docs/api/java/lang/String.html#replaceAll%28java.lang.String,%20java.lang.String%29
Here's a Regex example:
//NOTE: This example will not work for you. It's just a simple example for seeing a Regex.
//Removes whitespace between a word character and . or ,
String pattern = "(\\w)(\\s+)([\\.,])";
word = word.replaceAll(pattern, "$1$3");
Source:
http://www.vogella.com/articles/JavaRegularExpressions/article.html
Here is a good Regex example that may help you:
Regex for special characters in java
Parse and remove special characters in java regex
Remove all non-"word characters" from a String in Java, leaving accented characters?
if the user inputed text is different in case then you should try using equalsIgnoreCase()
in addition to blackpanthers answer you should also use trim() to account for whitespaces.as
"abc" not equal to "abc "
You should take a look at matches().
equals will not help you, since next() doesn't return the file word by word,
but rather whitespace (not comma, semicolon, etc.) separated token by token (as others mentioned).
Here the java docString#matches(java.lang.String)
...and a little example.
input = ".*" + input + ".*";
...
boolean foundWord = word.matches(input)
. is the regex wildcard and stands for any sign. .* stands for 0 or more undefined signs. So you get a match, if input is somewhere in word.

Categories

Resources