I have this string from mysql DB: it should be this: 2100428169/2010
this is my code
String str = rs.getString("str");
str = str.replaceAll("\\s+","");
str = str.trim();
char[] strCH = str.toCharArray();
and I get this:
[, 2, 1, 0, 0, 4, 2, 8, 1, 6, 9, /, 2, 0, 1, 0]
Why?
It's a problem because I need to use str1.equals(str) but it doesn't work because after
Object obj = (object)str;
It is in obj again with a space at the beginning like when I use toCharArray so it means equals doesn't work.
I finally found solution:
it was problem because of ASCII 65279 is something from BOM and trim() doesn't work for it.
this helped: str = str.replace("\uFEFF", "");
Neither replaceAll() nor trim() will work for some characters.
Actually there are several characters that could not be removed with this method. I even saw some files having characters that could not be recognized by java compiler, which creates unbelievable situations.
trim() method removes all \s from ends of string and by replaceAll() you are removing all \s from your string.
Instead use following
str = str.replaceAll("[^\\w\\\\]+", "");
You don't need to call trim() now.
Related
I want to split Area Code and preceding number from Telephone number without brackets so i did this.
String pattern = "[\\(?=\\)]";
String b = "(079)25894029".trim();
String c[] = b.split(pattern,-1);
for (int a = 0; a < c.length; a++)
System.out.println("c[" + a + "]::->" + c[a] + "\nLength::->"+ c[a].length());
Output:
c[0]::-> Length::->0
c[1]::->079 Length::->3
c[2]::->25894029 Length::->8
Expected Output:
c[0]::->079 Length::->3
c[1]::->25894029 Length::->8
So my question is why split() produces and extra blank at the start, e.g
[, 079, 25894029]. Is this its behavior, or I did something go wrong here?
How can I get my expected outcome?
First you have unnecessary escaping inside your character class. Your regex is same as:
String pattern = "[(?=)]";
Now, you are getting an empty result because ( is the very first character in the string and split at 0th position will indeed cause an empty string.
To avoid that result use this code:
String str = "(079)25894029";
toks = (Character.isDigit(str.charAt(0))? str:str.substring(1)).split( "[(?=)]" );
for (String tok: toks)
System.out.printf("<<%s>>%n", tok);
Output:
<<079>>
<<25894029>>
From the Java8 Oracle docs:
When there is a positive-width match at the beginning of this string
then an empty leading substring is included at the beginning of the
resulting array. A zero-width match at the beginning however never
produces such empty leading substring.
You can check that the first character is an empty string, if yes then trim that empty string character.
Your regex has problems, as does your approach - you can't solve it using your approach with any regex. The magic one-liner you seek is:
String[] c = b.replaceAll("^\\D+|\\D+$", "").split("\\D+");
This removes all leading/trailing non-digits, then splits on non-digits. This will handle many different formats and separators (try a few yourself).
See live demo of this:
String b = "(079)25894029".trim();
String[] c = b.replaceAll("^\\D+|\\D+$", "").split("\\D+");
System.out.println(Arrays.toString(c));
Producing this:
[079, 25894029]
I have a variable String which contains values i need and splitters. The problem is, the length of the string is variable and the type of splitters as well. They arrive through XML-file.
A string will look like this:
1+"."+20+"."+51+"."+2+"name.jpg"
but can also be:
1+"*"+20+"*"+51+"name.jpg"
The solid factors are:
the digits are id's which I need to retrieve.
the splitter values will be between "quotes".
the amount of id's is unknown, can be one, can be 200
the value used to split can be everything, but will always be between two quotes.
I was looking for a way to split the string on the "." but instead of the dot (.) give a wildcard, which can be 1 character or multiple.
Note: The value between the quotes can be anything! Doesn't even have to be a single character
Try to split by regular expression, i.e. like this:
String regex = "\\+?\"[^\"]*\"\\+?";
System.out.println(Arrays.toString( "1+\".\"+20+\".\"+51+\".\"+2+\"name.jpg\"".split( regex ) ));
System.out.println(Arrays.toString( "1+\"*\"+20+\"*\"+51+\"name.jpg\"".split( regex ) ));
Output:
[1, 20, 51, 2]
[1, 20, 51]
The regex would match any 2 double quotes with non-double quote characters in between and preceeded/followed by optional pluses. You could expand that to allow whitespace as well, e.g. "\\s*\\+?\\s*\"[^\"]*\"\\s*\\+?\\s*". The only thing that's not allowed in a splitter would be double quotes.
If you need the name as well, you might try and define the potential splitters in the regex,
e.g. "(\\+?\"[\\.\\*]*\"\\+?)|\\+?\""
Note that in that case you'd have to account for the quotes around the name, i.e. to split 2+"name.jpg" you have to add the alternative \+?" (double quotes preceded by an optional plus).
Update:
Additional examples (input -> output)
5+".."+272+"..."+21+"splitter"+2+"name.jpg" --> [5, 272, 21, 2]
444+"()"+0+"abc"+51+"__"+2+"name.jpg" --> [444, 0, 51, 2]
1+"."+20+"."+51+"."+2+"name.jpg" --> [1, 20, 51, 2]
1+"*"+20+"*"+51+"name.jpg" --> [1, 20, 51]
hmm can't you try something like this:
String oldStr=1+"."+20+"."+51+"."+2+"name.jpg";
String newStr= oldStr.replace("name.jpg",""); // or you can use regex such as : oldStr.replaceAll("(\w+.\w+)","");
String[] array;
array=newStr.split(".");
if(array==null || array.length==0){
array=newStr.split("*");
}
So, just that I get it right, possible filenames / string values are:
1.20.51.2name.jpg
1*20*51*name.jpg
Right?
So more general you could say: Some digits of unknown amount, seperated by a non-digit character?
You could execute a RegEx statement onto each String: \d+.
If executed globaly, you will get a list of each number. So for
1.20.51.2name.jpg
I got
1, 20, 51, 2
Using this :
String x = 1+"."+20+"."+51+"."+2+"name.jpg";
String y = 1+"*"+20+"*"+51+"name.jpg";
System.out.println(Arrays.toString(x.split("\\.|\\*")));
System.out.println(Arrays.toString(y.split("\\.|\\*")));
Will give you the following output:
[1, 20, 51, 2name, jpg]
[1, 20, 51name, jpg]
I have this line of code: temp5.replaceAll("\\W", "");
The contents of temp5 at this point are: [1, 2, 3, 4] but the regex doesn't remove anything. And when I do a toCharArray() I end up with this: [[, 1, ,, , 2, ,, , 3, ,, , 4, ]]
Am I not using the regex correctly? I was under the impression that \W should remove all punctuation and white space.
Note: temp5 is a String
And I just tested using \w, \W, and various others. Same output for all of them
Strings are immutable. replaceAll() returns the string with the changes made, it does not modify temp5. So you might do something like this instead:
temp5 = temp5.replaceAll("\\W", "");
After that, temp5 will be "1234".
String temp5="1, 2, 3, 4";
temp5=temp5.replaceAll("\\W", "");
System.out.println(temp5.toCharArray());
This will help
I have a string. I split and store it as a char array . I am trying to convert it using Integer.parseInt and I get an error. How do I convert String to int?
Here's my code:
String tab[]= null;
tab=pesel.split("");
int temp=0;
for(String tab1 : tab)
{
temp=temp+3*Integer.parseInt(tab1); //error
}
Assuming you have a string of digits (e.g. "123"), you can use toCharArray() instead:
for (char c : pesel.toCharArray()) {
temp += 3 * (c - '0');
}
This avoids Integer.parseInt() altogether. If you want to ensure that each character is a digit, you can use Character.isDigit().
Your error is because str.split("") contains a leading empty string in the array:
System.out.println(Arrays.toString("123".split("")));
[, 1, 2, 3]
Now, there is a trick using negative lookaheads to avoid that:
System.out.println(Arrays.toString("123".split("(?!^)")));
[1, 2, 3]
Although, in any case, I would prefer the approach shown above.
You missed a gap in split method.
tab=pesel.split(" ");
I have a string:
strArray= "-------9---------------";
I want to find 9 from the string. The string may be like this:
strArray= "---4-5-5-7-9---------------";
Now I want to find out only the digits from the string. I need the values 9,4, or such things and ignore the '-' . I tried the following:
strArray= strignId.split("-");
but it gets error, since there are multiple '-' and I don't get my output. So what function of java should be used?
My input and output should be as follows:
input="-------9---------------";
output="9";
input="---4-5-5-7-9---------------";
output="45579";
What should I do?
The + is a regex metacharacter of "one-or-more" repetition, so the pattern -+ is "one or more dash". This would allow you to use str.split("-+") instead, but you may get an empty string as first element.
If you just want to remove all -, then you can do str = str.replace("-", ""). This uses replace(CharSequence, CharSequence) method, which performs literal String replacement, i.e. not regex patterns.
If you want a String[] with each digit in its own element, then it's easiest to do in two steps: first remove all non-digits, then use zero-length assertion to split everywhere that's not the beginning of the string (?!^) (to prevent getting an empty string as a first element). If you want a char[], then you can just call String.toCharArray()
Lastly, if the string can be very long, it's better to use a java.util.regex.Matcher in a find() loop looking for a digit \d, or a java.util.Scanner with a delimiter \D*, i.e. a sequence (possibly empty) of non-digits. This will not give you an array, but you can use the loop to populate a List (see Effective Java 2nd Edition, Item 25: Prefer lists to arrays).
References
regular-expressions.info/Repetition with Star and Plus, Character Class, Lookaround
Snippets
Here are some examples to illustrate the above ideas:
System.out.println(java.util.Arrays.toString(
"---4--5-67--8-9---".split("-+")
));
// [, 4, 5, 67, 8, 9]
// note the empty string as first element
System.out.println(
"---4--5-67--8-9---".replace("-", "")
);
// 456789
System.out.println(java.util.Arrays.toString(
"abcdefg".toCharArray()
));
// [a, b, c, d, e, f, g]
The next example first deletes all non-digit \D, then splitting everywhere except the beginning of the string (?!^), to get a String[] each containing a digit:
System.out.println(java.util.Arrays.toString(
"#*#^$4#!#5ajs67>?<{8_(9SKJDH"
.replaceAll("\\D", "")
.split("(?!^)")
));
// [4, 5, 6, 7, 8, 9]
This uses a Scanner, with \D* as delimiter, to get each digit as its own token, using it to populate a List<String>:
List<String> digits = new ArrayList<String>();
String text = "(&*!##123ask45{P:L6";
Scanner sc = new Scanner(text).useDelimiter("\\D*");
while (sc.hasNext()) {
digits.add(sc.next());
}
System.out.println(digits);
// [1, 2, 3, 4, 5, 6]
Common problems with split()
Here are some common beginner problems when dealing with String.split:
Lesson #1: split takes a regular expression pattern
This is probably the most common beginner mistake:
System.out.println(java.util.Arrays.toString(
"one|two|three".split("|")
));
// [, o, n, e, |, t, w, o, |, t, h, r, e, e]
System.out.println(java.util.Arrays.toString(
"not.like.this".split(".")
));
// []
The problem here is that | and . are regex metacharacters, and since they are intended to be matched literally, they need to be escaped by preceding with a backslash, which as a Java string literal is "\\".
System.out.println(java.util.Arrays.toString(
"one|two|three".split("\\|")
));
// [one, two, three]
System.out.println(java.util.Arrays.toString(
"not.like.this".split("\\.")
));
// [not, like, this]
Lesson #2: split discards trailing empty strings by default
Sometimes it's desired to keep trailing empty strings (which are discarded by default split):
System.out.println(java.util.Arrays.toString(
"a;b;;d;;;g;;".split(";")
));
// [a, b, , d, , , g]
Note that there are slots for the "missing" values for c, e, f, but not for h and i. To fix this, you can use a negative limit argument to String.split(String regex, int limit).
System.out.println(java.util.Arrays.toString(
"a;b;;d;;;g;;".split(";", -1)
));
// [a, b, , d, , , g, , ]
You can also use a positive limit of n to apply the pattern at most n - 1 times (i.e. resulting in no more than n elements in the array).
Zero-width matching split examples
Here are more examples of splitting on zero-width matching constructs; this can be used to split a string but also keep "delimiters".
Simple sentence splitting, keeping punctuation marks:
String str = "Really?Wow!This.Is.Awesome!";
System.out.println(java.util.Arrays.toString(
str.split("(?<=[.!?])")
)); // prints "[Really?, Wow!, This., Is., Awesome!]"
Splitting a long string into fixed-length parts, using \G
String str = "012345678901234567890";
System.out.println(java.util.Arrays.toString(
str.split("(?<=\\G.{4})")
)); // prints "[0123, 4567, 8901, 2345, 6789, 0]"
Split before capital letters (except the first!)
System.out.println(java.util.Arrays.toString(
"OhMyGod".split("(?=(?!^)[A-Z])")
)); // prints "[Oh, My, God]"
A variety of examples is provided in related questions below.
References
regular-expressions.info/Lookarounds
Related questions
Can you use zero-width matching regex in String split?
"abc<def>ghi<x><x>" -> "abc", "<def>", "ghi", "<x>", "<x>"
How do I convert CamelCase into human-readable names in Java?
"AnXMLAndXSLT2.0Tool" -> "An XML And XSLT 2.0 Tool"
C# version: is there a elegant way to parse a word and add spaces before capital letters
Java split is eating my characters
Is there a way to split strings with String.split() and include the delimiters?
Regex split string but keep separators
You don't use split!
Split is to get the things BETWEEN the separator.
For this you want to eliminate the unwanted chars; '-'
The solution is simple
out=in.replaceAll("-","");
Use something like this to get the single values splitted. I'd rather eliminate the unwanted chars first to avoid getting empty/null String in the result array.
final Vector nodes = new Vector();
int index = original.indexOf(separator);
while (index >= 0) {
nodes.addElement(original.substring(0, index));
original = original.substring(index + separator.length());
index = original.indexOf(separator);
}
nodes.addElement(original);
final String[] result = new String[nodes.size()];
if (nodes.size() > 0) {
for (int loop = 0; loop smaller nodes.size(); loop++) {
result[loop] = (String) nodes.elementAt(loop);
}
}
return result;
}