Regex does not detect whitespace - java

I've a number of subscription from a YouTube channel that I copied.
It's "4 372 236".
I'm testing the "\s+" regex on https://regex101.com for that number and it does not work. When i'm writing the same number on my own the regex does work. Anybody knows what's wrong?.
I'm trying to remove the white space chars from such numbers but i cannot do it. I tried also the .replaceAll(" ", "") method but does not work neither.
screen from regex101.com
The JSON Youtube code:
JSON Youtube
Then I'm using JSON library to get the subscriptions like this:
JSONObject jsonObject = new JSONObject();
jsonObject = new JSONObject(content);
JSONArray tabs = jsonObject.getJSONObject("contents")
.getJSONObject("twoColumnBrowseResultsRenderer")
.getJSONArray("tabs");
JSONObject tabRenderer = tabs.getJSONObject(5).getJSONObject("tabRenderer");
JSONObject sectionListRenderer = tabRenderer.getJSONObject("content").getJSONObject("sectionListRenderer");
JSONArray contents2 = sectionListRenderer.getJSONArray("contents");
JSONObject itemSectionRenderer = contents2.getJSONObject(0).getJSONObject("itemSectionRenderer").getJSONArray("contents").getJSONObject(0);
JSONObject channelAboutFullMetadataRenderer = itemSectionRenderer.getJSONObject("channelAboutFullMetadataRenderer");
String subs = channelAboutFullMetadataRenderer.getJSONObject("subscriberCountText").getJSONArray("runs").getJSONObject(0).getString("text");
And finally, i'm using the regex to delete the whitespaces from number:
subs = subs.replaceAll("\\s+", "");
System.out.println(subs);
I tried this too but it does not work. I think it's not a regular space but I don't know how to recognise it.
subs = subs.replaceAll(" ", "");

Okay guys, I found it out.
It was not a duplication of Why does String.replace not work?. I kept in my mind that string in Java are immutable.
Between the numbers there are not simple spaces. It's NO-BREAK SPACE' (U+00A0).
So, the regex should look like this
subs = subs.replaceAll("[\\u202F\\u00A0]", "");
Maybe it will help somebody in the future :) Thanks #metters

I suggest you copy that number into notepad++ and use the "show all symbols" option.
Maybe there are not only whitespaces inbetween.
EDIT: sorry for not using the comment function, i need reputation for that and it sucks.

You need to escape the backslash:
System.out.println("4 372 236".replaceAll("\\s+", ""));
prints: 4372236

Your question doesn't really explain what you are trying to accomplish and you have not provided any sort of code other than a method to go off. It really depends on what your end goal is.
Generally, when you are trying to do can be accomplished easily through the replaceAll method as mentioned.
String test = "4 372 236";
String reg = "\\s+";
String newLine = test.replaceAll(reg, "");
or simply
String test = "4 372 236";
String newLine = test.replaceAll(" ", "");

Related

How to seperate 2 pieces of concatinated JSONObject in 1 string? in Java (Android)

Yes, as question title. I got a single string received in android consist of 2 JSONObject.I have to do 2 different processes in PHP but both result are returned (echo) in single string result which I don't know how to seperate it.I'm using :
JSONObject json = new JSONObject(result);
String success = json.optString("success");
// "success" here shows empty in logcat. I think it doesn't get the second json object
Example of Result string:
{"username":"xx","activated":"0"}{"multicast_id":xxx,"success":0,"failure":0,"canonical_ids":0,"results":[{"message_id":"xxx"}]}
How can I do?
(btw 2nd string is Firebase Cloud Messanging result example. EDIT: I'm using PHP to send FCM, that's why the result is forced to return together with my other string result even I do not (PHP::echo) it)
Use String.split() with proper regex.
Here is the working code:
String response = "{\"username\":\"xx\",\"activated\":\"0\"}{\"multicast_id\":xxx,\"success\":0,\"failure\":0,\"canonical_ids\":0,\"results\":[{\"message_id\":\"xxx\"}]}";
String[] separated = response.split("\\}\\{");
String str1 = separated[0] + "}";
String str2 = "{" + separated[1];
Log.d("STRING", "String1: " + str1 + "\nString2: " + str2);
OUTPUT:
D/STRING: String1: {"username":"xx","activated":"0"}
String2: {"multicast_id":xxx,"success":0,"failure":0,"canonical_ids":0,"results":[{"message_id":"xxx"}]}
Hope this will help~
Use String.split() with proper regex.
AND
Use this website to reveal the mystery of regexp

Newline character in bing translation text

This is a very basic question
I am using the Bing translate API method: Translate.execute(String to be translated,Target Language)
When there is no newline character in the source language then it is all fine. E.g.
String str = "I have seen some app. Educational and fun.";
But If my source text has multiple lines and looks like following, how do I create a String variable for it:
I have seen some app.
Educational and fun.
I don't want to add /n, /r characters inside my string because the bing API will try to translate these characters also.
Can you instead translate each sentence or line at a time and combine them after the fact?
String str1 = "I have seen some app.";
String str2 = "Educational and fun.";
String result = Translate.execute(str1) + "\n" + Translate.execute(str2);
Or translate it all at once and add the newlines characters in after you get the translation back? Maybe something like (may be too simplistic):
String str = "I have seen some app. Educational and fun.";
String result = Translate.execute(str);
result = result.replaceAll(".", "\n");

Get URL from string with text

I have a bunch of strings like this:
Some text, bla-bla http://www.easypolls.net/poll.html?p=51e5a300e4b084575d8568bb#.UeWjBcCzaaA.twitter
And I need to parse this String to two:
Some text, bla-bla
http://www.easypolls.net/poll.html?p=51e5a300e4b084575d8568bb#.UeWjBcCzaaA.twitter
I need separate them, but, of course, it's enough to parse only URL.
Can you help me, how can I parse url from string like this.
By using split :
String str = "Some text, bla-bla http://www.easypolls.net/poll.html?p=51e5a300e4b084575d8568bb#.UeWjBcCzaaA.twitter";
String [] ar = str.split("http\\.*");
System.out.println(ar[0]);
System.out.println("http"+ar[1]);
This depends on how robust you want your parser to be. If you can reasonably expect every url to start with http://, then you can use
string.indexOf("http://");
This returns the index of the first character of the string you pass in (and -1 if the string does not appear).
Full code to return a substring with just the URL:
string.substring(string.indexOf("http://"));
Here's the documentation for Java's String class. Let this become your friend in programming! http://docs.oracle.com/javase/7/docs/api/java/lang/String.html
Try something like this:
String string = "sometext http://www.something.com";
String url = string.substring(string.indexOf("http"), string.length());
System.out.println(url);
or use split.
I know in PHP you'd be able to run the explode() (http://www.php.net/manual/en/function.explode.php) function. You'd choose which character you want to explode at. For instance, you could explode at "http://"
So running the code via PHP would look like:
$string = "Some text, bla-bla http://www.easypolls.net/poll.html?p=51e5a300e4b084575d8568bb#.UeWjBcCzaaA.twitter";
$pieces = explode("http://", $string);
echo $pieces[0]; // Would print "Some text, bla-bla"
echo $pieces[1]; // Would print "www.easypolls.net/poll.html?p=51e5a300e4b084575d8568bb#.UeWjBcCzaaA.twitter"

parsing a string using string tokenizer twice

I am getting input string as below from some procedure
service:jmx:t3://10.20.30.40:9031/jndi/weblogic.management.mbeanservers.runtime
I want to parse it in java and get out
t3
10.20.30.40
9031
into separate strings
I think I can use string tokenizer but I have to tokenize 2 times ?Any better way to handle this?
Use the JMXServiceUrl class. It will parse the URL for you. No need to battle with regex or String splits.
String url = "service:jmx:t3://10.20.30.40:9031/jndi/weblogic.management.mbeanservers.runtime";
JMXServiceURL jmxServiceURL = new JMXServiceURL(url);
System.out.println(jmxServiceURL.getHost());
System.out.println(jmxServiceURL.getPort());
System.out.println(jmxServiceURL.getProtocol());
Prints
10.20.30.40
9031
t3
If it's only a somehow composed String and you can ignorie performance, I would prefer a readable solution (more than regex ;-)) like this:
int pos_1 = input.indexOf("//");
String s1 = input.substring(0, pos_1);
String input_2 = input.substring(pos_1 + 2);
int pos_2 = input_2.indexOf(":");
String s2 = input_2.substring(0, pos_2);
...
Regex is a good approach. You should find the pattern for your string and group with parenthesis what you want. Maybe this could be enough for you:
service\\:jmx\\:(?<groupName01>[a-z0-9]+)\\://(?<groupName02>[0-9\\.]+)\\:(?<groupName03>[o-9]+)
See Java Regex
If you use java earlier from 7, do not use ?<groupName> in the pattern. It will be grouped by number.
Do a simple string split
String s = "service:jmx:t3://10.20.30.40:9031/jndi/weblogic.management.mbeanservers.runtime";
String tokens[] = s.split("[:/]");
System.out.println(tokens[2]);
System.out.println(tokens[5]);
System.out.println(tokens[6]);

parse string to remove spaces java

I need some help parsing a string that is input to one that is later cleaned and output.
e.g.
String str = " tHis strIng is rEalLy mEssy "
and what I need to do is have it parsed from that to look like this:
"ThisStringIsReallyMessy"
so I basically need to clean it up then set only the first letter of every word to capitals, without having it break in case someone uses numbers.
Apache Commons to the rescue (again). As always, it's worth checking out the Commons libraries not just for this particular issue, but for a lot of functionality.
You can use Apache Commons WordUtils.capitalize() to capitalise each word within the string. Then a replaceAll(" ", "") will bin your whitespace.
String result = WordUtils.capitalize(str).replaceAll(" ", "");
Note (other) Brian's comments below re. the choices behind replace() vs replaceAll().
String[] tokens = " tHis strIng is rEalLy mEssy ".split(" ");
StringBuilder result = new StringBuilder();
for(String token : tokens) {
if(!token.isEmpty()) {
result.append(token.substring(0, 1).toUpperCase()).append(token.substring(1).toLowerCase());
}
}
System.out.println(result.toString()); // ThisStringIsReallyMessy
String str = " tHis strIng is rEalLy mEssy ";
str =str.replace(" ", "");
System.out.println(str);
output:
tHisstrIngisrEalLymEssy
For capitalizing first letter in each word there is no in-built function available, this thread has possible solutions.
Do you mean this?
str = str.replaceAll(" ", "");

Categories

Resources