Need some help getting some stuff off a string - java

I want to get some info out of my string but there's two possible "expressions" for the string. I want to get "a" & "b" out of the string. This is how they look:
Format one:
http://default.com/default/a/b
Format two:
http://default.com/#!default|1|a|b|1
How can I do this?

If the strings always looks like this, you could do the following:
Search for the #-char to decide, if you have type 1 or 2.
In case of type 1, split with delimiter '/' and always take the last and the one before. For type 2, also first split with '/' and then, split the last part again with delimiter '|' and take results[2] and results[3].

Use a regex to split the string.
Split on "default"
Regex Split

There are many ways you can do this - regular expressions is the most common.
In pseudo code:
if the string contains "/#!default" then:
Use the regular expression ^.*/([^/])/([^/])$
if the string contains "/default" then:
Use the regular expression ^.*|([^|])|([^|])|1$
Take the 1st and 2nd blocks from the matcher

Related

Split strings based on single occurrence of delimiter but not double in Java

How to split a string using single occurrence of a delimiter disregarding multiple occurrences?
For example, if the string contains
aaa, bbb,,ccc, ddd
I would like to split the string as follows:
aaa
bbb,,ccc
ddd
Tried using Regex with split() but unable to acquire the desired result.
Came across the solution in Javascript here: Split string with a single occurence (not twice) of a delimiter in Javascript. Is it possible to achieve the same in Java, with or without Regex?
String.split() accepts regular expressions as delimiters so you could use the following pattern :
(?<!,),(?!,)
This regex matches a comma that is neither preceded nor followed by a comma.
You can see it in action here : https://ideone.com/CmtAzX
If you want to trim the leading spaces at the same time you can use (?<!,),(?!,) * as mentioned by Nicolas Filotto.
Regex allows you to specify that a given symbol is neither preceded nor followed by another specified symbol. In your case you should use (?<!,),(?!,). In general (?<!x)y(?!z) would find 'y' if it is neither preceeded by 'x' nor followed by 'z'.

Java String Regex replacement

Sample Input:
a:b
a.in:b
asds.sdsd:b
a:b___a.sds:bc___ab:bd
Sample Output:
a:replaced
a.in:replaced
asds.sdsd:replaced
a:replaced___a.sds:replaced___ab:replaced
String which comes after : should be replaced with custom function.
I have done the same without Regex. I feel it can be replaced with regex as we are trying to extract string out of specific pattern.
For first three cases, it's simple enough to extract String after :, but I couldn't find a way to deal with third case, unless I split the string ___ and apply the approach for first type of pattern and again concatenate them.
Just replace only the letters with exists next to : with the string replaced.
string.replaceAll("(?<=:)[A-Za-z]+", "replaced");
DEMO
or
If you also want to deal with digits, then add \d inside the char class.
string.replaceAll("(?<=:)[A-Za-z\\d]+", "replaced");
(:)[a-zA-Z]+
You can simply do this with string.replaceAll.Replace by $1replaced.See demo.
https://regex101.com/r/fX3oF6/18

Split on non arabic characters

I have a String like this
أصبح::ينال::أخذ::حصل (على)::أحضر
And I want to split it on non Arabic characters using java
And here's my code
String s = "أصبح::ينال::أخذ::حصل (على)::أحضر";
String[] arr = s.split("^\\p{InArabic}+");
System.out.println(Arrays.toString(arr));
And the output was
[, ::ينال::أخذ::حصل (على)::أحضر]
But I expect the output to be
[ينال,أخذ,حصل,على,أحضر]
So I don't know what's wrong with this?
You need a negated class, and to do that, you need square brackets [ ... ]. Try to split with this:
"[^\\p{InArabic}]+"
If \\p{InArabic} matches any arabic character, then [^\\p{InArabic}] will match any non-arabic character.
Another option you can consider is an equivalent syntax, using P instead of p to indicate the opposite of the \\p{InArabic} character class like #Pshemo mentioned:
"\\P{InArabic}+"
This works just like \\W is the opposite of \\w.
The only possible advantage you get with the first syntax over the second (again like #Pshemo mentioned), is that if you want to add other characters to the list of characters which shouldn't match, for example, if you want to match all non \\p{InArabic} except periods, the first one is more flexible:
"[^\\p{InArabic}.]+"
^
Otherwise, if you really want to use \\P{InArabic}, you'll need subtraction within classes:
"[\\P{InArabic}&&[^.]]+"
The expression you want is "\\P{InArabic}+"
This means match any (non-zero) number of characters that are not Arabic.

Remove repeating set of characters in a string

I want to remove the sequesnce "-~-~-" if it repeats in a string, but only if they are together.
I have tried to create a regex based on the removing of multiple white spaces regex:
test.replaceAll("\\s+", " ");
Unfortunately I was unsuccessful. Can someone please help me write the correct regex? thanks.
Example:
string test = "hello-~-~--~-~--~-~-"
output:
hello-~-~-
Another example
string test = "-~-~--~-~--~-~-hello-~-~--~-~--~-~-"
output:
-~-~-hello-~-~-
The regex is:
test.replaceAll("(-~-~-){2,}", "-~-~-")
replaceAll replaces all occurrences matched by the regex (the first parameter) with the second parameter.
the () groups the expression -~-~- together, {2,} means two or more occurrences.
EDIT
Like #anubhava said, instead of using -~-~- for the replacement string, you could also use $1 which backreferences the first capturing group (i.e. the expression in the regex surrounded by ()).
test.replaceAll("(-~-~-)+", "-~-~-");
This is the regex you need:
(-~-~-){2}

Java, how to replace a sequence of numbers in a string

I am trying to replace any sequence of numbers in a string with the number itself within brackets.
So the input:
"i ee44 a1 1222"
Should have as an output:
"i ee(44) a(1) (1222)"
I am trying to implement it using String.replace(a,b) but with no success.
"i ee44 a1 1222".replaceAll("\\d+", "($0)");
Try this and see if it works.
Since you need to work with regular expressions, you may consider using replaceAll instead of replace.
You should use replaceAll. This method uses two arguments
regex for substrings we want to find
replacement for what should be used to replace matched substring.
In replacement part you can use groups matched by regex via $x where x is group index. For example
"ab cdef".replaceAll("[a-z]([a-z])","-$1")
will produce new string with replaced every two lower case letters with - and second currently matched letter (notice that second letter is placed parenthesis so it means that it is in group 1 so I can use it in replacement part with $1) so result will be -b -d-f.
Now try to use this to solve your problem.
You can use String.replaceAll with regular expressions:
"i ee44 a1 1222".replaceAll("(\\d+)", "($1)");

Categories

Resources