Splitting string by square brackets

Splitting string by square brackets - java

I have a string that comes out like this: 1.[Aagaard,Lindsay][SeniorPolicyAdvisor][TREASURYBOARDSECRETARIAT][DEPUTYPREMIERANDPRESIDENTOFTHETREASURYBOARD,Toronto][416-327-0948][lindsay.aagaard#ontario.ca]2.[Aalto,Margaret][ProbationOfficer][CHILDRENANDYOUTHSERVICES][THUNDERBAY,ThunderBay][807-475-1310][margaret.aalto#ontario.ca]
I want to split it into an arraylist like this:
1.
Aagaard,Lindsay
SeniorPolicyAdvisor
etc.
Any suggestions?

I read the JavaDoc and used Pattern and Matcher like so:
Pattern p = Pattern.compile("\\[(.*?)\\]");
Matcher m = p.matcher(tableContent);
while(m.find()) {
System.out.println(m.group(1));
}

First delete the first and the last brackets and then split by '][':
String arr = "[Aalto,Margaret][ProbationOfficer][CHILDRENANDYOUTHSERVICES]";
String[] items = arr.substring(1, arr.length() - 1).split("][");

Simply this:
String[] list = str.split("\\[");
for(int i = 0 ; i < list.length ; i++) {
list[i] = list[i].replace("\\]", "");
}

Related

m.find() returns false when it should return true

m.find() returns false when it should return true.
solrQueries[i] contains the string:
"fl=trending:0,id,business_attr,newarrivals:0,bestprice:0,score,mostviewed:0,primarySortOrder,fastselling:0,modelNumber&defType=pedismax&pf=&mm=2<70%&bgids=1524&bgboost=0.1&shards.tolerant=true&stats=true"
The code is:
Pattern p = Pattern.compile("&mm=(\\d+)&");
for(int i=0; i<solrQueries.length; i++) {
Matcher m = p.matcher(solrQueries[i].toLowerCase());
System.out.println(p.matcher(solrQueries[i].toLowerCase()));
if (m.find()) {
System.out.println(m.group(1));
mmValues[i] = m.group(1);
}

Oh,
Pattern p = Pattern.compile("(?i)&mm=(\d+)");
works fine now.
Thank you, #Wiktor Stribiżew

You executed m.find() twice (first, in System.out.println(m.find()); and then in if (m.find())). And since there is only 1 match - even if the regex matches - you would get nothing after the second run.
Use
public String[] fetchMmValue(String[] solrQueries) {
String[] mmValues = new String[solrQueries.length];
Pattern p = Pattern.compile("(?i)&mm=(\\d+)");
for(int i=0; i<solrQueries.length; i++) {
Matcher m = p.matcher(solrQueries[i]);
if (m.find()) {
// System.out.println(m.group(1)); // this is just for debugging
mmValues[i] = m.group(1);
}
return mmValues;
}
If you want to get all chars other than & after &mm=, use another regex:
"&mm=([^&]+)"
where [^&]+ matches 1 or more chars other than &.

Get element starting with letter from List

I have a list and I want to get the position of the string which starts with specific letter.
I am trying this code, but it isn't working.
List<String> sp = Arrays.asList(splited);
int i2 = sp.indexOf("^w.*$");

The indexOf method doesn't accept a regex pattern. Instead you could do a method like this:
public static int indexOfPattern(List<String> list, String regex) {
Pattern pattern = Pattern.compile(regex);
for (int i = 0; i < list.size(); i++) {
String s = list.get(i);
if (s != null && pattern.matcher(s).matches()) {
return i;
}
}
return -1;
}
And then you simply could write:
int i2 = indexOfPattern(sp, "^w.*$");

indexOf doesn't accept a regex, you should iterate on the list and use Matcher and Pattern to achieve that:
Pattern pattern = Pattern.compile("^w.*$");
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
System.out.print(matcher.start());
}
Maybe I misunderstood your question. If you want to find the index in the list of the first string that begins with "w", then my answer is irrelevant. You should iterate on the list, check if the string startsWith that string, and then return its index.

Splitting strings by {} & []

I'm sort of new to Java.
I would like to know if there's an easier yet efficient way to implement the following Splitting of String. I've tried with pattern and matcher but doesn't really come out the way I want it.
"{1,24,5,[8,5,9],7,[0,1]}"
to be split into:
1
24
5
[8,5,9]
7
[0,1]
This is a completely wrong code but I'm posting it anyway:
String str = "{1,24,5,[8,5,9],7,[0,1]}";
str= str.replaceAll("\\{", "");
str= str.replaceAll("}", "");
Pattern pattern = Pattern.compile("\\[(.*?)\\]");
Matcher matcher = pattern.matcher(str);
String[] test = new String[10];
// String[] _test = new String[10];
int i = 0;
String[] split = str.split(",");
while (matcher.find()) {
test[i] = matcher.group(0);
String[] split1 = matcher.group(0).split(",");
// System.out.println(split1[i]);
for (int j = 0; j < split.length; j++) {
if(!split[j].equals(test[j])&&((!split[j].contains("\\["))||!split[j].contains("\\]"))){
System.out.println(split[j]);
}
}
i++;
}
}
With a given String format lets say {a,b,[c,d,e],...} format. I want to enlist all the contents but the ones in the Square brackets are to be denoted as one element ( like an array).

This works:
public static void main(String[] args)
{
customSplit("{1,24,5,[8,5,9],7,[0,1]}");
}
static void customSplit(String str){
Pattern pattern = Pattern.compile("[0-9]+|\\[.*?\\]");
Matcher matcher =
pattern.matcher(str);
while (matcher.find()) {
System.out.println(matcher.group());
}
}
Yields the output
1
24
5
[8,5,9]
7
[0,1]

Regular expression with & as separator

I was given a long text in which I need to find all the text that are embedded in a pair of & (For example, in a text "&hello&&bye&", I need to find the words "hello" and "bye").
I try using the regex ".*&([^&])*&.*" but it doesn't work, I don't know what's wrong with that.
Any help?
Thanks

Try this way
String data = "&hello&&bye&";
Matcher m = Pattern.compile("&([^&]*)&").matcher(data);
while (m.find())
System.out.println(m.group(1));
output:
hello
bye

No regex needed. Just iterate!
boolean started = false;
List<String> list;
int startIndex;
for(int i = 0; i < string.length(); ++i){
if(string.charAt(i) != '&')
continue;
if(!started) {
started = true;
startIndex = i + 1;
}
else {
list.add(string.substring(startIndex, i)); // maybe some +-1 here in indices
}
started = !started;
}
or use split!
String[] parts = string.split("&");
for(int i = 1; i < parts.length; i += 2) { // every second
list.add(parts[i]);
}

If you don't want to use regular expressions, here's a simple way.
String string = "xyz...." // the string containing "hello", "bye" etc.
String[] tokens = string.split("&"); // this will split the string into an array
// containing tokens separated by "&"
for(int i=0; i<tokens.length; i++)
{
String token = tokens[i];
if(token.length() > 0)
{
// handle edge case
if(i==tokens.length-1)
{
if(string.charAt(string.length()-1) == '&')
System.out.println(token);
}
else
{
System.out.println(token);
}
}
}

Two problems:
You're repeating the capturing group. This means that you'll only catch the last letter between &s in the group.
You will only match the last word because the .*s will gobble up the rest of the string.
Use lookarounds instead:
(?<=&)[^&]+(?=&)
Now the entire match will be hello (and bye when you apply the regex for the second time) because the surrounding &s won't be part of the match any more:
List<String> matchList = new ArrayList<String>();
Pattern regex = Pattern.compile("(?<=&)[^&]+(?=&)");
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
matchList.add(regexMatcher.group());
}

The surrounding .* don't make sense and are unproductive. Just &([^&])*& is sufficient.

I would simplify it even further.
Check that the first char is &
Check that the last char is &
String.split("&&") on the substring between them
In code:
if (string.length < 2)
throw new IllegalArgumentException(string); // or return[], whatever
if ( (string.charAt(0) != '&') || (string.charAt(string.length()-1) != '&')
// handle this, too
String inner = string.substring(1, string.length()-1);
return inner.split("&&");

java regex matching each group starting with specific string

I have a string like a1wwa1xxa1yya1zz.
I would like to get every groups starting with a1 until next a1 excluded.
(In my example, i would be : a1ww, a1xx, a1yyand a1zz
If I use :
Matcher m = Pattern.compile("(a1.*?)a1").matcher("a1wwa1xxa1yya1zz");
while(m.find()) {
String myGroup = m.group(1);
}
myGroup capture 1 group every two groups.
So in my example, I can only capture a1ww and a1yy.
Anyone have a great idea ?

Split is a good solution, but if you want to remain in the regex world, here is a solution:
Matcher m = Pattern.compile("(a1.*?)(?=a1|$)").matcher("a1wwa1xxa1yya1zz");
while (m.find()) {
String myGroup = m.group(1);
System.out.println("> " + myGroup);
}
I used a positive lookahead to ensure the capture is followed by a1, or alternatively by the end of line.
Lookahead are zero-width assertions, ie. they verify a condition without advancing the match cursor, so the string they verify remains available for further testing.

You can use split() method, then append "a1" as a prefix to splitted elements:
String str = "a1wwa1xxa1yya1zz";
String[] parts = str.split("a1");
String[] output = new String[parts.length - 1];
for (int i = 0; i < output.length; i++)
output[i] = "a1" + parts[i + 1];
for (String p : output)
System.out.println(p);
Output:
a1ww
a1xx
a1yy
a1zz

I would use an approach like this:
String str = "a1wwa1xxa1yya1zz";
String[] parts = str.split("a1");
for (int i = 1; i < parts.length; i++) {
String found = "a1" + parts[i];
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Splitting string by square brackets - java

I read the JavaDoc and used Pattern and Matcher like so: Pattern p = Pattern.compile("\\[(.*?)\\]"); Matcher m = p.matcher(tableContent); while(m.find()) { System.out.println(m.group(1)); }

First delete the first and the last brackets and then split by '][': String arr = "[Aalto,Margaret][ProbationOfficer][CHILDRENANDYOUTHSERVICES]"; String[] items = arr.substring(1, arr.length() - 1).split("][");

Simply this: String[] list = str.split("\\["); for(int i = 0 ; i < list.length ; i++) { list[i] = list[i].replace("\\]", ""); }

Related

m.find() returns false when it should return true

Get element starting with letter from List

Splitting strings by {} & []

Regular expression with & as separator

java regex matching each group starting with specific string

Categories

Resources