Splitting strings by {} & [] - java

I'm sort of new to Java.
I would like to know if there's an easier yet efficient way to implement the following Splitting of String. I've tried with pattern and matcher but doesn't really come out the way I want it.
"{1,24,5,[8,5,9],7,[0,1]}"
to be split into:
1
24
5
[8,5,9]
7
[0,1]
This is a completely wrong code but I'm posting it anyway:
String str = "{1,24,5,[8,5,9],7,[0,1]}";
str= str.replaceAll("\\{", "");
str= str.replaceAll("}", "");
Pattern pattern = Pattern.compile("\\[(.*?)\\]");
Matcher matcher = pattern.matcher(str);
String[] test = new String[10];
// String[] _test = new String[10];
int i = 0;
String[] split = str.split(",");
while (matcher.find()) {
test[i] = matcher.group(0);
String[] split1 = matcher.group(0).split(",");
// System.out.println(split1[i]);
for (int j = 0; j < split.length; j++) {
if(!split[j].equals(test[j])&&((!split[j].contains("\\["))||!split[j].contains("\\]"))){
System.out.println(split[j]);
}
}
i++;
}
}
With a given String format lets say {a,b,[c,d,e],...} format. I want to enlist all the contents but the ones in the Square brackets are to be denoted as one element ( like an array).

This works:
public static void main(String[] args)
{
customSplit("{1,24,5,[8,5,9],7,[0,1]}");
}
static void customSplit(String str){
Pattern pattern = Pattern.compile("[0-9]+|\\[.*?\\]");
Matcher matcher =
pattern.matcher(str);
while (matcher.find()) {
System.out.println(matcher.group());
}
}
Yields the output
1
24
5
[8,5,9]
7
[0,1]

Related

How to find the words starting after line breaks, using regex, Java?

I have an input string, consisting of several lines, e.g.:
When I was younger
I never needed
And I was always OK
but it was a long Time Ago
The problem is to invert first letters of all the words which length is more than 3. That is an output must be the following:
when I Was Younger
I Never Needed
and I Was Always OK
But it Was a Long time ago
There is my code:
import java.util.regex.*;
public class Part3_1 {
public static void main(String[] args) {
String str = "When I was younger\r\nI never needed\r\nAnd I was always OK\r\nbut it was a long Time Ago";
System.out.println(convert(str));
}
public static String convert(String str) {
String result = "";
String[] strings = str.split(" ");
String regexLowerCase = "\\b[a-z]{3,}\\b";
String regexLowerCaseInitial = "(\\r\\n)[a-z]{3,}\\b";
String regexUpperCase = "\\b([A-Z][a-z]{2,})+\\b";
String regexUpperCaseInitial = "(\\r\\n)([A-Z][a-z]{2,})\\b";
Pattern patternLowerCase = Pattern.compile(regexLowerCase, Pattern.MULTILINE);
Pattern patternUpperCase = Pattern.compile(regexUpperCase, Pattern.MULTILINE);
Pattern patternLowerCaseInitial = Pattern.compile(regexLowerCaseInitial, Pattern.MULTILINE);
Pattern patternUpperCaseInitial = Pattern.compile(regexUpperCaseInitial, Pattern.MULTILINE);
for (int i = 0; i < strings.length; i++) {
Matcher matcherLowerCase = patternLowerCase.matcher(strings[i]);
Matcher matcherUpperCase = patternUpperCase.matcher(strings[i]);
Matcher matcherLowerCaseInitial = patternLowerCaseInitial.matcher(strings[i]);
Matcher matcherUpperCaseInitial = patternUpperCaseInitial.matcher(strings[i]);
char[] words = strings[i].toCharArray();
if (matcherLowerCase.find() || matcherLowerCaseInitial.find()) {
char temp = Character.toUpperCase(words[0]);
words[0] = temp;
result += new String(words);
} else if (matcherUpperCase.find() || matcherUpperCaseInitial.find()) {
char temp = Character.toLowerCase(words[0]);
words[0] = temp;
result += new String(words);
} else {
result += new String(words);
}
if (i < strings.length - 1) {
result += " ";
}
}
return result;
}
}
Here:
"\\b[a-z]{3,}\\b" is a regular expression, selecting all words in lower case which length is 3 or more symbols,
"\\b([A-Z][a-z]{2,})+\\b" is a regular expression, selecting all words starting from capital letter which length is 3 or more symbols.
Both regular expressions works properly but when we have a line breaks - they do not work. The output of my program execution is following:
when I Was Younger
I Never Needed
And I Was Always OK
but it Was a Long Time ago
As I understood, these regular expressions cannot select words And and but from needed\r\nAnd and OK\r\nbut respectively.
To fix this bug I tried to add new regular expressions "(\\r\\n)[a-z]{3,}\\b" and "(\\r\\n)([A-Z][a-z]{2,})\\b", but they do not work.
How to compose the regular expressions, selecting words after line breaks?
One option would be to split the string on a word break (\b) instead, and then pass the white space through to the final string in the strings array. This removes the need to have separate regex for the different situations, and also the need to add back space characters. This will give you the results you want:
public static String convert(String str) {
String result = "";
String[] strings = str.split("\\b");
String regexLowerCase = "^[a-z]{3,}";
String regexUpperCase = "^[A-Z][a-z]{2,}+";
Pattern patternLowerCase = Pattern.compile(regexLowerCase, Pattern.MULTILINE);
Pattern patternUpperCase = Pattern.compile(regexUpperCase, Pattern.MULTILINE);
for (int i = 0; i < strings.length; i++) {
Matcher matcherLowerCase = patternLowerCase.matcher(strings[i]);
Matcher matcherUpperCase = patternUpperCase.matcher(strings[i]);
char[] words = strings[i].toCharArray();
if (matcherLowerCase.find()) {
char temp = Character.toUpperCase(words[0]);
words[0] = temp;
result += new String(words);
} else if (matcherUpperCase.find()) {
char temp = Character.toLowerCase(words[0]);
words[0] = temp;
result += new String(words);
} else {
result += new String(words);
}
}
return result;
}
Output:
when I Was Younger
I Never Needed
and I Was Always OK
But it Was a Long time ago
Demo on rextester

getting string from string start wtih "abc" and end with "def"

I am using StringUtils (import org.apache.commons.lang3.StringUtils;) library to split string like:
String str = "ZXCVFMS2ZZ1012ZZ1012ZZ1000ZZ0923ZZ0990ZZ0990ZZ0990ZZ1020DEFZXCVFMS3ZZ1012ZZ1012ZZ1000ZZ0923ZZ0990ZZ0990ZZ0990ZZ1020DEFZXCVFMERRORDEF";
I need to take out string start with zxcv* and end with *def as
String tmp1 = "ZXCVFMS2ZZ1012ZZ1012ZZ1000ZZ0923ZZ0990ZZ0990ZZ0990ZZ1020DEF";
String tmp2 = "ZXCVFMS3ZZ1012ZZ1012ZZ1000ZZ0923ZZ0990ZZ0990ZZ0990ZZ1020DEF";
any help?
Solution thanks to #assylias :
Pattern p = Pattern.compile("ZXCV.*?DEF");
Matcher m = p.matcher(str);
List<String> result = new ArrayList<> ();
while (m.find()) {
result.add(m.group());
}
How about using replaceAll?
String tmp = str.replaceAll(".*(zxcv.*def).*", "$1"); //zxcvVariableCanChancedef
UPDATE following your edit
if you have a repeating pattern, you could use a Matcher - to avoid matching the whole string use the ? quantifier to make the match lazy.
Pattern p = Pattern.compile("zxcv.*?def");
String input = "15684zxcvVariableCanChancedefABCDEND15684zxcvVariableCanChancedefABCDEND";
Matcher m = p.matcher(input);
List<String> result = new ArrayList<> ();
while (m.find()) {
result.add(m.group());
}
This can be done without any additional libraries using core java.util.regex functionality. For example:
String str = "15684zxcvVariableCanChancedefABCDEND";
Pattern pattern = Pattern.compile(".*(zxcv.*def).*");
Matcher matcher = pattern.matcher(str);
if (matcher.matches()) {
System.out.println(matcher.group(1)); // ==> zxcvVariableCanChancedef
}
String line = "15684zxcvAAAAAAAncedefABCDEND15684zxcvBBBBBBBBBBdefABCDEND";
Last occurrence :
Matcher matcher = Pattern.compile(".*(zxcv.*def).*").matcher(line);
String tmp = matcher.find() ? matcher.group(1) : null;
System.out.println(tmp);
First occurence :
Matcher matcher = Pattern.compile(".*?(zxcv.*?def).*").matcher(line);
Biggest occurence (from first zxcv to last def) :
Matcher matcher = Pattern.compile(".*?(zxcv.*def).*").matcher(line);
All occurrences
Matcher matcher = Pattern.compile(".*?(zxcv.*?def)").matcher(line);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
I am not sure about it because I wrote it using a text document, I don't have any java IDE in this computer. I hope it helps
public String XXX()
{
int firstStorage = 0;
int secondStorage = 0;
for (int i = 0 ; i < tmp.lenght() < i++)
{
if( tmp.substring(i,i+4).equals("zxcv"))
{
firstStorage = i;
break;
}
}
for (int i = firstStorage ; i < tmp.lenght() < i++)
{
if( tmp.substring(i,i+3).equals("def"))
{
secondStorage = i + 2;
break;
}
}
return tmp.substring(firstStorage, secondStorage + 1);
}
Let me know if it is working or not. Have a nice day !!
String str = "15684zxcvVariableCanChancedefABCDEND15684zxcvVariableCanChancedefABCDEND";
List<string> strList = new List<string>();
while (str.IndexOf("zxc") >= 0 && str.IndexOf("def") >= 0)
{
var startIndex = str.IndexOf("zxc");
var stopIndex = str.IndexOf("def");
var item = str.Substring(startIndex, stopIndex - startIndex + 3);
strList.Add(item);
str = str.Substring(0, startIndex) + str.Substring(stopIndex+3);
}

Extract strings from a pattern in java using Matcher and pattern

If I have a String like this "Error. LineNumber = 2, originalLine = 'ABC', lineErrors = [Special chars found]", I would like to extract
the line number as '2',
originalLine as 'ABC' and
error as 'Special chars found'
I am very new to regex, any pointers would be very helpful. I browsed through few past questions but did not get what I wanted.
You can use capturing groups to capture the values. This is the sample code in Java. This works for the specified string but you can tweak and change it accordingly.
public class Main {
public static void main(String[] args) {
String s = "Error. LineNumber = 2, originalLine = 'ABC', lineErrors = [Special chars found]";
String patternStr = "Error. LineNumber = ([\\S ]+), originalLine = ([\\S ]+), lineErrors = ([\\S ]+)";
Pattern p = Pattern.compile(patternStr);
Matcher m = p.matcher(s);
if (m.find()) {
int count = m.groupCount();
System.out.println("group count is " + count);
for (int i = 0; i < count; i++) {
System.out.println(m.group(i+1));
}
}
}
}

what is the proper regular expression for this example?

Which regular expression in java can do these conversions?
"1.54.0.21" to "01540021"
or
"33.5.10.6" to "33051006"
I need to replace .# with 0# and .## with ##
You could try something like...
StringBuilder output = new StringBuilder(8);
String input = "1.54.0.21";
Pattern p = Pattern.compile("\\d+");
Matcher matcher = p.matcher(input);
while (matcher.find()) {
String group = matcher.group();
if (group.length() < 2) {
output.append("0");
}
output.append(group);
}
System.out.println(input);
System.out.println(output);
Which outputs...
1.54.0.21
01540021
Without Regex :
http://rextester.com/LGXETU62790
public static void main(String args[])
{
String str1 = "33.5.9.6";
String str2 = "1.54.0.21";
System.out.println(transform(str1));
System.out.println(transform(str2));
}
private static String transform(String str){
String[] splitted = str.split("\\.");
StringBuilder build = new StringBuilder();
for(String s : splitted){
build.append(String.format("%02d", Integer.parseInt(s)));
}
return build.toString();
}
The only functionality of a regular expression is to match a certain pattern of characters inside a string (or multiline strings).
A regular expression can be used in a find and replace Pattern but only to find the strings you are interested in. When they are found , a Split(), Remove(), Replace(), function will better do it's purpose.
I recommend you : http://gskinner.com/RegExr/
This is an online tool for matching strings with regular expression, and also learning the patterns.
public String getToken(String elem) {
return (elem.size() == 1) ? ("0" + elem) : elem;
}
String[] a = "1.54.0.21".split("\\.");
String o = "", e;
int i = 0, len = a.size();
for (i = 0; i < len; i++) {
o = o + getToken(a[i]);
}
System.out.println(o); //01540021

How can I count the number of matches for a regex?

Let's say I have a string which contains this:
HelloxxxHelloxxxHello
I compile a pattern to look for 'Hello'
Pattern pattern = Pattern.compile("Hello");
Matcher matcher = pattern.matcher("HelloxxxHelloxxxHello");
It should find three matches. How can I get a count of how many matches there were?
I've tried various loops and using the matcher.groupCount() but it didn't work.
matcher.find() does not find all matches, only the next match.
Solution for Java 9+
long matches = matcher.results().count();
Solution for Java 8 and older
You'll have to do the following. (Starting from Java 9, there is a nicer solution)
int count = 0;
while (matcher.find())
count++;
Btw, matcher.groupCount() is something completely different.
Complete example:
import java.util.regex.*;
class Test {
public static void main(String[] args) {
String hello = "HelloxxxHelloxxxHello";
Pattern pattern = Pattern.compile("Hello");
Matcher matcher = pattern.matcher(hello);
int count = 0;
while (matcher.find())
count++;
System.out.println(count); // prints 3
}
}
Handling overlapping matches
When counting matches of aa in aaaa the above snippet will give you 2.
aaaa
aa
aa
To get 3 matches, i.e. this behavior:
aaaa
aa
aa
aa
You have to search for a match at index <start of last match> + 1 as follows:
String hello = "aaaa";
Pattern pattern = Pattern.compile("aa");
Matcher matcher = pattern.matcher(hello);
int count = 0;
int i = 0;
while (matcher.find(i)) {
count++;
i = matcher.start() + 1;
}
System.out.println(count); // prints 3
This should work for matches that might overlap:
public static void main(String[] args) {
String input = "aaaaaaaa";
String regex = "aa";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
int from = 0;
int count = 0;
while(matcher.find(from)) {
count++;
from = matcher.start() + 1;
}
System.out.println(count);
}
From Java 9, you can use the stream provided by Matcher.results()
long matches = matcher.results().count();
If you want to use Java 8 streams and are allergic to while loops, you could try this:
public static int countPattern(String references, Pattern referencePattern) {
Matcher matcher = referencePattern.matcher(references);
return Stream.iterate(0, i -> i + 1)
.filter(i -> !matcher.find())
.findFirst()
.get();
}
Disclaimer: this only works for disjoint matches.
Example:
public static void main(String[] args) throws ParseException {
Pattern referencePattern = Pattern.compile("PASSENGER:\\d+");
System.out.println(countPattern("[ \"PASSENGER:1\", \"PASSENGER:2\", \"AIR:1\", \"AIR:2\", \"FOP:2\" ]", referencePattern));
System.out.println(countPattern("[ \"AIR:1\", \"AIR:2\", \"FOP:2\" ]", referencePattern));
System.out.println(countPattern("[ \"AIR:1\", \"AIR:2\", \"FOP:2\", \"PASSENGER:1\" ]", referencePattern));
System.out.println(countPattern("[ ]", referencePattern));
}
This prints out:
2
0
1
0
This is a solution for disjoint matches with streams:
public static int countPattern(String references, Pattern referencePattern) {
return StreamSupport.stream(Spliterators.spliteratorUnknownSize(
new Iterator<Integer>() {
Matcher matcher = referencePattern.matcher(references);
int from = 0;
#Override
public boolean hasNext() {
return matcher.find(from);
}
#Override
public Integer next() {
from = matcher.start() + 1;
return 1;
}
},
Spliterator.IMMUTABLE), false).reduce(0, (a, c) -> a + c);
}
Use the below code to find the count of number of matches that the regex finds in your input
Pattern p = Pattern.compile(regex, Pattern.MULTILINE | Pattern.DOTALL);// "regex" here indicates your predefined regex.
Matcher m = p.matcher(pattern); // "pattern" indicates your string to match the pattern against with
boolean b = m.matches();
if(b)
count++;
while (m.find())
count++;
This is a generalized code not specific one though, tailor it to suit your need
Please feel free to correct me if there is any mistake.

Categories

Resources