Can't split a line in Java - java

I am facing a problem that I don't know correctly split this line. I only need RandomAdresas0 100 2018 1.
String line = Files.readAllLines(Paths.get(failas2)).get(userInp);
System.out.println(line);
arr = line.split("[\\s\\-\\.\\'\\?\\,\\_\\#]+");;
Content in line:
[Pastatas{pastatoAdresas='RandomAdresas0',pastatoAukstuSkaicius=100,pastatoPastatymoData=2018, pastatoButuKiekis=1}]

You can try this code (basically extracting a string between two delimiters):
String ss = "[Pastatas{pastatoAdresas='RandomAdresas0',pastatoAukstuSkaicius=100,pastatoPastatymoData=2018, pastatoButuKiekis=1}]";
Pattern pattern = Pattern.compile("=(.*?)[,}]");
Matcher matcher = pattern.matcher(ss);
while (matcher.find()) {
System.out.println(matcher.group(1).replace("'", ""));
}
This output:
RandomAdresas0
100
2018

Remove all the characters before '{' including '{'
Remove all the characters after '}' including '}'
You can do the both by using indexOf method and substring.
Now you will left with only the following:
pastatoAdresas='RandomAdresas0',pastatoAukstuSkaicius=100,pastatoPastatymoData=2018, pastatoButuKiekis=1
After this read this [thread][1] : Parse a string with key=value pair in a map?

Here is a solution using a regular expression and the Pattern & Matcher classes. The values you are after can be retrieved using the group() method and you get all values by looping as long as find() returns true.
String data = "[Pastatas{pastatoAdresas='RandomAdresas0',pastatoAukstuSkaicius=100,pastatoPastatymoData=2018, pastatoButuKiekis=1}]";
Pattern pattern = Pattern.compile("=([^, }]*)");
Matcher matcher = pattern.matcher(data);
while (matcher.find()) {
System.out.printf("[%d:%d] %s", matcher.start(), matcher.end(), matcher.group(1));
}
The matched value is in group 1, group 0 matches the whole reg ex

Related

How to truncate a string after 5 delimiter in java?

String s = aaa-bbb-ccc-ddd-ee-23-xyz;
I need to convert the above string into aaa-bbb-ccc-ddd-ee, which means my output should only print words before fifth delimiter. could any help to solve this?
You could use a Regex:
String s = "aaa-bbb-ccc-ddd-ee-23-xyz";
Pattern p = Pattern.compile("^\\w+\\-\\w+\\-\\w+\\-\\w+\\-\\w+");
Matcher matcher = p.matcher(s);
matcher.find();
System.out.println(matcher.group(0));
Output is aaa-bbb-ccc-ddd-ee
If you have more than just letters you can replace the \\w with [^\\-] which grabs all characters but the delemiter.
Use Pattern and Matcher like this:
String s = "aaa-bbb-ccc-ddd-ee-23-xyz";
Pattern pattern = Pattern.compile("^((.+?-){4}[^-]+).*$");
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
s = matcher.group(1);
}
.* - search all symbols. ? - for lazy work
(.*?-) - search character sequence which end with symbol '-'
{4} - in your result string '-' 4 times
[^-]+ - after you search characters without '-'
.* - another characters after you serch
matcher.group(1) - return first group. This is ((.+?-){4}[^-]+)

Java regular expression to match parameters within a function

I would like to write a regular expression to extract parameter1 and parameter2 of func1(parameter1, parameter2), the length of parameter1 and parameter2 ranges from 1 to 64.
(func1) (\() (.{1,64}) (,\\s*) (.{1,64}) (\))
My version can not deal with the following case (nested function)
func2(func1(ef5b, 7dbdd))
I always get a "7dbdd)" for parameter2. How could I solve this?
Use "anything but closing parenthesis" ([^)]) instead of simply "anything" (.):
(func1) (\() (.{1,64}) (,\s*) ([^)]{1,64}) (\))
Demo: https://regex101.com/r/sP6eS1/1
Use [^)]{1,64} (match all except )) instead of .{1,64} (match any) to stop right before the first )
(func1) (\() (.{1,64}) (,\\s*) (.{1,64}) (\))
^
replace . with [^)]
Example:
// remove whitespace and escape backslash!
String regex = "(func1)(\\()(.{1,64})(,\\s*)([^)]{1,64})(\\))";
String input = "func2(func1(ef5b, 7dbdd))";
Pattern p = Pattern.compile(regex); // java.util.regex.Pattern
Matcher m = p.matcher(input); // java.util.regex.Matcher
if(m.find()) { // use while loop for multiple occurrences
String param1 = m.group(3);
String param2 = m.group(5);
// process the result...
}
If you want to ignore whitespace tokens, use this one:
func1\s*\(\s*([^\s]{1,64})\s*,\s*([^\s\)]{1,64})\s*\)"
Example:
// escape backslash!
String regex = "func1\\s*\\(\\s*([^\\s]{1,64})\\s*,\\s*([^\\s\\)]{1,64})\\s*\\)";
String input = "func2(func1 ( ef5b, 7dbdd ))";
Pattern p = Pattern.compile(regex); // java.util.regex.Pattern
Matcher m = p.matcher(input); // java.util.regex.Matcher
if(m.find()) { // use while loop for multiple occurrences
String param1 = m.group(1);
String param2 = m.group(2);
// process the result...
}
Hope this helpful
func1[^\(]*\(\s*([^,]{1,64}),\s*([^\)]{1,64})\s*\)
(func1) (\() (.{1,64}) (,\\s*) ([^)]{1,64}) (\))
^.*(func1)(\()(.{1,64})(,\s*)(.{1,64}[A-Za-z\d])(\))+
Working example: here

IndexOutOfBoundsException when using Matcher.find()

This Java program showing me IndexOutOfBoundsException when it tries to invoke group(1). If I replace 1 with 0 then the whole line is printed.. What do I have to do?
Pattern pattern = Pattern.compile("<abhi> abhinesh </abhi>");
Matcher matcher = pattern.matcher("<abhi> abhinesh </abhi>");
if (matcher.find())
System.out.println(matcher.group(1));
else
System.out.println("Not found");
index starts at 0 so use matcher.group(0)
Edit : To match the text between tag use this regex <abhi>(.*)<\\/abhi>
This post may shed more light on your question.
Confused about Matcher Group.
In short you haven't defined any regular expression grouping to reference an alternate group. You only have the full matching string.
Below if you try adding a grouped regular expression to parse the xml you'll notice 0 has the full string, 1 has the begin tag, 2 has the value, and 3 has the end tag.
Pattern pattern = Pattern.compile("<([a-z]+)>([a-z ]+)</([a-z]+)>");
Matcher matcher = pattern.matcher("<abhi> abhinesh </abhi>");
if (matcher.find()){
System.out.println(matcher.group(0));//<abhi> abhinesh </abhi>
System.out.println(matcher.group(1));//abhi
System.out.println(matcher.group(2));// abhinesh
System.out.println(matcher.group(3));//abhi
}else{
System.out.println("Not found");
}
Try this this regex:
<abhi>(.*)<\\/abhi>
The text you're after will be stored in the first capture group.
Example:
String regex = "<abhi>(.*)<\\/abhi>";
String input = "<abhi>foo</abhi>";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(input);
if (m.find()) {
System.out.println(m.group(1));
}

Pattern matching for character and end of line

I have a string which is in following format:
I am extracting this Hello:A;B;C, also Hello:D;E;F
How do I extract the strings A;B;C and D;E;F?
I have written below code snippet to extract but not able to extract the last matching character D;E;F
Pattern pattern = Pattern.compile("(?<=Hello:).*?(?=,)");
The $ means end-of-line.
Thus this should work:
Pattern pattern = Pattern.compile("(?<=Hello:).*?(?=,|$)");
So you look-ahead for a comma or the end-of-line.
Test.
Try this:
String test = "I am extracting this Hello:Word;AnotherWord;YetAnotherWord, also Hello:D;E;F";
// any word optionally followed by ";" three times, the whole thing followed by either two non-word characters or EOL
Pattern pattern = Pattern.compile("(\\w+;?){3}(?=\\W{2,}|$)");
Matcher matcher = pattern.matcher(test);
while (matcher.find()) {
System.out.println(matcher.group());
}
Output:
Word;AnotherWord;YetAnotherWord
D;E;F
Assuming you mean omitting certain patterns in a string:
String s = "I am extracting this Hello:A;B;C, also Hello:D;E;F" ;
ArrayList<String> tokens = new ArrayList<String>();
tokens.add( "A;B;C" );
tokens.add( "D;E;F" );
for( String tok : tokens )
{
if( s.contains( tok ) )
{
s = s.replace( tok, "");
}
}
System.out.println( s );

Regex for matching pattern within quotes

I have some input data such as
some string with 'hello' inside 'and inside'
How can I write a regex so that the quoted text (no matter how many times it is repeated) is returned (all of the occurrences).
I have a code that returns a single quotes, but I want to make it so that it returns multiple occurances:
String mydata = "some string with 'hello' inside 'and inside'";
Pattern pattern = Pattern.compile("'(.*?)+'");
Matcher matcher = pattern.matcher(mydata);
while (matcher.find())
{
System.out.println(matcher.group());
}
Find all occurences for me:
String mydata = "some '' string with 'hello' inside 'and inside'";
Pattern pattern = Pattern.compile("'[^']*'");
Matcher matcher = pattern.matcher(mydata);
while(matcher.find())
{
System.out.println(matcher.group());
}
Output:
''
'hello'
'and inside'
Pattern desciption:
' // start quoting text
[^'] // all characters not single quote
* // 0 or infinite count of not quote characters
' // end quote
I believe this should fit your requirements:
\'\w+\'
\'.*?' is the regex you are looking for.

Categories

Resources