How to truncate a string after 5 delimiter in java? - java

String s = aaa-bbb-ccc-ddd-ee-23-xyz;
I need to convert the above string into aaa-bbb-ccc-ddd-ee, which means my output should only print words before fifth delimiter. could any help to solve this?

You could use a Regex:
String s = "aaa-bbb-ccc-ddd-ee-23-xyz";
Pattern p = Pattern.compile("^\\w+\\-\\w+\\-\\w+\\-\\w+\\-\\w+");
Matcher matcher = p.matcher(s);
matcher.find();
System.out.println(matcher.group(0));
Output is aaa-bbb-ccc-ddd-ee
If you have more than just letters you can replace the \\w with [^\\-] which grabs all characters but the delemiter.

Use Pattern and Matcher like this:
String s = "aaa-bbb-ccc-ddd-ee-23-xyz";
Pattern pattern = Pattern.compile("^((.+?-){4}[^-]+).*$");
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
s = matcher.group(1);
}
.* - search all symbols. ? - for lazy work
(.*?-) - search character sequence which end with symbol '-'
{4} - in your result string '-' 4 times
[^-]+ - after you search characters without '-'
.* - another characters after you serch
matcher.group(1) - return first group. This is ((.+?-){4}[^-]+)

Related

How parse key-value with regex

i use Kotlin \ Java for parse some string.
My regex:
\[\'(.*?)[\]]=\'(.*?)(?!\,)[\']
text for parse:
someArray1['key1'] = 'value1', someArray2['key2'] = 'value2', ignoreText=ignore, some['key3'] = 'value3', ignoreMe['ignore']=ignore, some['key4'] = 'value4'..
i need result:
key1=value1
key2=value2
key3=value3
key4=value4
Thanks for help
Another regex for you
\['(\w+)'\]\s+(=)\s+'(\w+)'
Regex101 Demo Fiddle
Java test code
String str = "someArray1['key1'] = 'value1', someArray2['key2'] = 'value2', ignoreText=ignore, some['key3'] = 'value3', ignoreMe['ignore']=ignore, some['key4'] = 'value4'..";
String regex = "\\['(\\w+)'\\]\\s+(=)\\s+'(\\w+)'";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
System.out.println(matcher.group(1) + matcher.group(2) + matcher.group(3));
}
Test result:
key1=value1
key2=value2
key3=value3
key4=value4
A few notes about the pattern that you tried
In your pattern you are not matching the spaces around the equals sign.
Also note that this part (?!\,)[\'] will always work as it says that it asserts not a comma to the right, and then matches a single quote.
You don't have to escape the \' and the single characters do not have to be in a character class.
You can use a pattern with a negated character class to capture the values between the single quotes to prevent .*? matching too much as the dot can match any character.
You might write the pattern as
\['([^']*)'\]\h+=\h+'([^']*)'
The pattern matches:
\[' Match ['
( Capture group 1
[^']* Match optional chars other than '
) Close group 1
'\] Match ']
\h+=\h+ Match an equals sign between 1 or more horizontal whitespace characters
'([^']*)' Capture group 2 which has the same pattern as group 1
Regex demo | Java demo
Example
String regex = "\\['([^']*)'\\]\\h+=\\h+'([^']*)'";
String string = "someArray1['key1'] = 'value1', someArray2['key2'] = 'value2', ignoreText=ignore, some['key3'] = 'value3', ignoreMe['ignore']=ignore, some['key4'] = 'value4'..";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println(matcher.group(1) + "=" + matcher.group(2));
}
Output
key1=value1
key2=value2
key3=value3
key4=value4

extract a set of a characters between some characters

I have a string email = John.Mcgee.r2d2#hitachi.com
How can I write a java code using regex to bring just the r2d2?
I used this but got an error on eclipse
String email = John.Mcgee.r2d2#hitachi.com
Pattern pattern = Pattern.compile(".(.*)\#");
Matcher matcher = patter.matcher
for (Strimatcher.find()){
System.out.println(matcher.group(1));
}
To match after the last dot in a potential sequence of multiple dots request that the sequence that you capture does not contain a dot:
(?<=[.])([^.]*)(?=#)
(?<=[.]) means "preceded by a single dot"
(?=#) means "followed by # sign"
Note that since dot . is a metacharacter, it needs to be escaped either with \ (doubled for Java string literal) or with square brackets around it.
Demo.
Not sure if your posting the right code. I'll rewrite it based on what it should look like though:
String email = John.Mcgee.r2d2#hitachi.com
Pattern pattern = Pattern.compile(".(.*)\#");
Matcher matcher = pattern.matcher(email);
int count = 0;
while(matcher.find()) {
count++;
System.out.println(matcher.group(count));
}
but I think you just want something like this:
String email = John.Mcgee.r2d2#hitachi.com
Pattern pattern = Pattern.compile(".(.*)\#");
Matcher matcher = pattern.matcher(email);
if(matcher.find()){
System.out.println(matcher.group(1));
}
No need to Pattern you just need replaceAll with this regex .*\.([^\.]+)#.* which mean get the group ([^\.]+) (match one or more character except a dot) which is between dot \. and #
email = email.replaceAll(".*\\.([^\\.]+)#.*", "$1");
Output
r2d2
regex demo
If you want to go with Pattern then you have to use this regex \\.([^\\.]+)# :
String email = "John.Mcgee.r2d2#hitachi.com";
Pattern pattern = Pattern.compile("\\.([^\\.]+)#");
Matcher matcher = pattern.matcher(email);
if (matcher.find()) {
System.out.println(matcher.group(1));// Output : r2d2
}
Another solution you can use split :
String[] split = email.replaceAll("#.*", "").split("\\.");
email = split[split.length - 1];// Output : r2d2
Note :
Strings in java should be between double quotes "John.Mcgee.r2d2#hitachi.com"
You don't need to escape # in Java, but you have to escape the dot with double slash \\.
There are no syntax for a for loop like you do for (Strimatcher.find()){, maybe you mean while

Regular expression for extracting instance ID, AMI ID, Volume ID

Given the following string
Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305
I want to be able to extract the following using a regular expression
i-b9b4ffaa
ami-dbcf88b1
vol-e97db305
This is the regular expression I came up with, which currently doesn't do what I need :
Pattern p = Pattern.compile("Created by CreateImage([a-z]+[0.9]+)([a-z]+[0.9]+)([a-z]+[0.9]+)",Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher("Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305");
System.out.println(m.matches()); --> false
You may match all words starting with letters, followed with a hyphen, and then having alphanumeric chars:
String s = "Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305";
Pattern pattern = Pattern.compile("(?i)\\b[a-z]+-[a-z0-9]+");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
System.out.println(matcher.group(0));
}
// => i-b9b4ffaa, ami-dbcf88b1, vol-e97db305
See the Java demo
Pattern details:
(?i) - a case insensitive modifier (embedded flag option)
\\b - a word boundary
[a-z]+ - 1 or more ASCII letters
- - a hyphen
[a-z0-9]+ - 1 or more alphanumerics.
To make sure these values appear on the same line after Created by CreateImage, use a \G-based regex:
String s = "Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305";
Pattern pattern = Pattern.compile("(?i)(?:Created by CreateImage|(?!\\A)\\G)(?:(?!\\b[a-z]+-[a-z0-9]+).)*\\b([a-z]+-[a-z0-9]+)");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
System.out.println(matcher.group(1));
}
See this demo.
Note that the above pattern is based on the \G operator that matches the end of the last successful match (so we only match after a match or after Created...) and a tempered greedy token (?:(?!\\b[a-z]+-[a-z0-9]+).)* (matching any symbol other than a newline that does not start a sequence: word boundary+letters+-+letters|digits) that is very resource consuming.
You should consider using a two-step approach to first check if a string starts with Created... string, and then process it:
String s = "Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305";
if (s.startsWith("Created by CreateImage")) {
Matcher n = Pattern.compile("(?i)\\b[a-z]+-[a-z0-9]+").matcher(s);
while(n.find()) {
System.out.println(n.group(0));
}
}
See another demo

Splitting a string composed of numbers and alphabets

I want to break a string like :
String s = "xyz213123kop234430099kpf4532";
into tokens where each token starts with an alphabet and ends with a number. So the above string can be broken down into 3 tokens :
xyz213123
kop234430099
kpf4532
This string s could be very big but the pattern will remain the same, i.e each token will start with 3 alphabets and end with a number.
How do I split them ?
Try this:
\w+?\d+
Java Matcher:
Pattern pattern = Pattern.compile("\\w+?\\d+"); //compiles the pattern we want to use
Matcher matcher = pattern.matcher("xyz213123kop234430099kpf4532"); //we create the matcher on certain string using our pattern
while(matcher.find()) //while the matcher can find the next match
{
System.out.println(matcher.group()); //print it
}
And then you could use Regex.Matches C#:
foreach(Match m in Regex.Matches("xyz213123kop234430099kpf4532", #"\w+?\d+"))
{
Console.WriteLine(m.Value);
}
And for the future this:
RegExr
Do it like this,
String s = "xyz213123kop234430099kpf4532";
Pattern p = Pattern.compile("\\w+?\\d+");
Matcher match = p.matcher(s);
while(match.find()){
System.out.println(match.group());
}
OUTPUT
xyz213123
kop234430099
kpf4532
You can start from such regexp: (\w+?\d+)
http://regexr.com?36utt

Regex for matching pattern within quotes

I have some input data such as
some string with 'hello' inside 'and inside'
How can I write a regex so that the quoted text (no matter how many times it is repeated) is returned (all of the occurrences).
I have a code that returns a single quotes, but I want to make it so that it returns multiple occurances:
String mydata = "some string with 'hello' inside 'and inside'";
Pattern pattern = Pattern.compile("'(.*?)+'");
Matcher matcher = pattern.matcher(mydata);
while (matcher.find())
{
System.out.println(matcher.group());
}
Find all occurences for me:
String mydata = "some '' string with 'hello' inside 'and inside'";
Pattern pattern = Pattern.compile("'[^']*'");
Matcher matcher = pattern.matcher(mydata);
while(matcher.find())
{
System.out.println(matcher.group());
}
Output:
''
'hello'
'and inside'
Pattern desciption:
' // start quoting text
[^'] // all characters not single quote
* // 0 or infinite count of not quote characters
' // end quote
I believe this should fit your requirements:
\'\w+\'
\'.*?' is the regex you are looking for.

Categories

Resources