Regular expression split string on colon

Regular expression split string on colon - java

I have a string
String l = "name: kumar age: 22 relationship: single "
it is comming from UI dynamically now i need to split the above string to
name: kumar
age: 22
relationship: single
My code is :
Pattern ptn = Pattern.compile("([^\\s]+( ?= ?[^\\s]*)?)");
Matcher mt = ptn.matcher(l);
while(mt.find())
{
String col_dat=mt.group(0);
if(col_dat !=null && col_dat.length()>0)
{
System.out.println("\t"+col_dat );
}
}
Any Suggestions will appreciated Thank you

You can use this regex:
\S+\s*:\s*\S+
Or this:
\w+\s*:\s*\w+
Demo: https://regex101.com/r/EgXlcD/6
Regex:
\S+ - 1 or more non space characters
\s* - 0 or more space characters
\w+ - 0 or more \w i.e [A-Za-z0-9_] characters.

Related

Regex for italic markdown

I'm trying for hours with regex: I need a regex to select all that is inside underlines.
Example:
\_italic\_
But with the only condition that I need it to ignore \\_ (backslash followed by underscore).
So, this would be a match (all the text which is inside the \_):
\_italic some text 123 \\_*%&$ _
SO far I have this regex:
(\_.*?\_)(?!\\\_)
But is not ignoring the \\_
Which regex would work?

You can use
(?s)(?<!\\)(?:\\{2})*_((?:[^\\_]|\\.)+)_
See the regex demo. Details:
(?s) - an inline embedded flag option equal to Pattern.DOTALL
(?<!\\)(?:\\{2})* - a position that is not immediately preceded with a backslash and then zero or more sequences of double backslashes
_ - an underscore
((?:[^\\_]|\\.)+) - Capturing group 1: one or more occurrences of any char other than a \ and _, or any escaped char (a combination of a \ and any one char)
_ - an underscore
See the Java demo:
List<String> strs = Arrays.asList("xxx _italic some text 123 \\_*%&$ _ xxx",
"\\_test_test_");
String regex = "(?s)(?<!\\\\)(?:\\\\{2})*_((?:[^\\\\_]|\\\\.)+)_";
Pattern p = Pattern.compile(regex);
for (String str : strs) {
Matcher m = p.matcher(str);
List<String> result = new ArrayList<>();
while(m.find()) {
result.add(m.group(1));
}
System.out.println(str + " => " + String.join(", ", result));
}
Output:
xxx _italic some text 123 \_*%&$ _ xxx => italic some text 123 \_*%&$
\_test_test_ => test

How to extract a string till the end of the line with regular expression

I have the following string(contains Portuguese characters) in the following structure: contain Name: and then some words after.
Example:
String myStr1 = "aaad Name: bla and more blá\n gdgdf ppp";
String myStr2 = "bbbb Name: Á different blÁblÁ\n hhhh fjjj";
I need to extract the string from 'Name:' till the end of the line.
example:
extract(myStr1) = "Name: bla and more blá"
extract(myStr2) = "Name: Á different blÁblÁ"
Edit after #blue_note answer:
here is what I tried:
public static String extract(String myStr) {
Pattern p = compile("Name:(?m)^.*$");
Matcher m = p.matcher(myStr);
while (m.find()) {
String theGroup = m.group(0);
System.out.format("'%s'\n", theGroup);
return m.group(0);
}
return null;
}
did not work.

The regex is "^\\w*\\s*((?m)Name.*$)")
where
?m enables the multiline mode
^, $ denote start of line and end of line respectively
.* means any character, any number of times
And get group(1), not group(0) of the matched expression

You could also use substring in this case:
String name = myStr1.substring(myStr1.indexOf("Name:"), myStr1.indexOf("\n"));

How to remove unneeded white spaces and last word of a string with Regex?

I have a string that contains words in a pattern like this:
2013-2014  XXX 29 
2011-2012  XXXX 44
Please note that there are 2 whitespaces before AND after the year.
I need to remove the first 2 whitespaces, the 1 whitespace after the year and the last word (29/44 etc).
So it will become like this:
2013-2014 XXX
2011-2012 XXXX
Im really bad with Regex so any help would be appreciated. So far i can remove the last word with
str.replaceAll(" [^ ]+$", "");

Select only what you want and replace the rest (with a space in the middle) :)
This should work for you :
public static void main(String[] args) throws IOException {
String s1 = " 2013-2014 XXX 29 ";
System.out.println(s1.replaceAll("^\\s+([\\d-]+)\\s+(\\w+).*", "$1 $2"));
String s2 = " 2011-2012 XXXX 44 ";
System.out.println(s2.replaceAll("^\\s+([\\d-]+)\\s+(\\w+).*", "$1 $2"));
}
O/P :
2013-2014 XXX
2011-2012 XXXX

You can use a single regex for this:
str = str.replaceAll("^ +|(?<=\\d{4} ) | [^ ]+ *$", "");
RegEx Demo
RegEx Breakup:
^ + # 1 or more spaces at start
| # OR
(?<=\\d{4} ) # space after 4 digit year and a space
| # OR
[^ ]+ *$ # text after last space at end

you could also do it in multiple more easy to understand steps, like this:
public static void main(String[]args){
String s = " 2011-2012 XXXX 44";
// Remove leading and trailing whitespace
s = s.trim();
System.out.println(s);
// replace two or more whitespaces with a single whitespace
s = s.replaceAll("\\s{2,}", " ");
System.out.println(s);
// remove the last word and the whitespace before it
s = s.replaceAll("\\s\\w*$", "");
System.out.println(s);
}
O/P:
2011-2012 XXXX 44
2011-2012 XXXX 44
2011-2012 XXXX

You can also try this:
str = str.replaceAll("\\s{2}", " ").trim();
Example:
String str = " 2013-2014 XXX 29 ";
Now:
str.replaceAll("\\s{2}", " ");
Output: " 2013-2014 XXX 29 "
And with .trim() it looks like this: "2013-2014 XXX 29"

Java repeated character regex with condition

I have large database. I want to check my database capitalize errors. I use this pattern for repeated chars. Pattern works but i need to start and end condition with string.
Pattern:
(\w)\1+
Target String:
Javaaa
result: aaa
I want to add condition to regex; Start with Ja and end with a*. Result **only must be repetead characters.
(I dont want to control programmatically only regex do this if its possible
(I'm do this with String.replaceAll(regex, string) not to
Pattern or Matcher class)

You may use a lookahead anchored at the leading word boundary:
\b(?=Ja\w*a\b)\w*?((\w)\2+)\w*\b
See the regex demo
Details:
\b - leading word boundary
(?=Ja\w*a\b) - a positive lookahead that requires the whole word to start with Ja, then it can have 0+ word characters and end with a
\w*? - 0+ word characters but as few as possible
((\w)\2+) - Group 1 matching identical consecutive characters
\w* - any remaining word characters (0 or more)
\b - trailing word boundary.
The result you are seeking is in Group 1.
String s = "Prooo\nJavaaa";
Pattern pattern = Pattern.compile("\\b(?=Ja\\w*a\\b)\\w*?((\\w)\\2+)\\w*\\b");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
System.out.println(matcher.group(1));
}
See the Java demo.

Another code example (inspired from #Wiktor Stribizew's code ) as per your expected input and output format.
public static void main( String[] args )
{
String[] input =
{ "Javaaa", "Javaaaaaaaaa", "Javaaaaaaaaaaaaaaaaaa", "Paoooo", "Paoooooooo", "Paooooooooxxxxxxxxx" };
for ( String str : input )
{
System.out.println( "Target String :" + str );
Pattern pattern = Pattern.compile( "((.)\\2+)" );
Matcher matcher = pattern.matcher( str );
while ( matcher.find() )
{
System.out.println( "result: " + matcher.group() );
}
System.out.println( "---------------------" );
}
System.out.println( "Finish" );
}
Output:
Target String :Javaaa
result: aaa
---------------------
Target String :Javaaaaaaaaa
result: aaaaaaaaa
---------------------
Target String :Javaaaaaaaaaaaaaaaaaa
result: aaaaaaaaaaaaaaaaaa
---------------------
Target String :Paoooo
result: oooo
---------------------
Target String :Paoooooooo
result: oooooooo
---------------------
Target String :Paooooooooxxxxxxxxx
result: oooooooo
result: xxxxxxxxx
---------------------
Finish

Java pattern matching using regex

I am new to java coding and using pattern matching.I am reading this string from file. So, this will give compilation error. I have a string as follows :
String str = "find(\"128.210.16.48\",\"Hello Everyone\")" ; // no compile error
I want to extract "128.210.16.48" value and "Hello Everyone" from above string. This values are not constant.
can you please give me some suggestions?
Thanks

I suggest you to use String#split() method but still if you are looking for regex pattern then try it and get the matched group from index 1.
("[^"][\d\.]+"|"[^)]*+)
Online demo
Sample code:
String str = "find(\"128.210.16.48\",\"Hello Everyone\")";
String regex = "(\"[^\"][\\d\\.]+\"|\"[^)]*+)";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
output:
"128.210.16.48"
"Hello Everyone"
Pattern explanation:
( group and capture to \1:
" '"'
[^"] any character except: '"'
[\d\.]+ any character of: digits (0-9), '\.' (1
or more times (matching the most amount
possible))
" '"'
| OR
" '"'
[^)]* any character except: ')' (0 or more
times (matching the most amount
possible))
) end of \1

Try with String.split()
String str = "find(\"128.210.16.48\",\"Hello Everyone\")" ;
System.out.println(str.split(",")[0].split("\"")[1]);
System.out.println(str.split(",")[1].split("\"")[1]);
Output:
128.210.16.48
Hello Everyone
Edit:
Explanation:
For the first string split it by comma (,). From that array choose the first string as str.split(",")[0] split the string again with doublequote (") as split("\"")[1] and choose the second element from the array. Same the second string is also done.

The accepted answer is fine, but if for some reason you wanted to still use regex (or whoever finds this question) instead of String.split here's something:
String str = "find(\"128.210.16.48\",\"Hello Everyone\")" ; // no compile error
String regex1 = "\".+?\"";
Pattern pattern1 = Pattern.compile(regex1);
Matcher matcher1 = pattern1.matcher(str);
while (matcher1.find()){
System.out.println("Matcher 1 found (trimmed): " + matcher1.group().replace("\"",""));
}
Output:
Matcher 1 found (trimmed): 128.210.16.48
Matcher 1 found (trimmed): Hello Everyone
Note: this will only work if " is only used as a separator character. See Braj's demo as an example from the comments here.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Regular expression split string on colon - java

You can use this regex: \S+\s:\s\S+ Or this: \w+\s:\s\w+ Demo: https://regex101.com/r/EgXlcD/6 Regex: \S+ - 1 or more non space characters \s* - 0 or more space characters \w+ - 0 or more \w i.e [A-Za-z0-9_] characters.

Related

Regex for italic markdown

How to extract a string till the end of the line with regular expression

How to remove unneeded white spaces and last word of a string with Regex?

Java repeated character regex with condition

Java pattern matching using regex

Categories

Resources

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Regular expression split string on colon - java

You can use this regex: \S+\s*:\s*\S+ Or this: \w+\s*:\s*\w+ Demo: https://regex101.com/r/EgXlcD/6 Regex: \S+ - 1 or more non space characters \s* - 0 or more space characters \w+ - 0 or more \w i.e [A-Za-z0-9_] characters.

Related

Regex for italic markdown

How to extract a string till the end of the line with regular expression

How to remove unneeded white spaces and last word of a string with Regex?

Java repeated character regex with condition

Java pattern matching using regex

Categories

Resources

You can use this regex: \S+\s:\s\S+ Or this: \w+\s:\s\w+ Demo: https://regex101.com/r/EgXlcD/6 Regex: \S+ - 1 or more non space characters \s* - 0 or more space characters \w+ - 0 or more \w i.e [A-Za-z0-9_] characters.