java extract digits from string - java

String str = "POLYGON((39.4189453125 37.418708616699824,42.0556640625 37.418708616699824,43.4619140625 34.79181436843146,38.84765625 33.84817790215085,39.4189453125 37.418708616699824))";
I have tried to get only 39.4189453125 37.418708616699824,42.0556640625 37.418708616699824,43.4619140625 34.79181436843146,38.84765625 33.84817790215085,39.4189453125 37.418708616699824
with
final String coordsOnly = str.replaceAll("\\D\\(\\(\\w\\)\\)", "$2");
but i get coordsOnly = "POLYGON((39.4189453125 37.418708616699824,42.0556640625 37.418708616699824,43.4619140625 34.79181436843146,38.84765625 33.84817790215085,39.4189453125 37.418708616699824))"
what am i missing?

One reasonable approach here would be to match the following pattern and then replace all with the first capture group:
POLYGON\(\((.*?)\)\)
Sample code:
final String coordsOnly = str.replaceAll("POLYGON\\(\\((.*?)\\)\\)", "$1");
Demo
Edit:
If you also need to isolate the pairs of numbers, you can just use String#split() on comma. Actually, this looks like output from a database query, and I suspect that the database may offer a better way of getting out the individual values. But the answers given here are an option for you in case you can't get the exact output you need already.

Actually, a lot.
"\\D" // Matches a non-digit character, but it only matches one,
// while you need to match a word "POLYGON";
"\\(\\(" // Good. Matches the double left parentheses ((
"\\w" // One word character? Same issue, you need to match multiple chars. And what about '.'?
"\\)\\)" // Good. Matches the double right parentheses ))
And escaped () doesn't create matching groups; and \\w matches only one word character [a-zA-Z_0-9], it even won't match ..
I believe you should try something like this:
String coords = str.replaceAll("POLYGON\\(\\(([^)]+)\\)\\)", "$1");

Maybe this is what you want?:
String rex = "[+-]?([0-9]*[.])?[0-9]+";
Pattern p = Pattern.compile(rex);
String input = "POLYGON((39.4189453125 37.418708616699824,42.0556640625 37.418708616699824,43.4619140625 34.79181436843146,38.84765625 33.84817790215085,39.4189453125 37.418708616699824))";
Matcher matcher = p.matcher(input);
while(matcher.find()) {
System.out.println(matcher.group());
}

Try this version:
str = str.replaceAll("[^\\d\\.\\s,]", "");

String str = "POLYGON((39.4189453125 37.418708616699824,42.0556640625 37.418708616699824,43.4619140625 34.79181436843146,38.84765625 33.84817790215085,39.4189453125 37.418708616699824))";
String sp = "(([0-9]+[.])?[0-9]+[,]?\\s*)+";
Pattern p = Pattern.compile(sp);
Matcher matcher = p.matcher(str);
if (matcher.find()) {
System.out.println(matcher.group());
}
output:
39.4189453125 37.418708616699824,42.0556640625 37.418708616699824,43.4619140625 34.79181436843146,38.84765625 33.84817790215085,39.4189453125 37.418708616699824

Another alternative is replace all non-digit, space, dot or comma:
str.replaceAll("[\\D&&\\S&&[^,\\.]]", "")
Output:
39.4189453125 37.418708616699824,42.0556640625 37.418708616699824,43.4619140625 34.79181436843146,38.84765625 33.84817790215085,39.4189453125 37.418708616699824

Related

Java Regular expression for exacted matched case

I have been struggling to find the matched string(s) with Java Regular expression for the syntax {//<some string>/<some String>}
My regular expression should return with these matched cases: {//data/process_id}
Below is the String which i want to find matched syntax:
#process_id={//data/process_id}##history_id={//data/history_id}##Pdataxml={//data/dataxml}##Prules =_UNESCAPEXMLVALUE({//data/rules})##submitted_by={//data/submitted_by}##table_definition={//data/table_definition}
I have tried with below regx pattern but it did not work:
[a-zA-Z_/\\[\\]\\(\\)0-9|]+
Can someone please help me to solve this issue?
You can use the following regex:
\{\/\/[^\/{}\s]*\/[^\/{}\s]*\}
Demo on regex101
code:
String input = "#process_id={//data/process_id}##history_id={//data/history_id}##Pdataxml={//data/dataxml}##Prules =_UNESCAPEXMLVALUE({//data/rules})##submitted_by={//data/submitted_by}##table_definition={//data/table_definition}";
List<String> allMatches = new ArrayList<String>();
Matcher m = Pattern.compile("\\{\\/\\/[^\\/{}\\s]*\\/[^\\/{}\\s]*\\}").matcher(input);
while (m.find()) {
allMatches.add(m.group());
}
System.out.println(allMatches);
output:
[{//data/process_id}, {//data/history_id}, {//data/dataxml}, {//data/rules}, {//data/submitted_by}, {//data/table_definition}]
Try this regex with a Matcher:
"\\{//([^/]+)/([^/}]+)}"
The parts are captured in groups 1 and 2.
Like this:
Matcher m = Pattern.compile("\\{//([^/]+)/([^/}]+)}").matcher(str);
while (m.find()) {
String part1 = m.group(1);
String part2 = m.group(2);
// do something with the parts
}
To just grab the whole thing, which would be got from m.group(), use this regex:
"(?<=\\{)//[^/]+/[^/}]+(?=})"

Java regular expression to match parameters within a function

I would like to write a regular expression to extract parameter1 and parameter2 of func1(parameter1, parameter2), the length of parameter1 and parameter2 ranges from 1 to 64.
(func1) (\() (.{1,64}) (,\\s*) (.{1,64}) (\))
My version can not deal with the following case (nested function)
func2(func1(ef5b, 7dbdd))
I always get a "7dbdd)" for parameter2. How could I solve this?
Use "anything but closing parenthesis" ([^)]) instead of simply "anything" (.):
(func1) (\() (.{1,64}) (,\s*) ([^)]{1,64}) (\))
Demo: https://regex101.com/r/sP6eS1/1
Use [^)]{1,64} (match all except )) instead of .{1,64} (match any) to stop right before the first )
(func1) (\() (.{1,64}) (,\\s*) (.{1,64}) (\))
^
replace . with [^)]
Example:
// remove whitespace and escape backslash!
String regex = "(func1)(\\()(.{1,64})(,\\s*)([^)]{1,64})(\\))";
String input = "func2(func1(ef5b, 7dbdd))";
Pattern p = Pattern.compile(regex); // java.util.regex.Pattern
Matcher m = p.matcher(input); // java.util.regex.Matcher
if(m.find()) { // use while loop for multiple occurrences
String param1 = m.group(3);
String param2 = m.group(5);
// process the result...
}
If you want to ignore whitespace tokens, use this one:
func1\s*\(\s*([^\s]{1,64})\s*,\s*([^\s\)]{1,64})\s*\)"
Example:
// escape backslash!
String regex = "func1\\s*\\(\\s*([^\\s]{1,64})\\s*,\\s*([^\\s\\)]{1,64})\\s*\\)";
String input = "func2(func1 ( ef5b, 7dbdd ))";
Pattern p = Pattern.compile(regex); // java.util.regex.Pattern
Matcher m = p.matcher(input); // java.util.regex.Matcher
if(m.find()) { // use while loop for multiple occurrences
String param1 = m.group(1);
String param2 = m.group(2);
// process the result...
}
Hope this helpful
func1[^\(]*\(\s*([^,]{1,64}),\s*([^\)]{1,64})\s*\)
(func1) (\() (.{1,64}) (,\\s*) ([^)]{1,64}) (\))
^.*(func1)(\()(.{1,64})(,\s*)(.{1,64}[A-Za-z\d])(\))+
Working example: here

Creating regex to extract 4 digit number from string using java

Hi I am trying to build one regex to extract 4 digit number from given string using java. I tried it in following ways:
String mydata = "get the 0025 data from string";
Pattern pattern = Pattern.compile("^[0-9]+$");
//Pattern pattern = Pattern.compile("^[0-90-90-90-9]+$");
//Pattern pattern = Pattern.compile("^[\\d]+$");
//Pattern pattern = Pattern.compile("^[\\d\\d\\d\\d]+$");
Matcher matcher = pattern.matcher(mydata);
String val = "";
if (matcher.find()) {
System.out.println(matcher.group(1));
val = matcher.group(1);
}
But it's not working properly. How to do this. Need some help. Thank you.
Change you pattern to:
Pattern pattern = Pattern.compile("(\\d{4})");
\d is for a digit and the number in {} is the number of digits you want to have.
If you want to end up with 0025,
String mydata = "get the 0025 data from string";
mydata = mydata.replaceAll("\\D", ""); // Replace all non-digits
Pattern pattern = Pattern.compile("\\b[0-9]+\\b");
This should do it for you.^$ will compare with the whole string.It will match string with only numbers.
Remove the anchors.. put paranthesis if you want them in group 1:
Pattern pattern = Pattern.compile("([0-9]+)"); //"[0-9]{4}" for 4 digit number
And extract out matcher.group(1)
Many better answers, but if you still have to use in the same way.
String mydata = "get the 0025 data from string";
Pattern pattern = Pattern.compile("(?<![-.])\\b[0-9]+\\b(?!\\.[0-9])");
Matcher matcher = pattern.matcher(mydata);
String val = "";
if (matcher.find()) {
System.out.println(matcher.group(0));
val = matcher.group(0);
}
changed matcher.group(1); to matcher.group(0);
You can go with \d{4} or [0-9]{4} but note that by specifying the ^ at the beginning of regex and $ at the end you're limiting yourself to strings that contain only 4 digits.
My recomendation: Learn some regex basics.
Scanner sc=new Scanner(System.in);
HashMap<String,String> a=new HashMap<>();
ArrayList<String> b=new ArrayList<>();
String s=sc.nextLine();
Pattern p=Pattern.compile("\\d{4}");
Matcher m=p.matcher(s);
while(m.find())
{
String x="";
x=x+(m.group(0));
a.put(x,"0");
b.add(x);
}
System.out.println(a.size());
System.out.println(b);
You can find all matched digit patterns and unique patterns (for unique use Set<String> k=b.keySet();)
If you want to match any number of digits then use pattern like the following:
^\D*(\d+)\D*$
And for exactly 4 digits go for
^\D*(\d{4})\D*$

First and second tocen regex

How could I get the first and the second text in "" from the string?
I could do it with indexOf but this is really boring ((
For example I have a String for parse like: "aaa":"bbbbb"perhapsSomeOtherText
And I d like to get aaa and bbbbb with the help of Regex pattern - this will help me to use it in switch statement and will greatly simplify my app/
If all that you have is colon delimited string just split it:
String str = ...; // colon delimited
String[] parts = str.split(":");
Note, that split() receives regex and compilies it every time. To improve performance of your code you can use Pattern as following:
private static Pattern pColonSplitter = Pattern.compile(":");
// now somewhere in your code:
String[] parts = pColonSplitter.split(str);
If however you want to use pattern for matching and extraction of string fragments in more complicated cases, do it like following:
Pattert p = Patter.compile("(\\w+):(\\w+):");
Matcher m = p.matcher(str);
if (m.find()) {
String a = m.group(1);
String b = m.group(2);
}
Pay attention on brackets that define captured group.
Something like this?
Pattern pattern = Pattern.compile("\"([^\"]*)\"");
Matcher matcher = pattern.matcher("\"aaa\":\"bbbbb\"perhapsSomeOtherText");
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Output
aaa
bbbbb
String str = "\"aaa\":\"bbbbb\"perhapsSomeOtherText";
Pattern p = Pattern.compile("\"\\w+\""); // word between ""
Matcher m = p.matcher(str);
while(m.find()){
System.out.println(m.group().replace("\"", ""));
}
output:
aaa
bbbbb
there are several ways to do this
Use StringTokenizer or Scanner with UseDelimiter method

Substring to remove everything before first period and after second

So I have a filename that looks like this:
myFile.12345.txt
If I wanted to end up with just the "12345" how would I go about removing that from the filename if the 12345 could be anywhere between 1 and 5 numbers in length?
If you are sure that there would be 2 periods . for sure
String fileName = string.split("\\.")[1]
you can use this
String s="ghgj.7657676.jklj";
String p = s.substring(s.indexOf(".")+1,s.lastIndexOf("."));
Assuming you want to extract all the numbers, you could use a simple regex to remove all the non-digits characters:
String s = "myFile.12345.txt";
String numbers = s.replaceAll("[^\\d]","");
System.out.println(numbers); //12345
Note: It would not work with file12.12345.txt for example
static final Pattern P = Pattern.compile("^(.*?)\\.(.*?)\\.(.*?)$");
...
...
...
Matcher m = P.matcher(input);
if (m.matches()) {
//String first = m.group(1);
String middle = m.group(2);
//String last = m.group(3);
...
}

Categories

Resources