java regex separate numbers from strings - java

I have got strings like:
BLAH00001
DIK-11
DIK-2
MAN5
so all the strings are a kind of (sequence any characters)+(sequence of numbers)
and i want something like this:
1
11
2
5
in order to get those integer values, i wanted to separate the char sequence and the number sequence an do something like Integer.parseInt(number_sequence)
Is there something that does this job?
greetings

Try this:
public class Main {
public static void main(String[]args) {
String source = "BLAH00001\n" +
"\n" +
"DIK-11\n" +
"\n" +
"DIK-2\n" +
"\n" +
"MAN5";
Matcher m = Pattern.compile("\\d+").matcher(source);
while(m.find()) {
int i = Integer.parseInt(m.group());
System.out.println(i);
}
}
}
which produces:
1
11
2
5

String[] a ={"BLAH00001","DIK-11","DIK-2","MAN5"};
for(String g:a)
System.out.println(Integer.valueOf(g.split("^[A-Z]+\\-?")[1]));
/*******************************
Regex Explanation :
^ --> StartWith
[A-Z]+ --> 1 or more UpperCase
\\-? --> 0 or 1 hyphen
*********************************/

Pattern p = Pattern.compile("^[^0-9]*([0-9]+)$");
Matcher m = p.matcher("ASDFSA123");
if (m.matches()) {
resultInt = Integer.parseInt(m.group(1)));
}

Maybe you want to have a look to Java Pattern and Matcher:
http://download.oracle.com/javase/1.4.2/docs/api/java/util/regex/Pattern.html
http://leepoint.net/notes-java/data/strings/40regular_expressions/26pattern-matcher.html
http://answers.yahoo.com/question/index?qid=20071106173149AA4TUON

Related

Regular expression in java always return true

Can you please help me with regex for the following string in android:
1.0.2 Build S6B5
How it should be:
{number}.{number}.{number}{space}Build{space}{S or D orT}{anything up 3 to 4 chars}
With help of some king people from here I`ve tried the flowing code:
if (name.matches("\\d+\\.\\d+\\.\\d+\\s+Build\\s[SDT].{3,4}"));
but it always return True even for:
1.0.1 4C0
1.0.1 B 4BD
1.0.4.52A
etc.
Try the following code:
public static void main (String[] args) {
String name1 = "1.0.1 4C0";
String name2 = "1.0.1 B 4BD";
String name3 = "1.0.4.52A";
String name4 = "1.0.2 Build S6B5";
check(name1);
check(name2);
check(name3);
check(name4);
}
private static void check(String name) {
Pattern p = Pattern.compile("(\\d+)\\.(\\d+)\\.(\\d+)\\s+Build\\s+([SDT]\\w{3,4})");
Matcher m = p.matcher(name);
if (m.find()) {
System.out.println("num1: " + m.group(1));
System.out.println("num2: " + m.group(2));
System.out.println("num3: " + m.group(3));
System.out.println("build: " + m.group(4));
} else {
System.out.println("not found");
}
}
Use Matcher.find() method to match parts of the test string and then Matcher.group() method to access the parts captured by the round brackets.
Resulting output:
not found
not found
not found
num1: 1
num2: 0
num3: 2
build: S6B5
Try this one instead :
(\d{1}).(\d{1}).(\d{1})\s(Build)\s([SDT])([\w\d]{3,4})
or
(\d).(\d).(\d)*\s(Build)\s([SDT])([\w\d]{3,4}) if you can have multiple numbers.
In your, the problem is the end of the regex : ".{3,4}". It means that you accept ANY character 3 to 4 times.

Why is Java placing the string before the word and not after?

from the String value want to getting word before and after the <in>
String ref = "application<in>rid and test<in>efd";
int result = ref.indexOf("<in>");
int result1 = ref.lastIndexOf("<in>");
String firstWord = ref.substring(0, result);
String[] wor = ref.split("<in>");
for (int i = 0; i < wor.length; i++) {
System.out.println(wor[i]);
}
}
my Expected Output
String[] output ={application,rid,test,efd}
i tried with 2 Option first one IndexOf but if the String have more than two <in>i 'm not getting my expected output
Second One splitits also not getting with my expected Output
please suggest best option to getting the word(before and after <in>)
You could use an expression like so: \b([^ ]+?)<in>([^ ]+?)\b (example here). This should match the string prior and after the <in> tag and place them in two groups.
Thus, given this:
String ref = "application<in>rid and test<in>efd";
Pattern p = Pattern.compile("\\b([^ ]+?)<in>([^ ]+?)\\b");
Matcher m = p.matcher(ref);
while(m.find())
System.out.println("Prior: " + m.group(1) + " After: " + m.group(2));
Yields:
Prior: application After: rid
Prior: test After: efd
Alternatively using split:
String[] phrases = ref.split("\\s+");
for(String s : phrases)
if(s.contains("<in>"))
{
String[] split = s.split("<in>");
for(String t : split)
System.out.println(t);
}
Yields:
application
rid
test
efd
Regex is your friend :)
public static void main(String args[]) throws Exception {
String ref = "application<in>rid and test<in>efd";
Pattern p = Pattern.compile("\\w+(?=<in>)|(?<=<in>)\\w+");
Matcher m = p.matcher(ref);
while (m.find()) {
System.out.println(m.group());
}
}
O/P :
application
rid
test
efd
No doubt matching what you need using Pattern/Matcher API is simpler for tis problem.
However if you're looking for a short and quick String#split solution then you can consider:
String ref = "application<in>rid and test<in>efd";
String[] toks = ref.split("<in>|\\s+.*?(?=\\b\\w+<in>)");
Output:
application
rid
test
efd
RegEx Demo
This regex splits on <in> or a pattern that matches a space followed by 0 more chars followed by a word and <in>.
You can also try the below code, it is quite simple
class StringReplace1
{
public static void main(String args[])
{
String ref = "application<in>rid and test<in>efd";
System.out.println((ref.replaceAll("<in>", " ")).replaceAll(" and "," "));
}
}

Use RegEx to extract number from coordinates

I am a beginner of Java Programming language.
When I input (1,2) into the console (brackets included), how can I write the code to extract the first and the second number using RegEx?
If there is no such expression to extract the first/second number within the brackets, I will have to change the way of inputing coordinates to x,y without the brackets and that should be a lot easier to extract numbers to be used.
Try this code:
public static void main(String[] args) {
String searchString = "(7,32)";
Pattern compile1 = Pattern.compile("\\(\\d+,");
Pattern compile2 = Pattern.compile(",\\d+\\)");
Matcher matcher1 = compile1.matcher(searchString);
Matcher matcher2 = compile2.matcher(searchString);
while (matcher1.find() && matcher2.find()) {
String group1 = matcher1.group();
String group2 = matcher2.group();
System.out.println("value 1: " + group1.substring(1, group1.length() - 1 ) + " value 2: " + group2.substring(1, group2.length() - 1 ));
}
}
Not that I think regex is the best to use here. If you know the input will be in the form of: (number, number), I would first get rid of brackets:
stringWithoutBrackets = searchString.substring(1, searchString.length()-1)
and than tokenize it with split
String[] coordiantes = stringWithoutBrackets.split(",");
Looked through Regex API and you can also do something like this:
public static void main(String[] args) {
String searchString = "(7,32)";
Pattern compile1 = Pattern.compile("(?<=\\()\\d+(?=,)");
Pattern compile2 = Pattern.compile("(?<=,)\\d+(?=\\))");
Matcher matcher1 = compile1.matcher(searchString);
Matcher matcher2 = compile2.matcher(searchString);
while (matcher1.find() && matcher2.find()) {
String group1 = matcher1.group();
String group2 = matcher2.group();
System.out.println("value 1: " + group1 + " value 2: " + group2);
}
}
The main change is that I used (?<==\)), (?=,), (?<=,), (?=\)), to search for brackets and commas but not caputre them. But I really think its an overkill for this task.

REGEX : How to escape []?

I'm working on strings like "[ro.multiboot]: [1]". How do I just select 1(it can also be 0) out of this string?
I am looking for a regex in Java.
Usually, you would do something like (assuming 0 and 1 were the only options):
^.*\[([01])\].*$
If you only wanted the value for ro.multiboot, you could change it to something like:
^.*\[ro.multiboot\].*\[([01])\].*$
(depending on how complex any of the non-bracketed stuff is allowed to be).
These would both basically only extract the value between square brackets if it were zero or one, and capture it into a capture variable so you could use it.
Of course, regex is not a world-wide standard, nor are the environments in which you use it. That means it depends a lot on your actual environment how you will actually code this up.
For Java, the following sample program may help:
import java.util.regex.*;
class Test {
public static void main(String args[]) {
Pattern p = Pattern.compile("^.*\\[ro.multiboot\\].*\\[([01])\\].*$");
String str;
Matcher m;
str = "[ro.multiboot]: [0]";
m = p.matcher (str);
if (m.find()) {
System.out.println ("str0 has " + m.group(1));
}
str = "[ro.multiboot]: [1]";
m = p.matcher (str);
if (m.find()) {
System.out.println ("str1 has " + m.group(1));
}
str = "[ro.multiboot]: [2]";
m = p.matcher (str);
if (m.find()) {
System.out.println ("str2 has " + m.group(1));
}
}
}
This results in (as expected):
str0 has 0
str1 has 1
#paxdiablo's regexps are correct, but complete answer for "How do I just select 1(it can also be 0) out of this string?" is:
1. very simple solution
String input = "[ro.multiboot]: [1]";
String matched = input.replaceFirst( "^.*\\[ro.multiboot\\].*\\[([01])\\].*$", "$1" );
2. same functionality, more complicated but with better performance
String input = "[ro.multiboot]: [1]";
Pattern p = Pattern.compile( "^.*\\[ro.multiboot\\].*\\[([01])\\].*$" );
Matcher m = p.matcher( input );
String matched = null;
if ( m.matches() ) matched = m.group( 1 );
Performance is better because the pattern is compiled just once (for example when you are matching array os such Strings);
Notes:
in both examples the group is part of regexps between ( and ) (if not escaped)
in Java you have to use \\[, because \[ returns error - it is not correct escape sequence for String

Regular expression string search in Java

I know this can be done in many ways but im curious as to what the regex would be to pick out all strings not containing a particular substring, say GDA from
strings like GADSA, GDSARTCC, , THGDAERY.
you can do negative lookaround
"^((?!GAD).)*$"
You don't need a regex. Just use string.contains("GDA") to see if a string contains a particular substring. It will return false if it doesn't.
If your input is one long string then you have to decide how you define a substring. If it's separated by spaces then:
String[] split = mylongstr.split(" ");
for (String s : split) {
if (!s.contains("GDA")) {
// do whatever
}
}
String regex = ".*GDA.*";
List<String> testStrings = populateStrings();
for (String s : testStrings)
{
if (!s.matches(regex))
System.out.println("String " + s + " does not match " + regex);
}
Give this a shot:
java.util.regex.Pattern p = java.util.regex.Pattern.compile("(?!\\w*GDA\\w*)\\b\\w+\\b");
java.util.regex.Matcher m = p.matcher("GADSA, GDSARTCC, , THGDAERY");
while (m.find()) {
System.out.println("Found: " + m.group());
}

Categories

Resources