Creating regex to extract 4 digit number from string using java - java

Hi I am trying to build one regex to extract 4 digit number from given string using java. I tried it in following ways:
String mydata = "get the 0025 data from string";
Pattern pattern = Pattern.compile("^[0-9]+$");
//Pattern pattern = Pattern.compile("^[0-90-90-90-9]+$");
//Pattern pattern = Pattern.compile("^[\\d]+$");
//Pattern pattern = Pattern.compile("^[\\d\\d\\d\\d]+$");
Matcher matcher = pattern.matcher(mydata);
String val = "";
if (matcher.find()) {
System.out.println(matcher.group(1));
val = matcher.group(1);
}
But it's not working properly. How to do this. Need some help. Thank you.

Change you pattern to:
Pattern pattern = Pattern.compile("(\\d{4})");
\d is for a digit and the number in {} is the number of digits you want to have.

If you want to end up with 0025,
String mydata = "get the 0025 data from string";
mydata = mydata.replaceAll("\\D", ""); // Replace all non-digits

Pattern pattern = Pattern.compile("\\b[0-9]+\\b");
This should do it for you.^$ will compare with the whole string.It will match string with only numbers.

Remove the anchors.. put paranthesis if you want them in group 1:
Pattern pattern = Pattern.compile("([0-9]+)"); //"[0-9]{4}" for 4 digit number
And extract out matcher.group(1)

Many better answers, but if you still have to use in the same way.
String mydata = "get the 0025 data from string";
Pattern pattern = Pattern.compile("(?<![-.])\\b[0-9]+\\b(?!\\.[0-9])");
Matcher matcher = pattern.matcher(mydata);
String val = "";
if (matcher.find()) {
System.out.println(matcher.group(0));
val = matcher.group(0);
}
changed matcher.group(1); to matcher.group(0);

You can go with \d{4} or [0-9]{4} but note that by specifying the ^ at the beginning of regex and $ at the end you're limiting yourself to strings that contain only 4 digits.
My recomendation: Learn some regex basics.

Scanner sc=new Scanner(System.in);
HashMap<String,String> a=new HashMap<>();
ArrayList<String> b=new ArrayList<>();
String s=sc.nextLine();
Pattern p=Pattern.compile("\\d{4}");
Matcher m=p.matcher(s);
while(m.find())
{
String x="";
x=x+(m.group(0));
a.put(x,"0");
b.add(x);
}
System.out.println(a.size());
System.out.println(b);
You can find all matched digit patterns and unique patterns (for unique use Set<String> k=b.keySet();)

If you want to match any number of digits then use pattern like the following:
^\D*(\d+)\D*$
And for exactly 4 digits go for
^\D*(\d{4})\D*$

Related

How would I search a certain word after a character in java?

So I am doing some cw, and I want to search a string for words after a hashtag, "#".
How would I go about this?
Say for example the string was 'Hello World #me'? how would i return the word "me"?
kind regards
Use a regex and prepare a Matcher to find hashtags iteratively as
String input = "Hello #World! #Me";
Pattern pattern = Pattern.compile("#(\\S+)");
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Output :
World!
Me
Split the String on basis of that character say
String []splittedString=inputString.split("#");
System.out.println(splittedString[1]);
So for Input String
Hello World #me'
Output
me
Use this
example.substring(example.indexOf("#") + 1);
Using regex:
// Matches a string of word characters preceded by a '#'
Pattern p = Pattern.compile("(?<=#)\\w*");
Matcher m = p.matcher("Hello World #me");
String hashtag = "";
if(m.find())
{
hashtag = m.group(); //me
}
So then John, let me guess. You're a computer Science student at the university of Warwick. Here you go,
String s = "hello #yolo blaaa";
if(s.contains("#")){
int hash = s.indexOf("#") - 1;
s = s.substring(hash);
int space = s.indexOf(' ');
s = s.substring(space);
}
remove the -1 if you don't want to include the #
A simple way of doing it would be to Use indexOf and then you should use overloaded indexOf with a subString
EG:
String myString = originalString.substring(originalString.indexOf("#"),originalString.indexOf(" "),originalString.indexOf("#"));
Please note that this can throw out of bounds error is the characters are not found. Read the java doc links to understand in detail as to what this is doing.

First and second tocen regex

How could I get the first and the second text in "" from the string?
I could do it with indexOf but this is really boring ((
For example I have a String for parse like: "aaa":"bbbbb"perhapsSomeOtherText
And I d like to get aaa and bbbbb with the help of Regex pattern - this will help me to use it in switch statement and will greatly simplify my app/
If all that you have is colon delimited string just split it:
String str = ...; // colon delimited
String[] parts = str.split(":");
Note, that split() receives regex and compilies it every time. To improve performance of your code you can use Pattern as following:
private static Pattern pColonSplitter = Pattern.compile(":");
// now somewhere in your code:
String[] parts = pColonSplitter.split(str);
If however you want to use pattern for matching and extraction of string fragments in more complicated cases, do it like following:
Pattert p = Patter.compile("(\\w+):(\\w+):");
Matcher m = p.matcher(str);
if (m.find()) {
String a = m.group(1);
String b = m.group(2);
}
Pay attention on brackets that define captured group.
Something like this?
Pattern pattern = Pattern.compile("\"([^\"]*)\"");
Matcher matcher = pattern.matcher("\"aaa\":\"bbbbb\"perhapsSomeOtherText");
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Output
aaa
bbbbb
String str = "\"aaa\":\"bbbbb\"perhapsSomeOtherText";
Pattern p = Pattern.compile("\"\\w+\""); // word between ""
Matcher m = p.matcher(str);
while(m.find()){
System.out.println(m.group().replace("\"", ""));
}
output:
aaa
bbbbb
there are several ways to do this
Use StringTokenizer or Scanner with UseDelimiter method

Extracting Number from URL in Java via Regex

Take URL http://www.abc.com/alpha/beta/33445566778899/gamma/delta
i need to return the number 33445566778899 (with forward slashes removed, number is of variable length but between 10 & 20 digits)
Simple enough (or so i thought) except everything I've tried doesn't seem to work but why?
Pattern pattern = Pattern.compile("\\/([0-9])\\d{10,20}\\/");
Matcher matcher = pattern.matcher(fullUrl);
if (matcher.find()) {
return matcher.group(1);
}
Try this one-liner:
String number = url.replaceAll(".*/(\\d{10,20})/.*", "$1");
This regex works -
"\\/(\\d{10,20})\\/"
Testing it-
String fullUrl = "http://www.abc.com/alpha/beta/33445566778899/gamma/delta";
Pattern pattern = Pattern.compile("\\/(\\d{10,20})\\/");
Matcher matcher = pattern.matcher(fullUrl);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
OUTPUT - 33445566778899
Try,
String url = "http://www.abc.com/alpha/beta/33445566778899/gamma/delta";
String digitStr = null;
for(String str : url.split("/")){
System.out.println(str);
if(str.matches("[0-9]{10,20}")){
digitStr = str;
break;
}
}
System.out.println(digitStr);
Output:
33445566778899
Instead of saying it "doesn't seem to work", you should have given use what it was returning. Testing it confirmed what I thought: your code would return 3 for this input.
This is simply because your regexp as written will capture a digit following a / and followed by 10 to 20 digits themselves followed by a /.
The regex you want is "/(\\d{10,20})/" (you don't need to escape the /). Below you'll find the code I tested this with.
public static void main(String[] args) {
String src = "http://www.abc.com/alpha/beta/33445566778899/gamma/delta";
Pattern pattern = Pattern.compile("/(\\d{10,20})/");
Matcher matcher = pattern.matcher(src);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
}

Splitting a string composed of numbers and alphabets

I want to break a string like :
String s = "xyz213123kop234430099kpf4532";
into tokens where each token starts with an alphabet and ends with a number. So the above string can be broken down into 3 tokens :
xyz213123
kop234430099
kpf4532
This string s could be very big but the pattern will remain the same, i.e each token will start with 3 alphabets and end with a number.
How do I split them ?
Try this:
\w+?\d+
Java Matcher:
Pattern pattern = Pattern.compile("\\w+?\\d+"); //compiles the pattern we want to use
Matcher matcher = pattern.matcher("xyz213123kop234430099kpf4532"); //we create the matcher on certain string using our pattern
while(matcher.find()) //while the matcher can find the next match
{
System.out.println(matcher.group()); //print it
}
And then you could use Regex.Matches C#:
foreach(Match m in Regex.Matches("xyz213123kop234430099kpf4532", #"\w+?\d+"))
{
Console.WriteLine(m.Value);
}
And for the future this:
RegExr
Do it like this,
String s = "xyz213123kop234430099kpf4532";
Pattern p = Pattern.compile("\\w+?\\d+");
Matcher match = p.matcher(s);
while(match.find()){
System.out.println(match.group());
}
OUTPUT
xyz213123
kop234430099
kpf4532
You can start from such regexp: (\w+?\d+)
http://regexr.com?36utt

How split a string using regex pattern

How split a [0] like words from string using regex pattern.0 can replace any integer number.
I used regex pattern,
private static final String REGEX = "[\\d]";
But it returns string with [.
Spliting Code
Pattern p=Pattern.compile(REGEX);
String items[] = p.split(lure_value_save[0]);
You have to escape the brackets:
String REGEX = "\\[\\d+\\]";
Java doesn't offer an elegant solution to extract the numbers. This is the way to go:
Pattern p = Pattern.compile(REGEX);
String test = "[0],[1],[2]";
Matcher m = p.matcher(test);
List<String> matches = new ArrayList<String>();
while (m.find()) {
matches.add(m.group());
}

Categories

Resources