Pattern pattern = Pattern.compile("([^\\d.]|[\\d.]++)");
String[] equation = pattern.split("5+3--323");
System.out.println(equation.length);
I'm trying to break apart numbers (could be groups) and nonnumbers, in this example i was hoping for a size 6 array:
5, +, 3, -, -, 323
how can I do this?
Try using matcher, as in example below. It returns exactly what you are after.
import java.util.Arrays;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class MathSplitTest
{
public static void main(String[] args)
{
Pattern pattern = Pattern.compile("[0-9]+|[-+]");
String string = "5+3--323";
Matcher matcher = pattern.matcher(string);
while(matcher.find())
System.out.println("g0="+matcher.group(0));
}
}
What about using a
new java.util.Scanner(new java.io.StringReader("5+3--323"));
instead?
http://download.oracle.com/javase/6/docs/api/java/util/Scanner.html
If your numbers are comma separated then first tokenize the String;
tok = new StringTokenizer(string, ",");
then try to create a number from each token. If it is not a number then it's a symbol:
while (tok.hasMoreTokens()){
String tok = tok.nextTok();
try {
new Integer(tok);
}catch (NumberFormatException e){
}
}
If tok is not a number then a NumberFormatException is thrown.
Related
I'm pretty rusty with regex, but I have the requirement to extract the first token of the following string:
Input: /token1/token2/token3
Required output: /token1
I have tried:
List<String> connectorPath = Splitter.on("^[/\\w+]+")
.trimResults()
.splitToList(actionPath);
Doesn't work for me, any ideas?
Instead of split, you can match
^/\\w+
Or if the string has 3 parts, use a capture group for the first part.
^(/\\w+)/\\w+/\\w+$
Java example
Pattern pattern = Pattern.compile("^/\\w+");
Matcher matcher = pattern.matcher("/token1/token2/token3");
if (matcher.find()) {
System.out.println(matcher.group(0));
}
Output
/token1
You can split on the / that is not at the string start using the (?!^)/ regex:
String[] res = "/token1/token2/token3".split("(?!^)/");
System.out.println(res[0]); // => /token1
See the Java code demo and the regex demo.
(?!^) - a negative lookahead that matches a location not at the start of string
/ - a / char.
Using Guava:
Splitter splitter = Splitter.onPattern("(?!^)/").trimResults();
Iterable<String> iterable = splitter.split(actionPath);
String first = Iterables.getFirst(iterable, "");
You are over-complicating it.
Try the following regular expression: ^(\/\w+)(.+)$
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class PathSplitter {
public static void main(String args[]) {
String input = "/token1/token2/token3";
Pattern pattern = Pattern.compile("^(\\/\\w+)(.+)$");
Matcher matcher = pattern.matcher(input);
if (matcher.find()) {
System.out.println(matcher.group(1)); // /token1
System.out.println(matcher.group(2)); // /token2/token3
} else {
System.out.println("NO MATCH");
}
}
}
I want to split this string (1,0) and get result 1 and 0 i have tried this code:
String str ="(1,0)";
String parts[]= str.split("(,)");
System.out.println(parts[0]);
System.out.println(parts[1]);
But i got this :
(1
0)
Here's an efficient way you can isolate all your digits using the Regex Tools and put them into an ArrayList for easy usage. It doesn't use the .split() method, but it is efficient.
import java.util.ArrayList;
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public static void main(String[] args) {
String str = "(1,0)";
Pattern p = Pattern.compile("\\d+");
Matcher m = p.matcher(str);
ArrayList<Integer> vals = new ArrayList<>();
while(m.find())
vals.add(Integer.parseInt(m.group()));
System.out.println(vals);
}
A simple solution would be as follows:
public class Main {
public static void main(String[] args) {
String str = "(1,0)";
String parts[] = str.replace("(", "").replace(")", "").split(",");
System.out.println(parts[0]);
System.out.println(parts[1]);
}
}
Output:
1
0
If you split on (, the first value in the returned array will be "".
I'd recommend using regex directly, e.g.
String input = "(1,0)";
Matcher m = Pattern.compile("\\(([^,\\)]+),([^,\\)]+)\\)").matcher(input);
if (! m.matches())
throw new IllegalArgumentException("Invalid input: " + input);
System.out.println(m.group(1));
System.out.println(m.group(2));
Of course, if you insist on using split(), it can be done like this:
String input = "(1,0)";
String[] parts = input.split("[(,)]");
if (parts.length != 3 || ! parts[0].isEmpty())
throw new IllegalArgumentException("Invalid input: " + input);
System.out.println(parts[1]);
System.out.println(parts[2]);
If you know how to use regex, go for that. (personally I prefer to use string manipulation here because it's really easier) If not, learn how to use it or do something like this:
String input = "(64,128)";
String[] numbers = input.substring(1, input.length() - 1).split(",");
Try this, assuming your format is consistent.
String str = "(1,0)";
String[] tokens = str.substring(1,str.length()-1).split(",");
System.out.println(Arrays.toString(tokens));
Prints
[1, 0]
or if printed separately
1
0
I have written an OCR program in Java where it scans documents and finds all text in it. My primary task is to find the Invoice number which can be 6 or more integer.
I used the substring functionality but that's not so efficient as the position of that number is changing with every document, but it is always present in the first three lines of OCR text.
I want to write code in Java 8 from where I can iterate through the first three lines and get this 6 consecutive numbers.
I am using Tesseract for OCR.
Example:
,——— ————i_
g DAILYW RK SHE 278464
E C 0 mp] on THE POUJER Hello, Mumbai, Co. Maha
from this, I need to extract the number 278464.
Please help!!
try the following code using regex.
import java.lang.Math; // headers MUST be above the first class
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class Test
{
// arguments are passed using the text field below this editor
public static void main(String[] args)
{
Pattern pattern = Pattern.compile("(?<=\\D)\\d{6}(?!\\d)");
String str = "g DAILYW RK SHE 278464";
Matcher matcher = pattern.matcher(str);
if(matcher.find()){
String s = matcher.group();
//278464
System.out.println(s);
}
}
}
(?<=\\D) match but not catch text current and before current are not numbers
\\d{6} match exactly 6 numbers
(?!\\d) match but not catch text current and after current are not numbers
It can be solved simply with \\d{6,} as shown below:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static void main(String args[]) {
// Tests
String[] textArr1 = { ",——— ————i_", "g DAILYW RK SHE 2784647",
"E C 0 mp] on THE POUJER Hello, Mumbai, Co. Maha" };
String[] textArr2 = { ",——— ————i_", "g DAILYW RK SHE ——— ————",
"E C 0 mp] on THE 278464 POUJER Hello, Mumbai, Co. Maha" };
String[] textArr3 = { ",——— 278464————i_", "g DAILYW RK SHE POUJER",
"E C 0 mp] on THE POUJER Hello, Mumbai, Co. Maha" };
System.out.println(getInvoiceNumber(textArr1));
System.out.println(getInvoiceNumber(textArr2));
System.out.println(getInvoiceNumber(textArr3));
}
static String getInvoiceNumber(String[] textArr) {
String invoiceNumber = "";
Pattern pattern = Pattern.compile("\\d{6,}");
for (String text : textArr) {
Matcher matcher = pattern.matcher(text);
if (matcher.find()) {
invoiceNumber = matcher.group();
}
}
return invoiceNumber;
}
}
Output:
2784647
278464
278464
check this code.
public class Test {
private static final Pattern p = Pattern.compile("(\\d{6,})");
public static void main(String[] args) {
try {
Scanner scanner = new Scanner(new File("here put your file path"));
System.out.println("done");
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
// create matcher for pattern p and given string
Matcher m = p.matcher(line);
// if an occurrence if a pattern was found in a given string...
if (m.find()) {
System.out.println(m.group(1)); // second matched digits
}
}
scanner.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
}
String TextValue = "hello{MyVar} Discover {MyVar2} {MyVar3}";
String[] splitString = TextValue.split("\\{*\\}");
What I'm getting output is [{MyVar, {MyVar2, {MyVar3] in splitString
But my requirement is to preserve those delimiters {} i.e. [{MyVar}, {MyVar2}, {MyVar3}].
Required a way to match above output.
Use something like so:
Pattern p = Pattern.compile("(\\{\\w+\\})");
String str = ...
Matcher m = p.matcher(str);
while(m.find())
System.out.println(m.group(1));
Note, the code above is untested but that will look for words within curly brackets and place them in a group. It will then go over the string and output any string which matches the expression above.
An example of the regular expression is available here.
Thanks kelvin & npinti.
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class CreateMatcherExample {
public static void main(String[] args) {
String TextValue = "hello{MyVar} Discover {My_Var2} {My_Var3}";
String patternString = "\\{\\w+\\}";
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(TextValue);
while(matcher.find()) {
System.out.println(matcher.group());
}
}
}
I need to find the length of my string "பாரதீய ஜனதா இளைஞர் அணி தலைவர் அனுராக்சிங் தாகூர் எம்.பி. நேற்று தேர்தல் ஆணையர் வி.சம்பத்". I got the string length as 45 but i expect the string length to be 59. Here i need to add the regular expression condition for spaces and dot (.). My code
import java.util.*;
import java.lang.*;
import java.util.regex.*;
class UnicodeLength
{
public static void main (String[] args)
{
String s="பாரதீய ஜனதா இளைஞர் அணி தலைவர் அனுராக்சிங் தாகூர் எம்பி நேற்று தேர்தல் ஆணையர் விசம்பத்";
List<String> characters=new ArrayList<String>();
Pattern pat = Pattern.compile("\\p{L}\\p{M}*");
Matcher matcher = pat.matcher(s);
while (matcher.find()) {
characters.add(matcher.group());
}
// Test if we have the right characters and length
System.out.println(characters);
System.out.println("String length: " + characters.size());
}
}
The code below worked for me. There were three issues that I fixed:
I added a check for spaces to your regular expression.
I added a check for punctuation to your regular expression.
I pasted the string from your comment into the string in your code. They weren't the same!
Here's the code:
public static void main(String[] args) {
String s = "பாரதீய ஜனதா இளைஞர் அணி தலைவர் அனுராக்சிங் தாகூர் எம்.பி. நேற்று தேர்தல் ஆணையர் வி.சம்பத்";
List<String> characters = new ArrayList<String>();
Pattern pat = Pattern.compile("\\p{P}|\\p{L}\\p{M}*| ");
Matcher matcher = pat.matcher(s);
while (matcher.find()) {
characters.add(matcher.group());
}
// Test if we have the right characters and length
int i = 1;
for (String character : characters) {
System.out.println(String.format("%d = [%s]", i++, character));
}
System.out.println("Characters Size: " + characters.size());
}
It's probably worth pointing out that your code is remarkably similar to the solution for this SO. One comment on that solution in particular led me to discover the missing check for punctuation in your code and allowed me to notice that the string from your comment didn't match the string in your code.