let's say i have string like that:
eXamPLestring>1.67>>ReSTOfString
my task is to extract only 1.67 from string above.
I assume regex will be usefull, but i can't figure out how to write propper expression.
If you want to extract all Int's and Float's from a String, you can follow my solution:
private ArrayList<String> parseIntsAndFloats(String raw) {
ArrayList<String> listBuffer = new ArrayList<String>();
Pattern p = Pattern.compile("[0-9]*\\.?[0-9]+");
Matcher m = p.matcher(raw);
while (m.find()) {
listBuffer.add(m.group());
}
return listBuffer;
}
If you want to parse also negative values you can add [-]? to the pattern like this:
Pattern p = Pattern.compile("[-]?[0-9]*\\.?[0-9]+");
And if you also want to set , as a separator you can add ,? to the pattern like this:
Pattern p = Pattern.compile("[-]?[0-9]*\\.?,?[0-9]+");
.
To test the patterns you can use this online tool: http://gskinner.com/RegExr/
Note: For this tool remember to unescape if you are trying my examples (you just need to take off one of the \)
You could try matching the digits using a regular expression
\\d+\\.\\d+
This could look something like
Pattern p = Pattern.compile("\\d+\\.\\d+");
Matcher m = p.matcher("eXamPLestring>1.67>>ReSTOfString");
while (m.find()) {
Float.parseFloat(m.group());
}
Here's how to do it in one line,
String f = input.replaceAll(".*?(-?[\\d.]+)?.*", "$1");
Which returns a blank String if there is no float found.
If you actually want a float, you can do it in one line:
float f = Float.parseFloat(input.replaceAll(".*?(-?[\\d.]+).*", "$1"));
but since a blank cannot be parsed as a float, you would have to do it in two steps - testing if the string is blank before parsing - if it's possible for there to be no float.
String s = "eXamPLestring>1.67>>ReSTOfString>>0.99>>ahgf>>.9>>>123>>>2323.12";
Pattern p = Pattern.compile("\\d*\\.\\d+");
Matcher m = p.matcher(s);
while(m.find()){
System.out.println(">> "+ m.group());
}
Gives only floats
>> 1.67
>> 0.99
>> .9
>> 2323.12
You can use the regex \d*\.?,?\d* This will work for floats like 1.0 and 1,0
Have a look at this link, they also explain a few things that you need to keep in mind when building such a regex.
[-+]?[0-9]*\.?[0-9]+
example code:
String[] strings = new String[3];
strings[0] = "eXamPLestring>1.67>>ReSTOfString";
strings[1] = "eXamPLestring>0.57>>ReSTOfString";
strings[2] = "eXamPLestring>2547.758>>ReSTOfString";
Pattern pattern = Pattern.compile("[-+]?[0-9]*\\.?[0-9]+");
for (String string : strings)
{
Matcher matcher = pattern.matcher(string);
while(matcher.find()){
System.out.println("# float value: " + matcher.group());
}
}
output:
# float value: 1.67
# float value: 0.57
# float value: 2547.758
/**
* Extracts the first number out of a text.
* Works for 1.000,1 and also for 1,000.1 returning 1000.1 (1000 plus 1 decimal).
* When only a , or a . is used it is assumed as the float separator.
*
* #param sample The sample text.
*
* #return A float representation of the number.
*/
static public Float extractFloat(String sample) {
Pattern pattern = Pattern.compile("[\\d.,]+");
Matcher matcher = pattern.matcher(sample);
if (!matcher.find()) {
return null;
}
String floatStr = matcher.group();
if (floatStr.matches("\\d+,+\\d+")) {
floatStr = floatStr.replaceAll(",+", ".");
} else if (floatStr.matches("\\d+\\.+\\d+")) {
floatStr = floatStr.replaceAll("\\.\\.+", ".");
} else if (floatStr.matches("(\\d+\\.+)+\\d+(,+\\d+)?")) {
floatStr = floatStr.replaceAll("\\.+", "").replaceAll(",+", ".");
} else if (floatStr.matches("(\\d+,+)+\\d+(.+\\d+)?")) {
floatStr = floatStr.replaceAll(",", "").replaceAll("\\.\\.+", ".");
}
try {
return new Float(floatStr);
} catch (NumberFormatException ex) {
throw new AssertionError("Unexpected non float text: " + floatStr);
}
}
Related
I have a very long text and I'm extracting some specific values that are followed by some particular words. Here's an example of my long text:
.........
FPS(FramesPerSecond)[ValMin: 29.0000, ValMax: 35.000]
.........
TotalFrames[ValMin: 100000, ValMax:200000]
.........
MemoryUsage(In MB)[ValMin:190000MB, ValMax:360000MB]
.........
here's my code:
File file = filePath.toFile();
JSONObject jsonObject = new JSONObject();
String FPSMin="";
String FPSMax="";
String TotalFramesMin="";
String TotalFramesMax="";
String MemUsageMin="";
String MemUsageMax="";
String log = "my//log//file";
final Matcher matcher = Pattern.compile("FPS/\(FramesPerSecond/\)/\[ValMin:");
if(matcher.find()){
FPSMin= matcher.end().trim();
}
But I can't make it work. Where am I wrong? Basically I need to select, for each String, the corresponding values (max and min) coming from that long text and store them into the variables. Like
FPSMin = 29.0000
FPSMax = 35.0000
FramesMin = 100000
Etc
Thank you
EDIT:
I tried the following code (in a test case) to see if the solution could work, but I'm experiencing issues because I can't print anything except an object. Here's the code:
#Test
public void whenReadLargeFileJava7_thenCorrect()
throws IOException, URISyntaxException {
Scanner txtScan = new Scanner("path//to//file//test.txt");
String[] FPSMin= new String[0];
String FPSMax= "";
//Read File Line By Line
while (txtScan.hasNextLine()) {
// Print the content on the console
String str = txtScan.nextLine();
Pattern FPSMin= Pattern.compile("^FPS\\(FramesPerSecond\\)\\[ValMin:");
Matcher matcher = FPSMin.matcher(str);
if(matcher.find()){
String MinMaxFPS= str.substring(matcher.end(), str.length()-1);
String[] splitted = MinMaxFPS.split(",");
FPSMin= splitted[0].split(": ");
FPSMax = splitted[1];
}
System.out.println(FPSMin);
System.out.println(FPSMax);
}
Maybe your pattern should be like this ^FPS\\(FramesPerSecond\\)\\[ValMin: . I've tried it and it works for me.
String line = "FPS(FramesPerSecond)[ValMin: 29.0000, ValMax: 35.000]";
Pattern pattern = Pattern.compile("^FPS\\(FramesPerSecond\\)\\[ValMin:");
Matcher matcher = pattern.matcher(line);
if (matcher.find()) {
System.out.println(line.substring(matcher.end(), line.length()-1));
}
}
In that way, you get the offset of the line that you want to extract data and using the substring function you can get all characters starting from offset until the size of the line-1 (because you dont want to get also the ] character)
The following regular expression will match and capture the name, min and max:
Pattern.compile("(.*)\\[.+:\\s*(\\d+(?:\\.\\d+)?)[A-Z]*,.+:\\s*(\\d+(?:\\.\\d+)?)[A-Z]*\\]");
Usage (extracting the captured groups):
String input = (".........\n" +
"FPS(FramesPerSecond)[ValMin: 29.0000, ValMax: 35.000]\n" +
".........\n" +
"TotalFrames[ValMin: 100000, ValMax:200000]\n" +
".........\n" +
"MemoryUsage(In MB)[ValMin:190000MB, ValMax:360000MB]\n" +
".........");
for (String s : input.split("\n")) {
Matcher matcher = pattern.matcher(s);
if (matcher.matches()) {
System.out.println(matcher.group(1) + ", " + matcher.group(2) + ", " + matcher.group(3));
}
}
Output:
FPS(FramesPerSecond), 29.0000, 35.000
TotalFrames, 100000, 200000
MemoryUsage(In MB), 190000, 360000
I get a string and I have to retrieve the values
Je pense que nous devons utiliser le ".slit"
if (stringReceived.contains("ID")&& stringReceived.contains("Value")) {
here is my character string:
I/RECEIVER: [1/1/0 3
I/RECEIVER: :32:11]
I/RECEIVER: Timestam
I/RECEIVER: p=946697
I/RECEIVER: 531 ID=4
I/RECEIVER: 3 Value=
I/RECEIVER: 18
I receive the value 1 byte by 1 byte.
I would like to recover the value of Timestamp, Id and Value..
You can also use regex for that. Something like:
String example="[11/2/19 9:48:25] Timestamp=1549878505 ID=4 Value=2475";
Pattern pattern=Pattern.compile(".*Timestamp=(\\d+).*ID=(\\d+).*Value=(\\d+)");
Matcher matcher = pattern.matcher(example);
while(matcher.find()) {
System.out.println("Timestamp is:" + matcher.group(1));
System.out.println("Id is:" + matcher.group(2));
System.out.println("Value is:" + matcher.group(3));
}
If the order of tokens can be different (for example ID can come before Timestamp) you can also do it. But since it looks like log which is probably structured I doubt you will need to.
First [11/2/19 9:48:25] seems unnecessary so let's remove it by jumping right into "Timestamp".
Using indexOf(), we can find where Timestamp starts.
// "Timestamp=1549878505 ID=4 Value=2475"
line = line.substring(line.indexOf("Timestamp"));
Since each string is separated by space, we can split it.
// ["Timestamp=1549878505", "ID=4" ,"Value=2475"]
line.split(" ");
Now for each tokens, we can substring it using index of '=' and parse it into string.
for(String token: line.split(" ")) {
int v = Integer.parseInt(token.substring(token.indexOf('=') + 1));
System.out.println(v);
}
Hope that helps :)
String text = "Timestamp=1549878505 ID=4 Value=2475";
Pattern p = Pattern.compile("ID=(\\d)");
Matcher m = p.matcher(text);
if (m.find()) {
System.out.println(m.group(1));
}
output
4
A simple regex is also an option:
private int fromString(String data, String key) {
Pattern pattern = Pattern.compile(key + "=(\\d*)");
Matcher matcher = pattern.matcher(data);
if (matcher.find()) {
return Integer.parseInt(matcher.group(1));
}
return -1;
}
private void test(String data, String key) {
System.out.println(key + " = " + fromString(data, key));
}
private void test() {
String test = "[11/2/19 9:48:25] Timestamp=1549878505 ID=4 Value=2475";
test(test, "Timestamp");
test(test, "ID");
test(test, "Value");
}
prints:
Timestamp = 1549878505
ID = 4
Value = 2475
You can try that:
String txt= "[11/2/19 9:48:25] Timestamp=1549878505 ID=4 Value=2475";
String re1= ".*?\\d+.*?\\d+.*?\\d+.*?\\d+.*?\\d+.*?\\d+.*?(\\d+).*?(\\d+).*?(\\d+)";
Pattern p = Pattern.compile(re1,Pattern.CASE_INSENSITIVE | Pattern.DOTALL);
Matcher m = p.matcher(txt);
if (m.find())
{
String int1=m.group(1);
String int2=m.group(2);
String int3=m.group(3);
System.out.print("("+int1+")"+"("+int2+")"+"("+int3+")"+"\n");
}
Use below code, You will find your timestamp at index 0, id at 1 and value at 2 in List.
Pattern pattern = Pattern.compile("=\\d+");
Matcher matcher = pattern.matcher(stringToMatch);
final List<String> matches = new ArrayList<>();
while (matcher.find()) {
String ans = matcher.group(0);
matches.add(ans.substring(1, ans.length()));
}
Explaining the regex
= matches the character = literally
\d* matches a digit (equal to [0-9])
* Quantifier — Matches between zero and unlimited times, as many times as possible
I am trying to extract the 00 and 02 from the line below into Strings.
invokestatic:indexbyte1=00 indexbyte2=02
I am using this code, but it's not working correctly:
String parse = "invokestatic:indexbyte1=00 indexbyte2=02";
String first = parse.substring(check.indexOf("=") + 1);
String second= parse.substring(check.lastIndexOf("=") + 1);
This seems to work for the seconds string, but the first strings value is
00 indexbyte2=02
I want to catch just the two digits and not the rest of the string.
If you don't specify the second parameter in substring method it will result in a substring from the starting index to the end of string that's why you get "00 indexbyte2=02" for first.
Specify the last index only to extract two digits when you extract value for first
String first = parse.substring(check.indexOf("=") + 1, check.indexOf("=") + 3);
You can use a regex pattern with groups, like this:
public static void main(String[] args) {
String input = "invokestatic:indexbyte1=00 indexbyte2=02";
Pattern pattern = Pattern.compile(".*indexbyte1=(\\d*) indexbyte2=(\\d*)");
Matcher m = pattern.matcher(input);
if (m.matches()) {
System.out.println(m.group(1));
System.out.println(m.group(2));
}
}
Try this:
String first = parse.substring(check.indexOf("=") + 1, check.indexOf("=") + 3);
check.indexOf("=") + 3 will take the 02 and will be the endindex for the substring. Presently you are not specifying the endindex hence it is taking the indexbyte2=02 as well since substring does not know where to stop hence it parses down till the end.
String parse = "invokestatic:indexbyte1=00 indexbyte2=02";
String first = parse.substring(parse.indexOf("=") + 1,
parse.indexOf("=") + 3);
String second = parse.substring(parse.lastIndexOf("=") + 1);
System.out.println(first + ", " + second);
You could use Pattern, Matcher clases.
Matcher m = Pattern.compile("(?<==)\\d+").matcher(string);
while(m.find())
{
System.out.println(m.group());
}
substring also has an endIndex. See the docs: http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#substring(int,%20int)
If the input has the basic form invokestatic:indexbyte1=00 indexbyte2=02 ... indexbyte99=99 you could use a regex:
Pattern p = Pattern.compile("indexbyte\\d+=([a-fA-F0-9]{2})");
Matcher m = p.matcher(input);
while( m.find() ) {
String idxByte = m.group(1);
//handle the byte here
}
This assumes that the identifier for those bytes is indexbyteN but this can be replaced with another identifier. Further this assumes the bytes are provided in hex, i.e. 2 hex characters (case insensitive here).
This is the string that I have:
KLAS 282356Z 32010KT 10SM FEW090 10/M13 A2997 RMK AO2 SLP145 T01001128 10100 20072 51007
This is a weather report. I need to extract the following numbers from the report: 10/M13. It is temperature and dewpoint, where M means minus. So, the place in the String may differ and the temperature may be presented as M10/M13 or 10/13 or M10/13.
I have done the following code:
public String getTemperature (String metarIn){
Pattern regex = Pattern.compile(".*(\\d+)\\D+(\\d+)");
Matcher matcher = regex.matcher(metarIn);
if (matcher.matches() && matcher.groupCount() == 1) {
temperature = matcher.group(1);
System.out.println(temperature);
}
return temperature;
}
Obviously, the regex is wrong, since the method always returns null. I have tried tens of variations but to no avail. Thanks a lot if someone can help!
This will extract the String you seek, and it's only one line of code:
String tempAndDP = input.replaceAll(".*(?<![M\\d])(M?\\d+/M?\\d+).*", "$1");
Here's some test code:
public static void main(String[] args) throws Exception {
String input = "KLAS 282356Z 32010KT 10SM FEW090 M01/M13 A2997 RMK AO2 SLP145 T01001128 10100 20072 51007";
String tempAndDP = input.replaceAll(".*(?<![M\\d])(M?\\d+/M?\\d+).*", "$1");
System.out.println(tempAndDP);
}
Output:
M01/M13
The regex should look like:
M?\d+/M?\d+
For Java this will look like:
"M?\\d+/M?\\d+"
You might want to add a check for white space on the front and end:
"\\sM?\\d+/M?\\d+\\s"
But this will depend on where you think you are going to find the pattern, as it will not be matched if it is at the end of the string, so instead we should use:
"(^|\\s)M?\\d+/M?\\d+($|\\s)"
This specifies that if there isn't any whitespace at the end or front we must match the end of the string or the start of the string instead.
Example code used to test:
Pattern p = Pattern.compile("(^|\\s)M?\\d+/M?\\d+($|\\s)");
String test = "gibberish M130/13 here";
Matcher m = p.matcher(test);
if (m.find())
System.out.println(m.group().trim());
This returns: M130/13
Try:
Pattern regex = Pattern.compile(".*\\sM?(\\d+)/M?(\\d+)\\s.*");
Matcher matcher = regex.matcher(metarIn);
if (matcher.matches() && matcher.groupCount() == 2) {
temperature = matcher.group(1);
System.out.println(temperature);
}
Alternative for regex.
Some times a regex is not the only solution. It seems that in you case, you must get the 6th block of text. Each block is separated by a space character. So, what you need to do is count the blocks.
Considering that each block of text does NOT HAVE fixed length
Example:
String s = "KLAS 282356Z 32010KT 10SM FEW090 10/M13 A2997 RMK AO2 SLP145 T01001128 10100 20072 51007";
int spaces = 5;
int begin = 0;
while(spaces-- > 0){
begin = s.indexOf(' ', begin)+1;
}
int end = s.indexOf(' ', begin+1);
String result = s.substring(begin, end);
System.out.println(result);
Considering that each block of text does HAVE fixed length
String s = "KLAS 282356Z 32010KT 10SM FEW090 10/M13 A2997 RMK AO2 SLP145 T01001128 10100 20072 51007";
String result = s.substring(33, s.indexOf(' ', 33));
System.out.println(result);
Prettier alternative, as pointed by Adrian:
String result = rawString.split(" ")[5];
Note that split acctualy receives a regex pattern as parameter
I am a beginner of Java Programming language.
When I input (1,2) into the console (brackets included), how can I write the code to extract the first and the second number using RegEx?
If there is no such expression to extract the first/second number within the brackets, I will have to change the way of inputing coordinates to x,y without the brackets and that should be a lot easier to extract numbers to be used.
Try this code:
public static void main(String[] args) {
String searchString = "(7,32)";
Pattern compile1 = Pattern.compile("\\(\\d+,");
Pattern compile2 = Pattern.compile(",\\d+\\)");
Matcher matcher1 = compile1.matcher(searchString);
Matcher matcher2 = compile2.matcher(searchString);
while (matcher1.find() && matcher2.find()) {
String group1 = matcher1.group();
String group2 = matcher2.group();
System.out.println("value 1: " + group1.substring(1, group1.length() - 1 ) + " value 2: " + group2.substring(1, group2.length() - 1 ));
}
}
Not that I think regex is the best to use here. If you know the input will be in the form of: (number, number), I would first get rid of brackets:
stringWithoutBrackets = searchString.substring(1, searchString.length()-1)
and than tokenize it with split
String[] coordiantes = stringWithoutBrackets.split(",");
Looked through Regex API and you can also do something like this:
public static void main(String[] args) {
String searchString = "(7,32)";
Pattern compile1 = Pattern.compile("(?<=\\()\\d+(?=,)");
Pattern compile2 = Pattern.compile("(?<=,)\\d+(?=\\))");
Matcher matcher1 = compile1.matcher(searchString);
Matcher matcher2 = compile2.matcher(searchString);
while (matcher1.find() && matcher2.find()) {
String group1 = matcher1.group();
String group2 = matcher2.group();
System.out.println("value 1: " + group1 + " value 2: " + group2);
}
}
The main change is that I used (?<==\)), (?=,), (?<=,), (?=\)), to search for brackets and commas but not caputre them. But I really think its an overkill for this task.