Smart parsing string java - java

Is there some kind of rule engine or some smart way to do this?
I have a string like this :
test 1-2-22
SO that I can get these values:
name = "test"
part_id = 1
brand_id = 2
count = 22
I have more of these so called rules from which I know the format of string.
I was thinking I can do this with regex, but is there a better way of doing this instead?
Edit:
I see some very good answers. Maybe I should have been more clear.
This is not the only string type that I might have, I could have a string like this :
test 3-brand 15 – 2
Where after parsing it should be :
name = "test"
part_id = 2
brand_id = 3
count = 15
So I can have different strings and I need to definy a rule/pattern for each of those. What would be good way to do this? Regex is one option for now

You can split around both spaces and dashes using the following expression:
[ -]
Then you will find the different components at indexes starting from 0.
In Java:
String input = "test 1-2-22";
String[] results = input.split("[ -]");

You can use this Pattern regex:
Pattern pattern = Pattern.compile("^([a-zA-Z]+)\\s*([^-]+)-([^-]+)-([^-]+)$");
Then this code should work:
String line = "test 1-2-22";
Pattern pattern = Pattern.compile("^([a-zA-Z]+)\\s*([^-]+)-([^-]+)-([^-]+)$");
Matcher matcher = pattern.matcher(line);
if (matcher.find()) {
System.out.printf("name:%s, part_id:%s, brand_id:%s, count:%s%n",
matcher.group(1), matcher.group(2), matcher.group(3), matcher.group(4) );
}

In this particular case, suitable split operations (or other manual string processing) is probably going to be easiest, as you have the whitespace and the dashes to look for explicitly.
For more complex patterns you can look into antlr for tokenising this into (for example) one identifier and three number tokens and then parsing it, but that seems to be overkill here. (This would give you a 'rule engine', thugh.)
In general: you may want to read up on parsing and context-free grammars for this.

Something like this:
String s = "test 1-2-22";
String[] vars = s.split("[ -]");
String name = vars[0];
String part_id = vars[1];
String brand_id = vars[2];
String count = vars[3];
This will split the string if a space or "-" occurs.
you could then convert the ids and count to int if required.

Related

Java - extract JSON values from string using multi regex

I am trying to use this multi regex java library to extract JSON field value from a string.
My JSON look like this:
{
"field1": "something",
"field2": 13
"field3": "some"
}
I have created a regex pattern to fit each field, and it is working with Java Regex Pattern by simply doing something like this for each pattern:
Matcher matcher = patternToSearch.matcher(receiveData);
if (matcher.find()) {
return matcher.group(1);
}
I decided to try and improve the code and use multi regex so instead of scanning the string 3 times, it will scan it only one time and extract all needed values.
So I came up with something like this:
String[] patterns = new String[]{
"\"field1\":\\s*\"(.*?)\"",
"\"field2\":\\s*(\\d+)(\\.\\d)?",
"\"field3\":\\s*\"(.*?)\"",
};
this.matcher = MultiPattern.of(patterns).matcher();
the matcher has only one method - match - used like this:
int[] match = this.matcher.match(jsonStringToScan);
so I ended up with a list of integers, but I have no idea how to get the json values from these strings and how those integers are helping me. The multi regex matcher does not support the group method I used before to get the value.
Any idea of how I can extract multiple json values from string using multi regex? (Scanning string only once)
As mentioned on github page from your link match returnes indexes of patterns matched. Another point from this page:
The library does not handle groups.
Consider matching key as group too. Look at this simple example:
final Pattern p = Pattern.compile("\"(field.)\":((?:\".*?\")|(?:\\d+(?:\\.\\d+)?))");
final Matcher m = p.matcher("{\"field3\":\"hi\",\"field2\":100.0,\"field1\":\"hi\"}");
while (m.find()) {
for (int i = 1; i <= m.groupCount(); i++) {
System.out.print(m.group(i) + " ");
}
System.out.println();
}
It prints:
field3 "hi"
field2 100.0
field1 "hi"
If you want to avoid quotes in value group, you need more complicated logic. I've stopped at:
final Pattern p = Pattern.compile("\"(field.)\":(?:(?:\"(.*?(?=\"))\")|(\\d+(?:\\.\\d+)?))");
resulting in
field3 hi null
field2 null 100.0
field1 hi null

Replace a string using a regular expression

I have a string that I would like to replace using a regular expression in java but I am not quite sure how to do this.
Let's say I have the code below:
String globalID="60DC6285-1E71-4C30-AE36-043B3F7A4CA6";
String regExpr="^([A-Z0-9]{3})[A-Z0-9]*|-([A-Z0-9]{3})[A-Z0-9]*$|-([A-Z0-9]{2})[A-Z0-9]*"
What I would like to do is apply my regExpr in globalID so the new string will be something like : 60D1E4CAE043; I did it with str.substring(0,3)+.... but I was wondering if I can do it using the regexpr in java. I tried to do it by using the replaceAll but the output was not the one I describe above.
To be more specific , I would like to change the globalID to a newglobalID using the regexpr I described above. The newglobalID will be : 60D1E4CAE043.
Thanks
This is definitively not the best code ever, but you could do something like this:
String globalID = "60DC6285-1E71-4C30-AE36-043B3F7A4CA6";
String regExpr = "^([A-Z0-9]{3})[A-Z0-9]*|-([A-Z0-9]{3})[A-Z0-9]*$|-([A-Z0-9]{2})[A-Z0-9]*";
Pattern pattern = Pattern.compile(regExpr);
Matcher matcher = pattern.matcher(globalID);
String newGlobalID = "";
while (matcher.find()) {
for (int i = 1; i <= matcher.groupCount(); i++) {
newGlobalID += matcher.group(i) != null ? matcher.group(i) : "";
}
}
System.out.println(newGlobalID);
You will need to use a Matcher to iterate over all matches in your input as your regular expression matches subsequences of the input string only. Depending on which substring is matched a different capturing group will be non-null, you could also use named capturing groups or remember where in the input you currently are, but the above code should work as example.
Your regexp must match the whole string. Your wersioe tries to match the parts alternatively which does not work.
thy this:
String regExpr="^([A-Z0-9]{3})[^-]*"+
"-([A-Z0-9]{2})[^-]*"+
"-([A-Z0-9]{3})[^-]*"+
"-([A-Z0-9]{2})[^-]*"+
"-([A-Z0-9]{2}).*"
The total code should be like that below,
String globalID = "60DC6285-1E71-4C30-AE36-043B3F7A4CA6";
String regExpr = "^(\\w{3}).*?-"
+ "(\\w{2}).*?-"
+ "(\\w{2}).*?-"
+ "(\\w{2}).*?-"
+ "(\\w{3}).*";
System.out.println(globalID.replaceAll(regExpr, "$1$2$3$4$5"));
The output of println function is
60D1E4CAE043

Parse out specific characters from java string

I have been trying to drop specific values from a String holding JDBC query results and column metadata. The format of the output is:
[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]
I am trying to get it into the following format:
I_Col1=someValue1, I_Col2=someVal2, I_Col3=someVal3
I have tried just dropping everything before the "=", but some of the "someVal" data has "=" in them. Is there any efficient way to solve this issue?
below is the code I used:
for(int i = 0; i < finalResult.size(); i+=modval) {
String resulttemp = finalResult.get(i).toString();
String [] parts = resulttemp.split(",");
//below is only for
for(int z = 0; z < columnHeaders.size(); z++) {
String replaced ="";
replaced = parts[z].replace("*=", "");
System.out.println("Replaced: " + replaced);
}
}
You don't need any splitting here!
You can use replaceAll() and the power of regular expressions to simply replace all occurrences of those unwanted characters, like in:
someString.replaceAll("[\\[\\]\\{\\}", "")
When you apply that to your strings, the resulting string should exactly look like required.
You could use a regular expression to replace the square and curly brackets like this [\[\]{}]
For example:
String s = "[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]";
System.out.println(s.replaceAll("[\\[\\]{}]", ""));
That would produce the following output:
I_Col1=someValue1, I_Col2=someVal2, I_Col3=someVal3
which is what you expect in your post.
A better approach however might be to match instead of replace if you know the character set that will be in the position of 'someValue'. Then you can design a regex that will match this perticular string in such a way that no matter what seperates I_Col1=someValue1 from the rest of the String, you will be able to extract it :-)
EDIT:
With regards to the matching approach, given that the value following I_Col1= consists of characters from a-z and _ (regardless of the case) you could use this pattern: (I_Col\d=\w+),?
For example:
String s = "[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]";
Matcher m = Pattern.compile("(I_Col\\d=\\w+),?").matcher(s);
while (m.find())
System.out.println(m.group(1));
This will produce:
I_Col1=someValue1
I_Col2=someVal2
I_Col3=someVal3
You could do four calls to replaceAll on the string.
String query = "[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]"
String queryWithoutBracesAndBrackets = query.replaceAll("\\{", "").replaceAll("\\]", "").replaceAll("\\]", "").replaceAll("\\[", "")
Or you could use a regexp if you want the code to be more understandable.
String query = "[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]"
queryWithoutBracesAndBrackets = query.replaceAll("\\[|\\]|\\{|\\}", "")

How to find a String of last 2 items in colon separated string

I have a string = ab:cd:ef:gh. On this input, I want to return the string ef:gh (third colon intact).
The string apple:orange:cat:dog should return cat:dog (there's always 4 items and 3 colons).
I could have a loop that counts colons and makes a string of characters after the second colon, but I was wondering if there exists some easier way to solve it.
You can use the split() method for your string.
String example = "ab:cd:ef:gh";
String[] parts = example.split(":");
System.out.println(parts[parts.length-2] + ":" + parts[parts.length-1]);
String example = "ab:cd:ef:gh";
String[] parts = example.split(":",3); // create at most 3 Array entries
System.out.println(parts[2]);
The split function might be what you're looking for here. Use the colon, like in the documentation as your delimiter. You can then obtain the last two indexes, like in an array.
Yes, there is easier way.
First, is by using method split from String class:
String txt= "ab:cd:ef:gh";
String[] arr = example.split(":");
System.out.println(arr[arr.length-2] + " " + arr[arr.length-1]);
and the second, is to use Matcher class.
Use overloaded version of lastIndexOf(), which takes the starting index as 2nd parameter:
str.substring(a.lastIndexOf(":", a.lastIndexOf(":") - 1) + 1)
Another solution would be using a Pattern to match your input, something like [^:]+:[^:]+$. Using a pattern would probably be easier to maintain as you can easily change it to handle for example other separators, without changing the rest of the method.
Using a pattern is also likely be more efficient than String.split() as the latter is also converting its parameter to a Pattern internally, but it does more than what you actually need.
This would give something like this:
String example = "ab:cd:ef:gh";
Pattern regex = Pattern.compile("[^:]+:[^:]+$");
final Matcher matcher = regex.matcher(example);
if (matcher.find()) {
// extract the matching group, which is what we are looking for
System.out.println(matcher.group()); // prints ef:gh
} else {
// handle invalid input
System.out.println("no match");
}
Note that you would typically extract regex as a reusable constant to avoid compiling the pattern every time. Using a constant would also make the pattern easier to change without looking at the actual code.

Get specific value from string using split fucntion

I have String something like this
APIKey testapikey=mysecretkey
I want to get mysecretkey to String attribute
What i tried is below
String[] couple = string.split(" ");
String[] values=couple[1].split("=");
String mykey= values[1];
Is this right way?
You could use the String.replaceAll(...) method.
String string = "APIKey testapikey=mysecretkey";
// [.*key=] - match the substring ending with "key="
// [(.*)] - match everything after the "key=" and group the matched characters
// [$1] - replace the matched string by the value of cpaturing group number 1
string = string.replaceAll(".*key=(.*)", "$1");
System.out.println(string);
Don't use split() you will be unnecessarily creating an array of Strings.
Use String myString = originalString.replaceAll(".*=","");
I think using split here is pretty error prone. A small change in the format of the incoming string (such as a space being added) could result in a bug that's hard to diagnose. My recommendation would be to play it safe and use a regular expression to ensure the text is exactly as you expect:
Pattern pattern = Pattern.compile("APIKey testapikey=(\\w*)");
Matcher matcher = pattern.matcher(apiKeyText);
if (!matcher.matches())
throw new IllegalArgumentException("apiKey does not match pattern");
String apiKey = matcher.group();
That code documents your intentions much better than use of split and picks up unexpected changes in format. The only possible downside is performance but assuming you make pattern a static final (to ensure it's compiled once) then unless you are calling this millions of times then I very much doubt it will be an issue.

Categories

Resources