Stuck in regular expression - java

I have 3 strings that contain 2 fields and 2 values per string. I need a regular expression for the strings so I can get the data. Here are the 3 strings:
TTextRecordByLanguage{Text=Enter here the amount to transfer from your compulsory book saving account to your compulsory checking account; Id=55; }
TTextRecordByLanguage{Text=Hello World, CaribPayActivity!; Id=2; }
TTextRecordByLanguage{Text=(iphone); Id=4; }
The 2 fields are Text and Id, so I need an expression that gets the data between the Text field and the semi-colon (;). Make sure special symbols and any data are included.
Update ::
What i have tried.....
Pattern pinPattern = Pattern.compile("Text=([a-zA-Z0-9 \\E]*);");
ArrayList<String> pins = new ArrayList<String>();
Matcher m = pinPattern.matcher(soapObject.toString());
while (m.find()) {
pins.add(m.group(1));
s[i] = m.group(1);
}
Log.i("TAG", "ARRAY=>"+ s[i]);

I suggest a RE like this:
Text=.*?;
e.g: a returned of the last string should be
Text=(iphone);
then you may eliminate Text= and ; out of string as you want the content only.

Related

Extracting digits in the middle of a string using delimiters

String ccToken = "";
String result = "ssl_transaction_type=CCGETTOKENssl_result=0ssl_token=4366738602809990ssl_card_number=41**********9990ssl_token_response=SUCCESS";
String[] elavonResponse = result.split("=|ssl");
for (String t : elavonResponse) {
System.out.println(t);
}
ccToken = (elavonResponse[6]);
System.out.println(ccToken);
I want to be able to grab a specific part of a string and store it in a variable. The way I'm currently doing it, is by splitting the string and then storing the value of the cell into my variable. Is there a way to specify that I want to store the digits after "ssl_token="?
I want my code to be able to obtain the value of ssl_token without having to worry about changes in the string that are not related to the token since I wont have control over the string. I have searched online but I can't find answers for my specific problem or I maybe using the wrong words for searching.
You can use replaceAll with this regex .*ssl_token=(\\d+).* :
String number = result.replaceAll(".*ssl_token=(\\d+).*", "$1");
Outputs
4366738602809990
You can do it with regex. It would probably be better to change the specifications of the input string so that each key/value pair is separated by an ampersand (&) so you could split it (similar to HTTP POST parameters).
Pattern p = Pattern.compile(".*ssl_token=([0-9]+).*");
Matcher m = p.matcher(result);
if(m.matches()) {
long token = Long.parseLong(m.group(1));
System.out.println(String.format("token: [%d]", token));
} else {
System.out.println("token not found");
}
Search index of ssl_token. Create substring from that index. Convert substring to number. To number can extract number when it is at the beggining of the string.

regex for splitting key value-pair containing comma

I need a regex to split key-value pairs.Key and value are separated by =
Values can contain comma(,) but if they contain comma(,) they need to be enclosed by ("").Also the value in ("") can contain multiple inner ("") with comma(,) in them.Hence multiple level of nesting with (" , ") is possible.
Key can anything except ( comman(,) equal(=) double quote("") )
Example- abc="hi my name is "ayush,nigam"",def="i live at "bangalore",ghi=bangalore is in karnataka,jkl="i am from UP"
Another example - "ayush="piyush="abc,def",bce="asb,dsa"",aman=nigam"
I expect output as ayush="piyush="abc,def",bce="asb,dsa"" and aman=nigam
I am using the following regex code in java.
Pattern abc=Pattern.compile("([^=,]*)=((?:\"[^\"]*\"|[^,\"])*)");
String text2="AssemblyName=(foo.dll),ClassName=\"SomeClassanotherClass=\"a,b\"\"";
Matcher m=abc.matcher(text2);
while(m.find()) {
String kvPair = m.group();
System.out.println(kvPair);
}
I am getting folliwng kvPair
:
AssemblyName=(foo.dll)
ClassName="SomeClassanotherClass="a
Where as i need to get,
AssemblyName=(foo.dll)
ClassName="SomeClassanotherClass="a,b"
Hence comma(,) in inner double quotes("") are not being parse properly.Please help.

Parse out specific characters from java string

I have been trying to drop specific values from a String holding JDBC query results and column metadata. The format of the output is:
[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]
I am trying to get it into the following format:
I_Col1=someValue1, I_Col2=someVal2, I_Col3=someVal3
I have tried just dropping everything before the "=", but some of the "someVal" data has "=" in them. Is there any efficient way to solve this issue?
below is the code I used:
for(int i = 0; i < finalResult.size(); i+=modval) {
String resulttemp = finalResult.get(i).toString();
String [] parts = resulttemp.split(",");
//below is only for
for(int z = 0; z < columnHeaders.size(); z++) {
String replaced ="";
replaced = parts[z].replace("*=", "");
System.out.println("Replaced: " + replaced);
}
}
You don't need any splitting here!
You can use replaceAll() and the power of regular expressions to simply replace all occurrences of those unwanted characters, like in:
someString.replaceAll("[\\[\\]\\{\\}", "")
When you apply that to your strings, the resulting string should exactly look like required.
You could use a regular expression to replace the square and curly brackets like this [\[\]{}]
For example:
String s = "[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]";
System.out.println(s.replaceAll("[\\[\\]{}]", ""));
That would produce the following output:
I_Col1=someValue1, I_Col2=someVal2, I_Col3=someVal3
which is what you expect in your post.
A better approach however might be to match instead of replace if you know the character set that will be in the position of 'someValue'. Then you can design a regex that will match this perticular string in such a way that no matter what seperates I_Col1=someValue1 from the rest of the String, you will be able to extract it :-)
EDIT:
With regards to the matching approach, given that the value following I_Col1= consists of characters from a-z and _ (regardless of the case) you could use this pattern: (I_Col\d=\w+),?
For example:
String s = "[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]";
Matcher m = Pattern.compile("(I_Col\\d=\\w+),?").matcher(s);
while (m.find())
System.out.println(m.group(1));
This will produce:
I_Col1=someValue1
I_Col2=someVal2
I_Col3=someVal3
You could do four calls to replaceAll on the string.
String query = "[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]"
String queryWithoutBracesAndBrackets = query.replaceAll("\\{", "").replaceAll("\\]", "").replaceAll("\\]", "").replaceAll("\\[", "")
Or you could use a regexp if you want the code to be more understandable.
String query = "[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]"
queryWithoutBracesAndBrackets = query.replaceAll("\\[|\\]|\\{|\\}", "")

Regular expression matching issue with the following scenario

I am developing an application. User will enter some of the setting value in the server. When I ask for the value to the server through the inbuilt API. I am getting values like as a whole string:
for example-
name={abc};display={xyz};addressname={123}
Here the properties are name, display and address and there respective values are abc, xyz and 123.
I used to split with ; as first delimeter and = as a second dleimeter.
String[] propertyValues=iPropertiesStrings.split(";");
for(int i=0;i<propertyValues.length;i++)
{
if(isNullEmpty(propertyValues[i]))
continue;
String[] propertyValue=propertyValues[i].split("=");
if(propertyValue.length!=2)
mPropertyValues.put(propertyValue[0], "");
else
mPropertyValues.put(propertyValue[0], propertyValue[1]);
}
}
here mPropertyValues is hash map which is used for keeping property name and its value.
Problem is there can be string :
case 1: name={abc};display={ xyz=deno; demo2=pol };addressname={123}
case 2: name=;display={ xyz=deno; demo2=pol };addressname={123}
I want hashmap to be filled with :
case 1:
name ="abc"
display = "xyz= demo; demo2 =pol"
addressname = "123"
for case 2:
name =""
display = "xyz= demo; demo2 =pol"
addressname = "123"
I am looking for a regular expression to split these strings;
Assuming that there can't be nested {} this should do what you need
String data = "name=;display={ xyz=deno; demo2=pol };addressname={123}";
Pattern p = Pattern.compile("(?<name>\\w+)=(\\{(?<value>[^}]*)\\})?(;|$)");
Matcher m = p.matcher(data);
while (m.find()){
System.out.println(m.group("name")+"->"+(m.group("value")==null?"":m.group("value").trim()));
}
Output:
name->
display->xyz=deno; demo2=pol
addressname->123
Explanation
(?<name>\\w+)=(\\{(?<value>[^}]*)\\})?(;|$) can be split into parts where
(?<name>\\w+)= represents XXXX= and place XXXX in group named name (of property)
(\\{(?<value>[^}]*)\\})? is optional {XXXX} part where X can't be }. Also it will place XXXX part in group named value.
(;|$) represents ; OR end of data (represented by $ anchor) since formula is name=value; or in case of pair placed at the end of data name=value.
The following regex should match your criteria, and uses named capturing groups to get the three values you need.
name=\{(?<name>[^}])\};display=\{(?<display>[^}]+)\};addressname=\{(?<address>[^}]\)}
Assuming your dataset can change, a better parser may be more dynamic, building a Map from whatever is found in that return type.
The regex for this is pretty simple, given the cases you list above (and no nesting of {}, as others have mentioned):
Matcher m = Pattern.compile("(\\w+)=(?:\\{(.*?)\\})?").matcher(source_string);
while (m.find()) {
if (m.groupCount() > 1) {
hashMap.put(m.group(1), m.group(2));
}
}
There are, however, considerations to this:
If m.group(2) does not exist, "null" will be the value, (you can adjust that to be what you want with a tiny amount of logic).
This will account for varying data-sets - in case your data in the future changes.
What that regex does:
(\\w+) - This looks for one or more word characters in a row (A-z_) and puts them into a "capture group" (group(1))
= - The literal equals
(?:...)? - This makes the grouping not a capture group (will not be a .group(n), and the trailing ? makes it an optional grouping.
\\{(.*?)\\} - This looks for anything between the literals { and } (note: if a stray } is in there, this will break). If this section exists, the contents between {} will be in the second "capture group" (.group(2)).

I want to perform a split() on a string using a regex in Java, but would like to keep the delimited tokens in the array [duplicate]

This question already exists:
Is there a way to split strings with String.split() and include the delimiters? [duplicate]
Closed 8 years ago.
How can I format my regex to allow this?
Here's the regular expression:
"\\b[(\\w'\\-)&&[^0-9]]{4,}\\b"
It's looking for any word that is 4 letters or greater.
If I want to split, say, an article, I want an array that includes all the delimited values, plus all the values between them, all in the order that they originally appeared in. So, for example, if I want to split the following sentence: "I need to purchase a new vehicle. I would prefer a BMW.", my desired result from the split would be the following, where the italicized values are the delimiters.
"I ", "need", " to ", "purchase", " a new ", "vehicle", ". I ", "would", " ", "prefer", "a BMW."
So, all words with >4 characters are one token, while everything in between each delimited value is also a single token (even if it is multiple words with whitespace). I will only be modifying the delimited values and would like to keep everything else the same, including whitespace, new lines, etc.
I read in a different thread that I could use a lookaround to get this to work, but I can't seem to format it correctly. Is it even possible to get this to work the way I'd like?
I am not sure what you are trying to do but just in case that you want to modify words that have at least four letters you can use something like this (it will change words with =>4 letters to its upper cased version)
String data = "I need to purchase a new vehicle. I would prefer a BMW.";
Pattern patter = Pattern.compile("(?<![a-z\\-_'])[a-z\\-_']{4,}(?![a-z\\-_'])",
Pattern.CASE_INSENSITIVE);
Matcher matcher = patter.matcher(data);
StringBuffer sb = new StringBuffer();// holder of new version of our
// data
while (matcher.find()) {// lets find all words
// and change them with its upper case version
matcher.appendReplacement(sb, matcher.group().toUpperCase());
}
matcher.appendTail(sb);// lets not forget about part after last match
System.out.println(sb);
Output:
I NEED to PURCHASE a new VEHICLE. I WOULD PREFER a BMW.
OR if you change replacing code to something like
matcher.appendReplacement(sb, "["+matcher.group()+"]");
you will get
I [need] to [purchase] a new [vehicle]. I [would] [prefer] a BMW.
Now you can just split such string on every [ and ] to get your desired array.
Assuming that "word" is defined as [A-Za-z], you can use this regex:
(?<=(\\b[A-Za-z]{4,50}\\b))|(?=(\\b[A-Za-z]{4,50}\\b))
Full code:
class RegexSplit{
public static void main(String[] args){
String str = "I need to purchase a new vehicle. I would prefer a BMW.";
String[] tokens = str.split("(?<=(\\b[A-Za-z]{4,50}\\b))|(?=(\\b[A-Za-z]{4,50}\\b))");
for(String token: tokens){
System.out.print("["+token+"]");
}
System.out.println();
}
}
to get this output:
[I ][need][ to ][purchase][ a new ][vehicle][. I ][would][ ][prefer][ a BMW.]

Categories

Resources