Extracting digits in the middle of a string using delimiters - java

String ccToken = "";
String result = "ssl_transaction_type=CCGETTOKENssl_result=0ssl_token=4366738602809990ssl_card_number=41**********9990ssl_token_response=SUCCESS";
String[] elavonResponse = result.split("=|ssl");
for (String t : elavonResponse) {
System.out.println(t);
}
ccToken = (elavonResponse[6]);
System.out.println(ccToken);
I want to be able to grab a specific part of a string and store it in a variable. The way I'm currently doing it, is by splitting the string and then storing the value of the cell into my variable. Is there a way to specify that I want to store the digits after "ssl_token="?
I want my code to be able to obtain the value of ssl_token without having to worry about changes in the string that are not related to the token since I wont have control over the string. I have searched online but I can't find answers for my specific problem or I maybe using the wrong words for searching.

You can use replaceAll with this regex .*ssl_token=(\\d+).* :
String number = result.replaceAll(".*ssl_token=(\\d+).*", "$1");
Outputs
4366738602809990

You can do it with regex. It would probably be better to change the specifications of the input string so that each key/value pair is separated by an ampersand (&) so you could split it (similar to HTTP POST parameters).
Pattern p = Pattern.compile(".*ssl_token=([0-9]+).*");
Matcher m = p.matcher(result);
if(m.matches()) {
long token = Long.parseLong(m.group(1));
System.out.println(String.format("token: [%d]", token));
} else {
System.out.println("token not found");
}

Search index of ssl_token. Create substring from that index. Convert substring to number. To number can extract number when it is at the beggining of the string.

Related

Java String to parse with different parameters

Need to parse a string having format like this -
"xxx.xxx.xxx.xxx ProxyHost=prod.loacl.com ProxyUser=test ProxyPas=tes#123 ProxyPort=1809".
Need to split or parse in such a manner that I get "prod.loacl.com" "test" "tes#123" "1809" in some strings and if any of parameters is not defined like ProxyPas then it should be null.
We need to ignore the IP addr xxx.xxx.xxx.xxx it will be always concatenated.
Do we have split or use some list to get this done...which is the best possible way to extract this information and why?
Note: Input string can change except ProxyHost parameter, user may not input the ProxyPass etc.
If you assume that format of the input string will not change, you can do something like this:
string inputString = "xxx.xxx.xxx.xxx ProxyHost=prod.loacl.com ProxyUser=test ProxyPas=tes#123 ProxyPort=1809";
string[] eachPart = inputString.Split(" ");
for(int i = 1; i < eachPart.Length; i++) // Skip the IP address
{
string[] partData = eachPart[i].Split("=");
string dataName = partData[0];
string dataValue = partData[1];
// do something with dataName and dataValue
}
However, if input string can change its format you should add some additional logic to this code.
Use regex with groups for this, sample:
var myString = "xxx.xxx.xxx.xxx ProxyHost=prod.loacl.com ProxyUser=test ProxyPas=tes123 ProxyPort=1809";
var regex = new Regex(#"ProxyHost=([^\s]+) ProxyUser=([^\s]+) ProxyPas=([^\s]+) ProxyPort=(\d+)");
var match = regex.Match(myString);
while(match != null && match.Success)
{
int i = 0;
foreach(var group in match.Groups)
{
Console.WriteLine($"Group {i}: Value:'{group}'");
i++;
}
match = match.NextMatch();
}
now you can match the groups to your properties.
One of the possible approaches is to do this Regular Expression:
([^=]+?)\=((\"[^"]+?\")|([^ ]+))
on the whole string. This allows variable input like this:
variable="this has spaces but still is recognized as one"
Problem is that seems like the variable content will be in either 3rd or 4th Group of such match, according to online regex testers, depends on if it has quotes or simply one string - must have more elegant way to do this, but can't come up with any now.
You can check this document to understand more about C#'s regexp groups:
Match.Groups
You will have to deal with null inputs accordingly, when you are putting the content into your C# variable.

Parse out specific characters from java string

I have been trying to drop specific values from a String holding JDBC query results and column metadata. The format of the output is:
[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]
I am trying to get it into the following format:
I_Col1=someValue1, I_Col2=someVal2, I_Col3=someVal3
I have tried just dropping everything before the "=", but some of the "someVal" data has "=" in them. Is there any efficient way to solve this issue?
below is the code I used:
for(int i = 0; i < finalResult.size(); i+=modval) {
String resulttemp = finalResult.get(i).toString();
String [] parts = resulttemp.split(",");
//below is only for
for(int z = 0; z < columnHeaders.size(); z++) {
String replaced ="";
replaced = parts[z].replace("*=", "");
System.out.println("Replaced: " + replaced);
}
}
You don't need any splitting here!
You can use replaceAll() and the power of regular expressions to simply replace all occurrences of those unwanted characters, like in:
someString.replaceAll("[\\[\\]\\{\\}", "")
When you apply that to your strings, the resulting string should exactly look like required.
You could use a regular expression to replace the square and curly brackets like this [\[\]{}]
For example:
String s = "[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]";
System.out.println(s.replaceAll("[\\[\\]{}]", ""));
That would produce the following output:
I_Col1=someValue1, I_Col2=someVal2, I_Col3=someVal3
which is what you expect in your post.
A better approach however might be to match instead of replace if you know the character set that will be in the position of 'someValue'. Then you can design a regex that will match this perticular string in such a way that no matter what seperates I_Col1=someValue1 from the rest of the String, you will be able to extract it :-)
EDIT:
With regards to the matching approach, given that the value following I_Col1= consists of characters from a-z and _ (regardless of the case) you could use this pattern: (I_Col\d=\w+),?
For example:
String s = "[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]";
Matcher m = Pattern.compile("(I_Col\\d=\\w+),?").matcher(s);
while (m.find())
System.out.println(m.group(1));
This will produce:
I_Col1=someValue1
I_Col2=someVal2
I_Col3=someVal3
You could do four calls to replaceAll on the string.
String query = "[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]"
String queryWithoutBracesAndBrackets = query.replaceAll("\\{", "").replaceAll("\\]", "").replaceAll("\\]", "").replaceAll("\\[", "")
Or you could use a regexp if you want the code to be more understandable.
String query = "[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]"
queryWithoutBracesAndBrackets = query.replaceAll("\\[|\\]|\\{|\\}", "")

Extracting Twitter username from a given text (JAVA, Regex)

I believe the code is OK, the problem is the regex.
Basically I want to find a username mention (it starts with #), and then I want to extract the allowed username part from the given word.
For example if the text contains "#FOO!!" I want to extract only "foo", but I believe the problem is with my "split("[a-z0-9-_]+")[0]" part.
Btw, allowed symbols are numbers, letters, - and _
public static Set<String> getMentionedUsers(List<Tweet> tweets) {
Set<String> mentioned = new HashSet<>();
for (Tweet tweet : tweets) {
String tweetToAnal = null;
if (tweet.getText().contains("#")) tweetToAnal = tweet.getText();
if (tweetToAnal == null) continue;
String[] splited = tweetToAnal.split("\\s+");
for (String elem : splited) {
String newElem = "";
if (elem.startsWith("#")) {
newElem = elem.substring(1).toLowerCase().split("[a-z0-9-_]+")[0];
}
if (newElem.length() > 0) mentioned.add(newElem);
}
}
return mentioned;
}
The problem is not on your regex but on your logic.
You are using below line to analize usernames:
if (elem.startsWith("#")) {
newElem = elem.substring(1).toLowerCase().split("[a-z0-9-_]+")[0];
}
If you debug step by step your code, you will notice that you are consuming (with substring(1)) the # and then you are splitting by using a regex, therefore this split is consuming all your characters as well. However, you don't want to consume characters with the split method but you just want to capture content.
So, you can actually use split by using the negated regex you are using by doing:
split("[^a-z0-9-_]+")
^---- Notice the negate character class indicator
On the other hand, instead of splitting the whole text in multiple tokens to further be analyzed, you can use a regex with capturing group and then grab the username you want. So, instead of having this code:
String[] splited = tweetToAnal.split("\\s+");
for (String elem : splited) {
String newElem = "";
if (elem.startsWith("#")) {
newElem = elem.substring(1).toLowerCase().split("[a-z0-9-_]+")[0];
}
if (newElem.length() > 0) mentioned.add(newElem);
You can use a much more simpler code like this:
Matcher m = Pattern.compile("(?<=#)([\\w-]+)").matcher(tweetToAnal); // Analyze text with a regex that will capture usernames preceded by #
while (m.find()) { // Stores all username (without #)
mentioned.add(m.group(1));
}
Btw, I didn't test the code, so I may have a typo but you can understand the idea. Anyway the code is pretty simple to understand.
I'm not a Java-Person, but you can easily match twitter-usernames without the "#" using the following regex:
(?<=#)[\w-]+
which can be seen here. Of course you would need to escape special characters properly, but since I have no clue of Java, you would have to do that and the actual matching by yourself.

Filter and find integers in a String with Regex

I have this long string:
String responseData = "fker.phone.bash,0,0,0"
+ "fker.phone.bash,0,0,0"
+ "fker.phone.bash,2,0,0";
What I want to do is to extract the integers in this string. I have successfully done that with this code:
String pattern = "(\\d+)";
// this pattern finds EVERY integer. I only want the integers after the comma
Pattern pr = Pattern.compile(pattern);
Matcher match = pr.matcher(responseData);
while (match.find()) {
System.out.println(match.group());
}
So far it is working, but I want to make my regex more secure because the responsedata I get is dynamic. Sometimes I might get an integer in the middle of the string, but I only want the last integers, meaning after the comma.
I know the regex for starts with is ^ and I have to put my comma tecken as an argument, but I don't know how to piece it all together and that is why I am asking for help. Thank you.
String pattern = "(,)(\\d)+";
Then get the second group.
You can use positive lookbehind for that:
String pattern = "(?<=,)\\d+";
You don't need to extract any groups to do use that solution, because lookbehind is zero-length assertion.
You can simply use the following and find by match.group(1):
String pattern = ",(\\d+)";
See working demo
You can also use word boundaries to get independent numbers:
String pattern = "\\b(\\d+)\\b";

Get specific value from string using split fucntion

I have String something like this
APIKey testapikey=mysecretkey
I want to get mysecretkey to String attribute
What i tried is below
String[] couple = string.split(" ");
String[] values=couple[1].split("=");
String mykey= values[1];
Is this right way?
You could use the String.replaceAll(...) method.
String string = "APIKey testapikey=mysecretkey";
// [.*key=] - match the substring ending with "key="
// [(.*)] - match everything after the "key=" and group the matched characters
// [$1] - replace the matched string by the value of cpaturing group number 1
string = string.replaceAll(".*key=(.*)", "$1");
System.out.println(string);
Don't use split() you will be unnecessarily creating an array of Strings.
Use String myString = originalString.replaceAll(".*=","");
I think using split here is pretty error prone. A small change in the format of the incoming string (such as a space being added) could result in a bug that's hard to diagnose. My recommendation would be to play it safe and use a regular expression to ensure the text is exactly as you expect:
Pattern pattern = Pattern.compile("APIKey testapikey=(\\w*)");
Matcher matcher = pattern.matcher(apiKeyText);
if (!matcher.matches())
throw new IllegalArgumentException("apiKey does not match pattern");
String apiKey = matcher.group();
That code documents your intentions much better than use of split and picks up unexpected changes in format. The only possible downside is performance but assuming you make pattern a static final (to ensure it's compiled once) then unless you are calling this millions of times then I very much doubt it will be an issue.

Categories

Resources