Java String to parse with different parameters

Java String to parse with different parameters - java

Need to parse a string having format like this -
"xxx.xxx.xxx.xxx ProxyHost=prod.loacl.com ProxyUser=test ProxyPas=tes#123 ProxyPort=1809".
Need to split or parse in such a manner that I get "prod.loacl.com" "test" "tes#123" "1809" in some strings and if any of parameters is not defined like ProxyPas then it should be null.
We need to ignore the IP addr xxx.xxx.xxx.xxx it will be always concatenated.
Do we have split or use some list to get this done...which is the best possible way to extract this information and why?
Note: Input string can change except ProxyHost parameter, user may not input the ProxyPass etc.

If you assume that format of the input string will not change, you can do something like this:
string inputString = "xxx.xxx.xxx.xxx ProxyHost=prod.loacl.com ProxyUser=test ProxyPas=tes#123 ProxyPort=1809";
string[] eachPart = inputString.Split(" ");
for(int i = 1; i < eachPart.Length; i++) // Skip the IP address
{
string[] partData = eachPart[i].Split("=");
string dataName = partData[0];
string dataValue = partData[1];
// do something with dataName and dataValue
}
However, if input string can change its format you should add some additional logic to this code.

Use regex with groups for this, sample:
var myString = "xxx.xxx.xxx.xxx ProxyHost=prod.loacl.com ProxyUser=test ProxyPas=tes123 ProxyPort=1809";
var regex = new Regex(#"ProxyHost=([^\s]+) ProxyUser=([^\s]+) ProxyPas=([^\s]+) ProxyPort=(\d+)");
var match = regex.Match(myString);
while(match != null && match.Success)
{
int i = 0;
foreach(var group in match.Groups)
{
Console.WriteLine($"Group {i}: Value:'{group}'");
i++;
}
match = match.NextMatch();
}
now you can match the groups to your properties.

One of the possible approaches is to do this Regular Expression:
([^=]+?)\=((\"[^"]+?\")|([^ ]+))
on the whole string. This allows variable input like this:
variable="this has spaces but still is recognized as one"
Problem is that seems like the variable content will be in either 3rd or 4th Group of such match, according to online regex testers, depends on if it has quotes or simply one string - must have more elegant way to do this, but can't come up with any now.
You can check this document to understand more about C#'s regexp groups:
Match.Groups
You will have to deal with null inputs accordingly, when you are putting the content into your C# variable.

Related

Extracting digits in the middle of a string using delimiters

String ccToken = "";
String result = "ssl_transaction_type=CCGETTOKENssl_result=0ssl_token=4366738602809990ssl_card_number=41**********9990ssl_token_response=SUCCESS";
String[] elavonResponse = result.split("=|ssl");
for (String t : elavonResponse) {
System.out.println(t);
}
ccToken = (elavonResponse[6]);
System.out.println(ccToken);
I want to be able to grab a specific part of a string and store it in a variable. The way I'm currently doing it, is by splitting the string and then storing the value of the cell into my variable. Is there a way to specify that I want to store the digits after "ssl_token="?
I want my code to be able to obtain the value of ssl_token without having to worry about changes in the string that are not related to the token since I wont have control over the string. I have searched online but I can't find answers for my specific problem or I maybe using the wrong words for searching.

You can use replaceAll with this regex .*ssl_token=(\\d+).* :
String number = result.replaceAll(".*ssl_token=(\\d+).*", "$1");
Outputs
4366738602809990

You can do it with regex. It would probably be better to change the specifications of the input string so that each key/value pair is separated by an ampersand (&) so you could split it (similar to HTTP POST parameters).
Pattern p = Pattern.compile(".*ssl_token=([0-9]+).*");
Matcher m = p.matcher(result);
if(m.matches()) {
long token = Long.parseLong(m.group(1));
System.out.println(String.format("token: [%d]", token));
} else {
System.out.println("token not found");
}

Search index of ssl_token. Create substring from that index. Convert substring to number. To number can extract number when it is at the beggining of the string.

Parse out specific characters from java string

I have been trying to drop specific values from a String holding JDBC query results and column metadata. The format of the output is:
[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]
I am trying to get it into the following format:
I_Col1=someValue1, I_Col2=someVal2, I_Col3=someVal3
I have tried just dropping everything before the "=", but some of the "someVal" data has "=" in them. Is there any efficient way to solve this issue?
below is the code I used:
for(int i = 0; i < finalResult.size(); i+=modval) {
String resulttemp = finalResult.get(i).toString();
String [] parts = resulttemp.split(",");
//below is only for
for(int z = 0; z < columnHeaders.size(); z++) {
String replaced ="";
replaced = parts[z].replace("*=", "");
System.out.println("Replaced: " + replaced);
}
}

You don't need any splitting here!
You can use replaceAll() and the power of regular expressions to simply replace all occurrences of those unwanted characters, like in:
someString.replaceAll("[\\[\\]\\{\\}", "")
When you apply that to your strings, the resulting string should exactly look like required.

You could use a regular expression to replace the square and curly brackets like this [\[\]{}]
For example:
String s = "[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]";
System.out.println(s.replaceAll("[\\[\\]{}]", ""));
That would produce the following output:
I_Col1=someValue1, I_Col2=someVal2, I_Col3=someVal3
which is what you expect in your post.
A better approach however might be to match instead of replace if you know the character set that will be in the position of 'someValue'. Then you can design a regex that will match this perticular string in such a way that no matter what seperates I_Col1=someValue1 from the rest of the String, you will be able to extract it :-)
EDIT:
With regards to the matching approach, given that the value following I_Col1= consists of characters from a-z and _ (regardless of the case) you could use this pattern: (I_Col\d=\w+),?
For example:
String s = "[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]";
Matcher m = Pattern.compile("(I_Col\\d=\\w+),?").matcher(s);
while (m.find())
System.out.println(m.group(1));
This will produce:
I_Col1=someValue1
I_Col2=someVal2
I_Col3=someVal3

You could do four calls to replaceAll on the string.
String query = "[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]"
String queryWithoutBracesAndBrackets = query.replaceAll("\\{", "").replaceAll("\\]", "").replaceAll("\\]", "").replaceAll("\\[", "")
Or you could use a regexp if you want the code to be more understandable.
String query = "[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]"
queryWithoutBracesAndBrackets = query.replaceAll("\\[|\\]|\\{|\\}", "")

Replace parts of a string in Java

I need to replace parts of a string by looking up the System properties.
For example, consider the string It was {var1} beauty killed {var2}
I need to parse the string, and replace all the words contained within the parenthesis by looking up their value in System properties. If System.getProperty() returns null, then simply replace with empty character. This is pretty straightforward when I know the variables well ahead. But the string that I need to parse is not defined ahead. I wouldn't know how many number of variables are in the string and what the variable names are. Assuming a simple, well formatted string (no nested parenthesis, open - close matches), what is the simplest or the most elegant way to parse through the string and replace all the character sequences that are enclosed in the parenthesis?
Only solution I could come up with is to traverse the string from the first character, note down the positions of the start and end positions of the parenthesis, replace the string between them, and then continue until reaching the end of the string. Is there simpler way to do this?

You can use the parentheses to break the initial string into substrings, and then replace every other substring.
String[] substituteValues = {"the", "str", "other", "another"};
int substituteValuesIndex = 0;
String test = "Here is {var1} string called {var2}";
// split the string up into substrings
test = test.replaceAll("\\}", "\\{");
String[] splitString = test.split("\\{");
// now sub in your values
for (int k=1; k < splitString.length; k = k+2) {
splitString[k] = substituteValues[substituteValuesIndex];
substituteValuesIndex++;
}
String result = "";
for (String s : splitString) {
result = result + s;
}

Extracting Twitter username from a given text (JAVA, Regex)

I believe the code is OK, the problem is the regex.
Basically I want to find a username mention (it starts with #), and then I want to extract the allowed username part from the given word.
For example if the text contains "#FOO!!" I want to extract only "foo", but I believe the problem is with my "split("[a-z0-9-_]+")[0]" part.
Btw, allowed symbols are numbers, letters, - and _
public static Set<String> getMentionedUsers(List<Tweet> tweets) {
Set<String> mentioned = new HashSet<>();
for (Tweet tweet : tweets) {
String tweetToAnal = null;
if (tweet.getText().contains("#")) tweetToAnal = tweet.getText();
if (tweetToAnal == null) continue;
String[] splited = tweetToAnal.split("\\s+");
for (String elem : splited) {
String newElem = "";
if (elem.startsWith("#")) {
newElem = elem.substring(1).toLowerCase().split("[a-z0-9-_]+")[0];
}
if (newElem.length() > 0) mentioned.add(newElem);
}
}
return mentioned;
}

The problem is not on your regex but on your logic.
You are using below line to analize usernames:
if (elem.startsWith("#")) {
newElem = elem.substring(1).toLowerCase().split("[a-z0-9-_]+")[0];
}
If you debug step by step your code, you will notice that you are consuming (with substring(1)) the # and then you are splitting by using a regex, therefore this split is consuming all your characters as well. However, you don't want to consume characters with the split method but you just want to capture content.
So, you can actually use split by using the negated regex you are using by doing:
split("[^a-z0-9-_]+")
^---- Notice the negate character class indicator
On the other hand, instead of splitting the whole text in multiple tokens to further be analyzed, you can use a regex with capturing group and then grab the username you want. So, instead of having this code:
String[] splited = tweetToAnal.split("\\s+");
for (String elem : splited) {
String newElem = "";
if (elem.startsWith("#")) {
newElem = elem.substring(1).toLowerCase().split("[a-z0-9-_]+")[0];
}
if (newElem.length() > 0) mentioned.add(newElem);
You can use a much more simpler code like this:
Matcher m = Pattern.compile("(?<=#)([\\w-]+)").matcher(tweetToAnal); // Analyze text with a regex that will capture usernames preceded by #
while (m.find()) { // Stores all username (without #)
mentioned.add(m.group(1));
}
Btw, I didn't test the code, so I may have a typo but you can understand the idea. Anyway the code is pretty simple to understand.

I'm not a Java-Person, but you can easily match twitter-usernames without the "#" using the following regex:
(?<=#)[\w-]+
which can be seen here. Of course you would need to escape special characters properly, but since I have no clue of Java, you would have to do that and the actual matching by yourself.

string split to string array

I have a string variable Result that contains a string like:
"<field1>text</field1><field2>text</field2> etc.."
I use this code to try to split it:
Result = Result.replace("><", ">|<");
String[] Fields = Result.split("|");
According to the many websites, including this one, this should give me an array like this:
Fields[0] = "<field1>text</field2>"
Fields[1] = "<field2>test</field2>"
etc...
But it gives me an array like:
Fields(0) = ""
Fields(1) = "<"
Fields(2) = "f"
Fields(3) = "i"
Fields(4) = "e"
etc..
So, what am I doing wrong?

Your call to split("|") is parsing | as a regular-expression-or, which on its own will split between every character.
You can regex-escape the character to prevent this from occurring, or use a different temporary split character altogether.
String[] fields = result.split("\\|");
or
result = result.replace("><", ">~<");
String[] fields = result.split("~");

Try doing
String[] fields = result.split("\\|");
Note that I've used more conventional variable names (they shouldn't start with capital letters).
Remember that the split methods takes a regular expression as an argument, and | has a specific meaning in the world of regular expressions, which is why you're not receiving what you were expecting.
Relevant documentation:
split

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java String to parse with different parameters - java

Related

Extracting digits in the middle of a string using delimiters

Parse out specific characters from java string

Replace parts of a string in Java

Extracting Twitter username from a given text (JAVA, Regex)

string split to string array

Categories

Resources