string split to string array - java

I have a string variable Result that contains a string like:
"<field1>text</field1><field2>text</field2> etc.."
I use this code to try to split it:
Result = Result.replace("><", ">|<");
String[] Fields = Result.split("|");
According to the many websites, including this one, this should give me an array like this:
Fields[0] = "<field1>text</field2>"
Fields[1] = "<field2>test</field2>"
etc...
But it gives me an array like:
Fields(0) = ""
Fields(1) = "<"
Fields(2) = "f"
Fields(3) = "i"
Fields(4) = "e"
etc..
So, what am I doing wrong?

Your call to split("|") is parsing | as a regular-expression-or, which on its own will split between every character.
You can regex-escape the character to prevent this from occurring, or use a different temporary split character altogether.
String[] fields = result.split("\\|");
or
result = result.replace("><", ">~<");
String[] fields = result.split("~");

Try doing
String[] fields = result.split("\\|");
Note that I've used more conventional variable names (they shouldn't start with capital letters).
Remember that the split methods takes a regular expression as an argument, and | has a specific meaning in the world of regular expressions, which is why you're not receiving what you were expecting.
Relevant documentation:
split

Related

Java String to parse with different parameters

Need to parse a string having format like this -
"xxx.xxx.xxx.xxx ProxyHost=prod.loacl.com ProxyUser=test ProxyPas=tes#123 ProxyPort=1809".
Need to split or parse in such a manner that I get "prod.loacl.com" "test" "tes#123" "1809" in some strings and if any of parameters is not defined like ProxyPas then it should be null.
We need to ignore the IP addr xxx.xxx.xxx.xxx it will be always concatenated.
Do we have split or use some list to get this done...which is the best possible way to extract this information and why?
Note: Input string can change except ProxyHost parameter, user may not input the ProxyPass etc.
If you assume that format of the input string will not change, you can do something like this:
string inputString = "xxx.xxx.xxx.xxx ProxyHost=prod.loacl.com ProxyUser=test ProxyPas=tes#123 ProxyPort=1809";
string[] eachPart = inputString.Split(" ");
for(int i = 1; i < eachPart.Length; i++) // Skip the IP address
{
string[] partData = eachPart[i].Split("=");
string dataName = partData[0];
string dataValue = partData[1];
// do something with dataName and dataValue
}
However, if input string can change its format you should add some additional logic to this code.
Use regex with groups for this, sample:
var myString = "xxx.xxx.xxx.xxx ProxyHost=prod.loacl.com ProxyUser=test ProxyPas=tes123 ProxyPort=1809";
var regex = new Regex(#"ProxyHost=([^\s]+) ProxyUser=([^\s]+) ProxyPas=([^\s]+) ProxyPort=(\d+)");
var match = regex.Match(myString);
while(match != null && match.Success)
{
int i = 0;
foreach(var group in match.Groups)
{
Console.WriteLine($"Group {i}: Value:'{group}'");
i++;
}
match = match.NextMatch();
}
now you can match the groups to your properties.
One of the possible approaches is to do this Regular Expression:
([^=]+?)\=((\"[^"]+?\")|([^ ]+))
on the whole string. This allows variable input like this:
variable="this has spaces but still is recognized as one"
Problem is that seems like the variable content will be in either 3rd or 4th Group of such match, according to online regex testers, depends on if it has quotes or simply one string - must have more elegant way to do this, but can't come up with any now.
You can check this document to understand more about C#'s regexp groups:
Match.Groups
You will have to deal with null inputs accordingly, when you are putting the content into your C# variable.

Replace parts of a string in Java

I need to replace parts of a string by looking up the System properties.
For example, consider the string It was {var1} beauty killed {var2}
I need to parse the string, and replace all the words contained within the parenthesis by looking up their value in System properties. If System.getProperty() returns null, then simply replace with empty character. This is pretty straightforward when I know the variables well ahead. But the string that I need to parse is not defined ahead. I wouldn't know how many number of variables are in the string and what the variable names are. Assuming a simple, well formatted string (no nested parenthesis, open - close matches), what is the simplest or the most elegant way to parse through the string and replace all the character sequences that are enclosed in the parenthesis?
Only solution I could come up with is to traverse the string from the first character, note down the positions of the start and end positions of the parenthesis, replace the string between them, and then continue until reaching the end of the string. Is there simpler way to do this?
You can use the parentheses to break the initial string into substrings, and then replace every other substring.
String[] substituteValues = {"the", "str", "other", "another"};
int substituteValuesIndex = 0;
String test = "Here is {var1} string called {var2}";
// split the string up into substrings
test = test.replaceAll("\\}", "\\{");
String[] splitString = test.split("\\{");
// now sub in your values
for (int k=1; k < splitString.length; k = k+2) {
splitString[k] = substituteValues[substituteValuesIndex];
substituteValuesIndex++;
}
String result = "";
for (String s : splitString) {
result = result + s;
}

How to get the desired character from the variable sized strings?

I need to extract the desired string which attached to the word.
For example
pot-1_Sam
pot-22_Daniel
pot_444_Jack
pot_5434_Bill
I need to get the names from the above strings. i.e Sam, Daniel, Jack and Bill.
Thing is if I use substring the position keeps on changing due to the length of the number. How to achieve them using REGEX.
Update:
Some strings has 2 underscore options like
pot_US-1_Sam
pot_RUS_444_Jack
Assuming you have a standard set of above formats, It seems you need not to have any regex, you can try using lastIndexOf and substring methods.
String result = yourString.substring(yourString.lastIndexOf("_")+1, yourString.length());
Your answer is:
String[] s = new String[4];
s[0] = "pot-1_Sam";
s[1] = "pot-22_Daniel";
s[2] = "pot_444_Jack";
s[3] = "pot_5434_Bill";
ArrayList<String> result = new ArrayList<String>();
for (String value : s) {
String[] splitedArray = value.split("_");
result.add(splitedArray[splitedArray.length-1]);
}
for(String resultingValue : result){
System.out.println(resultingValue);
}
You have 2 options:
Keep using the indexOf method to get the index of the last _ (This assumes that there is no _ in the names you are after). Once that you have the last index of the _ character, you can use the substring method to get the bit you are after.
Use a regular expression. The strings you have shown essentially have the pattern where in you have numbers, followed by an underscore which is in turn followed by the word you are after. You can use a regular expression such as \\d+_ (which will match one or more digits followed by an underscore) in combination with the split method. The string you are after will be in the last array position.
Use a string tokenizer based on '_' and get the last element. No need for REGEX.
Or use the split method on the string object like so :
String[] strArray = strValue.split("_");
String lastToken = strArray[strArray.length -1];
String[] s = {
"pot-1_Sam",
"pot-22_Daniel",
"pot_444_Jack",
"pot_5434_Bill"
};
for (String e : s)
System.out.println(e.replaceAll(".*_", ""));

How to split string in two parts

I'm retrieving Strings from the database and storing in into a String variable which is inside the for loop. Few Strings i'm retrieving are in the form of:
https://www.ppltalent.com/test/en/soln-computers-ltd
and few are in the form of
https://www.ppltalent.com/test/ja/aman-computers-ltd
I want split string into two substrings i.e
https://www.ppltalent.com/test/en/soln-computers-ltd as https://www.ppltalent.com/test/en and /soln-computers-ltd.
It can easily be separated if i would have only /en.
String[] parts = stringPart.split("/en");
System.out.println("Divided String : "+ parts[1]);
But in many of the strings it has /jr , /ch etc.
So how can I split them in two sub-strings?
You could perhaps use the fact that /en and /ja are both preceeded by /test/. So, something like indexOf("/test/") and then substring.
In your examples, it seems like you're interested in the very last part, which could be retrieved by lastIndexOf('/') for instance.
Or, using look-arounds you could do
String s1 = "https://www.ppltalent.com/test/en/soln-computers-ltd";
String[] parts = s1.split("(?<=/test/../)");
System.out.println(parts[0]); // https://www.ppltalent.com/test/er/
System.out.println(parts[1]); // soln-computers-ltd
Split on the last /
String fullUrl = "https:////www.ppltalent.com//test//en//soln-computers-ltd";
String baseUrl = fullUrl.substring(0, fullUrl.lastIndexOf("//"));
String manufacturer = fullUrl.subString(fullUrl.lastIndexOf("//"));

Error when splitting a string in java

I am trying to split a string according to a certain set of delimiters.
My delimiters are: ,"():;.!? single spaces or multiple spaces.
This is the code i'm currently using,
String[] arrayOfWords= inputString.split("[\\s{2,}\\,\"\\(\\)\\:\\;\\.\\!\\?-]+");
which works fine for most cases but i'm have a problem when the the first word is surrounded by quotation marks. For example
String inputString = "\"Word\" some more text.";
Is giving me this output
arrayOfWords[0] = ""
arrayOfWords[0] = "Word"
arrayOfWords[1] = "some"
arrayOfWords[2] = "more"
arrayOfWords[3] = "text"
I want the output to give me an array with
arrayOfWords[0] = "Word"
arrayOfWords[1] = "some"
arrayOfWords[2] = "more"
arrayOfWords[3] = "text"
This code has been working fine when quotation marks are used in the middle of the sentence, I'm not sure what the trouble is when it's at the beginning.
EDIT: I just realized I have same problem when any of the delimiters are used as the first character of the string
Unfortunately you wont be able to remove this empty first element using only split. You should probably remove first elements from your string that match your delimiters and split after it. Also your regex seems to be incorrect because
by adding {2,} inside [...] you are in making { 2 , and } characters delimiters,
you don't need to escape rest of your delimiters (note that you don't have to escape - only because it is at end of character class [] so he cant be used as range operator).
Try maybe this way
String regexDelimiters = "[\\s,\"():;.!?\\-]+";
String inputString = "\"Word\" some more text.";
String[] arrayOfWords = inputString.replaceAll(
"^" + regexDelimiters,"").split(regexDelimiters);
for (String s : arrayOfWords)
System.out.println("'" + s + "'");
output:
'Word'
'some'
'more'
'text'
A delimiter is interpreted as separating the strings on either side of it, thus the empty string on its left is added to the result as well as the string to its right ("Word"). To prevent this, you should first strip any leading delimiters, as described here:
How to prevent java.lang.String.split() from creating a leading empty string?
So in short form you would have:
String delim = "[\\s,\"():;.!?\\-]+";
String[] arrayOfWords = inputString.replaceFirst("^" + delim, "").split(delim);
Edit: Looking at Pshemo's answer, I realize he is correct regarding your regex. Inside the brackets it's unnecessary to specify the number of space characters, as they will be caught be the + operator.

Categories

Resources