Difficulty splitting string at delimiter and keeping it - java

I have a string that is read in pairs, separated by comma. However, I do not always want to split at the comma because there is not always 1 comma in the input. For example, the string,
(http://www.wolframalpha.com/input/?i=103%2F30+%3D+4a-3b,+71%2F60+%3D+a+%2B+b
,http://www.wolframalpha.com/input/?i=x%5E2%2B5x%2B6,file:///tmp/foo/bar/p,d,f.pdf)
Is read in all one line. For this case, I only want to split at the ,h, and no where else in the string. Essentially, after the split, the strings should be:
http://www.wolframalpha.com/input/?i=103%2F30+%3D+4a-3b,+71%2F60+%3D+a+%2B+b
http://www.wolframalpha.com/input/?i=x%5E2%2B5x%2B6
file:///tmp/foo/bar/p,d,f.pdf
Maintaining the order of the comma in the first string. (I will get rid of parenthesis). I have looked at this stack overflow question, and while helpful, does not correctly split this string. This is in Java. Any help is appreciated.

You can use regex to do the split. Please see below code snippet.
String str = "(http://www.wolframalpha.com/input/?i=103%2F30+%3D+4a-3b,+71%2F60+%3D+a+%2B+b,http://www.wolframalpha.com/input/?i=x%5E2%2B5x%2B6)";
String[] strArr = str.split("(,(?=http))");
You will have Array of all the value which would be possible according to your requirement.

Split on 'http' then re-add it.
Psuedo-code
String input = "http://www.wolframalpha.com/input/?i=103%2F30+%3D+4a-3b,+71%2F60+%3D+a+%2B+b
,http://www.wolframalpha.com/input/?i=x%5E2%2B5x%2B6"
List<String> split = input.split('http');
List<String> finalList = new ArrayList<String>();
for(String fixup in split)
{
finalList.put( "http" + fixup );
}
Final should contain the two URLs.

Related

Any alternative to String[] when splitting a String?

In making a scanner, I want to split apart the input twice:
First remove any spaces with String.split("\\s+");
Then split the remaining String into chars with String.split("(?!^)");
After removing the spaces, I can't seem to figure out how to make a String that holds the entirely new Array of parts of my String.
With this, I tried String = String.split(), and that didn't work.
Google didn't help either.
You seem to be overcomplicating this, why not something as simple as:
// remove spaces
String a = "abc".replace(" ", "");
// to array of chars
char[] chars = a.toCharArray();

How to get the desired character from the variable sized strings?

I need to extract the desired string which attached to the word.
For example
pot-1_Sam
pot-22_Daniel
pot_444_Jack
pot_5434_Bill
I need to get the names from the above strings. i.e Sam, Daniel, Jack and Bill.
Thing is if I use substring the position keeps on changing due to the length of the number. How to achieve them using REGEX.
Update:
Some strings has 2 underscore options like
pot_US-1_Sam
pot_RUS_444_Jack
Assuming you have a standard set of above formats, It seems you need not to have any regex, you can try using lastIndexOf and substring methods.
String result = yourString.substring(yourString.lastIndexOf("_")+1, yourString.length());
Your answer is:
String[] s = new String[4];
s[0] = "pot-1_Sam";
s[1] = "pot-22_Daniel";
s[2] = "pot_444_Jack";
s[3] = "pot_5434_Bill";
ArrayList<String> result = new ArrayList<String>();
for (String value : s) {
String[] splitedArray = value.split("_");
result.add(splitedArray[splitedArray.length-1]);
}
for(String resultingValue : result){
System.out.println(resultingValue);
}
You have 2 options:
Keep using the indexOf method to get the index of the last _ (This assumes that there is no _ in the names you are after). Once that you have the last index of the _ character, you can use the substring method to get the bit you are after.
Use a regular expression. The strings you have shown essentially have the pattern where in you have numbers, followed by an underscore which is in turn followed by the word you are after. You can use a regular expression such as \\d+_ (which will match one or more digits followed by an underscore) in combination with the split method. The string you are after will be in the last array position.
Use a string tokenizer based on '_' and get the last element. No need for REGEX.
Or use the split method on the string object like so :
String[] strArray = strValue.split("_");
String lastToken = strArray[strArray.length -1];
String[] s = {
"pot-1_Sam",
"pot-22_Daniel",
"pot_444_Jack",
"pot_5434_Bill"
};
for (String e : s)
System.out.println(e.replaceAll(".*_", ""));

Remove characters before a comma in a string

I was wondering what would be the best way to go about removing characters before a comma in a string, as well as removing the comma itself, leaving just the characters after the comma in the string, if the string is represented as 'city,country'.
Thanks in advance
So you want
city,country
to become
country
An easy way to do this is this:
public static void main(String[] args) {
System.out.println("city,country".replaceAll(".*,", ""));
}
This is "greedy" though, meaning it will change
city,state,country
into
country
In your case, you might want it to become
state,country
I couldn't tell from your question.
If you want "non-greedy" matching, use
System.out.println("city,state,country".replaceAll(".*?,", ""));
this will output
state, country
check this
String s="city,country";
System.out.println(s.substring(s.lastIndexOf(',')+1));
I found it faster than .replaceAll(".*,", "")
If what you are interested in is extracting data while leaving the original string intact you should use the split(String regex) function.
String foo = new String("city,country");
String[] data = foo.split(",");
The data array will now contain strings "city" and "country".
More info is available here: http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#split%28java.lang.String%29
This can be done with a combination of substring and indexOf, using indexOf to determine the position of the (first) comma, and substring to extract a portion of the string relative to that position.
String s = "city,country";
String s2 = s.substring(s.indexOf(",") + 1);
You could implement a sort of substring that finds all the indexes of characters before your comma and then all you'd need to do is remove them.

split method leaving space in array

{
ArrayList<String> node_array = new ArrayList<String>();
String allValues[] = node.split("[(,)]");
for(String value : allValues){
node_array.add(value);
}
node is a string, for example: (3,4,5,6,3)
for some reason when I verify the content of the arraylist the split seems to leave a trail of space as elements, specifically where ( and ) is supposed to be. What am I doing wrong?
You're asking split() to split at parentheses and commas. In your string, there is a blank substring right before the first separator, the opening parenthesis. split() is keeping that blank substring and returning it at the zeroth element of the resulting array.
There are plenty of examples in the documentation that illustrate how the function works.
To work around this, you can either ignore the empty strings, or flip the regex on its head and match the numbers instead of splitting at the punctuation characters.
You have defined a separator to be the one of the characters that's the first character in your String, so an empty string "" will show up in your ArrayList, because that what occurs before the first separator. However, for your application you can easily fix it like this:
ArrayList<String> node_array = new ArrayList<String>();
String allValues[] = node.split("[(,)]");
for(String value : allValues){
if(!value.equals("")) node_array.add(value);
}
return node_array;
node.replace("(","").replace(")","").split(",");
or
node.substring(1,node.length()-1).split(",");

using regular expression as delimiter with StringTokenizer

I am new to java progrmming and came across the StringTokenizer class. The constructor accepts the string to be split and another optional delimiter string each character of which gets treated as an individual delimiter while splitting the original string. I was wondering if there is any way to split the string passing a regex as the delimiter. for example:
String s="34.5xy32.6y45.7x36xy"
StringTokenizer t=new StringTokenizer(s,"xy");
System.out.println(t.nextToken());
System.out.println(t.nextToken());
The actual output is:
34.5
32.6
However, the desired output is:
34.5
32.6y45.7x36
Hope you guys can help. Also, please suggest some way around if it is not possible with StringTokenizer class.
Thanks in advance.
p.s. Is there any way to know which character the StringTokenizer is currently using as delimiter out of the provided set?
Here you would want to use String.split(), this will give you an array with your desired output.
It will take your input and split it around exact matches of your string you provide. StringTokenizer will split around anyone of the set that you provide it rather than a regular expression.
So you change your code to:
String s="34.5xy32.6y45.7x36xy";
String[] splitString = s.split("xy");
System.out.println(splitString [0]);
System.out.println(splitString [1]);
For more complex examples you probably want boundary checking on the array also to make you don't go off the end of the array
Try with this.
String s="34.5xy32.6y45.7x36xy";
final String SPLIT_STR = "xy";
final String mainStr = "34.5xy32.6y45.7x36xy";
final String[] splitStr = mainStr.split(SPLIT_STR);
System.out.println("First Index Of xy : " +
mainStr.indexOf(SPLIT_STR));
for(int index=0; index < splitStr.length; index++) {
System.out.println("Split : " + splitStr[index]);
}

Categories

Resources