How to split the string in java? - java

String str = "AlwinX-road-9:00pm-kanchana travels-25365445421";
String[] names = str.split("-");
I want output like following:
AlwinX-road
9:00pm
kanchana travels
25365445421

Use pattern matching to match your requirement
String str = "AlwinX-road-9:00pm-kanchana travels-25365445421";
String regex = "(^[A-Z-a-z ]+)[-]+(\\d+:\\d+pm)[-]([a-z]+\\s+[a-z]+)[-](\\d+)";
Matcher matcher = Pattern.compile( regex ).matcher( str);
while (matcher.find( ))
{
String roadname = matcher.group(1);
String time = matcher.group(2);
String travels = matcher.group(3);
String digits= matcher.group(4);
System.out.println("time="+time);
System.out.println("travels="+travels);
System.out.println("digits="+digits);
}

Since you want to include the delimiter in your first output line, you can do the split, and merge the first two element with a -: -
String[] names = str.split("-");
System.out.println(names[0] + "-" + names[1])
for (int i = 2;i < names.length; i++) {
System.out.println(names[i])
}

The split() method can't distinguish the dash in AlwinX-road and the other dashes in the string, it treats all the dashes the same. You will need to do some sort of post processing on the resulting array. If you will always need the first two strings in the array joined you can just do that. If your strings are more complex you will need to add additional logic to join the strings in the array.

One way you could do it, assuming the first '-' is always part of a two part identifier.
String str = "AlwinX-road-9:00pm-kanchana travels-25365445421";
String[] tokens = str.split("-");
String[] output = new String[tokens.length - 1];
output[0] = tokens[0] + '-' + tokens[1];
System.out.println(output[0]);
for(int i = 1; i < output.length; i++){
output[i] = tokens[i+1];
System.out.println(output[i]);
}

Looks like you want to split (with removal of all dashes but the first one).
String str = "AlwinX-road-9:00pm-kanchana travels-25365445421";
String[] names = str.split("-");
for (String value : names)
{
System.out.println(value);
}
So its produces:
AlwinX
road
9:00pm
kanchana travels
25365445421
Notice that "AlwinX" and "road" we split as well since they had a dash in between. So you will need custom logic to handle this case. here is an example how to do it (I used StringTokenizer):
StringTokenizer tk = new StringTokenizer(str, "-", true);
String firstString = null;
String secondString = null;
while (tk.hasMoreTokens())
{
final String token = tk.nextToken();
if (firstString == null)
{
firstString = token;
continue;
}
if (secondString == null && firstString != null && !token.equals("-"))
{
secondString = token;
System.out.println(firstString + "-" + secondString);
continue;
}
if (!token.equals("-"))
{
System.out.println(token);
}
}
This will produce:
AlwinX-road
9:00pm
kanchana travels
25365445421

from your format, I think you want to split the first one just before the time part. You can do it this way:
String str =yourString;
String beforetime=str.split("-\\d+:\\d+[ap]m")[0]; //this is your first token,
//AlwinX-road in your example
String rest=str.substring(beforetime.length()+1);
String[] restNames=rest.split("-");
If you really need it all together in one array then see the code below:
String[] allTogether=new String[restNames.length+1];//the string with all your tokens
allTogether[0]=beforetime;
System.arraycopy(restNames, 0, allTogether, 1, restNames.length);

If you use "_" as a separator instead of "-": AlwinX-road_9:00pm_kanchana travels_25365445421
New code:
String str = new String("AlwinX-road_9:00pm_kanchana travels_25365445421");
String separator = new String("_");
String[] names = str.split(separator);
for(int i=0; i<names.length; i++){
System.out.println(names[i]);
}

Related

How to split a String by a comma, but from the second comma

I have a string as:
"model=iPhone12,3,os_version=13.6.1,os_update_exist=1,status=1"
How can I convert this into:
model=iPhone12,3
os_version=13.6.1
os_update_exist=1
status=1
Split the string from the first comma, then re-join the first two elements of the resulting string array.
I doubt there's a "clean" way to do this but this would work for your case:
String str = "model=iPhone12,3,os_version=13.6.1,os_update_exist=1,status=1";
String[] sp = str.split(",");
sp[0] += "," + sp[1];
sp[1] = sp[2];
sp[2] = sp[3];
sp[3] = sp[4];
sp[4] = "";
You can try this:
public String[] splitString(String source) {
// Split the source string based on a comma followed by letters and numbers.
// Basically "model=iPhone12,3,os_version=13.6.1,os_update_exist=1,status=1" will be split
// like this:
// model=iPhone12,3
// ,os_version=13.6.1
// ,os_update_exist=1
// ,status=1"
String[] result = source.split("(?=,[a-z]+\\d*)");
for (int i = 0; i < result.length; i++) {
// Removes the comma at the beginning of the string if present
if (result[i].matches(",.*")) {
result[i] = result[i].substring(1);
}
}
return result;
}
if you are parsing always the same kind of String a regex like this will be do the job
String str = "model=iPhone12,3,os_version=13.6.1,os_update_exist=1,status=1";
Matcher m = Pattern.compile("model=(.*),os_version=(.*),os_update_exist=(.*),status=(.*)").matcher(str);
if (m.find()) {
model = m.group(1)); // iPhone12,3
os = m.group(2)); // 13.6.1
update = m.group(3)); // 1
status = m.group(4)); // 1
}
If you really wants to use a split you can still use that kind of trick
String[] split = str.replaceAll(".*?=(.*?)(,[a-z]|$)", "$1#")
.split("#");
split[0] // iPhone12,3
split[1] // 13.6.1
split[2] // 1
split[3] // 1

Split and pair substrings by condition

I have a string like this: "aa-bb,ccdd,eeff,gg-gg,cc-gg". I need to split the string by '-' signs and create two strings from it, but if the comma-delimited part of the original string doesn't contain '-', some placeholder character needs to be used instead of substring. In case of above example output should be:
String 1:
"{aa,ccdd,eeff,gg,cc}"
String 2:
"{bb,0,0,gg,gg}"
I can't use the lastIndexOf() method because input is in one string. I am not sure how to much the parts.
if(rawIndication.contains("-")){
String[] parts = rawIndication.split("-");
String part1 = parts[0];
String part2 = parts[1];
}
Here is a Java 8 solution, using streams. The logic is to first split the input string on comma, generating an array of terms. Then, for each term, we split again on dash, retaining the first entry. In the case of a term having no dashes, the entire string would just be retained. Finally, we concatenate back into an output string.
String input = "aa-bb,ccdd,eeff,gg-gg,cc-gg";
int pos = 1;
String output = String.join(",", Arrays.stream(parts)
.map(e -> e.split("-").length >= (pos+1) ? e.split("-")[pos] : "0")
.toArray(String[]::new));
System.out.println(output);
This outputs:
bb,0,0,gg,gg
List<String> list1 = new ArrayList<>();
List<String> list2 = new ArrayList<>();
// First split the source String by comma to separate main parts
String[] mainParts = sourceStr.split(",");
for (String mainPart: mainParts) {
// Check if each part contains '-' character
if (mainPart.contains("-")) {
// If contains '-', split and add the 2 parts to 2 arrays
String[] subParts = mainPart.split("-");
list1.add(subParts[0]);
list2.add(subParts[1]);
} else {
// If does not contain '-', add complete part to 1st array and add placeholder to 2nd array
list1.add(mainPart);
list2.add("0");
}
}
// Build the final Strings by joining String parts by commas and enclosing between parentheses
String str1 = "{" + String.join(",", list1) + "}";
String str2 = "{" + String.join(",", list2) + "}";
System.out.println(str1);
System.out.println(str2);
With the way you structured the problem, you should actually be splitting by commas first. Then, you should iterate through the result of the call to split and split each string in the outputted array by hyphen if there exists one. If there isn't a hyphen, then you can add a 0 to string 2 and the string itself to string 1. If there is a hyphen, then add the left side to string 1 and the right side to string 2. Here's one way you can do this,
if(rawIndication.contains(",")){
String s1 = "{";
String s2 = "{";
String[] parts = rawIndication.split(",");
for(int i = 0; i < parts.length; i++) {
if(parts[i].contains("-") {
String[] moreParts = parts[i].split(",");
s1 = s1 + moreParts[0] + ",";
s2 = s2 + moreParts[1] + ",";
}
else{
s1 = s1 + parts[i] + ",";
s2 = "0,";
}
}
s1 = s1.substring(0, s1.length() - 1); //remove last extra comma
s2 = s2.substring(0, s2.length() - 1); //remove last extra comma
s1 = s1 + "}";
s2 = s2 + "}";
}
I think this solves your problem.
private static void splitStrings() {
List<String> list = Arrays.asList("aa-bb", "ccdd", "eeff", "gg-gg", "cc-gg");
List firstPartList = new ArrayList<>();
List secondPartList = new ArrayList<>();
for (String undividedString : list){
if(undividedString.contains("-")){
String[] dividedParts = undividedString.split("-");
String firstPart = dividedParts[0];
String secondPart = dividedParts[1];
firstPartList.add(firstPart);
secondPartList.add(secondPart);
} else{
firstPartList.add(undividedString);
secondPartList.add("0");
}
}
System.out.println(firstPartList);
System.out.println(secondPartList);
}
Output is -
[aa, ccdd, eeff, gg, cc]
[bb, 0, 0, gg, gg]

Split starting alphabetical characters from numeric characters

I want to split a string so that I get starting alphabetical string(until the first numeric digit occured). And the other alphanumeric string.
E.g.:
I have a string forexample: Nesc123abc456
I want to get following two strings by splitting the above string: Nesc, 123abc456
What I have tried:
String s = "Abc1234avc";
String[] ss = s.split("(\\D)", 2);
System.out.println(Arrays.toString(ss));
But this just removes the first letter from the string.
You could maybe use lookarounds so that you don't consume the delimiting part:
String s = "Abc1234avc";
String[] ss = s.split("(?<=\\D)(?=\\d)", 2);
System.out.println(Arrays.toString(ss));
ideone demo
(?<=\\D) makes sure there's a non-digit before the part to be split at,
(?=\\d) makes sure there's a digit after the part to be split at.
You need the quantifier.
Try
String[] ss = s.split("(\\D)*", 2);
More information here: http://docs.oracle.com/javase/tutorial/essential/regex/quant.html
Didn't you try replaceAll?
String s = ...;
String firstPart = s.replaceAll("[0-9].*", "");
String secondPart = s.substring(firstPart.length());
You can use:
String[] arr = "Nesc123abc456".split("(?<=[a-zA-Z])(?![a-zA-Z])", 2);
//=> [Nesc, 123abc456]
split is a destructive process so you would need to find the index of the first numeric digit and use substrings to get your result. This would also probably be faster than using a regex since those have a lot more heuristics behind them
int split = string.length();
for(int i = 0; i < string.length(); i ++) {
if (Character.isDigit(string.charAt(i)) {
split = i;
break;
}
}
String[] parts = new String[2];
parts[0] = string.substring(0, split);
parts[1] = string.substring(split);
I think this is what you asked:
String s = "Abc1234avc";
String numbers = "";
String chars = "";
for(int i = 0; i < s.length(); i++){
char c = s.charAt(i);
if(Character.isDigit(c)){
numbers += c + "";
}
else {
chars += c + "";
}
}
System.out.println("Numbers: " + numbers + "; Chars: " + chars);

How to get a string between two characters?

I have a string,
String s = "test string (67)";
I want to get the no 67 which is the string between ( and ).
Can anyone please tell me how to do this?
There's probably a really neat RegExp, but I'm noob in that area, so instead...
String s = "test string (67)";
s = s.substring(s.indexOf("(") + 1);
s = s.substring(0, s.indexOf(")"));
System.out.println(s);
A very useful solution to this issue which doesn't require from you to do the indexOf is using Apache Commons libraries.
StringUtils.substringBetween(s, "(", ")");
This method will allow you even handle even if there multiple occurrences of the closing string which wont be easy by looking for indexOf closing string.
You can download this library from here:
https://mvnrepository.com/artifact/org.apache.commons/commons-lang3/3.4
Try it like this
String s="test string(67)";
String requiredString = s.substring(s.indexOf("(") + 1, s.indexOf(")"));
The method's signature for substring is:
s.substring(int start, int end);
By using regular expression :
String s = "test string (67)";
Pattern p = Pattern.compile("\\(.*?\\)");
Matcher m = p.matcher(s);
if(m.find())
System.out.println(m.group().subSequence(1, m.group().length()-1));
Java supports Regular Expressions, but they're kind of cumbersome if you actually want to use them to extract matches. I think the easiest way to get at the string you want in your example is to just use the Regular Expression support in the String class's replaceAll method:
String x = "test string (67)".replaceAll(".*\\(|\\).*", "");
// x is now the String "67"
This simply deletes everything up-to-and-including the first (, and the same for the ) and everything thereafter. This just leaves the stuff between the parenthesis.
However, the result of this is still a String. If you want an integer result instead then you need to do another conversion:
int n = Integer.parseInt(x);
// n is now the integer 67
In a single line, I suggest:
String input = "test string (67)";
input = input.subString(input.indexOf("(")+1, input.lastIndexOf(")"));
System.out.println(input);`
You could use apache common library's StringUtils to do this.
import org.apache.commons.lang3.StringUtils;
...
String s = "test string (67)";
s = StringUtils.substringBetween(s, "(", ")");
....
Test String test string (67) from which you need to get the String which is nested in-between two Strings.
String str = "test string (67) and (77)", open = "(", close = ")";
Listed some possible ways: Simple Generic Solution:
String subStr = str.substring(str.indexOf( open ) + 1, str.indexOf( close ));
System.out.format("String[%s] Parsed IntValue[%d]\n", subStr, Integer.parseInt( subStr ));
Apache Software Foundation commons.lang3.
StringUtils class substringBetween() function gets the String that is nested in between two Strings. Only the first match is returned.
String substringBetween = StringUtils.substringBetween(subStr, open, close);
System.out.println("Commons Lang3 : "+ substringBetween);
Replaces the given String, with the String which is nested in between two Strings. #395
Pattern with Regular-Expressions: (\()(.*?)(\)).*
The Dot Matches (Almost) Any Character
.? = .{0,1}, .* = .{0,}, .+ = .{1,}
String patternMatch = patternMatch(generateRegex(open, close), str);
System.out.println("Regular expression Value : "+ patternMatch);
Regular-Expression with the utility class RegexUtils and some functions.
Pattern.DOTALL: Matches any character, including a line terminator.
Pattern.MULTILINE: Matches entire String from the start^ till end$ of the input sequence.
public static String generateRegex(String open, String close) {
return "(" + RegexUtils.escapeQuotes(open) + ")(.*?)(" + RegexUtils.escapeQuotes(close) + ").*";
}
public static String patternMatch(String regex, CharSequence string) {
final Pattern pattern = Pattern.compile(regex, Pattern.DOTALL);
final Matcher matcher = pattern .matcher(string);
String returnGroupValue = null;
if (matcher.find()) { // while() { Pattern.MULTILINE }
System.out.println("Full match: " + matcher.group(0));
System.out.format("Character Index [Start:End]«[%d:%d]\n",matcher.start(),matcher.end());
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
if( i == 2 ) returnGroupValue = matcher.group( 2 );
}
}
return returnGroupValue;
}
String s = "test string (67)";
int start = 0; // '(' position in string
int end = 0; // ')' position in string
for(int i = 0; i < s.length(); i++) {
if(s.charAt(i) == '(') // Looking for '(' position in string
start = i;
else if(s.charAt(i) == ')') // Looking for ')' position in string
end = i;
}
String number = s.substring(start+1, end); // you take value between start and end
String result = s.substring(s.indexOf("(") + 1, s.indexOf(")"));
public String getStringBetweenTwoChars(String input, String startChar, String endChar) {
try {
int start = input.indexOf(startChar);
if (start != -1) {
int end = input.indexOf(endChar, start + startChar.length());
if (end != -1) {
return input.substring(start + startChar.length(), end);
}
}
} catch (Exception e) {
e.printStackTrace();
}
return input; // return null; || return "" ;
}
Usage :
String input = "test string (67)";
String startChar = "(";
String endChar = ")";
String output = getStringBetweenTwoChars(input, startChar, endChar);
System.out.println(output);
// Output: "67"
Another way of doing using split method
public static void main(String[] args) {
String s = "test string (67)";
String[] ss;
ss= s.split("\\(");
ss = ss[1].split("\\)");
System.out.println(ss[0]);
}
Use Pattern and Matcher
public class Chk {
public static void main(String[] args) {
String s = "test string (67)";
ArrayList<String> arL = new ArrayList<String>();
ArrayList<String> inL = new ArrayList<String>();
Pattern pat = Pattern.compile("\\(\\w+\\)");
Matcher mat = pat.matcher(s);
while (mat.find()) {
arL.add(mat.group());
System.out.println(mat.group());
}
for (String sx : arL) {
Pattern p = Pattern.compile("(\\w+)");
Matcher m = p.matcher(sx);
while (m.find()) {
inL.add(m.group());
System.out.println(m.group());
}
}
System.out.println(inL);
}
}
The "generic" way of doing this is to parse the string from the start, throwing away all the characters before the first bracket, recording the characters after the first bracket, and throwing away the characters after the second bracket.
I'm sure there's a regex library or something to do it though.
The least generic way I found to do this with Regex and Pattern / Matcher classes:
String text = "test string (67)";
String START = "\\("; // A literal "(" character in regex
String END = "\\)"; // A literal ")" character in regex
// Captures the word(s) between the above two character(s)
String pattern = START + "(\w+)" + END;
Pattern pattern = Pattern.compile(pattern);
Matcher matcher = pattern.matcher(text);
while(matcher.find()) {
System.out.println(matcher.group()
.replace(START, "").replace(END, ""));
}
This may help for more complex regex problems where you want to get the text between two set of characters.
The other possible solution is to use lastIndexOf where it will look for character or String from backward.
In my scenario, I had following String and I had to extract <<UserName>>
1QAJK-WKJSH_MyApplication_Extract_<<UserName>>.arc
So, indexOf and StringUtils.substringBetween was not helpful as they start looking for character from beginning.
So, I used lastIndexOf
String str = "1QAJK-WKJSH_MyApplication_Extract_<<UserName>>.arc";
String userName = str.substring(str.lastIndexOf("_") + 1, str.lastIndexOf("."));
And, it gives me
<<UserName>>
String s = "test string (67)";
System.out.println(s.substring(s.indexOf("(")+1,s.indexOf(")")));
Something like this:
public static String innerSubString(String txt, char prefix, char suffix) {
if(txt != null && txt.length() > 1) {
int start = 0, end = 0;
char token;
for(int i = 0; i < txt.length(); i++) {
token = txt.charAt(i);
if(token == prefix)
start = i;
else if(token == suffix)
end = i;
}
if(start + 1 < end)
return txt.substring(start+1, end);
}
return null;
}
This is a simple use \D+ regex and job done.
This select all chars except digits, no need to complicate
/\D+/
it will return original string if no match regex
var iAm67 = "test string (67)".replaceFirst("test string \\((.*)\\)", "$1");
add matches to the code
String str = "test string (67)";
String regx = "test string \\((.*)\\)";
if (str.matches(regx)) {
var iAm67 = str.replaceFirst(regx, "$1");
}
---EDIT---
i use https://www.freeformatter.com/java-regex-tester.html#ad-output to test regex.
turn out it's better to add ? after * for less match. something like this:
String str = "test string (67)(69)";
String regx1 = "test string \\((.*)\\).*";
String regx2 = "test string \\((.*?)\\).*";
String ans1 = str.replaceFirst(regx1, "$1");
String ans2 = str.replaceFirst(regx2, "$1");
System.out.println("ans1:"+ans1+"\nans2:"+ans2);
// ans1:67)(69
// ans2:67
String s = "(69)";
System.out.println(s.substring(s.lastIndexOf('(')+1,s.lastIndexOf(')')));
Little extension to top (MadProgrammer) answer
public static String getTextBetween(final String wholeString, final String str1, String str2){
String s = wholeString.substring(wholeString.indexOf(str1) + str1.length());
s = s.substring(0, s.indexOf(str2));
return s;
}

How to Split a string in java based on limit

I have following String and i want to split this string into number of sub strings(by taking ',' as a delimeter) when its length reaches 36. Its not exactly splitting on 36'th position
String message = "This is some(sampletext), and has to be splited properly";
I want to get the output as two substrings follows:
1. 'This is some (sampletext)'
2. 'and has to be splited properly'
Thanks in advance.
A solution based on regex:
String s = "This is some sample text and has to be splited properly";
Pattern splitPattern = Pattern.compile(".{1,15}\\b");
Matcher m = splitPattern.matcher(s);
List<String> stringList = new ArrayList<String>();
while (m.find()) {
stringList.add(m.group(0).trim());
}
Update:
trim() can be droped by changing the pattern to end in space or end of string:
String s = "This is some sample text and has to be splited properly";
Pattern splitPattern = Pattern.compile("(.{1,15})\\b( |$)");
Matcher m = splitPattern.matcher(s);
List<String> stringList = new ArrayList<String>();
while (m.find()) {
stringList.add(m.group(1));
}
group(1) means that I only need the first part of the pattern (.{1,15}) as output.
.{1,15} - a sequence of any characters (".") with any length between 1 and 15 ({1,15})
\b - a word break (a non-character before of after any word)
( |$) - space or end of string
In addition I've added () surrounding .{1,15} so I can use it as a whole group (m.group(1)).
Depending on the desired result, this expression can be tweaked.
Update:
If you want to split message by comma only if it's length would be over 36, try the following expression:
Pattern splitPattern = Pattern.compile("(.{1,36})\\b(,|$)");
The best solution I can think of is to make a function that iterates through the string. In the function you could keep track of whitespace characters, and for each 16th position you could add a substring to a list based on the position of the last encountered whitespace. After it has found a substring, you start anew from the last encountered whitespace. Then you simply return the list of substrings.
Here's a tidy answer:
String message = "This is some sample text and has to be splited properly";
String[] temp = message.split("(?<=^.{1,16}) ");
String part1 = message.substring(0, message.length() - temp[temp.length - 1].length() - 1);
String part2 = message.substring(message.length() - temp[temp.length - 1].length());
This should work on all inputs, except when there are sequences of chars without whitespace longer than 16. It also creates the minimum amount of extra Strings by indexing into the original one.
public static void main(String[] args) throws IOException
{
String message = "This is some sample text and has to be splited properly";
List<String> result = new ArrayList<String>();
int start = 0;
while (start + 16 < message.length())
{
int end = start + 16;
while (!Character.isWhitespace(message.charAt(end--)));
result.add(message.substring(start, end + 1));
start = end + 2;
}
result.add(message.substring(start));
System.out.println(result);
}
If you have a simple text as the one you showed above (words separated by blank spaces) you can always think of StringTokenizer. Here's some simple code working for your case:
public static void main(String[] args) {
String message = "This is some sample text and has to be splited properly";
while (message.length() > 0) {
String token = "";
StringTokenizer st = new StringTokenizer(message);
while (st.hasMoreTokens()) {
String nt = st.nextToken();
String foo = "";
if (token.length()==0) {
foo = nt;
}
else {
foo = token + " " + nt;
}
if (foo.length() < 16)
token = foo;
else {
System.out.print("'" + token + "' ");
message = message.substring(token.length() + 1, message.length());
break;
}
if (!st.hasMoreTokens()) {
System.out.print("'" + token + "' ");
message = message.substring(token.length(), message.length());
}
}
}
}

Categories

Resources