Java split on - and + basis - java

I have a list of strings
0-30
31-60
61-90
91-120
365+
I want a regex which can be put into java split method to get the first no, i.e
0
31
61
91
365
Currently I am using this logic:
if(str.endsWith("+") ){
str= str.substring(0, str.length()-1);
}
String Num = str.split("-")[0];
Is there any better way ?
Thanks

String[] splitArray = subjectString.split("[+-]\\d*\\s*");

String pattern = "[+-]\\d*\\s*";
String digits = "0-30 31-60 61-90 91-120 365+";
Pattern splitter = Pattern.compile(pattern);
String[] result = splitter.split(digits );
for (String digit: result ) {
System.out.println("digits = \"" + digit + "\"");
}

Related

How to split a String by a comma, but from the second comma

I have a string as:
"model=iPhone12,3,os_version=13.6.1,os_update_exist=1,status=1"
How can I convert this into:
model=iPhone12,3
os_version=13.6.1
os_update_exist=1
status=1
Split the string from the first comma, then re-join the first two elements of the resulting string array.
I doubt there's a "clean" way to do this but this would work for your case:
String str = "model=iPhone12,3,os_version=13.6.1,os_update_exist=1,status=1";
String[] sp = str.split(",");
sp[0] += "," + sp[1];
sp[1] = sp[2];
sp[2] = sp[3];
sp[3] = sp[4];
sp[4] = "";
You can try this:
public String[] splitString(String source) {
// Split the source string based on a comma followed by letters and numbers.
// Basically "model=iPhone12,3,os_version=13.6.1,os_update_exist=1,status=1" will be split
// like this:
// model=iPhone12,3
// ,os_version=13.6.1
// ,os_update_exist=1
// ,status=1"
String[] result = source.split("(?=,[a-z]+\\d*)");
for (int i = 0; i < result.length; i++) {
// Removes the comma at the beginning of the string if present
if (result[i].matches(",.*")) {
result[i] = result[i].substring(1);
}
}
return result;
}
if you are parsing always the same kind of String a regex like this will be do the job
String str = "model=iPhone12,3,os_version=13.6.1,os_update_exist=1,status=1";
Matcher m = Pattern.compile("model=(.*),os_version=(.*),os_update_exist=(.*),status=(.*)").matcher(str);
if (m.find()) {
model = m.group(1)); // iPhone12,3
os = m.group(2)); // 13.6.1
update = m.group(3)); // 1
status = m.group(4)); // 1
}
If you really wants to use a split you can still use that kind of trick
String[] split = str.replaceAll(".*?=(.*?)(,[a-z]|$)", "$1#")
.split("#");
split[0] // iPhone12,3
split[1] // 13.6.1
split[2] // 1
split[3] // 1

convert a string number starts with `00` to `+` in java

I want to convert a string number starts with 00 to + such as 0046760963101 to +46760963101. Is there any solution to handle it via regex?
If not what solution do you recommend?
Addenda :
If it starts with 000 or more, I do not want to replace with + sign.
with regex assuming the input is a numeric string
s.replaceFirst("^00", "+")
or with regex if you aren't sure of the input format
s.replaceFirst("^00([0-9]+)$", "+$1")
or with a simple match
s.startsWith("00") ? "+"+s.substring(2) : s
Inculding the added requirement: If it starts with 000 or more, I do not want to replace with + sign.
String normalized = phone;
if ( !phone.matches("000+([0-9]+)") && phone.startsWith("00")) {
normalized = "+"+phone.substring(2);
}
Check you input in regex tester like: https://www.freeformatter.com/java-regex-tester.html#ad-output
You can try something like this
public static void main(String[] args) {
String number="0046760963101";
if(number.startsWith("00")) {
number=number.replaceFirst("00", "+");
}
System.out.println(number);
}
You could replace the 00 with a + like so:
String str = "0046760963101";
String newStr = "+";
for (int i = 2; i < str.length(); i++)
{
newStr += str.charAt(i);
}
Without regex:
String str = "0046760963101";
String replaced = str.charAt(2) == '0' ? str : str.substring(0, 2).replace("00", "+") + str.substring(2);
System.out.println(replaced);
will print:
+46760963101

Get substring in a string with multiple occurring string

I have a string something like
(D#01)5(D#02)14100319530033M(D#03)1336009-A-A(D#04)141002A171(D#05)1(D#06)
Now i want to get substring between (D#01)5(D#02)
If i have something like
(D#01)5(D#02)
i can get detail with
quantity = content.substring(content.indexOf("(D#01)") + 6, content.indexOf("(D#02)"));
But somethings D#02 can be different like #05, Now how can i use simple (D# to get string in between. there are multiple repetitions of (D#
Basically this is what i want to do
content.substring(content.indexOf("(D#01)") + 6, content.nextOccurringIndexOf("(D#"));
I suppose you can do
int fromIndex = content.indexOf("(D#01)") + 6;
int toIndex = content.indexOf("(D#", fromIndex); // next occurring
if (fromIndex != -1 && toIndex != -1)
str = content.substring(fromIndex, toIndex);
Output
5
See http://ideone.com/RrUtBy demo.
Assuming that the marker and value are some how linked and you want to know each ((D#01) == 5), then you can make use of the Pattern/Matcher API, for example
String text = "(D#01)5(D#02)14100319530033M(D#03)1336009-A-A(D#04)141002A171(D#05)1(D#06)";
Pattern p = Pattern.compile("\\(D#[0-9]+\\)");
Matcher m = p.matcher(text);
while (m.find()) {
String name = m.group();
if (m.end() < text.length()) {
String content = text.substring(m.end()) + 1;
content = content.substring(0, content.indexOf("("));
System.out.println(name + " = " + content);
}
}
Which outputs
(D#01) = 5
(D#02) = 14100319530033M
(D#03) = 1336009-A-A
(D#04) = 141002A171
(D#05) = 1
Now, this is a little heavy handed, I'd create some kind of "marker" object which contained the key (D#01) and it's start and end indices. I'd then keep this information in a List and cut up each value based on the end of the earlier key and the start of the last key...but that's just me ;)
You can use regex capture groups if want the content between the (D###)'s
Pattern p = Pattern.compile("(\\(D#\\d+\\))(.*?)(?=\\(D#\\d+\\))");
Matcher matcher = p.matcher("(D#01)5(D#02)14100319530033M(D#03)1336009-A-A(D#04)141002A171(D#05)1(D#06)");
while(matcher.find()) {
System.out.println(String.format("%s start: %2s end: %2s matched: %s ",
matcher.group(1), matcher.start(2), matcher.end(2), matcher.group(2)));
}
(D#01) start: 6 end: 7 matched: 5
(D#02) start: 13 end: 28 matched: 14100319530033M
(D#03) start: 34 end: 45 matched: 1336009-A-A
(D#04) start: 51 end: 61 matched: 141002A171
(D#05) start: 67 end: 68 matched: 1
You can user regex to split the input - as suggested by #MadProgrammer. split() method produces a table of Strings, so the order of the occurrences of the searched values will be exactly the same as the order of the values in the table produced by split(). For example:
String input = "(D#01)5(D#02)14100319530033M(D#03)1336009-A-A(D#04)141002A171(D#05)1(D#06)";
String[] table = input.split("\(D#[0-9]+\)");
Try this:
public static void main(String[] args) {
String input = "(D#01)5(D#02)14100319530033M(D#03)1336009-A-A(D#04)141002A171(D#05)1(D#06)";
Pattern p = Pattern.compile("\\(D#\\d+\\)(.*?)(?=\\(D#\\d+\\))");
Matcher matches = p.matcher(input);
while(matches.find()) {
int number = getNum(matches.group(0)); // parses the number
System.out.printf("%d. %s\n", number, matches.group(1)); // print the string
}
}
public static int getNum(String str) {
int start = str.indexOf('#') + 1;
int end = str.indexOf(')', start);
return Integer.parseInt(str.substring(start,end));
}
Result:
1. 5
2. 14100319530033M
3. 1336009-A-A
4. 141002A171
5. 1

How to extract a multiple quoted substrings in Java

I have a string that has multiple substring which has to be extracted. Strings which will be extracted is between ' character.
I could only extract the first or the last one when I use indexOf or regex.
How could I extract them and put them into array or list without parsing the same string only?
resultData = "Error 205: 'x' data is not crawled yet. Check 'y' and 'z' data and update dataset 't'";
I have a tried below;
protected static String errorsTPrinted(String errStr, int errCode) {
if (errCode== 202 ) {
ArrayList<String> ar = new ArrayList<String>();
Pattern p = Pattern.compile("'(.*?)'");
Matcher m = p.matcher(errStr);
String text;
for (int i = 0; i < errStr.length(); i++) {
m.find();
text = m.group(1);
ar.add(text);
}
return errStr = "Err 202: " + ar.get(0) + " ... " + ar.get(1) + " ..." + ar.get(2) + " ... " + ar.get(3);
}
Edit
I used #MinecraftShamrock 's approach.
if (errCode== 202 ) {
List<String> getQuotet = getQuotet(errStr, '\'');
return errStr = "Err 202: " + getQuotet.get(0) + " ... " + getQuotet.get(1) + " ..." + getQuotet.get(2) + " ... " + getQuotet.get(3);
}
You could use this very straightforward algorithm to do so and avoid regex (as one can't be 100% sure about its complexity):
public List<String> getQuotet(final String input, final char quote) {
final ArrayList<String> result = new ArrayList<>();
int n = -1;
for(int i = 0; i < input.length(); i++) {
if(input.charAt(i) == quote) {
if(n == -1) { //not currently inside quote -> start new quote
n = i + 1;
} else { //close current quote
result.add(input.substring(n, i));
n = -1;
}
}
}
return result;
}
This works with any desired quote-character and has a runtime complexity of O(n). If the string ends with an open quote, it will not be included. However, this can be added quite easily.
I think this is preferable over regex as you can ba absolutely sure about its complexity. Also, it works with a minimum of library classes. If you care about efficiency for big inputs, use this.
And last but not least, it does absolutely not care about what is between two quote characters so it works with any input string.
Simply use the pattern:
'([^']++)'
And a Matcher like so:
final Pattern pattern = Pattern.compile("'([^']++)'");
final Matcher matcher = pattern.matcher(resultData);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
This loops through each match in the String and prints it.
Output:
x
y
z
t
Here is a simple approach (assuming there are no escaping characters etc.):
// Compile a pattern to find the wanted strings
Pattern p = Pattern.compile("'([^']+)'");
// Create a matcher for given input
Matcher m = p.matcher(resultData);
// A list to put the found strings into
List<String> list = new ArrayList<String>();
// Loop over all occurrences
while(m.find()) {
// Retrieve the matched text
String text = m.group(1);
// Do something with the text, e.g. add it to a List
list.add(text);
}

How to Split a string in java based on limit

I have following String and i want to split this string into number of sub strings(by taking ',' as a delimeter) when its length reaches 36. Its not exactly splitting on 36'th position
String message = "This is some(sampletext), and has to be splited properly";
I want to get the output as two substrings follows:
1. 'This is some (sampletext)'
2. 'and has to be splited properly'
Thanks in advance.
A solution based on regex:
String s = "This is some sample text and has to be splited properly";
Pattern splitPattern = Pattern.compile(".{1,15}\\b");
Matcher m = splitPattern.matcher(s);
List<String> stringList = new ArrayList<String>();
while (m.find()) {
stringList.add(m.group(0).trim());
}
Update:
trim() can be droped by changing the pattern to end in space or end of string:
String s = "This is some sample text and has to be splited properly";
Pattern splitPattern = Pattern.compile("(.{1,15})\\b( |$)");
Matcher m = splitPattern.matcher(s);
List<String> stringList = new ArrayList<String>();
while (m.find()) {
stringList.add(m.group(1));
}
group(1) means that I only need the first part of the pattern (.{1,15}) as output.
.{1,15} - a sequence of any characters (".") with any length between 1 and 15 ({1,15})
\b - a word break (a non-character before of after any word)
( |$) - space or end of string
In addition I've added () surrounding .{1,15} so I can use it as a whole group (m.group(1)).
Depending on the desired result, this expression can be tweaked.
Update:
If you want to split message by comma only if it's length would be over 36, try the following expression:
Pattern splitPattern = Pattern.compile("(.{1,36})\\b(,|$)");
The best solution I can think of is to make a function that iterates through the string. In the function you could keep track of whitespace characters, and for each 16th position you could add a substring to a list based on the position of the last encountered whitespace. After it has found a substring, you start anew from the last encountered whitespace. Then you simply return the list of substrings.
Here's a tidy answer:
String message = "This is some sample text and has to be splited properly";
String[] temp = message.split("(?<=^.{1,16}) ");
String part1 = message.substring(0, message.length() - temp[temp.length - 1].length() - 1);
String part2 = message.substring(message.length() - temp[temp.length - 1].length());
This should work on all inputs, except when there are sequences of chars without whitespace longer than 16. It also creates the minimum amount of extra Strings by indexing into the original one.
public static void main(String[] args) throws IOException
{
String message = "This is some sample text and has to be splited properly";
List<String> result = new ArrayList<String>();
int start = 0;
while (start + 16 < message.length())
{
int end = start + 16;
while (!Character.isWhitespace(message.charAt(end--)));
result.add(message.substring(start, end + 1));
start = end + 2;
}
result.add(message.substring(start));
System.out.println(result);
}
If you have a simple text as the one you showed above (words separated by blank spaces) you can always think of StringTokenizer. Here's some simple code working for your case:
public static void main(String[] args) {
String message = "This is some sample text and has to be splited properly";
while (message.length() > 0) {
String token = "";
StringTokenizer st = new StringTokenizer(message);
while (st.hasMoreTokens()) {
String nt = st.nextToken();
String foo = "";
if (token.length()==0) {
foo = nt;
}
else {
foo = token + " " + nt;
}
if (foo.length() < 16)
token = foo;
else {
System.out.print("'" + token + "' ");
message = message.substring(token.length() + 1, message.length());
break;
}
if (!st.hasMoreTokens()) {
System.out.print("'" + token + "' ");
message = message.substring(token.length(), message.length());
}
}
}
}

Categories

Resources