How to find the delimiter encountered in a string in JAVA - java

I have written simple program in Java which does manipulation of a given string.
The input string has some delimiters which are non-alphabets. I have used String Tokenizer to read and manipulate the individual words in a string.
Now I need to reconstruct this manipulated string with the same set of delimiters. Appreciate if any one can suggest me how to identify the delimiter.
In other words, this is what input is:
Text1 Delimiter1 Text2 Delimiter2 Text3 Delimiter3 Text4 Delimiter4
This is what my code does:
NewText1 NewText2 NewText3 NewText4
I made use of string tokenizer to identify the next token in this manner:
StringTokenizer st = new StringTokenizer(str, ", 0123456789(*&^%$##!-_)");
But now I would like to identify the delimiter that was encountered so that I can build my new string.
This is what I actually want:
NewText1 Delimiter1 NewText2 Delimiter2 NewText3 Delimiter3 NewText4 Delmiter4

You can proceed according to this:
String dels = "-, 0123456789(*&^%$##!_)";
String specs = "[" + dels + "]+";
String letts = "[^" + dels + "]+";
String text = "one, two - three! four";
String[] words = text.split( specs );
String[] delim = text.split( letts );
Note that in dels the hyphen must be up front. If you ever add [ or ] or ^ more care must be taken - check the javadoc in java.util.regex.Pattern.
There is no particular problem with composing the original string.
The disadvantage with StringTokenizer with a third argument is that it returns each delimiter as a separate token of length 1.

Related

Remove parts of String? [duplicate]

I want to remove a part of string from one character, that is:
Source string:
manchester united (with nice players)
Target string:
manchester united
There are multiple ways to do it. If you have the string which you want to replace you can use the replace or replaceAll methods of the String class. If you are looking to replace a substring you can get the substring using the substring API.
For example
String str = "manchester united (with nice players)";
System.out.println(str.replace("(with nice players)", ""));
int index = str.indexOf("(");
System.out.println(str.substring(0, index));
To replace content within "()" you can use:
int startIndex = str.indexOf("(");
int endIndex = str.indexOf(")");
String replacement = "I AM JUST A REPLACEMENT";
String toBeReplaced = str.substring(startIndex + 1, endIndex);
System.out.println(str.replace(toBeReplaced, replacement));
String Replace
String s = "manchester united (with nice players)";
s = s.replace(" (with nice players)", "");
Edit:
By Index
s = s.substring(0, s.indexOf("(") - 1);
Use String.Replace():
http://www.daniweb.com/software-development/java/threads/73139
Example:
String original = "manchester united (with nice players)";
String newString = original.replace(" (with nice players)","");
originalString.replaceFirst("[(].*?[)]", "");
https://ideone.com/jsZhSC
replaceFirst() can be replaced by replaceAll()
Using StringBuilder, you can replace the following way.
StringBuilder str = new StringBuilder("manchester united (with nice players)");
int startIdx = str.indexOf("(");
int endIdx = str.indexOf(")");
str.replace(++startIdx, endIdx, "");
You should use the substring() method of String object.
Here is an example code:
Assumption: I am assuming here that you want to retrieve the string till the first parenthesis
String strTest = "manchester united(with nice players)";
/*Get the substring from the original string, with starting index 0, and ending index as position of th first parenthesis - 1 */
String strSub = strTest.subString(0,strTest.getIndex("(")-1);
I would at first split the original string into an array of String with a token " (" and the String at position 0 of the output array is what you would like to have.
String[] output = originalString.split(" (");
String result = output[0];
Using StringUtils from commons lang
A null source string will return null. An empty ("") source string will return the empty string. A null remove string will return the source string. An empty ("") remove string will return the source string.
String str = StringUtils.remove("Test remove", "remove");
System.out.println(str);
//result will be "Test"
If you just need to remove everything after the "(", try this. Does nothing if no parentheses.
StringUtils.substringBefore(str, "(");
If there may be content after the end parentheses, try this.
String toRemove = StringUtils.substringBetween(str, "(", ")");
String result = StringUtils.remove(str, "(" + toRemove + ")");
To remove end spaces, use str.trim()
Apache StringUtils functions are null-, empty-, and no match- safe
Kotlin Solution
If you are removing a specific string from the end, use removeSuffix (Documentation)
var text = "one(two"
text = text.removeSuffix("(two") // "one"
If the suffix does not exist in the string, it just returns the original
var text = "one(three"
text = text.removeSuffix("(two") // "one(three"
If you want to remove after a character, use
// Each results in "one"
text = text.replaceAfter("(", "").dropLast(1) // You should check char is present before `dropLast`
// or
text = text.removeRange(text.indexOf("("), text.length)
// or
text = text.replaceRange(text.indexOf("("), text.length, "")
You can also check out removePrefix, removeRange, removeSurrounding, and replaceAfterLast which are similar
The Full List is here: (Documentation)
// Java program to remove a substring from a string
public class RemoveSubString {
public static void main(String[] args) {
String master = "1,2,3,4,5";
String to_remove="3,";
String new_string = master.replace(to_remove, "");
// the above line replaces the t_remove string with blank string in master
System.out.println(master);
System.out.println(new_string);
}
}
You could use replace to fix your string. The following will return everything before a "(" and also strip all leading and trailing whitespace. If the string starts with a "(" it will just leave it as is.
str = "manchester united (with nice players)"
matched = str.match(/.*(?=\()/)
str.replace(matched[0].strip) if matched

How do I expand this replace expression to add weird alpha characters like "ü"?

I have this replacement expression here:
String firstName = mFirstName.getText().toString().trim().replace(" ", "");
String lastName = mLastName.getText().toString().trim().replace(" ", "");
firstName = firstName.replaceAll("[^A-Za-z'-]", "");
lastName = lastName.replaceAll("[^A-Za-z'-]", "");
It works really well and quickly. However it doesn't allow for the international ascii characters 128-165, say like umlauts. But I don't want the characters after that "()|-" in the string to be included. Is there a way to include that all in one replace all, or do I have to separate it out into multiple expressions?
Here's what I've tried (unsuccessfully) :
firstName = firstName.replaceAll("[^A-Za-zÀ-Ÿ'-]", "");
lastName = lastName.replaceAll("[^Alpha'-]", "");
It still replaces the characters.
You can use \p{L} which matches any Unicode character which is letter.
String strUmlaut = "ÀèŸ";
System.out.println(strUmlaut.matches("\\p{L}+"));
OUTPUT
true
[^A-Za-z\\x80-\\xa5'-] will additionally match characters with ASCII codes 128-165 (80 - A5 in hex)

How to split a string into two parts on specific delimeter

I have a string "Rush to ER/F07^e80c801e-ee37-4af8-9f12-af2d0e58e341".
I want to split it into 2 strings on the delimiter ^. For example string str1=Rush to ER/F07 and String str2 = e80c801e-ee37-4af8-9f12-af2d0e58e341
For getting this i am doing splitting of the string , I followed the tutorial on stackoverflow but it is not working for me , here is a code
String[] str_array = message.split("^");
String stringa = str_array[0];
String stringb = str_array[1];
when I am printing these 2 strings I am getting nothing in stringa and in stringb I am getting all the string as it was before the delimiter.
Please help me
You have to escape special regex sign via \\ try this:
String[] str_array = message.split("\\^");
It is because the .split() method requires a regex pattern. Escape the ^:
String[] str_array = message.split("\\^");
You can get more information on this at http://docs.oracle.com/javase/8/docs/api/java/lang/String.html#split-java.lang.String-.

Java Regex ReplaceAll with grouping

I want to surround all tokens in a text with tags in the following manner:
Input: " abc fg asd "
Output:" <token>abc</token> <token>fg</token> <token>asd</token> "
This is the code I tried so far:
String regex = "(\\s)([a-zA-Z]+)(\\s)";
String text = " abc fg asd ";
text = text.replaceAll(regex, "$1<token>$2</token>$3");
System.out.println(text);
Output:" <token>abc</token> fg <token>asd</token> "
Note: for simplicity we can assume that the input starts and ends with whitespaces
Use lookaround:
String regex = "(?<=\\s)([a-zA-Z]+)(?=\\s)";
...
text = text.replaceAll(regex, "<token>$1</token>");
If your tokens are only defined with a character class you don't need to describe what characters are around. So this should suffice since the regex engine walks from left to right and since the quantifier is greedy:
String regex = "[a-zA-Z]+";
text = text.replaceAll(regex, "<token>$0</token>");
// meaning not a space, 1+ times
String result = input.replaceAll("([^\\s]+)", "<token>$1</token>");
this matches everything that isn't a space. Prolly the best fit for what you need. Also it's greedy meaning it will never leave out a character that it shouldn't ( it will never find the string "as" in the string "asd" when there is another character with which it matches)

In Java how to extract string from the phrase with split() function

Can I extract string from the phrase using split() function with subphrases as delimeters? For example I have a phrase "Mandatory string - Any string1 - Any string2". How can I extract "Any string1" with delimiters as "Mandatory string" and "[a-zA-Z]"
This is how I'm trying to extract:
String str="Mandatory string - Any string1 - Any string2";
String[] result= str.split("Mandatory\\string\\s-\\s|\\s-\\s[a-zA-Z]+");
Result of this code is
result = ["Mandatory string","ny string1","ny string2"]
But desired is:
result = ["Any string1"]
Could appreciate some help, thanks.
String[] result= str.split("Mandatory\\s(1)string\\s-\\s|\\s-\\s[a-zA-Z\\s(2)]+");
You just forgot an "s" in position(1)
and there should be a "\\s" in position(2)
try this line:
String[] result= str.split("Mandatory\\sstring\\s-\\s|\\s-\\s[a-zA-Z\\s]+");
First of all, there's a typo right here:
Mandatory\\string
This should probably read
Mandatory\\sstring
Anyway, I would either use " - " as the delimiter and get the second token:
str.split(" - ")[1] // TODO: prod version should do bounds checking etc
or use a different tool entirely, probably a regex match with the following regular expression:
"Mandatory string - (.*) - .*"
The parenthesised capture group will give you the string you're after.
Why not
String[] result = str.split(" - ");
return result.length < 2 ? "" : result[1];
If there is a definite format to your input string, just split it and then use the parts that are needed:
String[] resultArray = str.split(" - ");
String whatYouWant = resultArray[1];

Categories

Resources