How to remove some word in a sentences Java? - java

Assume the string is:
The/at Fulton/np-tl County/nn-tl Grand/jj-tl
How can I remove character after / and the out put as below
The Fulton County Grand

It looks like a simple regex-based replace could work fine here:
text = text.replaceAll("/\\S*", "");
Here the \\S* means "0 or more non-whitespace characters". There are other options you could use too, of course.

String input = "The/at Fulton/np-tl County/nn-tl Grand/jj-tl";
String clean = input.replaceAll("/.*?(?= |$)", "");
Here's a test:
public static void main( String[] args ) {
String input = "The/at Fulton/np-tl County/nn-tl Grand/jj-tl";
String clean = input.replaceAll("/.*?(?= |$)", "");
System.out.println( clean);
}
Output:
The Fulton County Grand

String text = "The/at Fulton/np-tl County/nn-tl Grand/jj-tl";
String newText = text.replaceAll("/.*?\\S*", "");
From Java API:
String replace(char oldChar, char newChar)
Returns a new string resulting from replacing all occurrences of oldChar in this string with newChar.
String replace(CharSequence target, CharSequence replacement)
Replaces each substring of this string that matches the literal target sequence with the specified literal replacement sequence.
String replaceAll(String regex, String replacement)
Replaces each substring of this string that matches the given regular expression with the given replacement.
String replaceFirst(String regex, String replacement)
Replaces the first substring of this string that matches the given regular expression with the given replacement.
If you need to replace a substring or a character, use 1st 2 methods.
If you need to replace a pattern or a regex, used 2nd 2 methods.

This worked for me:
String text = "The/at Fulton/np-tl County/nn-tl Grand/jj-tl";
String newText = text.replaceAll("/.*?(\\s|$)", " ").trim();
Yields:
The Fulton County Grand
This basically replaces any character(s) which are after a / and are either followed by a white space or else, by the end of the string. The trim() at the end is to cater for the extra white space added by the replaceAll method.

do as follow:
startchar : is a starting character from which you want to replace.
endchar : is a ending character up to chich character you want to replace.
" " : is because you just want to delete it so replace with white space
string.replaceAll(startchar+".*"+endchar, "")
refer http://docs.oracle.com/javase/1.4.2/docs/api/java/lang/String.html#replaceAll%28java.lang.String,%20java.lang.String%29
also see greedy quantifier examples
see working example
public static void main( String[] args ) {
String startchar ="/";
String endchar="?(\\s|$)";
String input = "The/at Fulton/np-tl County/nn-tl Grand/jj-tl";
String clean = input.replaceAll(startchar+".*"+endchar, " ");
System.out.println( clean);
}
output
The Fulton County Grand

Related

Replace multiple whitespace in string with multiple special char in Java

String str = "hello there, what are you doing";
System.out.println(str.replaceAll("\\s{2,}", "?"));
output: hello?there what are?you doing
expected output: hello??there what are???you doing
Try this.
Pattern pat = Pattern.compile("\\s{2,}");
String str = "hello there, what are you doing";
System.out.println(pat.matcher(str).replaceAll(m -> "?".repeat(m.group().length())));
output:
hello??there, what are???you doing
A one line regex which uses replaces whitespace using look ahead (?=\s) and look behind (?<=\s) would be:
String str = "hello there, what are you doing here";
System.out.println(str.replaceAll("(\s(?=\s)|(?<=\s)\s)", "?"));
=>
hello??there, what are???you doing??????here
So the above matches whitespace followed by whitespace , or whitespace preceeded by whitespace and replaces this with "?".

How to extract a string till the end of the line with regular expression

I have the following string(contains Portuguese characters) in the following structure: contain Name: and then some words after.
Example:
String myStr1 = "aaad Name: bla and more blá\n gdgdf ppp";
String myStr2 = "bbbb Name: Á different blÁblÁ\n hhhh fjjj";
I need to extract the string from 'Name:' till the end of the line.
example:
extract(myStr1) = "Name: bla and more blá"
extract(myStr2) = "Name: Á different blÁblÁ"
Edit after #blue_note answer:
here is what I tried:
public static String extract(String myStr) {
Pattern p = compile("Name:(?m)^.*$");
Matcher m = p.matcher(myStr);
while (m.find()) {
String theGroup = m.group(0);
System.out.format("'%s'\n", theGroup);
return m.group(0);
}
return null;
}
did not work.
The regex is "^\\w*\\s*((?m)Name.*$)")
where
?m enables the multiline mode
^, $ denote start of line and end of line respectively
.* means any character, any number of times
And get group(1), not group(0) of the matched expression
You could also use substring in this case:
String name = myStr1.substring(myStr1.indexOf("Name:"), myStr1.indexOf("\n"));

Java Regex ReplaceAll with grouping

I want to surround all tokens in a text with tags in the following manner:
Input: " abc fg asd "
Output:" <token>abc</token> <token>fg</token> <token>asd</token> "
This is the code I tried so far:
String regex = "(\\s)([a-zA-Z]+)(\\s)";
String text = " abc fg asd ";
text = text.replaceAll(regex, "$1<token>$2</token>$3");
System.out.println(text);
Output:" <token>abc</token> fg <token>asd</token> "
Note: for simplicity we can assume that the input starts and ends with whitespaces
Use lookaround:
String regex = "(?<=\\s)([a-zA-Z]+)(?=\\s)";
...
text = text.replaceAll(regex, "<token>$1</token>");
If your tokens are only defined with a character class you don't need to describe what characters are around. So this should suffice since the regex engine walks from left to right and since the quantifier is greedy:
String regex = "[a-zA-Z]+";
text = text.replaceAll(regex, "<token>$0</token>");
// meaning not a space, 1+ times
String result = input.replaceAll("([^\\s]+)", "<token>$1</token>");
this matches everything that isn't a space. Prolly the best fit for what you need. Also it's greedy meaning it will never leave out a character that it shouldn't ( it will never find the string "as" in the string "asd" when there is another character with which it matches)

find two consecutive words/strings with regex expression java (including punctuation)

I want to check wheter a string is containing two words/string directly followed in a specific order.
The punctuation should also be included in the word/string. (i.e. "word" and "word." should be handeled as different words).
As an example:
String word1 = "is";
String word1 = "a";
String text = "This is a sample";
Pattern p = Pattern.compile(someregex+"+word1+"someregex"+word2+"someregex");
System.out.println(p.matcher(text).matches());
This should print out true.
With the following variables, it should also print true.
String word1 = "sample.";
String word1 = "0END";
String text = "This is a sample. 0END0";
But the latter should return false when setting word1 = "sample" (without punctuation).
Does anyone have an idea how the regex string should look like (i.e. what i should write instead of "someregex" ?)
Thank you!
Looks like you're just splitting on whitespace, try:
Pattern p = Pattern.compile("(\\s|^)" + Pattern.quote(word1) + "\\s+" + Pattern.quote(word2) + "(\\s|$)");
Explaination
(\\s|^) matches any whitespace before the first word, or the start of the string
\\s+ matches the whitespace between the words
(\\s|$) matches any whitespace after the second word, or the end of the string
Pattern.quote(...) ensures that any regex special characters in your input strings are properly escapes.
You also need to call find(), not match(). match() will only return true if the whole string matches the pattern.
Complete example
String word1 = "is";
String word2 = "a";
String text = "This is a sample";
String regex =
"(\\s|^)" +
Pattern.quote(word1) +
"\\s+" +
Pattern.quote(word2) +
"(\\s|$)";
Pattern p = Pattern.compile(regex);
System.out.println(p.matcher(text).find());
You can concatenate the two words with a whitespace and use that as the regexp.
the only thing, you have to do, is to replace "." with "." so the point does not match as any character.
String regexp = " " + word1 + " " + word2 + " ";
regexp = regexp.replaceAll("\\.", "\\\\.");

how to check if a space is followed by a certain character?

I am really confused on this regex things. I have tried to understand it, went no where.
Basically, i am trying to replace all spaces followed by every character but a space to be replaced with "PM".
" sd"
" sd"
however
" sd"
" sd"
This will replace the space and the following character with "PM":
String s = "123 axy cq23 dasd"; //your string
String newString = s.replaceAll(" [^ ]","PM");
Since I'm not sure if you want to replace only the space or the space and the following character, too, here is a slightly modified version that replaces only the space:
String s = "123 axy cq23 dasd"; //your string
String newString = s.replaceAll(" ([^ ])", "PM$1")
You need to use non-capturing pattern:
String res = oldString.replaceAll(" (?:[^ ])", "PM");

Categories

Resources