find two consecutive words/strings with regex expression java (including punctuation) - java

I want to check wheter a string is containing two words/string directly followed in a specific order.
The punctuation should also be included in the word/string. (i.e. "word" and "word." should be handeled as different words).
As an example:
String word1 = "is";
String word1 = "a";
String text = "This is a sample";
Pattern p = Pattern.compile(someregex+"+word1+"someregex"+word2+"someregex");
System.out.println(p.matcher(text).matches());
This should print out true.
With the following variables, it should also print true.
String word1 = "sample.";
String word1 = "0END";
String text = "This is a sample. 0END0";
But the latter should return false when setting word1 = "sample" (without punctuation).
Does anyone have an idea how the regex string should look like (i.e. what i should write instead of "someregex" ?)
Thank you!

Looks like you're just splitting on whitespace, try:
Pattern p = Pattern.compile("(\\s|^)" + Pattern.quote(word1) + "\\s+" + Pattern.quote(word2) + "(\\s|$)");
Explaination
(\\s|^) matches any whitespace before the first word, or the start of the string
\\s+ matches the whitespace between the words
(\\s|$) matches any whitespace after the second word, or the end of the string
Pattern.quote(...) ensures that any regex special characters in your input strings are properly escapes.
You also need to call find(), not match(). match() will only return true if the whole string matches the pattern.
Complete example
String word1 = "is";
String word2 = "a";
String text = "This is a sample";
String regex =
"(\\s|^)" +
Pattern.quote(word1) +
"\\s+" +
Pattern.quote(word2) +
"(\\s|$)";
Pattern p = Pattern.compile(regex);
System.out.println(p.matcher(text).find());

You can concatenate the two words with a whitespace and use that as the regexp.
the only thing, you have to do, is to replace "." with "." so the point does not match as any character.
String regexp = " " + word1 + " " + word2 + " ";
regexp = regexp.replaceAll("\\.", "\\\\.");

Related

How to search word in String text, this word end "." or "," in java

someone can help me with code?
How to search word in String text, this word end "." or "," in java
I don't want search like this to find it
String word = "test.";
String wordSerch = "I trying to tasting the Artestem test.";
String word1 = "test,"; // here with ","
String word2 = "test."; // here with "."
String word3 = "test"; //here without
//after i make string array and etc...
if((wordSearch.equalsIgnoreCase(word1))||
(wordSearch.equalsIgnoreCase(word2))||
(wordSearh.equalsIgnoreCase(word3))) {
}
if (wordSearch.contains(gramer))
//it's not working because the word Artestem will contain test too, and I don't need it
You can use the matches(Regex) function with a String
String word = "test.";
boolean check = false;
if (word.matches("\w*[\.,\,]") {
check = true;
}
You can use regex for this
Matcher matcher = Pattern.compile("\\btest\\b").matcher(wordSearch);
if (matcher.find()) {
}
\\b\\b will match only a word. So "Artestem" will not match in this case.
matcher.find() will return true if there is a word test in your sentence and false otherwise.
String stringToSearch = "I trying to tasting the Artestem test. test,";
Pattern p1 = Pattern.compile("test[.,]");
Matcher m = p1.matcher(stringToSearch);
while (m.find())
{
System.out.println(m.group());
}
You can transform your String in an Array divided by words(with "split"), and search on that array , checking the last character of the words(charAt) with the character that you want to find.
String stringtoSearch = "This is a test.";
String whatIwantToFind = ",";
String[] words = stringtoSearch.split("\\s+");
for (String word : words) {
if (whatIwantToFind.equalsignorecas(word.charAt(word.length()-1);)) {
System.out.println("FIND");
}
}
What is a word? E.g.:
Is '5' a word?
Is '漢語' a word, or two words?
Is 'New York' a word, or two words?
Is 'Kraftfahrzeughaftpflichtversicherung' (meaning "automobile liability insurance") a word, or 3 words?
For some languages you can use Pattern.compile("[^\\p{Alnum}\u0301-]+") for split words. Use Pattern#split for this.
I think, you can find word by this pattern:
String notWord = "[^\\p{Alnum}\u0301-]{0,}";
Pattern.compile(notWord + "test" + notWord)`
See also: https://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html

Java Regex ReplaceAll with grouping

I want to surround all tokens in a text with tags in the following manner:
Input: " abc fg asd "
Output:" <token>abc</token> <token>fg</token> <token>asd</token> "
This is the code I tried so far:
String regex = "(\\s)([a-zA-Z]+)(\\s)";
String text = " abc fg asd ";
text = text.replaceAll(regex, "$1<token>$2</token>$3");
System.out.println(text);
Output:" <token>abc</token> fg <token>asd</token> "
Note: for simplicity we can assume that the input starts and ends with whitespaces
Use lookaround:
String regex = "(?<=\\s)([a-zA-Z]+)(?=\\s)";
...
text = text.replaceAll(regex, "<token>$1</token>");
If your tokens are only defined with a character class you don't need to describe what characters are around. So this should suffice since the regex engine walks from left to right and since the quantifier is greedy:
String regex = "[a-zA-Z]+";
text = text.replaceAll(regex, "<token>$0</token>");
// meaning not a space, 1+ times
String result = input.replaceAll("([^\\s]+)", "<token>$1</token>");
this matches everything that isn't a space. Prolly the best fit for what you need. Also it's greedy meaning it will never leave out a character that it shouldn't ( it will never find the string "as" in the string "asd" when there is another character with which it matches)

Java RegEx replace all characters in string except for a word

I am using the code in Java:
String word = "hithere";
String str = "123hithere12345hi";
output(str.replaceAll("(?!"+word+")", "x"));
However, rather than outputting: xxxhitherexxxxxxx like I want it to, it outputs: x1x2x3hxixtxhxexrxex1x2x3x4x5xhxix x, I've tried a load of different regex patterns to try to do this, but I can't seem to figure out how to do this :(
Any help would be much appreciated.
Well this technically works. Using only replace all and only one line, and it's assuming you string does not contain a deprecated ASCII character (BEL)
String string = "hithere";
String string2 = "asdfasdfasdfasdfhithereasasdf";
System.out.println(string2.replaceAll(string,"" + (char)string.length()).replaceAll("[^" + (char)string.length() + "]", "x").replaceAll("" + (char)string.length(), string));
I think this is what you're looking for, if I'm not mistaken:
String pattern = "(\\d)|(hi$)";
System.out.println("123hithere12345hi".replaceAll(pattern, "X"));
The pattern replaces any numeric digits and the word "hi".
This lookaround based code will work for you:
String word = "hithere";
String string = "123hithere12345hi";
System.out.println(string.replaceAll(
".(?=.*?\\Q" + word + "\\E)|(?<=\\Q" + word + "\\E(.){0,99}).", "x"));
//=> xxxhitherexxxxxxx

How to remove spaces in between the String

I have below String
string = "Book Your Domain And Get\n \n\n \n \n \n Online Today."
string = str.replace("\\s","").trim();
which returning
str = "Book Your Domain And Get Online Today."
But what is want is
str = "Book Your Domain And Get Online Today."
I have tried Many Regular Expression and also googled but got no luck. and did't find related question, Please Help, Many Thanks in Advance
Use \\s+ instead of \\s as there are two or more consecutive whitespaces in your input.
string = str.replaceAll("\\s+"," ")
You can use replaceAll which takes a regex as parameter. And it seems like you want to replace multiple spaces with a single space. You can do it like this:
string = str.replaceAll("\\s{2,}"," ");
It will replace 2 or more consecutive whitespaces with a single whitespace.
First get rid of multiple spaces:
String after = before.trim().replaceAll(" +", " ");
If you want to just remove the white space between 2 words or characters and not at the end of string
then here is the
regex that i have used,
String s = " N OR 15 2 ";
Pattern pattern = Pattern.compile("[a-zA-Z0-9]\\s+[a-zA-Z0-9]", Pattern.CASE_INSENSITIVE);
Matcher m = pattern.matcher(s);
while(m.find()){
String replacestr = "";
int i = m.start();
while(i<m.end()){
replacestr = replacestr + s.charAt(i);
i++;
}
m = pattern.matcher(s);
}
System.out.println(s);
it will only remove the space between characters or words not spaces at the ends
and the output is
NOR152
Eg. to remove space between words in a string:
String example = "Interactive Resource";
System.out.println("Without space string: "+ example.replaceAll("\\s",""));
Output:
Without space string: InteractiveResource
If you want to print a String without space, just add the argument sep='' to the print function, since this argument's default value is " ".
//user this for removing all the whitespaces from a given string for example a =" 1 2 3 4"
//output: 1234
a.replaceAll("\\s", "")
String s2=" 1 2 3 4 5 ";
String after=s2.replace(" ", "");
this work for me
String string_a = "AAAA BBB";
String actualTooltip_3 = string_a.replaceAll("\\s{2,}"," ");
System.out.println(String actualTooltip_3);
OUTPUT will be:AAA BBB

how to check if a space is followed by a certain character?

I am really confused on this regex things. I have tried to understand it, went no where.
Basically, i am trying to replace all spaces followed by every character but a space to be replaced with "PM".
" sd"
" sd"
however
" sd"
" sd"
This will replace the space and the following character with "PM":
String s = "123 axy cq23 dasd"; //your string
String newString = s.replaceAll(" [^ ]","PM");
Since I'm not sure if you want to replace only the space or the space and the following character, too, here is a slightly modified version that replaces only the space:
String s = "123 axy cq23 dasd"; //your string
String newString = s.replaceAll(" ([^ ])", "PM$1")
You need to use non-capturing pattern:
String res = oldString.replaceAll(" (?:[^ ])", "PM");

Categories

Resources