I want to replace all non words characters from a string but I need to check if the word has a hyphen in it but the replace will delete the hyphen .
is there a way to do that after I replace everything that is not a letter or do I have to check before replacing ?
this is my code
word = word.replaceAll("[^a-zA-Z]", "").toLowerCase();
Use the regex, [^\w-] which means NOT(a word character or -).
public class Main {
public static void main(String[] args) {
// Test
String word = "Hello :) Hi, How are you doing? The Co-operative bank is open 2day!";
word = word.replaceAll("[^\\w-]", "").toLowerCase();
System.out.println(word);
}
}
Output:
hellohihowareyoudoingtheco-operativebankisopen2day
Note that a word character (i.e. \w) includes A-Za-z0-9_. If you want your regex to restrict only up to alphabets and hyphen, you should use [^A-Za-z\-]
public class Main {
public static void main(String[] args) {
// Test
String word = "Hello :) Hi, How are you doing? The Co-operative bank is open 2day!";
word = word.replaceAll("[^A-Za-z\\-]", "").toLowerCase();
System.out.println(word);
}
}
Output:
hellohihowareyoudoingtheco-operativebankisopenday
I need to check if the word has a hyphen in it but the replace will delete the hyphen
So check if there is a hyphen before you strip non-alpha characters.
if(word.contains("-")) {
//do whatever
}
//remove non-alpha chars
Related
I want to replace a word starting with # in a string which contains set of words with the same word (# omitted)
example
"word1 word2 #user" should be replaced with "word1 word2 user"
Can someone help me?
You can use regex. Lets start with
yourText = yourText.replaceAll("#(\\S+)", "$1");
in regex:
\S represents any non-whitespace characters
+ represents one or more
\S+ represents one or more non-whitespace characters
(\S+) -parenthesis create group containing one or more non-whitespace characters, this group will be indexed as 1
in replacement
$1 in replacement allows us to use content of group 1.
In other words it will try to find #non-whitespaces (which and replace it with non-whitespaces part.
But this solution doesn't require # to be start of word. To do this we could check if before # there is
whitespace space \s,
or start of the string ^.
To test if something is before our element without actually including it in our match we can use look-behind (?<=...).
So our final solution can look like
yourText = yourText.replaceAll("(?<=^|\\s)#(\\S+)", "$1");
yes, String.replaceAll()
String foo = "#user"
foo = foo.replaceAll("#", "");
You have not very clear use case, but my assumptions with code example:
omit all symbols with replaceAll function
omit just first symbol with substring function
public class TestRegex {
public static void main(String[] args) {
String omitInStart = "#user";
String omitInMiddle = "#user";
String omitInEnd = "#user";
String omitFewSymbols = "#us#er";
List<String> listForOmit = Arrays.asList(omitInStart, omitInMiddle, omitInEnd, omitFewSymbols);
listForOmit.forEach(e -> System.out.println(omitWithReplace(e)));
listForOmit.forEach(e -> System.out.println(omitFirstSymbol(e)));
}
private static String omitFirstSymbol(String stringForOmit) {
return stringForOmit.substring(1);
}
private static String omitWithReplace(String stringForOmit) {
String symbolForOmit = "#";
return stringForOmit.replaceAll(symbolForOmit, "");
}
}
I am trying split a string based on regular expression which contains "[.,?!]+'" all these characters including a single space but splitting is not happening?
Here's my class:
public class splitStr {
public static void main(String[] args) {
String S="He is a very very good boy, isn't he?";
S.trim();
if(1<=S.length() && S.length()<=400000){
String delim ="[ .,?!]+'";
String []s=S.split(delim);
System.out.println(s.length);
for(String d:s)
{
System.out.println(d);
}
}
}
}
The reason it's not working is because not all the delimiters are within the square brackets.
String delim ="[ .,?!]+'"; // you wrote this
change to this:
String delim ="[ .,?!']";
Do the characters +, ', [ and ] must be part of the split?
I'm asking this because plus sign and brackets have special meaning in regular expressions, and if you want them to be part of the match, they must be escaped with \
So, if you want an expression that includes all these characters, it should be:
delim = "[\\[ .,\\?!\\]\\+']"
Note that I had to write \\ because the backslash needs to be escaped inside java strings. I'm also not sure if ? and + need to be escaped because they're inside brackets (test it with and without backslashes before them)
I'm not in a front of a computer right now, so I haven't tested it, but I believe it should work.
import java.util.*;
import java.util.stream.Collectors;
public class StringToken {
public static void main(String[] args) {
String S="He is a very very good boy, isn't he?";
S.trim();
if(1<=S.length() && S.length()<=400000){
String delim = "[ .,?!']";
String []s=S.split(delim);
List<String> d = Arrays.asList(s);
d= d.stream().filter(item-> (item.length() > 0)).collect(Collectors.toList());
System.out.println(d.size());
for(String m:d)
{
System.out.println(m);
}
}
}
}
I want to check and see if a word contains a special character and remove it. Lets say I have String word = "hello-there", I want to loop through and check to see if the word doesn't contain a letter, then remove that special character and concatenate the word. So I want to turn hello-there into hellothere using regex. I have tried this but I can't seem to figure out how to check individual characters of a string to a regex.
public static void main(String[] args){
String word = "hello-there";
for(int i = 0; i < word.length(); i++)
{
if(word.charAt(i).matches("^[a-zA-Z]+"))
But the last if statement doesn't work. Anybody know how to take care of this?
You may use the following regex, that'll match any character, that is not a lower-case or upper-case letter.
[^a-zA-Z]+
see regex demo
Java ( demo )
class RegEx {
public static void main(String[] args) {
String s = "hello-there";
String r = "[^a-zA-Z]+";
String o = s.replaceAll(r, "");
System.out.println(o); //-> hellothere
}
}
I am having below String value, in that how can I find the only this four specified special character like [],:,{},-() (square bracket, curly bracket, hyphen and colon) in a given String.
String str = "[1-10],{10-20},dhoni:kholi";
Kindly help me as I am new to Java.
I think you can use regular expression like this.
class MyRegex
{
public static void main (String[] args) throws java.lang.Exception
{
String str = "[1-10],{10-20},dhoni:kholi";
String text = str.replaceAll("[a-zA-Z0-9]",""); // replacing all numbers and alphabets with ""
System.out.print(text); // result string
}
}
Hope this will help you.
If it is only characters that you want to check then you can use String.replaceAll method with regular expression
System.out.println("[Hello {}:-,World]".replaceAll("[^\\]\\[:\\-{}]", ""));
I have strings like:
Alian 12WE
and
ANI1451
Is there any way to replace all the numbers (and everything after the numbers) with an empty string in JAVA?
I want the output to look like this:
Alian
ANI
With a regex, it's pretty simple:
public class Test {
public static String replaceAll(String string) {
return string.replaceAll("\\d+.*", "");
}
public static void main(String[] args) {
System.out.println(replaceAll("Alian 12WE"));
System.out.println(replaceAll("ANI1451"));
}
}
You could use a regex to remove everyting after a digit is found - something like:
String s = "Alian 12WE";
s = s.replaceAll("\\d+.*", "");
\\d+ finds one or more consecutive digits
.* matches any characters after the digits
Use Regex
"Alian 12WE".split("\\d")[0] // Splits the string at numbers, get the first part.
Or replace "\\d.+$" with ""