Checking for permutations of a space-containing string in java - java

How would I check to ensure that a string does not contain any permutations of the space character in Java?
i.e., I want to check if my string equals "", " ", " ", etc.
I do not want to just check if my string contains a space, because if I had a string "My code works", it should not be parsed. I want to check only for strings that contain exclusively spaces, and no letters or other characters whatsoever.
Is this possible to do in a single if-statement?

Use String#matches() with the pattern \s*. This pattern will match empty string or any amount of continuous whitespace.
String input = " ";
if (input.matches("\\s*")) {
System.out.println("match");
}

If possible, you can also use StringUtils.isBlank(aString) provided by apache commons.
Checks if a CharSequence is whitespace, empty ("") or null.
StringUtils.isBlank(null) = true
StringUtils.isBlank("") = true
StringUtils.isBlank(" ") = true
StringUtils.isBlank(" ") = true
StringUtils.isBlank("bob") = false
StringUtils.isBlank(" bob ") = false

one more way
if(input.trim().length() == 0) {
...
}

This worked best for me:
for (i = 0; i < string.length(); i++) {
string = string.get(i).replaceAll(" ", "");
}
if (string.equals("")) {
//do something
}

Related

Java regex: Replace all characters with `+` except instances of a given string

I have the following problem which states
Replace all characters in a string with + symbol except instances of the given string in the method
so for example if the string given was abc123efg and they want me to replace every character except every instance of 123 then it would become +++123+++.
I figured a regular expression is probably the best for this and I came up with this.
str.replaceAll("[^str]","+")
where str is a variable, but its not letting me use the method without putting it in quotations. If I just want to replace the variable string str how can I do that? I ran it with the string manually typed and it worked on the method, but can I just input a variable?
as of right now I believe its looking for the string "str" and not the variable string.
Here is the output its right for so many cases except for two :(
List of open test cases:
plusOut("12xy34", "xy") → "++xy++"
plusOut("12xy34", "1") → "1+++++"
plusOut("12xy34xyabcxy", "xy") → "++xy++xy+++xy"
plusOut("abXYabcXYZ", "ab") → "ab++ab++++"
plusOut("abXYabcXYZ", "abc") → "++++abc+++"
plusOut("abXYabcXYZ", "XY") → "++XY+++XY+"
plusOut("abXYxyzXYZ", "XYZ") → "+++++++XYZ"
plusOut("--++ab", "++") → "++++++"
plusOut("aaxxxxbb", "xx") → "++xxxx++"
plusOut("123123", "3") → "++3++3"
Looks like this is the plusOut problem on CodingBat.
I had 3 solutions to this problem, and wrote a new streaming solution just for fun.
Solution 1: Loop and check
Create a StringBuilder out of the input string, and check for the word at every position. Replace the character if doesn't match, and skip the length of the word if found.
public String plusOut(String str, String word) {
StringBuilder out = new StringBuilder(str);
for (int i = 0; i < out.length(); ) {
if (!str.startsWith(word, i))
out.setCharAt(i++, '+');
else
i += word.length();
}
return out.toString();
}
This is probably the expected answer for a beginner programmer, though there is an assumption that the string doesn't contain any astral plane character, which would be represented by 2 char instead of 1.
Solution 2: Replace the word with a marker, replace the rest, then restore the word
public String plusOut(String str, String word) {
return str.replaceAll(java.util.regex.Pattern.quote(word), "#").replaceAll("[^#]", "+").replaceAll("#", word);
}
Not a proper solution since it assumes that a certain character or sequence of character doesn't appear in the string.
Note the use of Pattern.quote to prevent the word being interpreted as regex syntax by replaceAll method.
Solution 3: Regex with \G
public String plusOut(String str, String word) {
word = java.util.regex.Pattern.quote(word);
return str.replaceAll("\\G((?:" + word + ")*+).", "$1+");
}
Construct regex \G((?:word)*+)., which does more or less what solution 1 is doing:
\G makes sure the match starts from where the previous match leaves off
((?:word)*+) picks out 0 or more instance of word - if any, so that we can keep them in the replacement with $1. The key here is the possessive quantifier *+, which forces the regex to keep any instance of the word it finds. Otherwise, the regex will not work correctly when the word appear at the end of the string, as the regex backtracks to match .
. will not be part of any word, since the previous part already picks out all consecutive appearances of word and disallow backtrack. We will replace this with +
Solution 4: Streaming
public String plusOut(String str, String word) {
return String.join(word,
Arrays.stream(str.split(java.util.regex.Pattern.quote(word), -1))
.map((String s) -> s.replaceAll("(?s:.)", "+"))
.collect(Collectors.toList()));
}
The idea is to split the string by word, do the replacement on the rest, and join them back with word using String.join method.
Same as above, we need Pattern.quote to avoid split interpreting the word as regex. Since split by default removes empty string at the end of the array, we need to use -1 in the second parameter to make split leave those empty strings alone.
Then we create a stream out of the array and replace the rest as strings of +. In Java 11, we can use s -> String.repeat(s.length()) instead.
The rest is just converting the Stream to an Iterable (List in this case) and joining them for the result
This is a bit trickier than you might initially think because you don't just need to match characters, but the absence of specific phrase - a negated character set is not enough. If the string is 123, you would need:
(?<=^|123)(?!123).*?(?=123|$)
https://regex101.com/r/EZWMqM/1/
That is - lookbehind for the start of the string or "123", make sure the current position is not followed by 123, then lazy-repeat any character until lookahead matches "123" or the end of the string. This will match all characters which are not in a "123" substring. Then, you need to replace each character with a +, after which you can use appendReplacement and a StringBuffer to create the result string:
String inputPhrase = "123";
String inputStr = "abc123efg123123hij";
StringBuffer resultString = new StringBuffer();
Pattern regex = Pattern.compile("(?<=^|" + inputPhrase + ")(?!" + inputPhrase + ").*?(?=" + inputPhrase + "|$)");
Matcher m = regex.matcher(inputStr);
while (m.find()) {
String replacement = m.group(0).replaceAll(".", "+");
m.appendReplacement(resultString, replacement);
}
m.appendTail(resultString);
System.out.println(resultString.toString());
Output:
+++123+++123123+++
Note that if the inputPhrase can contain character with a special meaning in a regular expression, you'll have to escape them first before concatenating into the pattern.
You can do it in one line:
input = input.replaceAll("((?:" + str + ")+)?(?!" + str + ").((?:" + str + ")+)?", "$1+$2");
This optionally captures "123" either side of each character and puts them back (a blank if there's no "123"):
So instead of coming up with a regular expression that matches the absence of a string. We might as well just match the selected phrase and append + the number of skipped characters.
StringBuilder sb = new StringBuilder();
Matcher m = Pattern.compile(Pattern.quote(str)).matcher(input);
while (m.find()) {
for (int i = 0; i < m.start(); i++) sb.append('+');
sb.append(str);
}
int remaining = input.length() - sb.length();
for (int i = 0; i < remaining; i++) {
sb.append('+');
}
Absolutely just for the fun of it, a solution using CharBuffer (unexpectedly it took a lot more that I initially hoped for):
private static String plusOutCharBuffer(String input, String match) {
int size = match.length();
CharBuffer cb = CharBuffer.wrap(input.toCharArray());
CharBuffer word = CharBuffer.wrap(match);
int x = 0;
for (; cb.remaining() > 0;) {
if (!cb.subSequence(0, size < cb.remaining() ? size : cb.remaining()).equals(word)) {
cb.put(x, '+');
cb.clear().position(++x);
} else {
cb.clear().position(x = x + size);
}
}
return cb.clear().toString();
}
To make this work you need a beast of a pattern. Let's say you you are operating on the following test case as an example:
plusOut("abXYxyzXYZ", "XYZ") → "+++++++XYZ"
What you need to do is build a series of clauses in your pattern to match a single character at a time:
Any character that is NOT "X", "Y" or "Z" -- [^XYZ]
Any "X" not followed by "YZ" -- X(?!YZ)
Any "Y" not preceded by "X" -- (?<!X)Y
Any "Y" not followed by "Z" -- Y(?!Z)
Any "Z" not preceded by "XY" -- (?<!XY)Z
An example of this replacement can be found here: https://regex101.com/r/jK5wU3/4
Here is an example of how this might work (most certainly not optimized, but it works):
import java.util.regex.Pattern;
public class Test {
public static void plusOut(String text, String exclude) {
StringBuilder pattern = new StringBuilder("");
for (int i=0; i<exclude.length(); i++) {
Character target = exclude.charAt(i);
String prefix = (i > 0) ? exclude.substring(0, i) : "";
String postfix = (i < exclude.length() - 1) ? exclude.substring(i+1) : "";
// add the look-behind (?<!X)Y
if (!prefix.isEmpty()) {
pattern.append("(?<!").append(Pattern.quote(prefix)).append(")")
.append(Pattern.quote(target.toString())).append("|");
}
// add the look-ahead X(?!YZ)
if (!postfix.isEmpty()) {
pattern.append(Pattern.quote(target.toString()))
.append("(?!").append(Pattern.quote(postfix)).append(")|");
}
}
// add in the other character exclusion
pattern.append("[^" + Pattern.quote(exclude) + "]");
System.out.println(text.replaceAll(pattern.toString(), "+"));
}
public static void main(String [] args) {
plusOut("12xy34", "xy");
plusOut("12xy34", "1");
plusOut("12xy34xyabcxy", "xy");
plusOut("abXYabcXYZ", "ab");
plusOut("abXYabcXYZ", "abc");
plusOut("abXYabcXYZ", "XY");
plusOut("abXYxyzXYZ", "XYZ");
plusOut("--++ab", "++");
plusOut("aaxxxxbb", "xx");
plusOut("123123", "3");
}
}
UPDATE: Even this doesn't quite work because it can't deal with exclusions that are just repeated characters, like "xx". Regular expressions are most definitely not the right tool for this, but I thought it might be possible. After poking around, I'm not so sure a pattern even exists that might make this work.
The problem in your solution that you put a set of instance string str.replaceAll("[^str]","+") which it will exclude any character from the variable str and that will not solve your problem
EX: when you try str.replaceAll("[^XYZ]","+") it will exclude any combination of character X , character Y and character Z from your replacing method so you will get "++XY+++XYZ".
Actually you should exclude a sequence of characters instead in str.replaceAll.
You can do it by using capture group of characters like (XYZ) then use a negative lookahead to match a string which does not contain characters sequence : ^((?!XYZ).)*$
Check this solution for more info about this problem but you should know that it may be complicated to find regular expression to do that directly.
I have found two simple solutions for this problem :
Solution 1:
You can implement a method to replace all characters with '+' except the instance of given string:
String exWord = "XYZ";
String str = "abXYxyzXYZ";
for(int i = 0; i < str.length(); i++){
// exclude any instance string of exWord from replacing process in str
if(str.substring(i, str.length()).indexOf(exWord) + i == i){
i = i + exWord.length()-1;
}
else{
str = str.substring(0,i) + "+" + str.substring(i+1);//replace each character with '+' symbol
}
}
Note : str.substring(i, str.length()).indexOf(exWord) + i this if statement will exclude any instance string of exWord from replacing process in str.
Output:
+++++++XYZ
Solution 2:
You can try this Approach using ReplaceAll method and it doesn't need any complex regular expression:
String exWord = "XYZ";
String str = "abXYxyzXYZ";
str = str.replaceAll(exWord,"*"); // replace instance string with * symbol
str = str.replaceAll("[^*]","+"); // replace all characters with + symbol except *
str = str.replaceAll("\\*",exWord); // replace * symbol with instance string
Note : This solution will work only if your input string str doesn't contain any * symbol.
Also you should escape any character with a special meaning in a regular expression in phrase instance string exWord like : exWord = "++".

Replace characters and keep only one of these characters

Can someone help me here? I dont understand where's the problem...
I need check if a String have more than 1 char like 'a', if so i need replace all 'a' for a empty space, but i still want only one 'a'.
String text = "aaaasomethingsomethingaaaa";
for (char c: text.toCharArray()) {
if (c == 'a') {
count_A++;//8
if (count_A > 1) {//yes
//app crash at this point
do {
text.replace("a", "");
} while (count_A != 1);
}
}
}
the application stops working when it enters the while loop. Any suggestion? Thank you very much!
If you want to replace every a in the string except for the last one then you may try the following regex option:
String text = "aaaasomethingsomethingaaaa";
text = text.replaceAll("a(?=.*a)", " ");
somethingsomething a
Demo
Edit:
If you really want to remove every a except for the last one, then use this:
String text = "aaaasomethingsomethingaaaa";
text = text.replaceAll("a(?=.*a)", "");
You can also do it like
String str = new String ("asomethingsomethingaaaa");
int firstIndex = str.indexOf("a");
firstIndex++;
String firstPart = str.substring(0, firstIndex);
String secondPart = str.substring(firstIndex);
System.out.println(firstPart + secondPart.replace("a", ""));
Maybe I'm wrong here but I have a feeling your talking about runs of any single character within a string. If this is the case then you can just use a little method like this:
public String removeCharacterRuns(String inputString) {
return inputString.replaceAll("([a-zA-Z])\\1{2,}", "$1");
}
To use this method:
String text = "aaaasomethingsomethingaaaa";
System.out.println(removeCharacterRuns(text));
The console output is:
asomethingsomethinga
Or perhaps even:
String text = "FFFFFFFourrrrrrrrrrrty TTTTTwwwwwwooo --> is the answer to: "
+ "The Meeeeeaniiiing of liiiiife, The UUUniveeeerse and "
+ "Evvvvverything.";
System.out.println(removeCharacterRuns(text));
The console output is........
Fourty Two --> is the answer to: The Meaning of life, The Universe and Everything.
The Regular Expression used within the provided removeCharacterRuns() method was actually borrowed from the answers provided within this SO Post.
Regular Expression Explanation:

How to check if all characters in a String are all letters?

I'm able to separate the words in the sentence but I do not know how to check if a word contains a character other than a letter. You don't have to post an answer just some material I could read to help me.
public static void main(String args [])
{
String sentance;
String word;
int index = 1;
System.out.println("Enter sentance please");
sentance = EasyIn.getString();
String[] words = sentance.split(" ");
for ( String ss : words )
{
System.out.println("Word " + index + " is " + ss);
index++;
}
}
What I would do is use String#matches and use the regex [a-zA-Z]+.
String hello = "Hello!";
String hello1 = "Hello";
System.out.println(hello.matches("[a-zA-Z]+")); // false
System.out.println(hello1.matches("[a-zA-Z]+")); // true
Another solution is if (Character.isLetter(str.charAt(i)) inside a loop.
Another solution is something like this
String set = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
String word = "Hello!";
boolean notLetterFound;
for (char c : word.toCharArray()){ // loop through string as character array
if (!set.contains(c)){ // if a character is not found in the set
notLetterfound = true; // make notLetterFound true and break the loop
break;
}
}
if (notLetterFound){ // notLetterFound is true, do something
// do something
}
I prefer the first answer though, using String#matches
For more reference goto-> How to determine if a String has non-alphanumeric characters?
Make the following changes in pattern "[^a-zA-Z^]"
Not sure if I understand your question, but there is the
Character.isAlpha(c);
You would iterate over all characters in your string and check whether they are alphabetic (there are other "isXxxxx" methods in the Character class).
You could loop through the characters in the word calling Character.isLetter(), or maybe check if it matches a regular expression e.g. [\w]* (this would match the word only if its contents are all characters).
you can use charector array to do this like..
char[] a=ss.toCharArray();
not you can get the charector at the perticulor index.
with "word "+index+" is "+a[index];

Regex for specifying an empty string

I use a validator that requires a regex to be specified. In the case of validating against an empty string, I don't know how to generate such a regex. What regex can I use to match the empty string?
The regex ^$ matches only empty strings (i.e. strings of length 0). Here ^ and $ are the beginning and end of the string anchors, respectively.
If you need to check if a string contains only whitespaces, you can use ^\s*$. Note that \s is the shorthand for the whitespace character class.
Finally, in Java, matches attempts to match against the entire string, so you can omit the anchors should you choose to.
References
regular-expressions.info/Character classes and Anchors
API references
String.matches, Pattern.matches and Matcher.matches
Non-regex solution
You can also use String.isEmpty() to check if a string has length 0. If you want to see if a string contains only whitespace characters, then you can trim() it first and then check if it's isEmpty().
I don't know about Java specifically, but ^$ usually works (^ matches only at the start of the string, $ only at the end).
If you have to use regexp in Java for checking empty string you can simply use
testString.matches("")
please see examples:
String testString = "";
System.out.println(testString.matches(""));
or for checking if only white-spaces:
String testString = " ";
testString.trim().matches("");
but anyway using
testString.isEmpty();
testString.trim().isEmpty();
should be better from performance perspective.
public static void main(String[] args) {
String testString = "";
long startTime = System.currentTimeMillis();
for (int i =1; i <100000000; i++) {
// 50% of testStrings are empty.
if ((int)Math.round( Math.random()) == 0) {
testString = "";
} else {
testString = "abcd";
}
if (!testString.isEmpty()){
testString.matches("");
}
}
long endTime = System.currentTimeMillis();
System.out.println("Total testString.empty() execution time: " + (endTime-startTime) + "ms");
startTime = System.currentTimeMillis();
for (int i =1; i <100000000; i++) {
// 50% of testStrings are empty.
if ((int)Math.round( Math.random()) == 0) {
testString = "";
} else {
testString = "abcd";
}
testString.matches("");
}
endTime = System.currentTimeMillis();
System.out.println("Total testString.matches execution time: " + (endTime-startTime) + "ms");
}
Output:
C:\Java\jdk1.8.0_221\bin\java.exe
Total testString.empty() execution time: 11023ms
Total testString.matches execution time: 17831ms
For checking empty string i guess there is no need of regex itself...
u Can check length of the string directly ..
in many cases empty string and null checked together for extra precision.
like String.length >0 && String != null

How can I trim beginning and ending double quotes from a string?

I would like to trim a beginning and ending double quote (") from a string.
How can I achieve that in Java? Thanks!
You can use String#replaceAll() with a pattern of ^\"|\"$ for this.
E.g.
string = string.replaceAll("^\"|\"$", "");
To learn more about regular expressions, have al ook at http://regular-expression.info.
That said, this smells a bit like that you're trying to invent a CSV parser. If so, I'd suggest to look around for existing libraries, such as OpenCSV.
To remove the first character and last character from the string, use:
myString = myString.substring(1, myString.length()-1);
Also with Apache StringUtils.strip():
StringUtils.strip(null, *) = null
StringUtils.strip("", *) = ""
StringUtils.strip("abc", null) = "abc"
StringUtils.strip(" abc", null) = "abc"
StringUtils.strip("abc ", null) = "abc"
StringUtils.strip(" abc ", null) = "abc"
StringUtils.strip(" abcyx", "xyz") = " abc"
So,
final String SchrodingersQuotedString = "may or may not be quoted";
StringUtils.strip(SchrodingersQuotedString, "\""); //quoted no more
This method works both with quoted and unquoted strings as shown in my example. The only downside is, it will not look for strictly matched quotes, only leading and trailing quote characters (ie. no distinction between "partially and "fully" quoted strings).
If the double quotes only exist at the beginning and the end, a simple code as this would work perfectly:
string = string.replace("\"", "");
Kotlin
In Kotlin you can use String.removeSurrounding(delimiter: CharSequence)
E.g.
string.removeSurrounding("\"")
Removes the given delimiter string from both the start and the end of this string if and only if it starts with and ends with the delimiter.
Otherwise returns this string unchanged.
The source code looks like this:
public fun String.removeSurrounding(delimiter: CharSequence): String = removeSurrounding(delimiter, delimiter)
public fun String.removeSurrounding(prefix: CharSequence, suffix: CharSequence): String {
if ((length >= prefix.length + suffix.length) && startsWith(prefix) && endsWith(suffix)) {
return substring(prefix.length, length - suffix.length)
}
return this
}
This is the best way I found, to strip double quotes from the beginning and end of a string.
someString.replace (/(^")|("$)/g, '')
First, we check to see if the String is doubled quoted, and if so, remove them. You can skip the conditional if in fact you know it's double quoted.
if (string.length() >= 2 && string.charAt(0) == '"' && string.charAt(string.length() - 1) == '"')
{
string = string.substring(1, string.length() - 1);
}
Using Guava you can write more elegantly CharMatcher.is('\"').trimFrom(mystring);
I am using something as simple as this :
if(str.startsWith("\"") && str.endsWith("\""))
{
str = str.substring(1, str.length()-1);
}
To remove one or more double quotes from the start and end of a string in Java, you need to use a regex based solution:
String result = input_str.replaceAll("^\"+|\"+$", "");
If you need to also remove single quotes:
String result = input_str.replaceAll("^[\"']+|[\"']+$", "");
NOTE: If your string contains " inside, this approach might lead to issues (e.g. "Name": "John" => Name": "John).
See a Java demo here:
String input_str = "\"'some string'\"";
String result = input_str.replaceAll("^[\"']+|[\"']+$", "");
System.out.println(result); // => some string
Edited: Just realized that I should specify that this works only if both of them exists. Otherwise the string is not considered quoted. Such scenario appeared for me when working with CSV files.
org.apache.commons.lang3.StringUtils.unwrap("\"abc\"", "\"") = "abc"
org.apache.commons.lang3.StringUtils.unwrap("\"abc", "\"") = "\"abc"
org.apache.commons.lang3.StringUtils.unwrap("abc\"", "\"") = "abc\""
The pattern below, when used with java.util.regex.Matcher, will match any string between double quotes without affecting occurrences of double quotes inside the string:
"[^\"][\\p{Print}]*[^\"]"
Matcher m = Pattern.compile("^\"(.*)\"$").matcher(value);
String strUnquoted = value;
if (m.find()) {
strUnquoted = m.group(1);
}
Modifying #brcolow's answer a bit
if (string != null && string.length() >= 2 && string.startsWith("\"") && string.endsWith("\"") {
string = string.substring(1, string.length() - 1);
}
private static String removeQuotesFromStartAndEndOfString(String inputStr) {
String result = inputStr;
int firstQuote = inputStr.indexOf('\"');
int lastQuote = result.lastIndexOf('\"');
int strLength = inputStr.length();
if (firstQuote == 0 && lastQuote == strLength - 1) {
result = result.substring(1, strLength - 1);
}
return result;
}
find indexes of each double quotes and insert an empty string there.
public String removeDoubleQuotes(String request) {
return request.replace("\"", "");
}
Groovy
You can subtract a substring from a string using a regular expression in groovy:
String unquotedString = theString - ~/^"/ - ~/"$/
Scala
s.stripPrefix("\"").stripSuffix("\"")
This works regardless of whether the string has or does not have quotes at the start and / or end.
Edit: Sorry, Scala only

Categories

Resources