CodingBat - Java - Warmup-2 - "stringYak" algorithm

CodingBat - Java - Warmup-2 - "stringYak" algorithm - java

I'm presently trying to understand a particular algorithm at the CodingBat platform.
Here's the problem presented by CodingBat:
*Suppose the string "yak" is unlucky. Given a string, return a version where all the "yak" are removed, but the "a" can be any char. The "yak" strings will not overlap.
Example outputs:
stringYak("yakpak") → "pak"
stringYak("pakyak") → "pak"
stringYak("yak123ya") → "123ya"*
Here's the official code solution:
public String stringYak(String str) {
String result = "";
for (int i=0; i<str.length(); i++) {
// Look for i starting a "yak" -- advance i in that case
if (i+2<str.length() && str.charAt(i)=='y' && str.charAt(i+2)=='k') {
i = i + 2;
} else { // Otherwise do the normal append
result = result + str.charAt(i);
}
}
return result;
}
I can't make sense of this line of code below. Following the logic, result would only return the character at the index, not the remaining string.
result = result + str.charAt(i);
To me it would make better sense if the code was presented like this below, where the substring function would return the letter of the index and the remaining string afterwards:
result = result + str.substring(i);
What am I missing? Any feedback from anyone would be greatly helpful and thank you for your valuable time.

String concatenation
In order to be on the same page, let's recap how string concatenation works.
When at least one of the operands in the expression with plus sign + is an instance of String, plus sign will be interpreted a string concatenation operator. And the result of the execution of the expression will be a new string created by appending the right operand (or its string representation) to the left operand (or its string representation).
String str = "allow";
char ch = 'h';
Object obj = new Object();
System.out.println(ch + str); // prints "hallow"
System.out.println("test " + obj); // prints "test java.lang.Object#16b98e56"
Explanation of the code-logic
That said, I guess you will agree that this statement concatenates a character at position i in the str to the resulting string and assigns the result of concatenation to the same variable result:
result = result + str.charAt(i);
The condition in the code provided by coding bat ensures whether the index i+2 is valid and then checks characters at indices i and i+2. If they are equal to y and k respectively. If that is not the case, the character will be appended to the resulting string. Athowise it will be discarded and the indexed gets incremented by 2 in order to skip the whole group of characters that constitute "yak" (with a which can be an arbitrary symbol).
So the resulting string is being constructed in the loop character by characters.
Flavors of substring()
Method substring() is overload, there are two flavors of it.
A version that expects two argument: the starting index inclusive, the ending index, exclusivesubstring(int, int).
And you can use it to achieve the same result:
// an equivalent of result = result + str.charAt(i);
result = result + str.substring(i, i + 1);
Another version of this method, that expects one argument will not be useful here. Because the result returned by str.substring(i) will be not a string containing a single character, but a substring staring from the given index, i.e. encompassing all the characters until the end of the string as documentation of substring(int) states:
public String substring(int beginIndex)
Returns a string that is a substring of this string. The substring
begins with the character at the specified index and extends to the
end of this string.
Examples:
"unhappy".substring(2) returns "happy"
"Harbison".substring(3) returns "bison"
"emptiness".substring(9) returns "" (an empty string)
Side note:
This coding-problem was introduced in order to master the basic knowledge of loops and string-operations. But actually the simplest to solve this problem is by using method replaceAll() that expects a regular expression and a replacement-string:
return str.repalaceAll("y.k", "");

Related

Java regex: Replace all characters with `+` except instances of a given string

I have the following problem which states
Replace all characters in a string with + symbol except instances of the given string in the method
so for example if the string given was abc123efg and they want me to replace every character except every instance of 123 then it would become +++123+++.
I figured a regular expression is probably the best for this and I came up with this.
str.replaceAll("[^str]","+")
where str is a variable, but its not letting me use the method without putting it in quotations. If I just want to replace the variable string str how can I do that? I ran it with the string manually typed and it worked on the method, but can I just input a variable?
as of right now I believe its looking for the string "str" and not the variable string.
Here is the output its right for so many cases except for two :(
List of open test cases:
plusOut("12xy34", "xy") → "++xy++"
plusOut("12xy34", "1") → "1+++++"
plusOut("12xy34xyabcxy", "xy") → "++xy++xy+++xy"
plusOut("abXYabcXYZ", "ab") → "ab++ab++++"
plusOut("abXYabcXYZ", "abc") → "++++abc+++"
plusOut("abXYabcXYZ", "XY") → "++XY+++XY+"
plusOut("abXYxyzXYZ", "XYZ") → "+++++++XYZ"
plusOut("--++ab", "++") → "++++++"
plusOut("aaxxxxbb", "xx") → "++xxxx++"
plusOut("123123", "3") → "++3++3"

Looks like this is the plusOut problem on CodingBat.
I had 3 solutions to this problem, and wrote a new streaming solution just for fun.
Solution 1: Loop and check
Create a StringBuilder out of the input string, and check for the word at every position. Replace the character if doesn't match, and skip the length of the word if found.
public String plusOut(String str, String word) {
StringBuilder out = new StringBuilder(str);
for (int i = 0; i < out.length(); ) {
if (!str.startsWith(word, i))
out.setCharAt(i++, '+');
else
i += word.length();
}
return out.toString();
}
This is probably the expected answer for a beginner programmer, though there is an assumption that the string doesn't contain any astral plane character, which would be represented by 2 char instead of 1.
Solution 2: Replace the word with a marker, replace the rest, then restore the word
public String plusOut(String str, String word) {
return str.replaceAll(java.util.regex.Pattern.quote(word), "#").replaceAll("[^#]", "+").replaceAll("#", word);
}
Not a proper solution since it assumes that a certain character or sequence of character doesn't appear in the string.
Note the use of Pattern.quote to prevent the word being interpreted as regex syntax by replaceAll method.
Solution 3: Regex with \G
public String plusOut(String str, String word) {
word = java.util.regex.Pattern.quote(word);
return str.replaceAll("\\G((?:" + word + ")*+).", "$1+");
}
Construct regex \G((?:word)*+)., which does more or less what solution 1 is doing:
\G makes sure the match starts from where the previous match leaves off
((?:word)*+) picks out 0 or more instance of word - if any, so that we can keep them in the replacement with $1. The key here is the possessive quantifier *+, which forces the regex to keep any instance of the word it finds. Otherwise, the regex will not work correctly when the word appear at the end of the string, as the regex backtracks to match .
. will not be part of any word, since the previous part already picks out all consecutive appearances of word and disallow backtrack. We will replace this with +
Solution 4: Streaming
public String plusOut(String str, String word) {
return String.join(word,
Arrays.stream(str.split(java.util.regex.Pattern.quote(word), -1))
.map((String s) -> s.replaceAll("(?s:.)", "+"))
.collect(Collectors.toList()));
}
The idea is to split the string by word, do the replacement on the rest, and join them back with word using String.join method.
Same as above, we need Pattern.quote to avoid split interpreting the word as regex. Since split by default removes empty string at the end of the array, we need to use -1 in the second parameter to make split leave those empty strings alone.
Then we create a stream out of the array and replace the rest as strings of +. In Java 11, we can use s -> String.repeat(s.length()) instead.
The rest is just converting the Stream to an Iterable (List in this case) and joining them for the result

This is a bit trickier than you might initially think because you don't just need to match characters, but the absence of specific phrase - a negated character set is not enough. If the string is 123, you would need:
(?<=^|123)(?!123).*?(?=123|$)
https://regex101.com/r/EZWMqM/1/
That is - lookbehind for the start of the string or "123", make sure the current position is not followed by 123, then lazy-repeat any character until lookahead matches "123" or the end of the string. This will match all characters which are not in a "123" substring. Then, you need to replace each character with a +, after which you can use appendReplacement and a StringBuffer to create the result string:
String inputPhrase = "123";
String inputStr = "abc123efg123123hij";
StringBuffer resultString = new StringBuffer();
Pattern regex = Pattern.compile("(?<=^|" + inputPhrase + ")(?!" + inputPhrase + ").*?(?=" + inputPhrase + "|$)");
Matcher m = regex.matcher(inputStr);
while (m.find()) {
String replacement = m.group(0).replaceAll(".", "+");
m.appendReplacement(resultString, replacement);
}
m.appendTail(resultString);
System.out.println(resultString.toString());
Output:
+++123+++123123+++
Note that if the inputPhrase can contain character with a special meaning in a regular expression, you'll have to escape them first before concatenating into the pattern.

You can do it in one line:
input = input.replaceAll("((?:" + str + ")+)?(?!" + str + ").((?:" + str + ")+)?", "$1+$2");
This optionally captures "123" either side of each character and puts them back (a blank if there's no "123"):

So instead of coming up with a regular expression that matches the absence of a string. We might as well just match the selected phrase and append + the number of skipped characters.
StringBuilder sb = new StringBuilder();
Matcher m = Pattern.compile(Pattern.quote(str)).matcher(input);
while (m.find()) {
for (int i = 0; i < m.start(); i++) sb.append('+');
sb.append(str);
}
int remaining = input.length() - sb.length();
for (int i = 0; i < remaining; i++) {
sb.append('+');
}

Absolutely just for the fun of it, a solution using CharBuffer (unexpectedly it took a lot more that I initially hoped for):
private static String plusOutCharBuffer(String input, String match) {
int size = match.length();
CharBuffer cb = CharBuffer.wrap(input.toCharArray());
CharBuffer word = CharBuffer.wrap(match);
int x = 0;
for (; cb.remaining() > 0;) {
if (!cb.subSequence(0, size < cb.remaining() ? size : cb.remaining()).equals(word)) {
cb.put(x, '+');
cb.clear().position(++x);
} else {
cb.clear().position(x = x + size);
}
}
return cb.clear().toString();
}

To make this work you need a beast of a pattern. Let's say you you are operating on the following test case as an example:
plusOut("abXYxyzXYZ", "XYZ") → "+++++++XYZ"
What you need to do is build a series of clauses in your pattern to match a single character at a time:
Any character that is NOT "X", "Y" or "Z" -- [^XYZ]
Any "X" not followed by "YZ" -- X(?!YZ)
Any "Y" not preceded by "X" -- (?<!X)Y
Any "Y" not followed by "Z" -- Y(?!Z)
Any "Z" not preceded by "XY" -- (?<!XY)Z
An example of this replacement can be found here: https://regex101.com/r/jK5wU3/4
Here is an example of how this might work (most certainly not optimized, but it works):
import java.util.regex.Pattern;
public class Test {
public static void plusOut(String text, String exclude) {
StringBuilder pattern = new StringBuilder("");
for (int i=0; i<exclude.length(); i++) {
Character target = exclude.charAt(i);
String prefix = (i > 0) ? exclude.substring(0, i) : "";
String postfix = (i < exclude.length() - 1) ? exclude.substring(i+1) : "";
// add the look-behind (?<!X)Y
if (!prefix.isEmpty()) {
pattern.append("(?<!").append(Pattern.quote(prefix)).append(")")
.append(Pattern.quote(target.toString())).append("|");
}
// add the look-ahead X(?!YZ)
if (!postfix.isEmpty()) {
pattern.append(Pattern.quote(target.toString()))
.append("(?!").append(Pattern.quote(postfix)).append(")|");
}
}
// add in the other character exclusion
pattern.append("[^" + Pattern.quote(exclude) + "]");
System.out.println(text.replaceAll(pattern.toString(), "+"));
}
public static void main(String [] args) {
plusOut("12xy34", "xy");
plusOut("12xy34", "1");
plusOut("12xy34xyabcxy", "xy");
plusOut("abXYabcXYZ", "ab");
plusOut("abXYabcXYZ", "abc");
plusOut("abXYabcXYZ", "XY");
plusOut("abXYxyzXYZ", "XYZ");
plusOut("--++ab", "++");
plusOut("aaxxxxbb", "xx");
plusOut("123123", "3");
}
}
UPDATE: Even this doesn't quite work because it can't deal with exclusions that are just repeated characters, like "xx". Regular expressions are most definitely not the right tool for this, but I thought it might be possible. After poking around, I'm not so sure a pattern even exists that might make this work.

The problem in your solution that you put a set of instance string str.replaceAll("[^str]","+") which it will exclude any character from the variable str and that will not solve your problem
EX: when you try str.replaceAll("[^XYZ]","+") it will exclude any combination of character X , character Y and character Z from your replacing method so you will get "++XY+++XYZ".
Actually you should exclude a sequence of characters instead in str.replaceAll.
You can do it by using capture group of characters like (XYZ) then use a negative lookahead to match a string which does not contain characters sequence : ^((?!XYZ).)*$
Check this solution for more info about this problem but you should know that it may be complicated to find regular expression to do that directly.
I have found two simple solutions for this problem :
Solution 1:
You can implement a method to replace all characters with '+' except the instance of given string:
String exWord = "XYZ";
String str = "abXYxyzXYZ";
for(int i = 0; i < str.length(); i++){
// exclude any instance string of exWord from replacing process in str
if(str.substring(i, str.length()).indexOf(exWord) + i == i){
i = i + exWord.length()-1;
}
else{
str = str.substring(0,i) + "+" + str.substring(i+1);//replace each character with '+' symbol
}
}
Note : str.substring(i, str.length()).indexOf(exWord) + i this if statement will exclude any instance string of exWord from replacing process in str.
Output:
+++++++XYZ
Solution 2:
You can try this Approach using ReplaceAll method and it doesn't need any complex regular expression:
String exWord = "XYZ";
String str = "abXYxyzXYZ";
str = str.replaceAll(exWord,"*"); // replace instance string with * symbol
str = str.replaceAll("[^*]","+"); // replace all characters with + symbol except *
str = str.replaceAll("\\*",exWord); // replace * symbol with instance string
Note : This solution will work only if your input string str doesn't contain any * symbol.
Also you should escape any character with a special meaning in a regular expression in phrase instance string exWord like : exWord = "++".

why does this for loop wordcount method not work in java

Can anyone let me know why this wordsearch method doesn't work - the returned value of count is 0 everytime I run it.
public int wordcount(){
String spaceString = " ";
int count = 0;
for(int i = 0; i < this.getString().length(); i++){
if (this.getString().substring(i).equals(spaceString)){
count++;
}
}
return count;
}
The value of getString = my search string.
Much appreciated if anyone can help - I'm sure I'm prob doing something dumb.
Dylan

Read the docs:
The substring begins with the character at the specified index and extends to the end of this string.
Your if condition is only true once, if the last character of the string is a space. Perhaps you wanted charAt? (And even this won't properly handle double spaces; splitting on whitespace might be a better option.)

Because substring with only one argument returns the sub string starting from that index till the end of the string. So you're not comparing just one character.
Instead of substring define spaceString as a char, and use charAt(i)

this.getString().substring(i) -> this returns a sub string from the index i to the end of the String
So for example if your string was Test the above would return Test, est, st and finally t
For what you're trying to do there are alternative methods, but you could simple replace
this.getString().substring(i)
with
spaceString.equals(this.getString().charAt(i))
An alternative way of doing what you're trying to do is:
this.getString().split(spaceString)
This would return an array of Strings - the original string broken up by spaces.

Read the documentation of the method you are using:
http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#substring(int)
I.e. the count will be non zero only if you have a space on the end of your string

Using substring as you are will not work. If the value of getString() is "my search string" every iteration through the loop with have substring(i) return:
my search string
y search string
search string
search string
earch string
arch string
rch string
ch string
h string
string
string
tring
ring
ing
ng
g
Notice none of those equals " ".
Try using split.
public int countWords(String s){
return s.split("\\s+").length;
}

Change
if (this.getString().substring(i).equals(spaceString))
to
if (this.getString().charAt(i) == ' ')

this.getString().substring(i) returns a string from the index of (i) to the end of the string.
Example: for i=5, it will return "rown cow" from the string "the brown cow". This functionality isn't what you need.
If you pepper System.out.println() throughout your code (or use the debugger), you will see this.
I think it would be better to use something like String.split() or charAt(i).
By the way, even if you fix your code by counting spaces, it will not return the correct value for these conditions: "my dog" (word count=2) and "cow" (word count=1). There is also a problem if there are more than one space between words. ALso, this will produce a word cound of three:
" the cow ".

How to use recursion to reverse a String?

public String reverse(String word) {
if ((word == null) || (word.length() <= 1)) {
return word;
}
return reverse(word.substring(1)) + word.charAt(0);
}
I have this code that professor sent me but I don't get it. I know what recursion is but I'm still a newbie at Java Programming so if anybody would care to explain to me the part
return reverse(word.substring(1)) + word.charAt(0);
what does the subString(1) does and the chartAt(0)?

This is recursion. Here are documentation for subString() and charAt().
Coming to how this works:
public static String reverse(String word) {
if ((word == null) || (word.length() <= 1)) {
return word;
}
return reverse(word.substring(1)) + word.charAt(0);
}
Pass1: reverse("user") : return reverse("ser")+'u';
Pass2: reverse("ser")+'u' : return reverse("er")+'s'+'u';
Pass3: reverse("er")+'s'+'u' : return reverse("r")+'e'+'s'+'u';
Pass4: reverse("r")+'e'+'s'+'u' : return 'r'+'e'+'s'+'u'; // because here "r".length()==1

The way the recursive part of this works is that to reverse a string, you remove the first character, reverse what's left, and then append the first character to the result. That's what the prof's code is doing.
word.substring(1) returns the substring starting at index 1 and going to the end
word.charAt(0) returns the character at index 0
There's a bit more going on when the two pieces are appended using +. The issue is that word.charAt(0) has a return type of char. Since the left-hand part of the + is a String, the Java language rules say that the right-hand side must be converted to a String if it isn't one. So the char value is first converted to a Character and then the toString() method of the Character class is called. This returns a String consisting of the single character.
It might have been more efficient code to write that line like:
return reverse(word.substring(1)) + word.substring(0, 1);
The two-argument version of substring returns the substring between the two indexes. That would eliminate the autoboxing and conversion to String.

return reverse(word.substring(1)) + word.charAt(0);
you should read it this way:
remove the first letter away from the word
reverse the rest (recursive call)
put the first letter at the end
if you assume this function reverses the strings of length N, you can easily see that it must reverse the strings of length N+1. If you realize that the word with at most one letter is the same if reversed (the first three lines of code), you have a complete very simple proof using Mathematical Induction that this function really reverses the string.

Java - removing first character of a string

In Java, I have a String:
Jamaica
I would like to remove the first character of the string and then return amaica
How would I do this?

const str = "Jamaica".substring(1)
console.log(str)
Use the substring() function with an argument of 1 to get the substring from position 1 (after the first character) to the end of the string (leaving the second argument out defaults to the full length of the string).

public String removeFirstChar(String s){
return s.substring(1);
}

In Java, remove leading character only if it is a certain character
Use the Java ternary operator to quickly check if your character is there before removing it. This strips the leading character only if it exists, if passed a blank string, return blankstring.
String header = "";
header = header.startsWith("#") ? header.substring(1) : header;
System.out.println(header);
header = "foobar";
header = header.startsWith("#") ? header.substring(1) : header;
System.out.println(header);
header = "#moobar";
header = header.startsWith("#") ? header.substring(1) : header;
System.out.println(header);
Prints:
blankstring
foobar
moobar
Java, remove all the instances of a character anywhere in a string:
String a = "Cool";
a = a.replace("o","");
//variable 'a' contains the string "Cl"
Java, remove the first instance of a character anywhere in a string:
String b = "Cool";
b = b.replaceFirst("o","");
//variable 'b' contains the string "Col"

Use substring() and give the number of characters that you want to trim from front.
String value = "Jamaica";
value = value.substring(1);
Answer: "amaica"

You can use the substring method of the String class that takes only the beginning index and returns the substring that begins with the character at the specified index and extending to the end of the string.
String str = "Jamaica";
str = str.substring(1);

substring() method returns a new String that contains a subsequence of characters currently contained in this sequence.
The substring begins at the specified start and extends to the character at index end - 1.
It has two forms. The first is
String substring(int FirstIndex)
Here, FirstIndex specifies the index at which the substring will
begin. This form returns a copy of the substring that begins at
FirstIndex and runs to the end of the invoking string.
String substring(int FirstIndex, int endIndex)
Here, FirstIndex specifies the beginning index, and endIndex specifies
the stopping point. The string returned contains all the characters
from the beginning index, up to, but not including, the ending index.
Example
String str = "Amiyo";
// prints substring from index 3
System.out.println("substring is = " + str.substring(3)); // Output 'yo'

you can do like this:
String str = "Jamaica";
str = str.substring(1, title.length());
return str;
or in general:
public String removeFirstChar(String str){
return str.substring(1, title.length());
}

public String removeFirst(String input)
{
return input.substring(1);
}

The key thing to understand in Java is that Strings are immutable -- you can't change them. So it makes no sense to speak of 'removing a character from a string'. Instead, you make a NEW string with just the characters you want. The other posts in this question give you a variety of ways of doing that, but its important to understand that these don't change the original string in any way. Any references you have to the old string will continue to refer to the old string (unless you change them to refer to a different string) and will not be affected by the newly created string.
This has a number of implications for performance. Each time you are 'modifying' a string, you are actually creating a new string with all the overhead implied (memory allocation and garbage collection). So if you want to make a series of modifications to a string and care only about the final result (the intermediate strings will be dead as soon as you 'modify' them), it may make more sense to use a StringBuilder or StringBuffer instead.

I came across a situation where I had to remove not only the first character (if it was a #, but the first set of characters.
String myString = ###Hello World could be the starting point, but I would only want to keep the Hello World. this could be done as following.
while (myString.charAt(0) == '#') { // Remove all the # chars in front of the real string
myString = myString.substring(1, myString.length());
}
For OP's case, replace while with if and it works aswell.

You can simply use substring().
String myString = "Jamaica"
String myStringWithoutJ = myString.substring(1)
The index in the method indicates from where we are getting the result string, in this case we are getting it after the first position because we dont want that "J" in "Jamaica".

Another solution, you can solve your problem using replaceAll with some regex ^.{1} (regex demo) for example :
String str = "Jamaica";
int nbr = 1;
str = str.replaceAll("^.{" + nbr + "}", "");//Output = amaica

My version of removing leading chars, one or multiple. For example, String str1 = "01234", when removing leading '0', result will be "1234". For a String str2 = "000123" result will be again "123". And for String str3 = "000" result will be empty string: "". Such functionality is often useful when converting numeric strings into numbers.The advantage of this solution compared with regex (replaceAll(...)) is that this one is much faster. This is important when processing large number of Strings.
public static String removeLeadingChar(String str, char ch) {
int idx = 0;
while ((idx < str.length()) && (str.charAt(idx) == ch))
idx++;
return str.substring(idx);
}

##KOTLIN
#Its working fine.
tv.doOnTextChanged { text: CharSequence?, start, count, after ->
val length = text.toString().length
if (length==1 && text!!.startsWith(" ")) {
tv?.setText("")
}
}

How do I check that a Java String is not all whitespaces?

I want to check that Java String or character array is not just made up of whitespaces, using Java?
This is a very similar question except it's Javascript:
How can I check if string contains characters & whitespace, not just whitespace?
EDIT: I removed the bit about alphanumeric characters, so it makes more sense.

Shortest solution I can think of:
if (string.trim().length() > 0) ...
This only checks for (non) white space. If you want to check for particular character classes, you need to use the mighty match() with a regexp such as:
if (string.matches(".*\\w.*")) ...
...which checks for at least one (ASCII) alphanumeric character.

I would use the Apache Commons Lang library. It has a class called StringUtils that is useful for all sorts of String operations. For checking if a String is not all whitespaces, you can use the following:
StringUtils.isBlank(<your string>)
Here is the reference: StringUtils.isBlank

Slightly shorter than what was mentioned by Carl Smotricz:
!string.trim().isEmpty();

StringUtils.isBlank(CharSequence)
https://commons.apache.org/proper/commons-lang/javadocs/api-release/org/apache/commons/lang3/StringUtils.html#isBlank-java.lang.CharSequence-

If you are using Java 11 or more recent, the new isBlank string method will come in handy:
!s.isBlank();
If you are using Java 8, 9 or 10, you could build a simple stream to check that a string is not whitespace only:
!s.chars().allMatch(Character::isWhitespace));
In addition to not requiring any third-party libraries such as Apache Commons Lang, these solutions have the advantage of handling any white space character, and not just plain ' ' spaces as would a trim-based solution suggested in many other answers. You can refer to the Javadocs for an exhaustive list of all supported white space types. Note that empty strings are also covered in both cases.

This answer focusses more on the sidenote "i.e. has at least one alphanumeric character". Besides that, it doesn't add too much to the other (earlier) solution, except that it doesn't hurt you with NPE in case the String is null.
We want false if (1) s is null or (2) s is empty or (3) s only contains whitechars.
public static boolean containsNonWhitespaceChar(String s) {
return !((s == null) || "".equals(s.trim()));
}

if(target.matches("\\S"))
// then string contains at least one non-whitespace character
Note use of back-slash cap-S, meaning "non-whitespace char"
I'd wager this is the simplest (and perhaps the fastest?) solution.

If you are only checking for whitespace and don't care about null then you can use org.apache.commons.lang.StringUtils.isWhitespace(String str),
StringUtils.isWhitespace(String str);
(Checks if the String contains only whitespace.)
If you also want to check for null(including whitespace) then
StringUtils.isBlank(String str);

Just an performance comparement on openjdk 13, Windows 10. For each of theese texts:
"abcd"
" "
" \r\n\t"
" ab "
" \n\n\r\t \n\r\t\t\t \r\n\r\n\r\t \t\t\t\r\n\n"
"lorem ipsum dolor sit amet consectetur adipisici elit"
"1234657891234567891324569871234567891326987132654798"
executed one of following tests:
// trim + empty
input.trim().isEmpty()
// simple match
input.matches("\\S")
// match with precompiled pattern
final Pattern PATTERN = Pattern.compile("\\S");
PATTERN.matcher(input).matches()
// java 11's isBlank
input.isBlank()
each 10.000.000 times.
The results:
METHOD min max note
trim: 18 313 much slower if text not trimmed
match: 1799 2010
pattern: 571 662
isBlank: 60 338 faster the earlier hits the first non-whitespace character
Quite surprisingly the trim+empty is the fastest. Even if it needs to construct the trimmed text. Still faster then simple for-loop looking for one single non-whitespaced character...
EDIT:
The longer text, the more numbers differs. Trim of long text takes longer time than just simple loop. However, the regexs are still the slowest solution.

With Java-11+, you can make use of the String.isBlank API to check if the given string is not all made up of whitespace -
String str1 = " ";
System.out.println(str1.isBlank()); // made up of all whitespaces, prints true
String str2 = " a";
System.out.println(str2.isBlank()); // prints false
The javadoc for the same is :
/**
* Returns {#code true} if the string is empty or contains only
* {#link Character#isWhitespace(int) white space} codepoints,
* otherwise {#code false}.
*
* #return {#code true} if the string is empty or contains only
* {#link Character#isWhitespace(int) white space} codepoints,
* otherwise {#code false}
*
* #since 11
*/
public boolean isBlank()

The trim method should work great for you.
http://download.oracle.com/docs/cd/E17476_01/javase/1.4.2/docs/api/java/lang/String.html#trim()
Returns a copy of the string, with
leading and trailing whitespace
omitted. If this String object
represents an empty character
sequence, or the first and last
characters of character sequence
represented by this String object both
have codes greater than '\u0020' (the
space character), then a reference to
this String object is returned.
Otherwise, if there is no character
with a code greater than '\u0020' in
the string, then a new String object
representing an empty string is
created and returned.
Otherwise, let k be the index of the
first character in the string whose
code is greater than '\u0020', and let
m be the index of the last character
in the string whose code is greater
than '\u0020'. A new String object is
created, representing the substring of
this string that begins with the
character at index k and ends with the
character at index m-that is, the
result of this.substring(k, m+1).
This method may be used to trim
whitespace from the beginning and end
of a string; in fact, it trims all
ASCII control characters as well.
Returns: A copy of this string with
leading and trailing white space
removed, or this string if it has no
leading or trailing white space.leading or trailing white space.
You could trim and then compare to an empty string or possibly check the length for 0.

Alternative:
boolean isWhiteSpaces( String s ) {
return s != null && s.matches("\\s+");
}

trim() and other mentioned regular expression do not work for all types of whitespaces
i.e: Unicode Character 'LINE SEPARATOR' http://www.fileformat.info/info/unicode/char/2028/index.htm
Java functions Character.isWhitespace() covers all situations.
That is why already mentioned solution
StringUtils.isWhitespace( String ) /or StringUtils.isBlank(String)
should be used.

StringUtils.isEmptyOrWhitespaceOnly(<your string>)
will check :
- is it null
- is it only space
- is it empty string ""
https://www.programcreek.com/java-api-examples/?class=com.mysql.jdbc.StringUtils&method=isEmptyOrWhitespaceOnly

While personally I would be preferring !str.isBlank(), as others already suggested (or str -> !str.isBlank() as a Predicate), a more modern and efficient version of the str.trim() approach mentioned above, would be using str.strip() - considering nulls as "whitespace":
if (str != null && str.strip().length() > 0) {...}
For example as Predicate, for use with streams, e. g. in a unit test:
#Test
public void anyNonEmptyStrippedTest() {
String[] strings = null;
Predicate<String> isNonEmptyStripped = str -> str != null && str.strip().length() > 0;
assertTrue(Optional.ofNullable(strings).map(arr -> Stream.of(arr).noneMatch(isNonEmptyStripped)).orElse(true));
strings = new String[] { null, "", " ", "\\n", "\\t", "\\r" };
assertTrue(Optional.ofNullable(strings).map(arr -> Stream.of(arr).anyMatch(isNonEmptyStripped)).orElse(true));
strings = new String[] { null, "", " ", "\\n", "\\t", "\\r", "test" };
}

public static boolean isStringBlank(final CharSequence cs) {
int strLen;
if (cs == null || (strLen = cs.length()) == 0) {
return true;
}
for (int i = 0; i < strLen; i++) {
if (!Character.isWhitespace(cs.charAt(i))) {
return false;
}
}
return true;
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

CodingBat - Java - Warmup-2 - "stringYak" algorithm - java

Related

Java regex: Replace all characters with `+` except instances of a given string

why does this for loop wordcount method not work in java

How to use recursion to reverse a String?

Java - removing first character of a string

How do I check that a Java String is not all whitespaces?

Categories

Resources