There are several ways to perform a String conversion in Java and some people (including myself) prefers to concatenate an empty string to do the conversion:
Example:
char ch = 'A';
String str = "" + ch; //gets character value and append to str
However the order of the empty String is always a mystery to me. The following will successfully perform a String conversion:
str = ch + "";
str = ch + "" + ch;
but not the following:
str = ch + ch + ""; //if (ch + "") gives us "A", shouldn't this be "65A"?
Que: To be safe, we can always place the empty String infront, but I want to know how Java interprets the concatenation when the (empty) string is placed in other locations (such as in between or at the back).
The + operator is left-associative, which means that it is grouped from left-to-right.
str = ch + ch + "";
This is equivalent to
str = (ch + ch) + "";
// = ('A' + 'A') + "";
// = 130 + "";
// = "130";
not
str = ch + (ch + "");
// = 'A' + ('A' + "");
// = 'A' + "A";
// = "AA";
char + String and String + char both result in a String. But char + char returns an int. Do you see now why a second + ch doesn't work?
Related
My String:
BByTTheWay .I want to split the string as B By T The Way BByTheWay .That means I want to split string if I get any capital letters and last put the main string as it is. As far I tried in java:
public String breakWord(String fileAsString) throws FileNotFoundException, IOException {
String allWord = "";
String allmethod = "";
String[] splitString = fileAsString.split(" ");
for (int i = 0; i < splitString.length; i++) {
String k = splitString[i].replaceAll("([A-Z])(?![A-Z])", " $1").trim();
allWord = k.concat(" " + splitString[i]);
allWord = Arrays.stream(allWord.split("\\s+")).distinct().collect(Collectors.joining(" "));
allmethod = allmethod + " " + allWord;
// System.out.print(allmethod);
}
return allmethod;
}
It givs me the output: B ByT The Way BByTTheWay . I think stackoverflow community help me to solve this.
You may use this code:
Code 1
String s = "BByTTheWay";
Pattern p = Pattern.compile("\\p{Lu}\\p{Ll}*");
String out = p.matcher(s)
.results()
.map(MatchResult::group)
.collect(Collectors.joining(" "))
+ " " + s;
//=> "B By T The Way BByTTheWay"
RegEx \\p{Lu}\\p{Ll}* matches any unicode upper case letter followed by 0 or more lowercase letters.
CODE DEMO
Or use String.split using same regex and join it back later:
Code 2
String out = Arrays.stream(s.split("(?=\\p{Lu})"))
.collect(Collectors.joining(" ")) + " " + s;
//=> "B By T The Way BByTTheWay"
Use
String s = "BByTTheWay";
Pattern p = Pattern.compile("[A-Z][a-z]*");
Matcher m = p.matcher(s);
String r = "";
while (m.find()) {
r = r + m.group(0) + " ";
}
System.out.println(r + s);
See Java proof.
Results: B By T The Way BByTTheWay
EXPLANATION
--------------------------------------------------------------------------------
[A-Z] any character of: 'A' to 'Z'
--------------------------------------------------------------------------------
[a-z]* any character of: 'a' to 'z' (0 or more
times (matching the most amount possible))
As per requirements, you can write in this way checking if a character is an alphabet or not:
char[] chars = fileAsString.toCharArray();
StringBuilder fragment = new StringBuilder();
for (char ch : chars) {
if (Character.isLetter(ch) && Character.isUpperCase(ch)) { // it works as internationalized check
fragment.append(" ");
}
fragment.append(ch);
}
String.join(" ", fragment).concat(" " + fileAsString).trim(); // B By T The Way BByTTheWay
I want to erase words "makbet" in my string, but my method deleteAllStopWords() is working really strange: if my string is " makbet makbet ", after I use my method string "makbet" is created.
I called deleteAllStopWords() twice for one string and then for string " makbet makbet " method is working as expected, but problem is with string " makbet makbet makbet makbet " (string "makbet" is returned). When I invoked method thrice the problem is with string " makbet makbet makbet makbet makbet makbet makbet makbet ".
The stopWords variable is ArrayList that have "makbet" inside.
private String removeSpecialChars(String word) {
if (word.matches(".*\\[.*\\]"))
word = deleteAnnotation(word);
if (word.isEmpty())
return word;
char firstChar = word.charAt(0);
char lastChar = word.charAt(word.length() - 1);
while (lastChar == '.' || lastChar == ','
|| lastChar == ';' || lastChar == ')'
|| lastChar == ']' || lastChar == '}'
|| lastChar == '-' || lastChar == '?'
|| lastChar == '\"' || lastChar == '!'
|| lastChar == ',' || lastChar == ':'
|| lastChar == '|') {
word = removeCharAt(word, word.length() - 1);
if (!word.isEmpty())
lastChar = word.charAt(word.length() - 1);
}
if (firstChar == '{' || firstChar == '[' || firstChar == '(' || firstChar == '\"') {
word = removeCharAt(word, 0);
}
return word;
}
private String deleteAllStopWords(String txt) {
String ret = " ";
for (String word : txt.split("\\s")) {
if (word.isEmpty())
continue;
word = removeSpecialChars(word);
ret += word + " ";
}
for (String word : stopWords) {
ret = ret.replaceAll(" (?i)" + word + " ", " ");
}
return ret;
}
public static void main()
{
String txt = " makbet makbet ";
txt = deleteAllStopWords(txt);
System.out.println(txt); //prints "makbet"
txt = deleteAllStopWords(txt);
System.out.println(txt); //prints ""
}
Of course that 2 methods are inside my class, I deleted unnecessery code for better readability.
As I got it right, "makbet" is in your "stopWords" and you want it to be deleted from the string.
So, the reason why it doesn't work for you is that you are trying to delete it with spaces. When you replace " makbet " in " makbet makbet ", it finds the first match and removes it, the string left is "makbet ", without space at the beginning. At the second iteration, you create a new string with space at the beginning and finally get what you need.
If you need to replace all makbet at once, I'd make spaces optional in regex string (\\s?) or replace all makbet without spaces and remove double spaces afterwards.
I wonder why the double quotation marks is not shown in the actual output - just after the equal sign:
String word = "" + c1 + c2 + "ll";
The full code as follows:
public class InstantMethodsIndexOf
{
public void start()
{
String greeting = new String ("Hello World");
int position = greeting.indexOf('r');
char c1 = greeting.charAt(position + 2);
char c2 = greeting.charAt(position - 1);
**String word = "" + c1 + c2 + "ll";**
System.out.println(word);
}
}
When you pass "" to a String you are passing an empty String. You need to escape the quotation with a back slash if you want to print them.
Example:
String word = "\"" + c1 + c2 + "ll\"";
then System.out.println(word) will print:
"Hell"
As you can see I am escaping one double quotation at the beginning and another at the end
(Assuming c1 == 'H' and c2 == 'e')
The quotation mark does not appear because you have none being printed. What you have is an empty string being concatenated with other contents.
If you need the quotation mark, then you shoud do the following:
String word = "\"" + c1 + c2 + "ll";
It's a way to let Java know that it will be a string straight from the beginning, since "" is a String object of an empty string.
In your code, it doesn't really look useful. But following is an example where it would be:
int a=10, b=20;
String word = a + b + "he"; // word = "30he"
String word2 = "" + a + b + "he"; // word2 = "1020he"
I wonder why the double quotation marks is not shown in the actual
output - just after the equal sign:
String word = "" + c1 + c2 + "ll";
You are declaring a String that concatenates:
The empty String ""
c1
c2
The String literal "ll"
To show the quotes and make the code easier to read, try:
String word = '\u0022' + c1 + c2 + "ll"
which uses the unicode character value to print the double quote
I wonder why the double quotation marks is not shown in the actual
output - just after the equal sign:
In java String represented by the use of double quotes means the data between double quotes is considered as String value but if you want to include double quotes you have to use escape character \".
Moreover I suggest you to use StringBuilder and append your characters and String into it and use toString to print.
String str="ABC";//So value of String literal is ABC not "ABC"
String empty="";//is just empty but NOT Null
String quote="\"";//Here String has value " (One Double Quote)
This code
String greeting = "Hello World"; // <-- no need for new String()
int position = greeting.indexOf('r'); // <-- 8
char c1 = greeting.charAt(position + 2); // <-- 'd'
char c2 = greeting.charAt(position - 1); // <-- 'o'
String word = "" + c1 + c2 + "ll"; // <-- "" + 'd' + 'o' + "ll"
The empty String "" is used to coerce the arithmetic to a String, so it could also be written as
StringBuilder sb = new StringBuilder();
sb.append(c1).append(c2).append("ll");
String word = sb.toString();
or
StringBuilder sb = new StringBuilder("ll");
sb.insert(0, c2);
sb.insert(0, c1);
String word = sb.toString();
If you wanted to include double quotes in your word, your could escape them with a \\ or use a character -
char c1 = greeting.charAt(position + 2); // <-- 'd'
char c2 = greeting.charAt(position - 1); // <-- 'o'
String word = "\"" + c1 + c2 + "ll\""; // <-- "\"" + 'd' + 'o' + "ll\""
or
String word = "" + '"' + c1 + c2 + "ll" + '"';
I have a String
String str = (a AND b) OR (c AND d)
I tokenise with the help of code below
String delims = "AND|OR|NOT|[!&|()]+"; // Regular expression syntax
String newstr = str.replaceAll(delims, " ");
String[] tokens = newstr.trim().split("[ ]+");
and get String[] below
[a, b, c, d]
To each element of the array I add " =1" so it becomes
[a=1, b=1, c=1, d=1]
NOW I need to replace these values to the initial string making it
(a=1 AND b=1) OR (c=1 AND d=1)
Can someone help or guide me ? The initial String str is arbitrary!
This answer is based on #Michael's idea (BIG +1 for him) of searching words containing only lowercase characters and adding =1 to them :)
String addstr = "=1";
String str = "(a AND b) OR (c AND d) ";
StringBuffer sb = new StringBuffer();
Pattern pattern = Pattern.compile("[a-z]+");
Matcher m = pattern.matcher(str);
while (m.find()) {
m.appendReplacement(sb, m.group() + addstr);
}
m.appendTail(sb);
System.out.println(sb);
output
(a=1 AND b=1) OR (c=1 AND d=1)
Given:
String str = (a AND b) OR (c AND d);
String[] tokened = [a, b, c, d]
String[] edited = [a=1, b=1, c=1, d=1]
Simply:
for (int i=0; i<tokened.length; i++)
str.replaceAll(tokened[i], edited[i]);
Edit:
String addstr = "=1";
String str = "(a AND b) OR (c AND d) ";
String delims = "AND|OR|NOT|[!&|() ]+"; // Regular expression syntax
String[] tokens = str.trim().split( delims );
String[] delimiters = str.trim().split( "[a-z]+"); //remove all lower case (these are the characters you wish to edit)
String newstr = "";
for (int i = 0; i < delimiters.length-1; i++)
newstr += delimiters[i] + tokens[i] + addstr;
newstr += delimiters[delimiters.length-1];
OK now the explanation:
tokens = [a, b, c, d]
delimiters = [ "(" , " AND " , ") OR (" , " AND " , ") " ]
When iterating through delimiters, we take "(" + "a" + "=1".
From there we have "(a=1" += " AND " + "b" + "=1".
And on: "(a=1 AND b=1" += ") OR (" + "c" + "=1".
Again : "(a=1 AND b=1) OR (c=1" += " AND " + "d" + "=1"
Finally (outside the for loop): "(a=1 AND b=1) OR (c=1 AND d=1" += ")"
There we have: "(a=1 AND b=1) OR (c=1 AND d=1)"
How long is str allowed to be? If the answer is "relatively short", you could simply do a "replace all" for every element in the array. This obviously is not the most performance-friendly solution, so if performance is an issue, a different solution would be desireable.
I am using Java and I created obscenity filter that works well except if we have a letter that ends in a it will be replaced.
eg. "I want a banana" --> "I want a bananbleep"
HOWEVER...
If you add punctuation after the 'a' it will show up correctly.
eg. "Would you like a banana?" --> "Would you like a banana?"
Here is what I did:
public String rInString(String theDisplay) {
String a = theDisplay;
String b = word;
String c = Matcher.quoteReplacement(replacement);
if(matchWholeWord != null && matchWholeWord){
b = "([^\\p{Alpha}\\p{Lower}\\p{Space}])" + b +
"([^\\p{Alpha}\\p{Lower}\\p{Space}])";
a = " " + a + " ";
c = "$1" + c + "$2";
}
return message.replaceAll("(?i:" + b + " )", c).trim();
}
public String rInString(String theDisplay, String theB, String replace) {
String c = Matcher.quoteReplacement(replace);
String a = theDisplay;
String b = theB;
if (matchWholeWord != null && matchWholeWord) {
b = "([^\\p{Alpha}\\p{Space}])" + b + "([^\\p{Alpha}\\p{Space}])";
a = " " + a + " ";
c = "$1" + c + "$2";
}
return message.replaceAll("(?i:" + b + ")", c).trim();
}
The issue was that one letter (a) was being blocked because $$ are the terminus in the RegEx statement. This meant that the a was being counted as a bad element. Each $ needed to be escaped in the SQL statement.