How do I build this words to one sentence after looping? - java

Am Looping through a sentence splitting it to capitalize.But its hard to build it back after getting the individual words.
String str = "Not the answer you're looking for.";
StringBuilder stringBuilder = new StringBuilder();
String oneWord =" ";
for (String word : str.toLowerCase().split(" ")){
char firstLetter = word.substring(0,1).toUpperCase().charAt(0);
oneWord = firstLetter + word.substring(1);
System.out.println(stringBuilder.append(oneWord + " "));
}
}
I expect to get only one fully built String "Not The Answer You're Looking For."

String str = "Not the answer you're looking for.";
StringBuilder stringBuilder = new StringBuilder();
String oneWord =" ";
for (String word : str.toLowerCase().split(" ")){
char firstLetter = word.substring(0,1).toUpperCase().charAt(0);
oneWord = firstLetter + word.substring(1);
stringBuilder.append(oneWord + " ");
}
System.out.println(stringBuilder.toString());
You are not getting just one string because you use System.out.println inside for loop.
Consider my example above

oneWord += firstLetter + word.substring(1) + " ";
after loop
oneWord = oneWord.trim();
System.out.println(oneWord);
So the solution is:
String str = "Not the answer you're looking for.";
StringBuilder sb = new StringBuilder();
for (String word : str.toLowerCase().split(" ")) {
sb.append(str.substring(0, 1).toUpperCase());
sb.append(str.substring(1));
sb.append(" ");
}
System.out.println(stringBuilder.toString().trim());
Also your solution is not optimal.
Check String.join() or use somth like this
Arrays.stream(str.toLowerCase().split(" "))
.map(word -> str.substring(0, 1).toUpperCase() + str.substring(1))
.collect(Collectors.joining(" "));

Related

split String If get any capital letters

My String:
BByTTheWay .I want to split the string as B By T The Way BByTheWay .That means I want to split string if I get any capital letters and last put the main string as it is. As far I tried in java:
public String breakWord(String fileAsString) throws FileNotFoundException, IOException {
String allWord = "";
String allmethod = "";
String[] splitString = fileAsString.split(" ");
for (int i = 0; i < splitString.length; i++) {
String k = splitString[i].replaceAll("([A-Z])(?![A-Z])", " $1").trim();
allWord = k.concat(" " + splitString[i]);
allWord = Arrays.stream(allWord.split("\\s+")).distinct().collect(Collectors.joining(" "));
allmethod = allmethod + " " + allWord;
// System.out.print(allmethod);
}
return allmethod;
}
It givs me the output: B ByT The Way BByTTheWay . I think stackoverflow community help me to solve this.
You may use this code:
Code 1
String s = "BByTTheWay";
Pattern p = Pattern.compile("\\p{Lu}\\p{Ll}*");
String out = p.matcher(s)
.results()
.map(MatchResult::group)
.collect(Collectors.joining(" "))
+ " " + s;
//=> "B By T The Way BByTTheWay"
RegEx \\p{Lu}\\p{Ll}* matches any unicode upper case letter followed by 0 or more lowercase letters.
CODE DEMO
Or use String.split using same regex and join it back later:
Code 2
String out = Arrays.stream(s.split("(?=\\p{Lu})"))
.collect(Collectors.joining(" ")) + " " + s;
//=> "B By T The Way BByTTheWay"
Use
String s = "BByTTheWay";
Pattern p = Pattern.compile("[A-Z][a-z]*");
Matcher m = p.matcher(s);
String r = "";
while (m.find()) {
r = r + m.group(0) + " ";
}
System.out.println(r + s);
See Java proof.
Results: B By T The Way BByTTheWay
EXPLANATION
--------------------------------------------------------------------------------
[A-Z] any character of: 'A' to 'Z'
--------------------------------------------------------------------------------
[a-z]* any character of: 'a' to 'z' (0 or more
times (matching the most amount possible))
As per requirements, you can write in this way checking if a character is an alphabet or not:
char[] chars = fileAsString.toCharArray();
StringBuilder fragment = new StringBuilder();
for (char ch : chars) {
if (Character.isLetter(ch) && Character.isUpperCase(ch)) { // it works as internationalized check
fragment.append(" ");
}
fragment.append(ch);
}
String.join(" ", fragment).concat(" " + fileAsString).trim(); // B By T The Way BByTTheWay

Stop words not being correctly removed from string

I have a function which reads stop words from a file and saves it in a HashSet.
HashSet<String> hset = readFile();
This is my string
String words = "the plan crash is invisible";
I am trying to remove all the stop words from the string but it is not working correctly
The output i am getting: plan crash invible
Output i want => plan crash invisible
Code:
HashSet<String> hset = readFile();
String words = "the plan crash is invisible";
String s = words.toLowerCase();
String[] split = s.split(" ");
for(String str: split){
if (hset.contains(str)) {
s = s.replace(str, "");
} else {
}
}
System.out.println("\n" + "\n" + s);
While hset.contains(str) matches full words, s.replace(str, ""); can replace occurrences of the "stop" words which are part of words of the input String. Hence "invisible" becomes "invible".
Since you are iterating over all the words of s anyway, you can construct a String that contains all the words not contained in the Set:
StringBuilder sb = new StringBuilder();
for(String str: split){
if (!hset.contains(str)) {
if (sb.length() > 0) {
sb.append(' ');
}
sb.append(str);
}
}
System.out.println("\n" + "\n" + sb.toString());
No need so check if your string contain the stop word or split your string, you can use replaceAll which use regex, like this :
for (String str : hset) {
s = s.replaceAll("\\s" + str + "|" + str + "\\s", " ");
}
Excample :
HashSet<String> hset = new HashSet<>();
hset.add("is");
hset.add("the");
String words = "the plan crash is invisible";
String s = words.toLowerCase();
for (String str : hset) {
s = s.replaceAll("\\s" + str + "|" + str + "\\s", " ");
}
s = s.replaceAll("\\s+", " ").trim();//comment and idea of #davidxxx
System.out.println(s);
This can gives you :
plan crash invisible

Can StringTokenizer countTokens ever be zero?

I just found a piece of Java code inside a method:
if (param.contains("|")) {
StringTokenizer st = new StringTokenizer(param.toLowerCase().replace(" ", ""), "|");
if (st.countTokens() > 0) {
...
}
} else {
return myString.contains(param);
}
Can countTokens in the above case ever be less than 1?
It can, if the string you're trying to tokenize is empty, otherwise it'll always at least be 1
Example 1:
String myStr = "abcdefg";
StringTokenizer st = new StringTokenizer(myStr, ";");
int tokens = st.countTokens();
System.out.println("Number of tokens: " + tokens);
> "Number of tokens: 1"
Example 2:
String myStr = "";
StringTokenizer st = new StringTokenizer(myStr, ";");
int tokens = st.countTokens();
System.out.println("Number of tokens: " + tokens);
> "Number of tokens: 0"
Example 3:
String myStr = "abc;defg";
StringTokenizer st = new StringTokenizer(myStr, ";");
int tokens = st.countTokens();
System.out.println("Number of tokens: " + tokens);
> "Number of tokens: 2"
Below return 0:
new StringTokenizer("", "|").countTokens()
new StringTokenizer("|", "|").countTokens()
new StringTokenizer("||||", "|").countTokens()
so countTokens() returns 0 when:
the String is empty
the String contains only the delimeter
Look at this
String param="";
StringTokenizer st = new StringTokenizer(param.toLowerCase().replace(" ", ""), "|");
System.out.println(st.countTokens());
answer is 0(zero)

Replacing sub-string of a String with another String

I have a String
String str = (a AND b) OR (c AND d)
I tokenise with the help of code below
String delims = "AND|OR|NOT|[!&|()]+"; // Regular expression syntax
String newstr = str.replaceAll(delims, " ");
String[] tokens = newstr.trim().split("[ ]+");
and get String[] below
[a, b, c, d]
To each element of the array I add " =1" so it becomes
[a=1, b=1, c=1, d=1]
NOW I need to replace these values to the initial string making it
(a=1 AND b=1) OR (c=1 AND d=1)
Can someone help or guide me ? The initial String str is arbitrary!
This answer is based on #Michael's idea (BIG +1 for him) of searching words containing only lowercase characters and adding =1 to them :)
String addstr = "=1";
String str = "(a AND b) OR (c AND d) ";
StringBuffer sb = new StringBuffer();
Pattern pattern = Pattern.compile("[a-z]+");
Matcher m = pattern.matcher(str);
while (m.find()) {
m.appendReplacement(sb, m.group() + addstr);
}
m.appendTail(sb);
System.out.println(sb);
output
(a=1 AND b=1) OR (c=1 AND d=1)
Given:
String str = (a AND b) OR (c AND d);
String[] tokened = [a, b, c, d]
String[] edited = [a=1, b=1, c=1, d=1]
Simply:
for (int i=0; i<tokened.length; i++)
str.replaceAll(tokened[i], edited[i]);
Edit:
String addstr = "=1";
String str = "(a AND b) OR (c AND d) ";
String delims = "AND|OR|NOT|[!&|() ]+"; // Regular expression syntax
String[] tokens = str.trim().split( delims );
String[] delimiters = str.trim().split( "[a-z]+"); //remove all lower case (these are the characters you wish to edit)
String newstr = "";
for (int i = 0; i < delimiters.length-1; i++)
newstr += delimiters[i] + tokens[i] + addstr;
newstr += delimiters[delimiters.length-1];
OK now the explanation:
tokens = [a, b, c, d]
delimiters = [ "(" , " AND " , ") OR (" , " AND " , ") " ]
When iterating through delimiters, we take "(" + "a" + "=1".
From there we have "(a=1" += " AND " + "b" + "=1".
And on: "(a=1 AND b=1" += ") OR (" + "c" + "=1".
Again : "(a=1 AND b=1) OR (c=1" += " AND " + "d" + "=1"
Finally (outside the for loop): "(a=1 AND b=1) OR (c=1 AND d=1" += ")"
There we have: "(a=1 AND b=1) OR (c=1 AND d=1)"
How long is str allowed to be? If the answer is "relatively short", you could simply do a "replace all" for every element in the array. This obviously is not the most performance-friendly solution, so if performance is an issue, a different solution would be desireable.

removing comma from string array

I want to execute a query like
select ID from "xyz_DB"."test" where user in ('a','b')
so the corresponding code is like
String s="(";
for(String user:selUsers){
s+= " ' " + user + " ', ";
}
s+=")";
Select ID from test where userId in s;
The following code is forming the value of s as ('a','b',)
i want to remove the comma after the end of array how to do this ?
Here is one way to do this:
String s = "(";
boolean first = true;
for(String user : selUsers){
if (first) {
first = false;
} else {
s += ", ";
}
s += " ' " + user + " '";
}
s += ")";
But it is more efficient to use a StringBuilder to assemble a String if there is looping involved.
StringBuilder sb = new StringBuilder("(");
boolean first = true;
for(String user : selUsers){
if (first) {
first = false;
} else {
sb.append(", ");
}
sb.append(" ' ").append(user).append(" '");
}
sb.append(")");
String s = sb.toString();
This does the trick.
String s = "";
for(String user : selUsers)
s += ", '" + user + "'";
if (selUsers.size() > 0)
s = s.substring(2);
s = "(" + s + ")";
But, a few pointers:
When concatenating strings like this, you are advised to work with StringBuilder and append.
If this is part of an SQL-query, you probably want to sanitize the user-names. See xkcd: Exploits of a Mom for an explanation.
For fun, a variation of Stephen C's answer:
StringBuilder sb = new StringBuilder("(");
boolean first = true;
for(String user : selUsers){
if (!first || (first = false))
sb.append(", ");
sb.append('\'').append(user).append('\'');
}
sb.append(')');
you could even do the loop it like this :-)
for(String user : selUsers)
sb.append(!first || (first=false) ? ", \'" : "\'").append(user).append('\'');
Use the 'old style' of loop where you have the index, then you add the comma on every username except the last:
String[] selUsers = {"a", "b", "c"};
String s="(";
for(int i = 0; i < selUsers.length; i++){
s+= " ' " + selUsers[i] + " ' ";
if(i < selUsers.length -1){
s +=" , ";
}
}
s+=")";
But as others already mentioned, use StringBuffer when concatenating strings:
String[] selUsers = {"a", "b", "c"};
StringBuffer s = new StringBuffer("(");
for(int i = 0; i < selUsers.length; i++){
s.append(" ' " + selUsers[i] + " ' ");
if(i < selUsers.length -1){
s.append(" , ");
}
}
s.append(")");
Use StringUtils.join from apache commons.
Prior to adding the trailing ')' I'd strip off the last character of the string if it's a comma, or perhaps just replace the trailing comma with a right parenthesis - in pseudo-code, something like
if s.last == ',' then
s = s.left(s.length() - 1);
end if;
s = s + ')';
or
if s.last == ',' then
s.last = ')';
else
s = s + ')';
end if;
Share and enjoy.
i would do s+= " ,'" + user + "'"; (place the comma before the value) and add a counter to the loop where i just do s = "'" + user + "'"; if the counter is 1 (or 0, depending on where you start to count).
(N.B. - I'm not a Java guy, so the syntax may be wrong here - apologies if it is).
If selUsers is an array, why not do:
selUsers.join(',');
This should do what you want.
EDIT:
Looks like I was wrong - I figured Java had this functionality built-in. Looks like the Apache project has something that does what I meant, though. Check out this SO answer: Java: function for arrays like PHP's join()?
I fully support Stephen C's answer - that's the one I wanted to suggest aswell: simply avoid adding the additional comma.
But in your special case, the following should work too:
s = s.replace(", \\)", ")");
It simply replaces the substring ", )" with a single bracket ")".
Java 1.4+
s = s.replaceFirst("\\, \\)$", ")");
Edited: I forgot last space before parethesis
StringBuilder has a perfectly good append(int) overload.
String [] array = {"1","2","3" ...};
StringBuilder builder = new StringBuilder();
builder.append(s + "( ")
for(String i : array)
{
if(builder.length() != 0)
builder.append(",");
builder.append(i);
}
builder.append(" )")
Answer shamelessly copied from here

Categories

Resources