replace ". " with "\n\n" within a string in Android - java

I try to replace ". " with "\n\n" within a string but it doesnt work, I use the following code:
text=text.replace(". ","\n\n");
The result is every word without the last letter of the word in a each line. I read something like the point means any character in this case, but how can I actually refer to the point?
Input Example: "Hello world"
Example of the output:
Hell
world
Thank you

There is something fishy here; either text is not a String, or you don't use .replace() but something else (.replaceAll()?), or Android's .replace() is buggy.
And I frankly doubt that Android devs would have had such an overlook.
The Javadoc for String#replace() says:
Replaces each substring of this string that matches the literal target sequence with the specified literal replacement sequence. [emphasis mine]
Unlike its sibling methods (.replaceFirst() and .replaceAll()) which do use regexes, .replace() doesn't (and the fact that internally it does use Pattern, at least in Oracle's JDK [*], is not the problem).
Therefore, if you actually use .replace() and gain the result you say, this is a bug in Android. If this is the case, try an alternative, like so (UNTESTED):
public static String realStringReplace(final String victim, final String target,
final String replacement)
{
final int skip = target.length();
final StringBuilder sb = new StringBuilder(victim.length());
String tmp = victim;
int index;
while (!tmp.isEmpty()) {
index = tmp.indexOf(target);
if (index == -1)
break;
sb.append(tmp.subString(0, index)).append(replacement);
tmp = tmp.subString(index + skip);
}
return sb.append(tmp).toString();
}

the point means any character if you use
text=text.replaceAll(". ", "\n\n");
perhaps you have posted the wrong code, in this case try this one:
text=text.replaceAll("\\. ", "\n\n");
the strange thing is that this line is equivalent to the line that you have posted..

Related

How can I modify just the first letter of a String?

I have got a String which reads "You cannot sit on that " + entity.getEntityType().toLowerCase() + "!";. Which returns an upper case, which is why I convert it to lower case.
However, how can I get the first letter ONLY, and turn it into an upper case?
A one-liner solution would be preferred, however beggars cant be choosers.
In java, if you do any manipulation to String, it will create a new string in the memory.
Here's, how to do it:
String output = input.substring(0, 1).toUpperCase() + input.substring(1);
It's more efficient to treat the first character what it is - a character and not a String:
String output = Character.toUpperCase(input.charAt(0)) + input.substring(1);
The first thing it came to my mind was this
int x = a.codePointAt(0)-32;
a=a.replace(a.charAt(0), (char) x);
You can just use something like this:
String output = input.substring(0, 1).toUpperCase() + input.substring(1);
On the other hand is isn't bad to take a look at the guava libraries.
String output = CaseFormat.LOWER_CAMEL.to(CaseFormat.UPPER_CAMEL, entity.getEntityType()));
For this case only it will be overkill but it has a lot of other usefull utilities (Not only for String manipulation)

Making only the first letter of a word uppercase

I have a method that converts all the first letters of the words in a sentence into uppercase.
public static String toTitleCase(String s)
{
String result = "";
String[] words = s.split(" ");
for (int i = 0; i < words.length; i++)
{
result += words[i].replace(words[i].charAt(0)+"", Character.toUpperCase(words[i].charAt(0))+"") + " ";
}
return result;
}
The problem is that the method converts each other letter in a word that is the same letter as the first to uppercase. For example, the string title comes out as TiTle
For the input this is a title this becomes the output This Is A TiTle
I've tried lots of things. A nested loop that checks every letter in each word, and if there is a recurrence, the second is ignored. I used counters, booleans, etc. Nothing works and I keep getting the same result.
What can I do? I only want the first letter in upper case.
Instead of using the replace() method, try replaceFirst().
result += words[i].replaceFirst(words[i].charAt(0)+"", Character.toUpperCase(words[i].charAt(0))+"") + " ";
Will output:
This Is A Title
The problem is that you are using replace method which replaces all occurrences of described character. To solve this problem you can either
use replaceFirst instead
take first letter,
create its uppercase version
concatenate it with rest of string which can be created with a little help of substring method.
since you are using replace(String, String) which uses regex you can add ^ before character you want to replace like replace("^a","A"). ^ means start of input so it will only replace a that is placed after start of input.
I would probably use second approach.
Also currently in each loop your code creates new StringBuilder with data stored in result, append new word to it, and reassigns result of output from toString().
This is infective approach. Instead you should create StringBuilder before loop that will represent your result and append new words created inside loop to it and after loop ends you can get its String version with toString() method.
Doing some Regex-Magic can simplify your task:
public static void main(String[] args) {
final String test = "this is a Test";
final StringBuffer buffer = new StringBuffer(test);
final Pattern patter = Pattern.compile("\\b(\\p{javaLowerCase})");
final Matcher matcher = patter.matcher(buffer);
while (matcher.find()) {
buffer.replace(matcher.start(), matcher.end(), matcher.group().toUpperCase());
}
System.out.println(buffer);
}
The expression \\b(\\p{javaLowerCase}) matches "The beginning of a word followed by a lower-case letter", while matcher.group() is equal to whats inside the () in the part that matches. Example: Applying on "test" matches on "t", so start is 0, end is 1 and group is "t". This can easily run through even a huge amount of text and replace all those letters that need replacement.
In addition: it is always a good idea to use a StringBuffer (or similar) for String manipulation, because each String in Java is unique. That is if you do something like result += stringPart you actually create a new String (equal to result + stringPart) each time this is called. So if you do this with like 10 parts, you will in the end have at least 10 different Strings in memory, while you only need one, which is the final one.
StringBuffer instead uses something like char[] to ensure that if you change only a single character no extra memory needs to be allocated.
Note that a patter only need to be compiled once, so you can keep that as a class variable somewhere.

Remove Punctuation issue

Im trying to find a word in a string. However, due to a period it fails to recognize one word. Im trying to remove punctuation, however it seems to have no effect. Am I missing something here? This is the line of code I am using: s.replaceAll("([a-z] +) [?:!.,;]*","$1");
String test = "This is a line about testing tests. Tests are used to examine stuff";
String key = "tests";
int counter = 0;
String[] testArray = test.toLowerCase().split(" ");
for(String s : testArray)
{
s.replaceAll("([a-z] +) [?:!.,;]*","$1");
System.out.println(s);
if(s.equals(key))
{
System.out.println(key + " FOUND");
counter++;
}
}
System.out.println(key + " has been found " + counter + " times.");
}
I managed to find a solution (though may not be ideal) through using s = s.replaceAll("\W",""); Thanks for everyones guidance on how to solve this problem.
You could also take advantage of the regex in the split operation. Try this:
String[] testArray = test.toLowerCase().split("\\W+");
This will split on apostrophe, so you may need to tweak it a bit with a specific list of characters.
Strings are immutable. You would need assign the result of replaceAll to the new String:
s = s.replaceAll("([a-z] +)*[?:!.,;]*", "$1");
^
Also your regex requires that a space exist between the word and the the punctuation. In the case of tests., this isn't true. You can adjust you regex with an optional (zero or more) character to account for this.
Your regex doesn't seem to work as you want.
If you want to find something which has period after that then this will work
([a-z]*) [?(:!.,;)*]
it returns "tests." when it's run on your given string.
Also
[?(:!.,;)*]
just points out the punctuation which will then can be replaced.
However I am not sure why you are not using substring() function.

Help building a regex

I need to build a regular expression that finds the word "int" only if it's not part of some string.
I want to find whether int is used in the code. (not in some string, only in regular code)
Example:
int i; // the regex should find this one.
String example = "int i"; // the regex should ignore this line.
logger.i("int"); // the regex should ignore this line.
logger.i("int") + int.toString(); // the regex should find this one (because of the second int)
thanks!
It's not going to be bullet-proof, but this works for all your test cases:
(?<=^([^"]*|[^"]*"[^"]*"[^"]*))\bint\b(?=([^"]*|[^"]*"[^"]*"[^"]*)$)
It does a look behind and look ahead to assert that there's either none or two preceding/following quotes "
Here's the code in java with the output:
String regex = "(?<=^([^\"]*|[^\"]*\"[^\"]*\"[^\"]*))\\bint\\b(?=([^\"]*|[^\"]*\"[^\"]*\"[^\"]*)$)";
System.out.println(regex);
String[] tests = new String[] {
"int i;",
"String example = \"int i\";",
"logger.i(\"int\");",
"logger.i(\"int\") + int.toString();" };
for (String test : tests) {
System.out.println(test.matches("^.*" + regex + ".*$") + ": " + test);
}
Output (included regex so you can read it without all those \ escapes):
(?<=^([^"]*|[^"]*"[^"]*"[^"]*))\bint\b(?=([^"]*|[^"]*"[^"]*"[^"]*)$)
true: int i;
false: String example = "int i";
false: logger.i("int");
true: logger.i("int") + int.toString();
Using a regex is never going to be 100% accurate - you need a language parser. Consider escaped quotes in Strings "foo\"bar", in-line comments /* foo " bar */, etc.
Not exactly sure what your complete requirements are but
$\s*\bint\b
perhaps
Assuming input will be each line,
^int\s[\$_a-bA-B\;]*$
it follows basic variable naming rules :)
If you think to parse code and search isolated int word, this works:
(^int|[\(\ \;,]int)
You can use it to find int that in code can be only preceded by space, comma, ";" and left parenthesis or be the first word of line.
You can try it here and enhance it http://www.regextester.com/
PS: this works in all your test cases.
$[^"]*\bint\b
should work. I can't think of a situation where you can use a valid int identifier after the character '"'.
Of course this only applies if the code is limited to one statement per line.

ReplaceAll and " doesn't replace

Can anyone point me out how the first if works and the second doesn't? I'm puzzled why the second if-clause isn't working. I'd like to get a hint, thanks.
String msg = o.getTweet();
if (msg.indexOf("&") > 0) {
msg = msg.replaceAll("&", "&");// vervangt & door &
}
if (msg.indexOf(""") > 0) {
msg = msg.replaceAll(""", "aa"); //vervangt " door "
}
Because ZERO is a very valid index. Try this out,
String msg = o.getTweet();
if (msg.indexOf("&") != -1) {
msg = msg.replaceAll("&", "&");// vervangt & door &
}
if (msg.indexOf(""") != -1) {
msg = msg.replaceAll(""", "aa"); //vervangt " door "
}
Explanation:
The documentation of String.indexOf(String str) explains that, "if the string argument occurs as a substring within this object, then the index of the first character of the first such substring is returned; if it does not occur as a substring, -1 is returned." - [link to docs]
This can be done as simple as below, as OpenSauce pointed out here.
msg = msg.replace("&", "&").replace(""", "\"");
Useful links:
String indexOf() docs
String replace() docs
String replaceAll() docs
You don't need to check the substring exists, the replace and replaceAll methods are no-ops if the substring is not found. Since you're not looking for regexes, you can also use replace instead of replaceAll - it will be somewhat more efficient, and won't surprise you if you also want to check for other strings which happen to contain regex special chars.
msg = msg.replace("&", "&").replace(""", "\"");
note that replace does indeed replace all matches, like you want. The difference between replace and replaceAll is whether the arg is interpreted as a regex or not.

Categories

Resources