Formatting String Array efficiently in Java

Formatting String Array efficiently in Java - java

I was working on some string formatting, and I was curious if I was doing it the most efficient way.
Assume I have a String Array:
String ArrayOne[] = {"/test/" , "/this/is/test" , "/that/is/" "/random/words" }
I want the result Array to be
String resultArray[] = {"test", "this_is_test" , "that_is" , "random_words" }
It's quite messy and brute-force-like.
for(char c : ArrayOne[i].toCharArray()) {
if(c == '/'){
occurances[i]++;
}
}
First I count the number of "/" in each String like above and then using these counts, I find the indexOf("/") for each string and add "_" accordingly.
As you can see though, it gets very messy.
Is there a more efficient way to do this besides the brute-force way I'm doing?
Thanks!

You could use replaceAll and replace, as follows:
String resultArray[] = new String[ArrayOne.length];
for (int i = 0; i < ArrayOne.length; ++i) {
resultArray[i] = ArrayOne[i].replaceAll("^/|/$", "").replace('/', '_');
}
The replaceAll method searches the string for a match to the regex given in the first argument, and replaces each match with the text in the second argument.
Here, we use it first to remove leading and trailing slashes. We search for slashes at the start of the string (^/) or the end of the string (/$), and replace them with nothing.
Then, we replace all remaining slashes with underscores using replace.

Related

Java regex: Replace all characters with `+` except instances of a given string

I have the following problem which states
Replace all characters in a string with + symbol except instances of the given string in the method
so for example if the string given was abc123efg and they want me to replace every character except every instance of 123 then it would become +++123+++.
I figured a regular expression is probably the best for this and I came up with this.
str.replaceAll("[^str]","+")
where str is a variable, but its not letting me use the method without putting it in quotations. If I just want to replace the variable string str how can I do that? I ran it with the string manually typed and it worked on the method, but can I just input a variable?
as of right now I believe its looking for the string "str" and not the variable string.
Here is the output its right for so many cases except for two :(
List of open test cases:
plusOut("12xy34", "xy") → "++xy++"
plusOut("12xy34", "1") → "1+++++"
plusOut("12xy34xyabcxy", "xy") → "++xy++xy+++xy"
plusOut("abXYabcXYZ", "ab") → "ab++ab++++"
plusOut("abXYabcXYZ", "abc") → "++++abc+++"
plusOut("abXYabcXYZ", "XY") → "++XY+++XY+"
plusOut("abXYxyzXYZ", "XYZ") → "+++++++XYZ"
plusOut("--++ab", "++") → "++++++"
plusOut("aaxxxxbb", "xx") → "++xxxx++"
plusOut("123123", "3") → "++3++3"

Looks like this is the plusOut problem on CodingBat.
I had 3 solutions to this problem, and wrote a new streaming solution just for fun.
Solution 1: Loop and check
Create a StringBuilder out of the input string, and check for the word at every position. Replace the character if doesn't match, and skip the length of the word if found.
public String plusOut(String str, String word) {
StringBuilder out = new StringBuilder(str);
for (int i = 0; i < out.length(); ) {
if (!str.startsWith(word, i))
out.setCharAt(i++, '+');
else
i += word.length();
}
return out.toString();
}
This is probably the expected answer for a beginner programmer, though there is an assumption that the string doesn't contain any astral plane character, which would be represented by 2 char instead of 1.
Solution 2: Replace the word with a marker, replace the rest, then restore the word
public String plusOut(String str, String word) {
return str.replaceAll(java.util.regex.Pattern.quote(word), "#").replaceAll("[^#]", "+").replaceAll("#", word);
}
Not a proper solution since it assumes that a certain character or sequence of character doesn't appear in the string.
Note the use of Pattern.quote to prevent the word being interpreted as regex syntax by replaceAll method.
Solution 3: Regex with \G
public String plusOut(String str, String word) {
word = java.util.regex.Pattern.quote(word);
return str.replaceAll("\\G((?:" + word + ")*+).", "$1+");
}
Construct regex \G((?:word)*+)., which does more or less what solution 1 is doing:
\G makes sure the match starts from where the previous match leaves off
((?:word)*+) picks out 0 or more instance of word - if any, so that we can keep them in the replacement with $1. The key here is the possessive quantifier *+, which forces the regex to keep any instance of the word it finds. Otherwise, the regex will not work correctly when the word appear at the end of the string, as the regex backtracks to match .
. will not be part of any word, since the previous part already picks out all consecutive appearances of word and disallow backtrack. We will replace this with +
Solution 4: Streaming
public String plusOut(String str, String word) {
return String.join(word,
Arrays.stream(str.split(java.util.regex.Pattern.quote(word), -1))
.map((String s) -> s.replaceAll("(?s:.)", "+"))
.collect(Collectors.toList()));
}
The idea is to split the string by word, do the replacement on the rest, and join them back with word using String.join method.
Same as above, we need Pattern.quote to avoid split interpreting the word as regex. Since split by default removes empty string at the end of the array, we need to use -1 in the second parameter to make split leave those empty strings alone.
Then we create a stream out of the array and replace the rest as strings of +. In Java 11, we can use s -> String.repeat(s.length()) instead.
The rest is just converting the Stream to an Iterable (List in this case) and joining them for the result

This is a bit trickier than you might initially think because you don't just need to match characters, but the absence of specific phrase - a negated character set is not enough. If the string is 123, you would need:
(?<=^|123)(?!123).*?(?=123|$)
https://regex101.com/r/EZWMqM/1/
That is - lookbehind for the start of the string or "123", make sure the current position is not followed by 123, then lazy-repeat any character until lookahead matches "123" or the end of the string. This will match all characters which are not in a "123" substring. Then, you need to replace each character with a +, after which you can use appendReplacement and a StringBuffer to create the result string:
String inputPhrase = "123";
String inputStr = "abc123efg123123hij";
StringBuffer resultString = new StringBuffer();
Pattern regex = Pattern.compile("(?<=^|" + inputPhrase + ")(?!" + inputPhrase + ").*?(?=" + inputPhrase + "|$)");
Matcher m = regex.matcher(inputStr);
while (m.find()) {
String replacement = m.group(0).replaceAll(".", "+");
m.appendReplacement(resultString, replacement);
}
m.appendTail(resultString);
System.out.println(resultString.toString());
Output:
+++123+++123123+++
Note that if the inputPhrase can contain character with a special meaning in a regular expression, you'll have to escape them first before concatenating into the pattern.

You can do it in one line:
input = input.replaceAll("((?:" + str + ")+)?(?!" + str + ").((?:" + str + ")+)?", "$1+$2");
This optionally captures "123" either side of each character and puts them back (a blank if there's no "123"):

So instead of coming up with a regular expression that matches the absence of a string. We might as well just match the selected phrase and append + the number of skipped characters.
StringBuilder sb = new StringBuilder();
Matcher m = Pattern.compile(Pattern.quote(str)).matcher(input);
while (m.find()) {
for (int i = 0; i < m.start(); i++) sb.append('+');
sb.append(str);
}
int remaining = input.length() - sb.length();
for (int i = 0; i < remaining; i++) {
sb.append('+');
}

Absolutely just for the fun of it, a solution using CharBuffer (unexpectedly it took a lot more that I initially hoped for):
private static String plusOutCharBuffer(String input, String match) {
int size = match.length();
CharBuffer cb = CharBuffer.wrap(input.toCharArray());
CharBuffer word = CharBuffer.wrap(match);
int x = 0;
for (; cb.remaining() > 0;) {
if (!cb.subSequence(0, size < cb.remaining() ? size : cb.remaining()).equals(word)) {
cb.put(x, '+');
cb.clear().position(++x);
} else {
cb.clear().position(x = x + size);
}
}
return cb.clear().toString();
}

To make this work you need a beast of a pattern. Let's say you you are operating on the following test case as an example:
plusOut("abXYxyzXYZ", "XYZ") → "+++++++XYZ"
What you need to do is build a series of clauses in your pattern to match a single character at a time:
Any character that is NOT "X", "Y" or "Z" -- [^XYZ]
Any "X" not followed by "YZ" -- X(?!YZ)
Any "Y" not preceded by "X" -- (?<!X)Y
Any "Y" not followed by "Z" -- Y(?!Z)
Any "Z" not preceded by "XY" -- (?<!XY)Z
An example of this replacement can be found here: https://regex101.com/r/jK5wU3/4
Here is an example of how this might work (most certainly not optimized, but it works):
import java.util.regex.Pattern;
public class Test {
public static void plusOut(String text, String exclude) {
StringBuilder pattern = new StringBuilder("");
for (int i=0; i<exclude.length(); i++) {
Character target = exclude.charAt(i);
String prefix = (i > 0) ? exclude.substring(0, i) : "";
String postfix = (i < exclude.length() - 1) ? exclude.substring(i+1) : "";
// add the look-behind (?<!X)Y
if (!prefix.isEmpty()) {
pattern.append("(?<!").append(Pattern.quote(prefix)).append(")")
.append(Pattern.quote(target.toString())).append("|");
}
// add the look-ahead X(?!YZ)
if (!postfix.isEmpty()) {
pattern.append(Pattern.quote(target.toString()))
.append("(?!").append(Pattern.quote(postfix)).append(")|");
}
}
// add in the other character exclusion
pattern.append("[^" + Pattern.quote(exclude) + "]");
System.out.println(text.replaceAll(pattern.toString(), "+"));
}
public static void main(String [] args) {
plusOut("12xy34", "xy");
plusOut("12xy34", "1");
plusOut("12xy34xyabcxy", "xy");
plusOut("abXYabcXYZ", "ab");
plusOut("abXYabcXYZ", "abc");
plusOut("abXYabcXYZ", "XY");
plusOut("abXYxyzXYZ", "XYZ");
plusOut("--++ab", "++");
plusOut("aaxxxxbb", "xx");
plusOut("123123", "3");
}
}
UPDATE: Even this doesn't quite work because it can't deal with exclusions that are just repeated characters, like "xx". Regular expressions are most definitely not the right tool for this, but I thought it might be possible. After poking around, I'm not so sure a pattern even exists that might make this work.

The problem in your solution that you put a set of instance string str.replaceAll("[^str]","+") which it will exclude any character from the variable str and that will not solve your problem
EX: when you try str.replaceAll("[^XYZ]","+") it will exclude any combination of character X , character Y and character Z from your replacing method so you will get "++XY+++XYZ".
Actually you should exclude a sequence of characters instead in str.replaceAll.
You can do it by using capture group of characters like (XYZ) then use a negative lookahead to match a string which does not contain characters sequence : ^((?!XYZ).)*$
Check this solution for more info about this problem but you should know that it may be complicated to find regular expression to do that directly.
I have found two simple solutions for this problem :
Solution 1:
You can implement a method to replace all characters with '+' except the instance of given string:
String exWord = "XYZ";
String str = "abXYxyzXYZ";
for(int i = 0; i < str.length(); i++){
// exclude any instance string of exWord from replacing process in str
if(str.substring(i, str.length()).indexOf(exWord) + i == i){
i = i + exWord.length()-1;
}
else{
str = str.substring(0,i) + "+" + str.substring(i+1);//replace each character with '+' symbol
}
}
Note : str.substring(i, str.length()).indexOf(exWord) + i this if statement will exclude any instance string of exWord from replacing process in str.
Output:
+++++++XYZ
Solution 2:
You can try this Approach using ReplaceAll method and it doesn't need any complex regular expression:
String exWord = "XYZ";
String str = "abXYxyzXYZ";
str = str.replaceAll(exWord,"*"); // replace instance string with * symbol
str = str.replaceAll("[^*]","+"); // replace all characters with + symbol except *
str = str.replaceAll("\\*",exWord); // replace * symbol with instance string
Note : This solution will work only if your input string str doesn't contain any * symbol.
Also you should escape any character with a special meaning in a regular expression in phrase instance string exWord like : exWord = "++".

How do I get the last substring starting with +,-,*,/?

If I have expression in a string variable like this 20+567-321, so how can I extract last number 321 from it where operator can be +,-,*,/
If the string expression is just 321, I have to get 321, here there is no operator in the expression

You can do this by splitting your string based on your operators as following:
String[] result = myString.split("[-+*/]");
[+|-|*|/] is Regex that specifies the points from where your string should be split. Here, result[result.length-1] is your required string.
EDIT
As suggested by #ElliotFrisch we need to escape - in regex while specifying it. So following pattern should also work:
String[] result = myString.split("[+|\\-|*|/]");
Here is the list of characters they need to be escaped.
Link.

This seems to be an assignment for learning programming and algo, and also I doubt splitting using Regex would be efficient in a case where only last substring is required.
Start from end, and iterate until the length of the string times.
Declare a empty string say Result
While looping, if any of those operator is found, return Result, else prepend the traversed character to the string Result.
Return Result

String[] output = s.split("[+-/*]");
String ans = output[output.length-1];
Assumption here that there will be no spaces and the string contains only numbers and arithmetic operators.
[+-/*] is a regular expression that matches only the characters we provide inside the square brackets. We are splitting based on those characters.

If you wanna do it with StringTokenizer:
public static void main(String args[])
{
String expression = "20+567-321";
StringTokenizer tokenizer = new StringTokenizer(expression, "+-*/");
int count = tokenizer.countTokens();
if( count > 0){
for(int i=0; i< count; i++){
if(i == count - 1 ){
System.out.println(tokenizer.nextToken());
}else{
tokenizer.nextToken();
}
}
}
}
Recall you can specify multiple delimiters in StringTokenizer.

Replace parts of a string in Java

I need to replace parts of a string by looking up the System properties.
For example, consider the string It was {var1} beauty killed {var2}
I need to parse the string, and replace all the words contained within the parenthesis by looking up their value in System properties. If System.getProperty() returns null, then simply replace with empty character. This is pretty straightforward when I know the variables well ahead. But the string that I need to parse is not defined ahead. I wouldn't know how many number of variables are in the string and what the variable names are. Assuming a simple, well formatted string (no nested parenthesis, open - close matches), what is the simplest or the most elegant way to parse through the string and replace all the character sequences that are enclosed in the parenthesis?
Only solution I could come up with is to traverse the string from the first character, note down the positions of the start and end positions of the parenthesis, replace the string between them, and then continue until reaching the end of the string. Is there simpler way to do this?

You can use the parentheses to break the initial string into substrings, and then replace every other substring.
String[] substituteValues = {"the", "str", "other", "another"};
int substituteValuesIndex = 0;
String test = "Here is {var1} string called {var2}";
// split the string up into substrings
test = test.replaceAll("\\}", "\\{");
String[] splitString = test.split("\\{");
// now sub in your values
for (int k=1; k < splitString.length; k = k+2) {
splitString[k] = substituteValues[substituteValuesIndex];
substituteValuesIndex++;
}
String result = "";
for (String s : splitString) {
result = result + s;
}

String manipulation of function names

For this Kata, i am given random function names in the PEP8 format and i am to convert them to camelCase.
(input)get_speed == (output)getSpeed ....
(input)set_distance == (output)setDistance
I have a understanding on one way of doing this written in pseudo-code:
loop through the word,
if the letter is an underscore
then delete the underscore
then get the next letter and change to a uppercase
endIf
endLoop
return the resultant word
But im unsure the best way of doing this, would it be more efficient to create a char array and loop through the element and then when it comes to finding an underscore delete that element and get the next index and change to uppercase.
Or would it be better to use recursion:
function camelCase takes a string
if the length of the string is 0,
then return the string
endIf
if the character is a underscore
then change to nothing,
then find next character and change to uppercase
return the string taking away the character
endIf
finally return the function taking the first character away
Any thoughts please, looking for a good efficient way of handing this problem. Thanks :)

I would go with this:
divide given String by underscore to array
from second word until end take first letter and convert it to uppercase
join to one word
This will work in O(n) (go through all names 3 time). For first case, use this function:
str.split("_");
for uppercase use this:
String newName = substring(0, 1).toUpperCase() + stre.substring(1);
But make sure you check size of the string first...
Edited - added implementation
It would look like this:
public String camelCase(String str) {
if (str == null ||str.trim().length() == 0) return str;
String[] split = str.split("_");
String newStr = split[0];
for (int i = 1; i < split.length; i++) {
newStr += split[i].substring(0, 1).toUpperCase() + split[i].substring(1);
}
return newStr;
}
for inputs:
"test"
"test_me"
"test_me_twice"
it returns:
"test"
"testMe"
"testMeTwice"

It would be simpler to iterate over the string instead of recursing.
String pep8 = "do_it_again";
StringBuilder camelCase = new StringBuilder();
for(int i = 0, l = pep8.length(); i < l; ++i) {
if(pep8.charAt(i) == '_' && (i + 1) < l) {
camelCase.append(Character.toUpperCase(pep8.charAt(++i)));
} else {
camelCase.append(pep8.charAt(i));
}
}
System.out.println(camelCase.toString()); // prints doItAgain

The question you pose is whether to use an iterative or a recursive approach. For this case I'd go for the recursive approach because it's straightforward, easy to understand doesn't require much resources (only one array, no new stackframe etc), though that doesn't really matter for this example.
Recursion is good for divide-and-conquer problems, but I don't see that fitting the case well, although it's possible.
An iterative implementation of the algorithm you described could look like the following:
StringBuilder buf = new StringBuilder(input);
for(int i = 0; i < buf.length(); i++){
if(buf.charAt(i) == '_'){
buf.deleteCharAt(i);
if(i != buf.length()){ //check fo EOL
buf.setCharAt(i, Character.toUpperCase(buf.charAt(i)));
}
}
}
return buf.toString();
The check for the EOL is not part of the given algorithm and could be ommitted, if the input string never ends with '_'

Java - removing first character of a string

In Java, I have a String:
Jamaica
I would like to remove the first character of the string and then return amaica
How would I do this?

const str = "Jamaica".substring(1)
console.log(str)
Use the substring() function with an argument of 1 to get the substring from position 1 (after the first character) to the end of the string (leaving the second argument out defaults to the full length of the string).

public String removeFirstChar(String s){
return s.substring(1);
}

In Java, remove leading character only if it is a certain character
Use the Java ternary operator to quickly check if your character is there before removing it. This strips the leading character only if it exists, if passed a blank string, return blankstring.
String header = "";
header = header.startsWith("#") ? header.substring(1) : header;
System.out.println(header);
header = "foobar";
header = header.startsWith("#") ? header.substring(1) : header;
System.out.println(header);
header = "#moobar";
header = header.startsWith("#") ? header.substring(1) : header;
System.out.println(header);
Prints:
blankstring
foobar
moobar
Java, remove all the instances of a character anywhere in a string:
String a = "Cool";
a = a.replace("o","");
//variable 'a' contains the string "Cl"
Java, remove the first instance of a character anywhere in a string:
String b = "Cool";
b = b.replaceFirst("o","");
//variable 'b' contains the string "Col"

Use substring() and give the number of characters that you want to trim from front.
String value = "Jamaica";
value = value.substring(1);
Answer: "amaica"

You can use the substring method of the String class that takes only the beginning index and returns the substring that begins with the character at the specified index and extending to the end of the string.
String str = "Jamaica";
str = str.substring(1);

substring() method returns a new String that contains a subsequence of characters currently contained in this sequence.
The substring begins at the specified start and extends to the character at index end - 1.
It has two forms. The first is
String substring(int FirstIndex)
Here, FirstIndex specifies the index at which the substring will
begin. This form returns a copy of the substring that begins at
FirstIndex and runs to the end of the invoking string.
String substring(int FirstIndex, int endIndex)
Here, FirstIndex specifies the beginning index, and endIndex specifies
the stopping point. The string returned contains all the characters
from the beginning index, up to, but not including, the ending index.
Example
String str = "Amiyo";
// prints substring from index 3
System.out.println("substring is = " + str.substring(3)); // Output 'yo'

you can do like this:
String str = "Jamaica";
str = str.substring(1, title.length());
return str;
or in general:
public String removeFirstChar(String str){
return str.substring(1, title.length());
}

public String removeFirst(String input)
{
return input.substring(1);
}

The key thing to understand in Java is that Strings are immutable -- you can't change them. So it makes no sense to speak of 'removing a character from a string'. Instead, you make a NEW string with just the characters you want. The other posts in this question give you a variety of ways of doing that, but its important to understand that these don't change the original string in any way. Any references you have to the old string will continue to refer to the old string (unless you change them to refer to a different string) and will not be affected by the newly created string.
This has a number of implications for performance. Each time you are 'modifying' a string, you are actually creating a new string with all the overhead implied (memory allocation and garbage collection). So if you want to make a series of modifications to a string and care only about the final result (the intermediate strings will be dead as soon as you 'modify' them), it may make more sense to use a StringBuilder or StringBuffer instead.

I came across a situation where I had to remove not only the first character (if it was a #, but the first set of characters.
String myString = ###Hello World could be the starting point, but I would only want to keep the Hello World. this could be done as following.
while (myString.charAt(0) == '#') { // Remove all the # chars in front of the real string
myString = myString.substring(1, myString.length());
}
For OP's case, replace while with if and it works aswell.

You can simply use substring().
String myString = "Jamaica"
String myStringWithoutJ = myString.substring(1)
The index in the method indicates from where we are getting the result string, in this case we are getting it after the first position because we dont want that "J" in "Jamaica".

Another solution, you can solve your problem using replaceAll with some regex ^.{1} (regex demo) for example :
String str = "Jamaica";
int nbr = 1;
str = str.replaceAll("^.{" + nbr + "}", "");//Output = amaica

My version of removing leading chars, one or multiple. For example, String str1 = "01234", when removing leading '0', result will be "1234". For a String str2 = "000123" result will be again "123". And for String str3 = "000" result will be empty string: "". Such functionality is often useful when converting numeric strings into numbers.The advantage of this solution compared with regex (replaceAll(...)) is that this one is much faster. This is important when processing large number of Strings.
public static String removeLeadingChar(String str, char ch) {
int idx = 0;
while ((idx < str.length()) && (str.charAt(idx) == ch))
idx++;
return str.substring(idx);
}

##KOTLIN
#Its working fine.
tv.doOnTextChanged { text: CharSequence?, start, count, after ->
val length = text.toString().length
if (length==1 && text!!.startsWith(" ")) {
tv?.setText("")
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Formatting String Array efficiently in Java - java

Related

Java regex: Replace all characters with `+` except instances of a given string

How do I get the last substring starting with +,-,*,/?

Replace parts of a string in Java

String manipulation of function names

Java - removing first character of a string

Categories

Resources