I have the following code which takes more than 5 seconds to run with the argument -Xmx<1024M>.
I am aware that the for loop takes O(q) time, as well as the reverse() and toString() take O(n) time each.
Is there a way to reverse the string in less than O(n) time? Or is something else slowing the code down? Any help would be welcome!
class Main {
public static void main(String[] args){
String s = "a";
String qa = "200000";
int q = Integer.parseInt(qa);
String[] t = new String[q];
for(int i = 0; i < q; i++) {
if(i%2==0) {t[i] = "2 1 x";}
if(i%2==1) {t[i] = "1";}
if(t[i].toCharArray()[0] == '1') {
StringBuilder rev = new StringBuilder(s).reverse();
s = rev.toString();
} else {
char letter = t[i].toCharArray()[4];
if(t[i].toCharArray()[2] == '1') {
s = letter + s;
} else {
s = s + letter;
}
}
}
System.out.println(s);
}
}
Regardless of what is it supposed to do (I have no idea), I found the following problems:
Multiple instantinations of StringBuilder in each iteration.
String concatenation using + operator.
Repetitive usage of Sring::toCharArray (see the 2nd solution)
You will achieve a faster result using directly only one instance of StringBuilder:
String s = "a";
String qa = "200000";
int q = Integer.parseInt(qa);
String[] t = new String[q];
StringBuilder sb = new StringBuilder(s); // Instantiate before the loop
for (int i = 0; i < q; i++) {
if(i%2==0) {t[i] = "2 1 x";}
if(i%2==1) {t[i] = "1";}
if(t[i].toCharArray()[0] == '1') {
sb.reverse(); // all you did here is just reversing 's'
} else {
char letter = t[i].toCharArray()[4];
if(t[i].toCharArray()[2] == '1') {
sb.insert(0, letter); // prepend a letter
} else {
sb.append(letter); // append a letter
}
}
}
Another thing is that you multiple times define a String such as t[i] = "2 1 x"; and then you compare with t[i].toCharArray()[0]. Pre-definig these immutable values and using char[][] should help too:
String s = "a";
String qa = "200000";
int q = Integer.parseInt(qa);
char[][] t = new char[q][]; // char[][] instead of String[]
char[] char21x = new char[]{'2', '1', 'x'}; // predefined array
char[] char1 = new char[]{'1'}; // another predefined array
StringBuilder sb = new StringBuilder(s); // Instantiate before the loop
for (int i = 0; i < q; i++) {
if(i%2==0) {t[i] = char21x;} // first reuse
if(i%2==1) {t[i] = char1;} // second reuse
if(t[i][0] == '1') { // instead of String::toCharArray, mind the indices
sb.reverse(); // all you did here is just reversing 's'
} else {
char letter = t[i][2]; // instead of String::toCharArray, mind the indices
if(t[i][1] == '1') {
sb.insert(0, letter); // prepend a letter
} else {
sb.append(letter); // append a letter
}
}
}
Edit: I have tested the solution with the simplest way possible using a difference of System.currentTimeMillis() on my laptop:
Original solution: 7.658, 6.899 and 7.046 seconds
2nd solution: 3.288, 3.691 and 3.158 seconds
3rd solution: 2.717, 2.966 and 2.717 seconds
Conclusion: I see no way to improve the algorithm itself in terms of the computation complexity, however, using the correct ways to treat Strings helps to reduce the time complexity 2-3 times (in my case).
General advice: What you can instantiate and define before the loop, do it before the loop.
Is there a way to reverse the string in less than O(n) time? Or is something else slowing the code down?
No there is no way to reverse a string in less than O(n) time: A program that produces an output of size n necessarily takes o(n) time at the minimum.
Your code has lots of unnecessary operations that slow the program down. The program produces 50000 letters x, followed by one letter a, followed by another 50000 letters x. Here is a much faster (and easier to understand) implementation of the same program.
class Faster {
public static void main(String[] args) {
String hundredXs = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx";
for (int i = 0; i < 500; i++)
System.out.print(hundredXs);
System.out.print("a");
for (int i = 0; i < 500; i++)
System.out.print(hundredXs);
System.out.println();
}
}
Related
So i'm making a program that removes duplicate letters in a string. The last step of it is updating the old string to the new string, and looping through the new string. I believe everything works besides the looping through the new string part. Any ideas what might be causing it to not work? It will work as intended for one pass through, and then after that it won't step through the new loop
public class homework20_5 {
public static void main(String[] arg) {
Scanner scanner = new Scanner(System.in);
String kb = scanner.nextLine();
int i;
for (i = 0; i < kb.length(); i++) {
char temp = kb.charAt(i);
if(temp == kb.charAt(i+1)) {
kb = kb.replace(""+temp, "");
i = kb.length() + i;
}
}
System.out.println(kb);
}
}
Instead of using complex algorithms and loops like this you can just use HashSet which will work just like a list but it won't allow any duplicate elements.
private static String removeDuplicateWords(String str) {
HashSet<Character> xChars = new LinkedHashSet<>();
for(char c: str.toCharArray()) {
xChars.add(c);
}
StringBuilder sb = new StringBuilder();
for (char c: xChars) {
sb.append(c);
}
return sb.toString();
}
So you actually want to remove all occurrences that appear more than once entirely and not just the duplicate appearances (while preserving one instance)?
"Yea that’s exactly right "
In that case your idea won't cut it because your duplicate letter detection can only detect continuous sequences of duplicates. A very simple way would be to use 2 sets in order to identify unique letters in one pass.
public class RemoveLettersSeenMultipleTimes {
public static void main(String []args){
String input = "abcabdgag";
Set<Character> lettersSeenOnce = lettersSeenOnceIn(input);
StringBuilder output = new StringBuilder();
for (Character c : lettersSeenOnce) {
output.append(c);
}
System.out.println(output);
}
private static Set<Character> lettersSeenOnceIn(String input) {
Set<Character> seenOnce = new LinkedHashSet<>();
Set<Character> seenMany = new HashSet<>();
for (Character c : input.toCharArray()) {
if (seenOnce.contains(c)) {
seenMany.add(c);
seenOnce.remove(c);
continue;
}
if (!seenMany.contains(c)) {
seenOnce.add(c);
}
}
return seenOnce;
}
}
There are a few problems here:
Problem 1
for (i = 0; i < kb.length(); i++) {
should be
for (i = 0; i < kb.length() - 1; i++) {
Because this
if (temp == kb.charAt(i+1))
will explode with an ArrayIndexOutOfBoundsException otherwise.
Problem 2
Delete this line:
i = kb.length() + i;
I don't understand what the intention is there, but nevertheless it must be deleted.
Problem 3
Rather than lots of code, there's a one-line solution:
String deduped = kb.replaceAll("[" + input.replaceAll("(.)(?=.*\\1)|.", "$1") + "]", "");
This works by:
finding all dupe chars via input.replaceAll("(.)(?=.*\\1)|.", "$1"), which in turn works by consuming every character, either capturing it as group 1 if it has a dupe or just consuming it if a non-dupe
building a regex character class from the dupes, which is used to delete them all (replace with a blank)
Say you feed the program with the input "AAABBC", then the expected output should be "ABC".
Now in the for-loop, i gets incremented from 0 to 5.
After 1st iteration:
kb becomes AABBC and i becomes 5 + 0 = 5 and gets incremented to 6.
And now the condition for the for-loop is that i < kb.length() which equates to 6 < 5 returning false. Hence the for-loop ends after just one iteration.
So the problematic line of code is i = kb.length() + i; and also the loop condition keeps changing as the size of kb changes.
I would suggest using a while loop like the following example if you don't worry too much about the efficiency.
public static void main(String[] arg) {
String kb = "XYYYXAC";
int i = 0;
while (i < kb.length()) {
char temp = kb.charAt(i);
for (int j = i + 1; j < kb.length(); j++) {
char dup = kb.charAt(j);
if (temp == dup) {
kb = removeCharByIndex(kb, j);
j--;
}
}
i++;
}
System.out.println(kb);
}
private static String removeCharByIndex(String str, int index) {
return new StringBuilder(str).deleteCharAt(index).toString();
}
Output: XYAC
EDIT: I misunderstood your requirements. So looking at the above comments, you want all the duplicates and the target character removed. So the above code can be changed like this.
public static void main(String[] arg) {
String kb = "XYYYXAC";
int i = 0;
while (i < kb.length()) {
char temp = kb.charAt(i);
boolean hasDup = false;
for (int j = i + 1; j < kb.length(); j++) {
if (temp == kb.charAt(j)) {
hasDup = true;
kb = removeCharByIndex(kb, j);
j--;
}
}
if (hasDup) {
kb = removeCharByIndex(kb, i);
i--;
}
i++;
}
System.out.println(kb);
}
private static String removeCharByIndex(String str, int index) {
return new StringBuilder(str).deleteCharAt(index).toString();
}
Output: AC
Although, this is not the best and definitely not an efficient solution to this, I think you can get the idea of iterating the input string character by character and removing it if it has duplicates.
The following answer concerns only the transformation of XYYYXACX to ACX. If we wanted to have AC, it's a whole different answer. The other answers already speak about it, and I'll invite you to consult the contains method of String too.
We should consider avoiding -most of the time- modifying the things we iterate. Using a temporary variable could be a kind of solution. To use it, we could change our mindset. Instead of erasing the undesired letters, we can save the ones we want.
To identify the desired character, we need to test if all surrounding letters are different from the tested one. It'll be the opposite of what you did with if(temp == kb.charAt(i+1)) { like if(temp != kb.charAt(i+1)) {. But considering that the tested string will not change anymore, we will need to test the previous letter too as if(temp != kb.charAt(i-1) && temp != kb.charAt(i+1)) {.
As previously said, once we have identified the letter, we will keep the value with a temporary variable. That will lead to replace kb = kb.replace(""+temp, ""); by buffer = buffer + temp; if buffer is our temporary variable initialized with an empty string (Aka. String buffer = "";). In the end, we could override our base value with the temporary one.
At this step, we will have:
public static void main(String[] arg) {
Scanner scanner = new Scanner(System.in);
String kb = scanner.nextLine();
String buffer = "";
int i;
for (i = 1; i < kb.length(); i++) {
char temp = kb.charAt(i);
if(temp != kb.charAt(i-1) && temp != kb.charAt(i+1)) {
buffer = buffer + temp;
}
}
kb = buffer;
System.out.println(kb);
}
That'll sadly not work, trying to access invalid indexes of our string. We should consider two particular behavior for the first and the last letter because they are close to only one letter. For these letters, we will have only one comparison. So, we can make them inside or outside the loop. For clarity, we will do it outside.
For the first one, it will look like to if (kb.charAt(0) != kb.charAt(1)) { and at if (kb.charAt(kb.length() - 1) != kb.charAt(kb.length() - 2)) { for the last. The body of the condition will remain the same as the one in the loop.
Once done, we will reduce the scope of our loop to exclude these character with for (i = 1; i < (kb.length() - 1); i++) {.
Now we will have something working, but only for one iteration:
public static void main(String[] arg) {
Scanner scanner = new Scanner(System.in);
String kb = scanner.nextLine();
String buffer = "";
int i;
if (kb.charAt(0) != kb.charAt(1)) {
buffer = buffer + kb.charAt(0);
}
for (i = 1; i < (kb.length() - 1); i++) {
char temp = kb.charAt(i);
if(temp != kb.charAt(i-1) && temp != kb.charAt(i+1)) {
buffer = buffer + temp;
}
}
if (kb.charAt(kb.length() - 1) != kb.charAt(kb.length() - 2)) {
buffer = buffer + kb.charAt(kb.length() - 1);
}
kb = buffer;
System.out.println(kb);
}
XYYYXACX will become XXACX.
Once said, our index problem can occur again if the string has only one letter. However, all of this would have been useless because obviously, we can't have a duplicate letter in this situation. As a fact, we should wrap the whole thing to ensure that we have at least two letters:
public static void main(String[] arg) {
Scanner scanner = new Scanner(System.in);
String kb = scanner.nextLine();
if (kb.length() >= 2) {
String buffer = "";
int i;
if (kb.charAt(0) != kb.charAt(1)) {
buffer = buffer + kb.charAt(0);
}
for (i = 1; i < (kb.length() - 1); i++) {
char temp = kb.charAt(i);
if (temp != kb.charAt(i - 1) && temp != kb.charAt(i + 1)) {
buffer = buffer + temp;
}
}
if (kb.charAt(kb.length() - 1) != kb.charAt(kb.length() - 2)) {
buffer = buffer + kb.charAt(kb.length() - 1);
}
kb = buffer;
}
System.out.println(kb);
}
The last thing to do is perform this treatment until we have no more undesired letters. For this task, the do { ... } while ( ... ) seems perfect. We can use for the condition comparison the size of the string. Because when the size of the previous iteration is equal to the temporary variable, we will know that we have finished.
We will need to perform this comparison before affecting the value of our temporary variable to the base one. Otherwise, it'll always be the same.
In the end, the following thing should be a potential solution:
public static void main(String[] arg) {
Scanner scanner = new Scanner(System.in);
String kb = scanner.nextLine();
Boolean modified;
do {
modified = false;
if (kb.length() >= 2) {
String buffer = "";
int i;
if (kb.charAt(0) != kb.charAt(1)) {
buffer = buffer + kb.charAt(0);
}
for (i = 1; i < (kb.length() - 1); i++) {
char temp = kb.charAt(i);
if (temp != kb.charAt(i - 1) && temp != kb.charAt(i + 1)) {
buffer = buffer + temp;
}
}
if (kb.charAt(kb.length() - 1) != kb.charAt(kb.length() - 2)) {
buffer = buffer + kb.charAt(kb.length() - 1);
}
modified = (kb.length() != buffer.length());
kb = buffer;
}
} while (modified);
System.out.println(kb);
}
Take note that this code is ugly for the sole purpose of the explanation. We should refactor this code. We can improve it a lot for the sake of brevity and, why not, performance.
How would I modify this for an O(n) time complexity? Basically, my program just finds the palindrome string and outputs a statement based on the findings.
class ChkPalindrome
{
public static void main(String args[])
{
String str, rev = "";
Scanner sc = new Scanner(System.in);
System.out.println("Enter a string:");
str = sc.nextLine();
int length = str.length();
for ( int i = length - 1; i >= 0; i-- )
rev = rev + str.charAt(i);
if (str.equals(rev))
System.out.println(str+" is a palindrome");
else
System.out.println(str+" is not a palindrome");
}
}
rev = rev + str.charAt(i) is the culprit. This statement by itself takes O(N) time, N being the size of rev.
Given that this line is inside a for loop that runs N times, your code is O(N^2) time.
The fix is the same fix for when you want to append to a string inside loops: Don't do that, use StringBuilder instead:
StringBuilder reverse = new StringBuilder();
for (int i = length - 1; i >= 0; i--) {
reverse.append(str.charAt(i));
}
if (str.contentEquals(rev)) {
// palindrome
} else {
// nope
}
Or, even faster (but by a constant factor with early exit, which both make a large difference in real life but don't affect O(N) numbers): Simply compare the first and last character, then the second-to-first and second-to-last, returning immediately with false if you see a mismatch. Also break up your methods (you should have a method that determines palindrome, and a separate method that then prints results). Something like:
boolean isPalindrome(String str) {
int len = str.length(), mid = len / 2;
for (int i = 0; i < mid; i++) {
char a = str.charAt(i);
char b = str.charAt(len - i);
if (a != b) return false;
}
return true;
}
// and then the code you have now simply becomes...
if (isPalindrome(str)) {
System.out.println(str + " is a palindrome");
} else {
System.out.println(str + " is not a palindrome");
}
Separate methods make them shorter, easier to read and understand, more re-usable, and easier to test.
StringBuilder reverse operation takes O(n) and String equals method takes O(n).Therefore below code will give O(n) complexity.
`public static void main(String[] args) {
Scanner in = new Scanner(System.in);
System.out.println("Enter a string:");
String s = in.nextLine();
String reverse = new StringBuilder(s).reverse().toString();
if(s.equals(reverse)){
System.out.println(s+" is a palindrome");
}
else {
System.out.println(s+" is not a palindrome");
}
}`
public static boolean isPalindrome(String str) {
for(int i = 0, j = str.length() - 1; i < j; i++, j--)
if(str.charAt(i) != str.charAt(j))
return false;
return true;
}
So I'm creating a program that will output the first character of a string and then the first character of another string. Then the second character of the first string and the second character of the second string, and so on.
I created what is below, I was just wondering if there is an alternative to this using a loop or something rather than substring
public class Whatever
{
public static void main(String[] args)
{
System.out.println (interleave ("abcdefg", "1234"));
}
public static String interleave(String you, String me)
{
if (you.length() == 0) return me;
else if (me.length() == 0) return you;
return you.substring(0,1) + interleave(me, you.substring(1));
}
}
OUTPUT: a1b2c3d4efg
Well, if you really don't want to use substrings, you can use String's toCharArray() method, then you can use a StringBuilder to append the chars. With this you can loop through each of the array's indices.
Doing so, this would be the outcome:
public static String interleave(String you, String me) {
char[] a = you.toCharArray();
char[] b = me.toCharArray();
StringBuilder out = new StringBuilder();
int maxLength = Math.max(a.length, b.length);
for( int i = 0; i < maxLength; i++ ) {
if( i < a.length ) out.append(a[i]);
if( i < b.length ) out.append(b[i]);
}
return out.toString();
}
Your code is efficient enough as it is, though. This can be an alternative, if you really want to avoid substrings.
This is a loop implementation (not handling null value, just to show the logic):
public static String interleave(String you, String me) {
StringBuilder result = new StringBuilder();
for (int i = 0 ; i < Math.max(you.length(), me.length()) ; i++) {
if (i < you.length()) {
result.append(you.charAt(i)); }
if (i < me.length()) {
result.append(me.charAt(i));
}
}
return result.toString();
}
The solution I am proposing is based on the expected output - In your particular case consider using split method of String since you are interleaving by on character.
So do something like this,
String[] xs = "abcdefg".split("");
String[] ys = "1234".split("");
Now loop over the larger array and ensure interleave ensuring that you perform length checks on the smaller one before accessing.
To implement this as a loop you would have to maintain the position in and keep adding until one finishes then tack the rest on. Any larger sized strings should use a StringBuilder. Something like this (untested):
int i = 0;
String result = "";
while(i <= you.length() && i <= me.length())
{
result += you.charAt(i) + me.charAt(i);
i++;
}
if(i == you.length())
result += me.substring(i);
else
result += you.substring(i);
Improved (in some sense) #BenjaminBoutier answer.
StringBuilder is the most efficient way to concatenate Strings.
public static String interleave(String you, String me) {
StringBuilder result = new StringBuilder();
int min = Math.min(you.length(), me.length());
String longest = you.length() > me.length() ? you : me;
int i = 0;
while (i < min) { // mix characters
result.append(you.charAt(i));
result.append(me.charAt(i));
i++;
}
while (i < longest.length()) { // add the leading characters of longest
result.append(longest.charAt(i));
i++;
}
return result.toString();
}
I have no idea how to start my assignment.
We got to make a Run-length encoding program,
for example, the users enters this string:
aaaaPPPrrrrr
is replaced with
4a3P5r
Can someone help me get started with it?
Hopefully this will get you started on your assignment:
The fundamental idea behind run-length encoding is that consecutively occurring tokens like aaaa can be replaced by a shorter form 4a (meaning "the following four characters are an 'a'"). This type of encoding was used in the early days of computer graphics to save space when storing an image. Back then, video cards supported a small number of colors and images commonly had the same color all in a row for significant portions of the image)
You can read up on it in detail on Wikipedia
http://en.wikipedia.org/wiki/Run-length_encoding
In order to run-length encode a string, you can loop through the characters in the input string. Have a counter that counts how many times you have seen the same character in a row. When you then see a different character, output the value of the counter and then the character you have been counting. If the value of the counter is 1 (meaning you only saw one of those characters in a row) skip outputting the counter.
public String runLengthEncoding(String text) {
String encodedString = "";
for (int i = 0, count = 1; i < text.length(); i++) {
if (i + 1 < text.length() && text.charAt(i) == text.charAt(i + 1))
count++;
else {
encodedString = encodedString.concat(Integer.toString(count))
.concat(Character.toString(text.charAt(i)));
count = 1;
}
}
return encodedString;
}
Try this one out.
This can easily and simply be done using a StringBuilder and a few helper variables to keep track of how many of each letter you've seen. Then just build as you go.
For example:
static String encode(String s) {
StringBuilder sb = new StringBuilder();
char[] word = s.toCharArray();
char current = word[0]; // We initialize to compare vs. first letter
// our helper variables
int index = 0; // tracks how far along we are
int count = 0; // how many of the same letter we've seen
for (char c : word) {
if (c == current) {
count++;
index++;
if (index == word.length)
sb.append(current + Integer.toString(count));
}
else {
sb.append(current + Integer.toString(count));
count = 1;
current = c;
index++;
}
}
return sb.toString();
}
Since this is clearly a homework assignment, I challenge you to learn the approach and not just simply use the answer as the solution to your homework. StringBuilders are very useful for building things as you go, thus keeping your runtime O(n) in many cases. Here using a couple of helper variables to track where we are in the iteration "index" and another to keep count of how many of a particular letter we've seen "count", we keep all necessary info for building our encoded string as we go.
Try this out:
private static String encode(String sampleInput) {
String encodedString = null;
//get the input to a character array.
// String sampleInput = "aabbcccd";
char[] charArr = sampleInput.toCharArray();
char prev=(char)0;
int counter =1;
//compare each element with its next element and
//if same increment the counter
StringBuilder sb = new StringBuilder();
for (int i = 0; i < charArr.length; i++) {
if(i+1 < charArr.length && charArr[i] == charArr[i+1]){
counter ++;
}else {
//System.out.print(counter + Character.toString(charArr[i]));
sb.append(counter + Character.toString(charArr[i]));
counter = 1;
}
}
return sb.toString();
}
Here is my solution in java
public String encodingString(String s){
StringBuilder encodedString = new StringBuilder();
List<Character> listOfChars = new ArrayList<Character>();
Set<String> removeRepeated = new HashSet<String>();
//Adding characters of string to list
for(int i=0;i<s.length();i++){
listOfChars.add(s.charAt(i));
}
//Getting the occurance of each character and adding it to set to avoid repeated strings
for(char j:listOfChars){
String temp = Integer.toString(Collections.frequency(listOfChars,j))+Character.toString(j);
removeRepeated.add(temp);
}
//Constructing the encodingString.
for(String k:removeRepeated){
encodedString.append(k);
}
return encodedString.toString();
}
import java.util.Scanner;
/**
* #author jyotiv
*
*/
public class RunLengthEncoding {
/**
* #param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
System.out.println("Enter line to encode:");
Scanner s=new Scanner(System.in);
String input=s.nextLine();
int len = input.length();
int i = 0;
int noOfOccurencesForEachChar = 0;
char storeChar = input.charAt(0);
String outputString = "";
for(;i<len;i++)
{
if(i+1<len)
{
if(input.charAt(i) == input.charAt(i+1))
{
noOfOccurencesForEachChar++;
}
else
{
outputString = outputString +
Integer.toHexString(noOfOccurencesForEachChar+1) + storeChar;
noOfOccurencesForEachChar = 0;
storeChar = input.charAt(i+1);
}
}
else
{
outputString = outputString +
Integer.toHexString(noOfOccurencesForEachChar+1) + storeChar;
}
}
System.out.println("Encoded line is: " + outputString);
}
}
I have tried this one. It will work for sure.
String handling in Java is something I'm trying to learn to do well. Currently I want to take in a string and replace any characters I find.
Here is my current inefficient (and kinda silly IMO) function. It was written to just work.
public String convertWord(String word)
{
return word.toLowerCase().replace('á', 'a')
.replace('é', 'e')
.replace('í', 'i')
.replace('ú', 'u')
.replace('ý', 'y')
.replace('ð', 'd')
.replace('ó', 'o')
.replace('ö', 'o')
.replaceAll("[-]", "")
.replaceAll("[.]", "")
.replaceAll("[/]", "")
.replaceAll("[æ]", "ae")
.replaceAll("[þ]", "th");
}
I ran 1.000.000 runs of it and it took 8182ms. So how should I proceed in changing this function to make it more efficient?
Solution found:
Converting the function to this
public String convertWord(String word)
{
StringBuilder sb = new StringBuilder();
char[] charArr = word.toLowerCase().toCharArray();
for(int i = 0; i < charArr.length; i++)
{
// Single character case
if(charArr[i] == 'á')
{
sb.append('a');
}
// Char to two characters
else if(charArr[i] == 'þ')
{
sb.append("th");
}
// Remove
else if(charArr[i] == '-')
{
}
// Base case
else
{
sb.append(word.charAt(i));
}
}
return sb.toString();
}
Running this function 1.000.000 times takes 518ms. So I think that is efficient enough. Thanks for the help guys :)
You could create a table of String[] which is Character.MAX_VALUE in length. (Including the mapping to lower case)
As the replacements got more complex, the time to perform them would remain the same.
private static final String[] REPLACEMENT = new String[Character.MAX_VALUE+1];
static {
for(int i=Character.MIN_VALUE;i<=Character.MAX_VALUE;i++)
REPLACEMENT[i] = Character.toString(Character.toLowerCase((char) i));
// substitute
REPLACEMENT['á'] = "a";
// remove
REPLACEMENT['-'] = "";
// expand
REPLACEMENT['æ'] = "ae";
}
public String convertWord(String word) {
StringBuilder sb = new StringBuilder(word.length());
for(int i=0;i<word.length();i++)
sb.append(REPLACEMENT[word.charAt(i)]);
return sb.toString();
}
My suggestion would be:
Convert the String to a char[] array
Run through the array, testing each character one by one (e.g. with a switch statement) and replacing it if needed
Convert the char[] array back to a String
I think this is probably the fastest performance you will get in pure Java.
EDIT: I notice you are doing some changes that change the length of the string. In this case, the same principle applies, however you need to keep two arrays and increment both a source index and a destination index separately. You might also need to resize the destination array if you run out of target space (i.e. reallocate a larger array and arraycopy the existing destination array into it)
My implementation is based on look up table.
public static String convertWord(String str) {
char[] words = str.toCharArray();
char[] find = {'á','é','ú','ý','ð','ó','ö','æ','þ','-','.',
'/'};
String[] replace = {"a","e","u","y","d","o","o","ae","th"};
StringBuilder out = new StringBuilder(str.length());
for (int i = 0; i < words.length; i++) {
boolean matchFailed = true;
for(int w = 0; w < find.length; w++) {
if(words[i] == find[w]) {
if(w < replace.length) {
out.append(replace[w]);
}
matchFailed = false;
break;
}
}
if(matchFailed) out.append(words[i]);
}
return out.toString();
}
My first choice would be to use a StringBuilder because you need to remove some chars from the string.
Second choice would be to iterate throw the array of chars and add the treated char to another array of the inicial size of the string. Then you would need to copy the array to trim the possible unused positions.
After that, I would make some performance tests to see witch one is better.
I doubt, that you can speed up the 'character replacement' at all really. As for the case of regular expression replacement, you may compile the regexs beforehand
Use the function String.replaceAll.
Nice article similar with what you want: link
Any time we have problems like this we use regular expressions are they are by far the fastest way to deal with what you are trying to do.
Have you already tried regular expressions?
What i see being inefficient is that you are gonna check again characters that have already been replaced, which is useless.
I would get the charArray of the String instance, iterate over it, and for each character spam a series of if-else like this:
char[] array = word.toCharArray();
for(int i=0; i<array.length; ++i){
char currentChar = array[i];
if(currentChar.equals('é'))
array[i] = 'e';
else if(currentChar.equals('ö'))
array[i] = 'o';
else if(//...
}
I just implemented this utility class that replaces a char or a group of chars of a String. It is equivalent to bash tr and perl tr///, aka, transliterate. I hope it helps someone!
package your.package.name;
/**
* Utility class that replaces chars of a String, aka, transliterate.
*
* It's equivalent to bash 'tr' and perl 'tr///'.
*
*/
public class ReplaceChars {
public static String replace(String string, String from, String to) {
return new String(replace(string.toCharArray(), from.toCharArray(), to.toCharArray()));
}
public static char[] replace(char[] chars, char[] from, char[] to) {
char[] output = chars.clone();
for (int i = 0; i < output.length; i++) {
for (int j = 0; j < from.length; j++) {
if (output[i] == from[j]) {
output[i] = to[j];
break;
}
}
}
return output;
}
/**
* For tests!
*/
public static void main(String[] args) {
// Example from: https://en.wikipedia.org/wiki/Caesar_cipher
String string = "THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG";
String from = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
String to = "XYZABCDEFGHIJKLMNOPQRSTUVW";
System.out.println();
System.out.println("Cesar cypher: " + string);
System.out.println("Result: " + ReplaceChars.replace(string, from, to));
}
}
This is the output:
Cesar cypher: THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG
Result: QEB NRFZH YOLTK CLU GRJMP LSBO QEB IXWV ALD