Suppose you have a sorted List containing server names. You'd like to collapse them as tightly as possible.
Example:
abcd01c, abcd02c, abcd04c, abcd05, z1x
should become
abcd0[1-4]c,abcd05,z1x
What is the simplest algorithm to take care of something like this?
I would store all strings in a prefix map, which makes the decision of a String exists very easy, and also allows fast iteration of a subset of Strings.
Store the Strings as:
(0)abcd01c
(5) 2c,
(5) 4c,
(4) 05,
(0)z1x
The number is the count of characters which have to be taken from the previous String. This is a common implementation for dictionaries like phonebooks, where you have to store many similar Strings.
A Trie is a similar structure, as Brian Roach noticed in the comments.
I'm a little shaky on what your actual need is, but an approach to this would be in a custom Trie (Wikipedia Entry)
When you reached the point in your key where your next character isn't an alpha character, you'd know that you had a prefix. Inside that node in the Trie you could then have another map (not pointing at additional Trie nodes) that was keyed by the suffix and contained the ranges for each.
You still have the problem, however, of the specific rules around your data. If you have abcd01c as the key, is the prefix abcd or abcd0?
I think dynamic programming can help. The shortest length can be computed for all sets of first elements of given array, i.e. {1}, {1,2}, {1,2,3}... Those numbers are computed consequently, so previous ones are used to calculate the current number. If we want to calculate A[i] and A[j] is known (j < i) and numbers from given array from j+1 to i can be compressed, then A[i] equals A[j] + length of compressed data.
upd
I hardly understand how to compress if range is set for more then one symbol. So, here is a simple realization in case of one symbol.
int prevIdx = -1;
int count = 0;
for (int i = 1; i < list.Length; i++) {
bool ok = true;
if (list[i].Length == list[i - 1].Length) {
int count = 0;
for (int j = 0; j < list[i].Length; j++)
if (list[i][j] != list[i - 1][j])
curIdx = j;
count++;
}
if (count > 1)
ok = false;
}
else
ok = false;
if (ok) {
if (prevIdx == curIdx) {
count++;
}
else {
prevIdx = curIdx;
if (count > 1)
answer.Add(list[i - 1].SubString(0, prevIdx - 1) +
'[' + count.ToString() + ']' + list[i - 1].SubString(prevIdx + 1, list[i - 1].Length);
else
answer.Add(list[i - 1]);
count = 0;
}
}
else {
if (count > 1)
answer.Add(list[i - 1].SubString(0, prevIdx - 1) +
'[' + count.ToString() + ']' + list[i - 1].SubString(prevIdx + 1, list[i - 1].Length);
else
answer.Add(list[i - 1]);
prevIdx = -1;
}
}
if (count > 1)
answer.Add(list[List.Length - 1].SubString(0, prevIdx - 1) +
'[' + count.ToString() + ']' + list[i - 1].SubString(prevIdx + 1, list[List.Length - 1].Length);
else
answer.Add(list[list.Length - 1]);
Related
I am solving a LeetCode question: Minimum Number of Operations to Move All Balls to Each Box.
You have n boxes. You are given a binary string boxes of length n, where boxes[i] is '0' if the ith box is empty, and '1' if it contains one ball. In one operation, you can move one ball from a box to an adjacent box. Return an array answer of size n, where answer[i] is the minimum number of operations needed to move all the balls to the ith box. For input boxes = "001011", the output is: [11,8,5,4,3,4].
Doing it in O(n^2) is trivial. I could only solve it that way. I am trying to understand this O(n) solution, but having a hard time:
class Solution {
public int[] minOperations(String boxes) {
int n = boxes.length();
int[] left = new int[n];
int[] right = new int[n];
int[] ans = new int[n];
int count = boxes.charAt(0) - '0';
for(int i = 1 ; i < n ; i++){
left[i] = left[i - 1] + count;
count += boxes.charAt(i) - '0';
// System.out.println("i: "+i+" left[i]: "+left[i]+" left[i-1] : "+left[i-1]+" count: " + count);
}
count = boxes.charAt(n - 1) - '0';
for(int i = n - 2 ; i >=0 ; i--){
right[i] = right[i + 1] + count;
count += boxes.charAt(i) - '0';
// System.out.println("i: "+i+" right[i]: "+right[i]+" right[i+1] : "+right[i+1]+" count: " + count);
}
for(int i = 0 ; i < n ; i++) {
ans[i] = left[i] + right[i];
}
return ans;
}
}
Could someone please elaborate the logic behind:
left[i] = left[i - 1] + count;
count += boxes.charAt(i) - '0';
I understand we increment count whenever we encounter a ball, but how does left[i] = left[i - 1] + count; help us count the number of operations needed so far to move all the balls on the left to i (and vice versa in case of right)?
Thank you!
Think of left[i] as the cost to move all balls starting from 0 to ith index.
So,
left[i] =
left[i - 1] (cost to move all 1's to (i - 1) the index)
+ count (this is the total number of 1's which will all need to be moved to the ith index so, its cost is count)
This comment from #dunkypie helped:
"I finally build the intuition for the problem using DP. Here is goes. When we say calculating the number of operations for moving all the balls to the left of a box to that bax, say we are at the i th position(or box). This consists of two parts, first dp[i - 1] will give us the number of operations to move all the balls so far till (i - 1) th position and now we have all the balls till the (i - 1) th position in the (i - 1) th position(or box). Then the next part involves moving all those balls in (i - 1) th position to the i th position. Also note the cost of moving a single ball by 1 position is 1. So the recurrence relation becomes:
dp[i] = dp[i - 1] + (1 * balls) where 1 here is the cost of moving a single ball."
I have a small doubt while sorting arrays and yeah I am new to programming. Take a look at this code for example:
public void bubbleSort(int[] array) {
boolean swapped = true;
int j = 0;
int tmp;
while (swapped) {
swapped = false;
j++;
for (int i = 0; i < array.length - j; i++) {
if (array[i] > array[i + 1]) {
tmp = array[i];
array[i] = array[i + 1];
array[i + 1] = tmp;
swapped = true;
}
}
}
In the above code, why do we have to use j++ and i < (array.length-j) as the test expression? We could have rather used i < (array.length) as the test expression while omitting the variable j. Any answers?
"Why do we have to use j++ and i < (array.length-j) as the test expression?"
The reason behind is at any time elements array[ array.length -j ] to array[array.length - 1] are already sorted.
Example: Say you have array of length n.
So after the first iteration the biggest element will be placed at array[n - 1].
So because the largest element is already sorted on the next iteration we will only sort array of length n - 1.
After the second iteration the second biggest element will be placed at array[ n - 2].
So because the 1st largest and 2nd largest elements are already sorted on the next iteration we will only sort array of length n - 2, and so on...
The running time of the algorithm will be which is
As you said it we could have used i < (array.length - 1) but we will be just doing a lot of work for nothing. If we do this the running time will be (nn) which is O(nn). But though the running time is still O(n*n) but it is obvious that is smaller than , hence the first one is efficient.
I'm trying to solve the edit distance problem. the code I've been using is below.
public static int minDistance(String word1, String word2) {
int len1 = word1.length();
int len2 = word2.length();
// len1+1, len2+1, because finally return dp[len1][len2]
int[][] dp = new int[len1 + 1][len2 + 1];
for (int i = 0; i <= len1; i++) {
dp[i][0] = i;
}
for (int j = 0; j <= len2; j++) {
dp[0][j] = j;
}
//iterate though, and check last char
for (int i = 0; i < len1; i++) {
char c1 = word1.charAt(i);
for (int j = 0; j < len2; j++) {
char c2 = word2.charAt(j);
//if last two chars equal
if (c1 == c2) {
//update dp value for +1 length
dp[i + 1][j + 1] = dp[i][j];
} else {
int replace = dp[i][j] + 1 ;
int insert = dp[i][j + 1] + 1 ;
int delete = dp[i + 1][j] + 1 ;
int min = replace > insert ? insert : replace;
min = delete > min ? min : delete;
dp[i + 1][j + 1] = min;
}
}
}
return dp[len1][len2];
}
It's a DP approach. The problem it since it use a 2D array we cant solve this problem using above method for large strings. Ex: String length > 100000.
So Is there anyway to modify this algorithm to overcome that difficulty ?
NOTE:
The above code will accurately solve the Edit Distance problem for small strings. (which has length below 1000 or near)
As you can see in the code it uses a Java 2D Array "dp[][]" . So we can't initialize a 2D array for large rows and columns.
Ex : If i need to check 2 strings whose lengths are more than 100000
int[][] dp = new int[len1 + 1][len2 + 1];
the above will be
int[][] dp = new int[100000][100000];
So it will give a stackOverflow error.
So the above program only good for small length Strings.
What I'm asking is , Is there any way to solve this problem for large strings(length > 100000) efficiently in java.
First of all, there's no problem in allocating a 100k x 100k int array in Java, you just have to do it in the Heap, not the Stack (and on a machine with around 80GB of memory :))
Secondly, as a (very direct) hint:
Note that in your loop, you are only ever using 2 rows at a time - row i and row i+1. In fact, you calculate row i+1 from row i. Once you get i+1 you don't need to store row i anymore.
This neat trick allows you to store only 2 rows at the same time, bringing down the space complexity from n^2 to n. Since you stated that this is not homework (even though you're a CS undergrad by your profile...), I'll trust you to come up with the code yourself.
Come to think of it I recall having this exact problem when I was doing a class in my CS degree...
I ve got an issue with the ArrayIndexOutofBound error when trying to determine whether data is heap following the 2i + 1/2 formula child-parent relationship. Do you know how can I resolve the issue?
for (int i = 0; i < array.length; i++)
{
if ((array[i] < array[2*i + 1]) || (array[i] < array[2*i + 2]))
{
bool = false;
....
}
}
I would suggest checking for the boundaries of the array first, and if it contains enough elements, then you can compare them:
if ((array.length >= 2*i + 2)
&& ((array[i] < array[2*i + 1]) || (array[i] < array[2*i + 2])))
{
bool = false;
....
}
Your for-loop guard allows i == array.length -1. But inside the loop, you look for element 2 * i + 2. That's 2 * array.length. That's way beyond the end of the array, and will always throw an excpetion (except when i == 0).
Change your for-loop guard to i < ((array.length / 2) - 1), so that the maximal value of 2 * i + 2 is array.length - 1. Also, ensure you test this both for arrays of even and odd length; I think the logic will work in only one case.
i don't know how to do something specific at the last loop of this for loop:
String msg = "---Player List, Count:" + users.size() + "---" + brln;
for (int i = 0; i < users.size(); i++) {
if((i - 1) == users.size()){
msg += "--::" + users.get(i).name; //Do this at the last loop
return; //returns the void
}
msg += "--::" + users.get(i).name + brln; // do this by default
}
Can you help me get this to work?
Your condition is wrong: instead of (i - 1) == users.size() use (i + 1) == users.size() or i == users.size() - 1.
Basically (i - 1) == users.size() would match the element after the last (which clearly doesn't exist), i.e. for a list of size 5 you'd get (i - 1) == 5 or i == 6.
In the example above (i + 1) == users.size() and i == users.size() - 1 would resolve to (i + 1) == 5 and i == 5 - 1 which both result in i == 4, which is the last index in the list.
Edit: Btw, your loop is still quite odd. You basically seem to add a line break after each but the last element. Why don't you change it to something like this:
String msg = "---Player List, Count:" + users.size() + "---" + brln;
for (int i = 0; i < users.size(); i++) {
if( i > 0){
msg += brln;
}
msg += "--::" + users.get(i).name;
}
This would add a line break before every line except the first. Note how the condition is much easier.
Change this
from:
if((i - 1) == users.size()){
to:
if((i + 1) == users.size()){
What you want in effect is a string of entries separated by a delimiter, where the delimiter is a line break. Use this idiom:
String delimiter = "", result = "";
for (...loop init...) {
result += delimiter;
...append one entry...
delimiter = brln;
}
Other than that, building a large string by creating new string in each iteration is bad for performance as it is an O(n^2) operation. You should prefer a StringBuilder.
If you need last index just use if( i == users.size()-1)
In the example you provide you should use a StringBuilder for string concatenation. Then you only need to loop without asking anything and you can use for-each loop.
StringBuilder msg =new StringBuilder().append("---Player List, Count:").append(users.size()).append("---");
for (User user : users) {
msg.append(brln)
.append("--::").append(user.name)
.append(brln);
}
If you change your code a little, you can avoid a special case:
StringBuilder msg = new StringBuilder("---Player List, Count:" + users.size()
+ "---"); // Note a lack of brln
for (int i = 0; i < users.size(); i++) {
msg.append(brln + "--::" + users.get(i).name);
}
return msg.toString();
Ideally you should use a StringBuilder for concatenating strings in a loop, as I've done above.
(This solution works in your case, since you had a header including a line break. See other solutions for a generic way to insert string X between every occurence but the last in a loop).
you are doing it wrong. Change your line:
if((i - 1) == users.size())
by this one:
if(i == (users.size()-1))
Change your condition to
i == users.size() - 1
Why? In Java (and other languages) the first element in a list is at index 0, and the last element at N-1 (if N is the number of elements currently in the list), so users.size() - 1 is the index of the last element. For example, if there are 10 elements in users list, the last will be at index 9.
Full code:
String msg = "---Player List, Count:" + users.size() + "---" + brln;
for (int i = 0; i < users.size(); i++) {
//if i equals to the last index, do your special handling of the loop
if(i == (users.size() - 1)) {
msg += "--::" + users.get(i).name; //Do this at the last loop
break;
}
msg += "--::" + users.get(i).name + brln; // do this by default
}
Although this might not help you right now (when you don't use Java 8), all this messy code will have an end with Java 8:
// requires Java 8
String msg = users.stream().map(User::name).collect(Collectors.joining(brln));