Text Justification Algorithm

Text Justification Algorithm - java

This is a very famous problem in DP, Can somebody help to visualize the recursion part of it.How are the Permutations or Combinations will be generated.
problem reference.
https://www.geeksforgeeks.org/dynamic-programming-set-18-word-wrap/

Given the maximum line width as L, the idea to justify the Text T, is to consider all suffixes of the Text (consider words instead of characters for forming suffixes to be precise.)
Dynamic Programming is nothing but "Careful Brute-force".
If you consider the brute force approach, you need to do the following.
consider putting 1, 2, .. n words in the first line.
for each case described in case 1(say i words are put in line 1), consider cases of putting 1, 2, .. n -i words in the second line and then remaining words on third line and so on..
Instead lets just consider the problem to find out the cost of putting a word at the beginning of a line.
In general we can define DP(i) to be the cost for considering (i- 1)th word as the beginning of a Line.
How can we form a recurrence relation for DP(i)?
If jth word is the beginning of the next line, then the current line will contain words[i:j) (j exclusive) and the cost of jth word being the beginning of the next line will be DP(j).
Hence DP(i) = DP(j) + cost of putting words[i:j) in the current line
As we want to minimise the total cost, DP(i) can be defined as follows.
Recurrence relation:
DP(i) = min { DP(j) + cost of putting words[i:j in the current line }
for all j in [i+1, n]
Note j = n signify that no words are left to be put in the next line.
The base Case: DP(n) = 0 => at this point there is no word left to be written.
To summarise:
Subproblems: suffixes , words[:i]
Guess: Where to start the next line, # of choices n - i -> O(n)
Recurrence: DP(i) = min {DP(j) + cost of putting words[i:j) in the current line }
If we use memoization, the expression inside the curly brace should should take O(1) time, and the loop run O(n) times (# of choices times).
i Varies from n down to 0 => Hence total complexity is brought down to O(n^2).
Now even though we derived the minimum cost for justifying the text, we also need to solve the original problem by keeping track of the j value for chosen as minimum in the above expression, so that we can later use the same to print out the justified text. The idea is of keeping parent pointer.
Hope this helps you understand the solution. Below is the simple implementation of the above idea.
public class TextJustify {
class IntPair {
//The cost or badness
final int x;
//The index of word at the beginning of a line
final int y;
IntPair(int x, int y) {this.x=x;this.y=y;}
}
public List<String> fullJustify(String[] words, int L) {
IntPair[] memo = new IntPair[words.length + 1];
//Base case
memo[words.length] = new IntPair(0, 0);
for(int i = words.length - 1; i >= 0; i--) {
int score = Integer.MAX_VALUE;
int nextLineIndex = i + 1;
for(int j = i + 1; j <= words.length; j++) {
int badness = calcBadness(words, i, j, L);
if(badness < 0 || badness == Integer.MAX_VALUE) break;
int currScore = badness + memo[j].x;
if(currScore < 0 || currScore == Integer.MAX_VALUE) break;
if(score > currScore) {
score = currScore;
nextLineIndex = j;
}
}
memo[i] = new IntPair(score, nextLineIndex);
}
List<String> result = new ArrayList<>();
int i = 0;
while(i < words.length) {
String line = getLine(words, i, memo[i].y);
result.add(line);
i = memo[i].y;
}
return result;
}
private int calcBadness(String[] words, int start, int end, int width) {
int length = 0;
for(int i = start; i < end; i++) {
length += words[i].length();
if(length > width) return Integer.MAX_VALUE;
length++;
}
length--;
int temp = width - length;
return temp * temp;
}
private String getLine(String[] words, int start, int end) {
StringBuilder sb = new StringBuilder();
for(int i = start; i < end - 1; i++) {
sb.append(words[i] + " ");
}
sb.append(words[end - 1]);
return sb.toString();
}
}

Related

Recursive method to replace all occurrences of a value in a 2D array

I have created a recursive method that replaces all occurrences of an element in a two dimensional double array. The issue is that I cannot seem to get this working without encountering a stack overflow error. Could someone please look at my code below and show me how to fix this? I have tried setting this up several times over the past few days. Thank you. Note that my arrays are 2 x 3, so the first if means that if you are at column 1 row 2, you are at the end of the array, and in that case you are done searching.
private static int replaceAll(double number, double replacementTerm) {
int i = 0;
int j = 0;
double searchFor = number;
double replace = replacementTerm;
if (i == 1 && j == 2) {
System.out.println("Search complete!");
}
if (twoDimArray2[i][j] == searchFor) {
System.out.println("Replaced An Element!");
twoDimArray2[i][j] = replace;
System.out.println(twoDimArray2[i][j]);
j++;
return replaceAll(searchFor, replace);
}
if (j == twoDimArray2.length) {
i++;
return replaceAll(searchFor, replace);
} else {
j++;
return replaceAll(searchFor, replace);
}
}

i and j should be method parameters instead of local variables so changes to their values can be tracked. Try to move right and down recursively if it does not exceed the bounds of the array. Note that this is much less efficient that iteration with two layers of for loops, as it will check multiple positions in the array more than once; to mitigate this, one can use a visited array to store all positions previous visited so they will not be checked again. See the below code in action here.
private static void replaceAll(double number, double replacementTerm, int i, int j) {
double searchFor = number;
double replace = replacementTerm;
if (twoDimArray2[i][j] == searchFor) {
System.out.println("Replaced An Element!");
twoDimArray2[i][j] = replace;
System.out.println(twoDimArray2[i][j]);
}
if (i == twoDimArray2.length - 1 && j == twoDimArray2[0].length - 1) {
System.out.println("Reached the end!");
return;
}
if (i + 1 < twoDimArray2.length) {
replaceAll(number, replacementTerm, i + 1, j);
}
if (j + 1 < twoDimArray2[0].length) {
replaceAll(number, replacementTerm, i, j + 1);
}
}

Given a number n, list all n-digit numbers such that each number does not have repeating digits

I'm trying to solve the following problem. Given an integer, n, list all n-digits numbers such that each number does not have repeating digits.
For example, if n is 4, then the output is as follows:
0123
0124
0125
...
9875
9876
Total number of 4-digit numbers is 5040
My present approach is by brute-force. I can generate all n-digit numbers, then, using a Set, list all numbers with no repeating digits. However, I'm pretty sure there is a faster, better and more elegant way of doing this.
I'm programming in Java, but I can read source code in C.
Thanks

Mathematically, you have 10 options for the first number, 9 for the second, 8 for the 3rd, and 7 for the 4th. So, 10 * 9 * 8 * 7 = 5040.
Programmatically, you can generate these with some combinations logic. Using a functional approach usually keeps code cleaner; meaning build up a new string recursively as opposed to trying to use a StringBuilder or array to keep modifying your existing string.
Example Code
The following code will generate the permutations, without reusing digits, without any extra set or map/etc.
public class LockerNumberNoRepeats {
public static void main(String[] args) {
System.out.println("Total combinations = " + permutations(4));
}
public static int permutations(int targetLength) {
return permutations("", "0123456789", targetLength);
}
private static int permutations(String c, String r, int targetLength) {
if (c.length() == targetLength) {
System.out.println(c);
return 1;
}
int sum = 0;
for (int i = 0; i < r.length(); ++i) {
sum += permutations(c + r.charAt(i), r.substring(0,i) + r.substring(i + 1), targetLength);
}
return sum;
}
}
Output:
...
9875
9876
Total combinations = 5040
Explanation
Pulling this from a comment by #Rick as it was very well said and helps to clarify the solution.
So to explain what is happening here - it's recursing a function which takes three parameters: a list of digits we've already used (the string we're building - c), a list of digits we haven't used yet (the string r) and the target depth or length. Then when a digit is used, it is added to c and removed from r for subsequent recursive calls, so you don't need to check if it is already used, because you only pass in those which haven't already been used.

it's easy to find a formula. i.e.
if n=1 there are 10 variants.
if n=2 there are 9*10 variants.
if n=3 there are 8*9*10 variants.
if n=4 there are 7*8*9*10 variants.

Note the symmetry here:
0123
0124
...
9875
9876
9876 = 9999 - 123
9875 = 9999 - 124
So for starters you can chop the work in half.
It's possible that you might be able to find a regex which covers scenarios such that if a digit occurs twice in the same string then it matches/fails.
Whether the regex will be faster or not, who knows?
Specifically for four digits you could have nested For loops:
for (int i = 0; i < 10; i++) {
for (int j = 0; j < 10; j++) {
if (j != i) {
for (int k = 0; k < 10; k++) {
if ((k != j) && (k != i)) {
for (int m = 0; m < 10; m++) {
if ((m != k) && (m != j) && (m != i)) {
someStringCollection.add((((("" + i) + j) + k) + m));
(etc)
Alternatively, for a more generalised solution, this is a good example of the handy-dandy nature of recursion. E.g. you have a function which takes the list of previous digits, and required depth, and if the number of required digits is less than the depth just have a loop of ten iterations (through each value for the digit you're adding), if the digit doesn't exist in the list already then add it to the list and recurse. If you're at the correct depth just concatenate all the digits in the list and add it to the collection of valid strings you have.

Backtracking method is also a brute-force method.
private static int pickAndSet(byte[] used, int last) {
if (last >= 0) used[last] = 0;
int start = (last < 0) ? 0 : last + 1;
for (int i = start; i < used.length; i++) {
if (used[i] == 0) {
used[i] = 1;
return i;
}
}
return -1;
}
public static int get_series(int n) {
if (n < 1 || n > 10) return 0;
byte[] used = new byte[10];
int[] result = new int[n];
char[] output = new char[n];
int idx = 0;
boolean dirForward = true;
int count = 0;
while (true) {
result[idx] = pickAndSet(used, dirForward ? -1 : result[idx]);
if (result[idx] < 0) { //fail, should rewind.
if (idx == 0) break; //the zero index rewind failed， think all over.
dirForward = false;
idx --;
continue;
} else {//forward.
dirForward = true;
}
idx ++;
if (n == idx) {
for (int k = 0; k < result.length; k++) output[k] = (char)('0' + result[k]);
System.out.println(output);
count ++;
dirForward = false;
idx --;
}
}
return count;
}

How can I get Space complexity O(n) while looking for the longest common substring [DP]?

¡Hello!
I'm trying to find the longest common substring between two strings with a good time and space complexity, following using dynamic programming. I could find a solution with O(n^2) time and space complexity:
public static String LCS(String s1, String s2){
int maxlen = 0; // stores the max length of LCS
int m = s1.length();
int n = s2.length();
int endingIndex = m; // stores the ending index of LCS in X
// lookup[i][j] stores the length of LCS of substring
// X[0..i-1], Y[0..j-1]
int[][] lookup = new int[m + 1][n + 1];
// fill the lookup table in bottom-up manner
for (int i = 1; i <= m; i++)
{
for (int j = 1; j <= n; j++)
{
// if current character of X and Y matches
if (s1.charAt(i - 1) == s2.charAt(j - 1))
{
lookup[i][j] = lookup[i - 1][j - 1] + 1;
// update the maximum length and ending index
if (lookup[i][j] > maxlen)
{
maxlen = lookup[i][j];
endingIndex = i;
}
}
}
}
// return Longest common substring having length maxlen
return s1.substring(endingIndex - maxlen, endingIndex);
}
My question is: How can I get better space complexity?
Thanks in advance!

The best time complexity you can get for finding the LCS of two Strings is O(n^2) using dynamic programming. I tried to find another algorithm for the problem since it was one of my University projects. But the best thing I could find was an algorithm with O(n^3) complexity. The main solution of this problem uses "Recurrence relation" which uses less space but far more process. But like "Fibonacci series" computer scientists used dynamic programming to reduce the time complexity.
The Recurrence relation code:
void calculateLCS(string &lcs , char frstInp[] , char secInp[] , int lengthFrstInp , int lengthSecInp) {
if (lengthFrstInp == -1 || lengthSecInp == -1)
return;
if (frstInp[lengthFrstInp] == secInp[lengthSecInp]) {
lcs += frstInp[lengthFrstInp];
lengthFrstInp--;
lengthSecInp--;
calculateLCS(lcs, frstInp, secInp, lengthFrstInp, lengthSecInp);
}
else {
string lcs1 ="";
string lcs2 ="";
lcs1 = lcs;
lcs2 = lcs;
calculateLCS(lcs1, frstInp, secInp, lengthFrstInp, lengthSecInp - 1);
calculateLCS(lcs2, frstInp, secInp, lengthFrstInp - 1, lengthSecInp);
if (lcs1.size() >= lcs2.size())
lcs = lcs1;
else
lcs = lcs2;
}

Improving Program Efficiency

public class cowcode {
public static void main(String[] args) {
long index = 1000000
String line = HELLO
boolean found = false;
if (index <= line.length())
found = true;
while (!found) {
line += buildString(line);
if (index <= line.length())
found = true;
}
if (found)
System.out.println("" + charAt(line, index-1));
}
public static String buildString(String str){
String temp = "" + str.charAt(str.length()-1);
for (int i = 0; i < str.length()-1; i ++){
temp += str.charAt(i);
}
return temp;
}
public static String charAt(String line, long index){
for (int i = 0; i < line.length(); i ++){
if (i == index)
return line.charAt(i) + "";
}
return "";
}
}
Hey! The code above works perfectly fine. However the only problem is runtime.
The objective of this program is to build a string from "HELLO" (which will eventually have the length of at least size index). This is done by rotating the String to the right ("HELLO" --> "HELLOOHELL", and concatenating the original String and the rotated version together. This process will not stop until the index that the program is looking for is found in the String. (so in this example, the String will become "HELLOOHELLLHELLOOHEL" after going through the loop twice).
Do you guys see anything that could be eliminated/shortened to improve runtime?

What I guess is killing you is all of the String concatenations you're doing in buildString. You can cut it down to this:
public static String buildString(String str){
return str.charAt(str.length()-1) + str.substring(0, str.length()-1);
}

You need to calculate the index without actually building the string. The right half of the composite string it rotated, the left one is not. If You have index in the left half of the string, You can just throw away the right half. Hence You simplified the situation. If You have index in the right half, You can transform it to index in the left half. You just need to undo rotation of the string in the right half. So You rotate the index left by one character. Now You can substract legth of half of the string and You have index in the left half of the string. This situation is already described above. So You shorten the string and start again at the beginning. In the end You end up with the string, that is not composed. It is the original string. Now You can address the characters directly with the index as it is now in range of the string.
index = 1000000 - 1;
line = "HELLO";
int len = line.length();
long len2 = len;
while (len2 <= index) {
len2 *= 2;
}
while (len2 > len) {
long lenhalf = len2 / 2;
if (index >= lenhalf) {
index -= lenhalf;
index -= 1;
if (index < 0) {
index += lenhalf;
}
}
len2 = lenhalf;
}
System.out.println(line.charAt((int)index));

O(log n) Programming

I am trying to prepare for a contest but my program speed is always dreadfully slow as I use O(n). First of all, I don't even know how to make it O(log n), or I've never heard about this paradigm. Where can I learn about this?
For example,
If you had an integer array with zeroes and ones, such as [ 0, 0, 0, 1, 0, 1 ], and now you wanted to replace every 0 with 1 only if one of it's neighbors has the value of 1, what is the most efficient way to go about doing if this must occur t number of times? (The program must do this for a number of t times)
EDIT:
Here's my inefficient solution:
import java.util.Scanner;
public class Main {
static Scanner input = new Scanner(System.in);
public static void main(String[] args) {
int n;
long t;
n = input.nextInt();
t = input.nextLong();
input.nextLine();
int[] units = new int[n + 2];
String inputted = input.nextLine();
input.close();
for(int i = 1; i <= n; i++) {
units[i] = Integer.parseInt((""+inputted.charAt(i - 1)));
}
int[] original;
for(int j = 0; j <= t -1; j++) {
units[0] = units[n];
units[n + 1] = units[1];
original = units.clone();
for(int i = 1; i <= n; i++) {
if(((original[i - 1] == 0) && (original[i + 1] == 1)) || ((original[i - 1] == 1) && (original[i + 1] == 0))) {
units[i] = 1;
} else {
units[i] = 0;
}
}
}
for(int i = 1; i <= n; i++) {
System.out.print(units[i]);
}
}
}

This is an elementary cellular automaton. Such a dynamical system has properties that you can use for your advantages. In your case, for example, you can set to value 1 every cell at distance at most t from any initial value 1 (cone of light property). Then you may do something like:
get a 1 in the original sequence, say it is located at position p.
set to 1 every position from p-t to p+t.
You may then take as your advantage in the next step that you've already set position p-t to p+t... This can let you compute the final step t without computing intermediary steps (good factor of acceleration isn't it?).
You can also use some tricks as HashLife, see 1.

As I was saying in the comments, I'm fairly sure you can keep out the array and clone operations.
You can modify a StringBuilder in-place, so no need to convert back and forth between int[] and String.
For example, (note: This is on the order of an O(n) operation for all T <= N)
public static void main(String[] args) {
System.out.println(conway1d("0000001", 7, 1));
System.out.println(conway1d("01011", 5, 3));
}
private static String conway1d(CharSequence input, int N, long T) {
System.out.println("Generation 0: " + input);
StringBuilder sb = new StringBuilder(input); // Will update this for all generations
StringBuilder copy = new StringBuilder(); // store a copy to reference current generation
for (int gen = 1; gen <= T; gen++) {
// Copy over next generation string
copy.setLength(0);
copy.append(input);
for (int i = 0; i < N; i++) {
conwayUpdate(sb, copy, i, N);
}
input = sb.toString(); // next generation string
System.out.printf("Generation %d: %s\n", gen, input);
}
return input.toString();
}
private static void conwayUpdate(StringBuilder nextGen, final StringBuilder currentGen, int charPos, int N) {
int prev = (N + (charPos - 1)) % N;
int next = (charPos + 1) % N;
// **Exactly one** adjacent '1'
boolean adjacent = currentGen.charAt(prev) == '1' ^ currentGen.charAt(next) == '1';
nextGen.setCharAt(charPos, adjacent ? '1' : '0'); // set cell as alive or dead
}
For the two samples in the problem you posted in the comments, this code generates this output.
Generation 0: 0000001
Generation 1: 1000010
1000010
Generation 0: 01011
Generation 1: 00011
Generation 2: 10111
Generation 3: 10100
10100

The BigO notation is a simplification to understand the complexity of the Algorithm. Basically, two algorithms O(n) can have very different execution times. Why? Let's unroll your example:
You have two nested loops. The outer loop will run t times.
The inner loop will run n times
For each time the loop executes, it will take a constant k time.
So, in essence your algorithm is O(k * t * n). If t is in the same order of magnitude of n, then you can consider the complexity as O(k * n^2).
There is two approaches to optimize this algorithm:
Reduce the constant time k. For example, do not clone the whole array on each loop, because it is very time consuming (clone needs to do a full array loop to clone).
The second optimization in this case is to use Dynamic Programing (https://en.wikipedia.org/wiki/Dynamic_programming) that can cache information between two loops and optimize the execution, that can lower k or even lower the complexity from O(nˆ2) to O(n * log n).

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Text Justification Algorithm - java

This is a very famous problem in DP, Can somebody help to visualize the recursion part of it.How are the Permutations or Combinations will be generated. problem reference. https://www.geeksforgeeks.org/dynamic-programming-set-18-word-wrap/

Related

Recursive method to replace all occurrences of a value in a 2D array

Given a number n, list all n-digit numbers such that each number does not have repeating digits

How can I get Space complexity O(n) while looking for the longest common substring [DP]?

Improving Program Efficiency

O(log n) Programming

Categories

Resources