Implement inner join using simple nested loop in Java - java

I have been asked in an interview about how to implement Inner Join using nested for loop in Java. I found on internet about Hash Join here https://rosettacode.org/wiki/Hash_join but couldn't find anything on internet explaining about simple nested loop implementation of inner join. I tried implementing the code but got stuck at few places as mentioned in the code comment.
/**
*
* #param R
* #param index1 Join column for table R.
* #param S
* #param index2 Join column for table S.
* #return
*/
public String[][] innerJoin(String[][] R, int index1, String[][] S, int index2) {
// How to define the result array. What should be it's size?? Is the below code correct.
String[][] result = new String[R.length + S.length][R[0].length + S[0].length];
// loop through both the tables to find out when the join column have common values.
// output those common values.
for (int i = 0; i < R.length; i++) {
for (int j = 0; j < S.length; j++) {
if (R[i][index1] == S[j][index2]) {
// How to combine both tables here ???
}
}
}
return result;
}
)

You have correctly identified 3 important issues in the question code:
how do you calculate the size of the result table?
how do you find matches?
when you find a match, how do you add it to your result table?
The easy way to calculate the result is to store matches somewhere else, and then count how many you have found before returning them. In this sense, it would be better to use ArrayList<String[]> instead of String[][], because you can append to ArrayLists but cannot change the size of arrays.
Finding matches with a double-loop is indeed very inneficient O(nm), but hey, if that is what they want, it can certainly be done. It would be a lot easier to sort on the indices first and then work on that (O(n log n + m log m + n log m), with O(n+m) extra memory); or build hashtables and use them (O(n + m + n) = O(n + m)).
Choosing what to return depends on what the columns represent, and if there are any duplicates. You could, for example, decide on the following format:
as 1st column, the contents of index1
all columns (except index1 one) from the 1st table
all the columns (except index2) from the second table.
Note that the choice of format is somewhat arbitrary; you could have left index1 in its place, and then just ommit it from the columns of table 2. In any case, with the previous answers, you would get:
public String[][] innerJoin(String[][] R, int index1, String[][] S, int index2) {
// temporary storage for matches
ArrayList<String[]> matches = new ArrayList<>();
// loop through both the tables to find out when the join column have common values.
// output those common values.
for (int i = 0; i < R.length; i++) {
for (int j = 0; j < S.length; j++) {
if (R[i][index1] == S[j][index2]) {
matches.add(combine(R[i], S[j], index1, index2));
}
}
}
// convert matches to expected output array
return matches.toArray(new String[matches.size()][]);
}
private String[] combine(String[] one, String[] two, int index1, int index2) {
String[] r = new String[one.length + two.length - 1];
int pos = 0;
r[pos ++] = one[index1];
for (int i=0; i<one.length; i++) if (i != index1) r[pos ++] = one[i];
for (int i=0; i<two.length; i++) if (i != index2) r[pos ++] = two[i];
return r;
}

I will try to give you some hints:
The length for the result array is not the sum of the length of R and S tables. Depending on the contents of the tables, it could be up to R.length * S.length.
The number of "columns" in the result array is indeed R[0].length + S[0].length (as long as the arrays are "real" tables and do not have variable number of "columns" per "row").
In your loop (in the if block), you should
In the current "output" line of the result array (starting with 0), set first R[0].length columns (0..rl - 1) to the contents of the R[i][0] ... R[i][rl - 1] columns
Then, set the rl ... R[0].length + S[0].length - 1 columns (rl ... rl + sl - 1) to the contents of the S[j][0] ... S[j][sl - 1] columns
Increment a counter for the current "output" line in the result array
In the end, it is just some array offset arithmetics ;-)

Related

How to find most profitable Path in 2-Dimensional Array

I'm trying to implement a game where the viable moves are down-left and down-right.
The parameter for the function is for the size of the array, so if you pass 4 it will be a 4 by 4 array.
The starting position is the top row from any column. Every element in the array is a number in the range 1-100, taken from a file. I need to find the resulting value for the most profitable route from any starting column.
My current implementation will compare the right position and left position and move to whichever is higher. The problem is, for example, if the left position is lower in value than the right, but the left position will provide more profit in the long run since it can access higher value elements, my algorithm fails.
Here is a demo:
84 (53) 40 62
*42* 14 [41] 57
76 *47* 80 [95]
If we start at number 53. The numbers enclosed in * are the moves that my algorithm will take, but the numbers enclosed in [] are the moves my algorithm should take.
This is my code:
import java.util.ArrayList;
import java.util.Scanner;
public class bestPathGame{
private int[][] grid;
private int n;
public bestPathGame(int num){
Scanner input = new Scanner(System.in);
n = num;
grid = new int[n][n];
for(int i = 0; i < n; i++){
for(int j = 0; j < n; j++){
grid[i][j] = input.nextInt();
}
}
}
public static void main(String[] args){
bestPathGame obj = new bestPathGame(Integer.parseInt(args[0]));
obj.bestPath();
}
private boolean moveLeftBetter(int r,int c){
if(c <= 0){
return false;
} else if (c >= n -1 ){
return true;
}
return grid[r][c-1] > grid[r][c+1];
}
public void bestPath(){
ArrayList<Integer> allOptions = new ArrayList<>();
for(int k = 0; k < n; k++){
int row = 0;
int col = k;
int collection = grid[row][col];
while(row < n - 1){
row += 1;
if(moveLeftBetter(row,col)){
col-=1;
} else{
col+=1;
}
collection += grid[row][col];
}
allOptions.add(collection);
}
System.out.println(allOptions.stream().reduce((a,b)->Integer.max(a,b)).get());
}
}
Greedy algorithm vs Dynamic programming
There's an issue with the logic of your solution.
Basically, what you are implemented is a called a greedy algorithm. At each step of iteration, you are picking a result that optimal locally, assuming that this choice will lead to the optimal global result. I.e. your code is based on the assumption that by choosing a local maximum between the two columns, you will get the correct global maximum.
As a consequence, your code in the bestPath() method almost at each iteration will discard a branch of paths based on only one next value. This approach might lead to incorrect results, especially with large matrixes.
Greedy algorithms are rarely able to give an accurate output, usually their result is somewhat close but not precise. As an upper-hand, they run fast, typically in O(n) time.
For this problem, you need to use a dynamic programming (DP).
In short, DP is an enhanced brute-force approach which cashes the results and reuses them instead of recalculating the same values multiple times. And as well, as a regular brute-force DP algorithms are always checking all possible combinations.
There are two major approaches in dynamic programming: tabulation and memoization (take a look at this post for more information).
Tabulation
While implementing a tabulation first you need to create an array which then need to be prepopulated (completely or partially). Tabulation is also called the bottom-up approach because calculation start from the elementary edge cases. Every possible outcome is being computed based on the previously obtained values while iterating over this array. The final result will usually be stored in the last cell (in this case in the last row).
To implement the tabulation, we need to create the matrix of the same size, and copy all the values from the given matrix into it. Then row by row every cell will be populated with the maximum possible profit that could be obtained by reaching this cell from the first row.
I.e. every iteration will produce a solution for a 2D-array, that continuously increases by one row at each step. It'll start from the array that consists of only one first row (no changes are needed), then to get the profit for every cell in the second row it's values has to be combined with the best values from the first row (that will be a valid solution for 2D-array of size 2 * n), and so on. That way, solution gradually develops, and the last row will contain the maximum results for every cell.
That how the code will look like:
public static int getMaxProfitTabulation(int[][] matrix) {
int[][] tab = new int[matrix.length][matrix.length];
for (int row = 0; row < tab.length; row++) { // populating the tab to preserve the matrix intact
tab[row] = Arrays.copyOf(matrix[row], matrix[row].length);
}
for (int row = 1; row < tab.length; row++) {
for (int col = 0; col < tab[row].length; col++) {
if (col == 0) { // index on the left is invalid
tab[row][col] += tab[row - 1][col + 1];
} else if (col == matrix[row].length - 1) { // index on the right is invalid
tab[row][col] += tab[row - 1][col - 1];
} else {
tab[row][col] += Math.max(tab[row - 1][col - 1], tab[row - 1][col + 1]); // max between left and right
}
}
}
return getMax(tab);
}
Helper method responsible for extracting the maximum value from the last row (if you want to utilize streams for that, use IntStream.of(tab[tab.length - 1]).max().orElse(-1);).
public static int getMax(int[][] tab) {
int result = -1;
for (int col = 0; col < tab[tab.length - 1].length; col++) {
result = Math.max(tab[tab.length - 1][col], result);
}
return result;
}
Memoization
The second option is to use Memoization, also called the top-down approach.
As I said, DP is an improved brute-force algorithm and memoization is based on the recursive solution that generates all possible outcomes, that is enhanced by adding a HashMap that stores all previously calculated results for every cell (i.e. previously encountered unique combination of row and column).
Recursion starts with the first row and the base-case of recursion (condition that terminates the recursion and is represented by a simple edge-case for which output is known in advance) for this task is when the recursive call hits the last row row == matrix.length - 1.
Otherwise, HashMap will be checked whether it already contains a result. And if it not the case all possible combination will be evaluated and the best result will be placed into the HashMap in order to be reused, and only the then the method returns.
Note that tabulation is usually preferred over memoization, because recursion has significant limitations, especially in Java. But recursive solutions are sometimes easier to came up with, so it's completely OK to use it when you need to test the idea or to prove that an iterative solution is working correctly.
The implementation will look like that.
public static int getMaxProfitMemoization(int[][] matrix) {
int result = 0;
for (int i = 0; i < matrix[0].length; i++) {
result = Math.max(result, maxProfitHelper(matrix, 0, i, new HashMap<>()));
}
return result;
}
public static int maxProfitHelper(int[][] matrix, int row, int col,
Map<String, Integer> memo) {
if (row == matrix.length - 1) { // base case
return matrix[row][col];
}
String key = getKey(row, col);
if (memo.containsKey(key)) { // if cell was already encountered result will be reused
return memo.get(key);
}
int result = matrix[row][col]; // otherwise result needs to be calculated
if (col == matrix[row].length - 1) { // index on the right is invalid
result += maxProfitHelper(matrix, row + 1, col - 1, memo);
} else if (col == 0) { // index on the left is invalid
result += maxProfitHelper(matrix, row + 1, col + 1, memo);
} else {
result += Math.max(maxProfitHelper(matrix, row + 1, col - 1, memo),
maxProfitHelper(matrix, row + 1, col + 1, memo));
}
memo.put(key, result); // placing result in the map
return memo.get(key);
}
public static String getKey(int row, int col) {
return row + " " + col;
}
Method main() and a matrix-generator used for testing purposes.
public static void main(String[] args) {
int[][] matrix = generateMatrix(100, new Random());
System.out.println("Tabulation: " + getMaxProfitTabulation(matrix));
System.out.println("Memoization: " + getMaxProfitMemoization(matrix));
}
public static int[][] generateMatrix(int size, Random random) {
int[][] result = new int[size][size];
for (int row = 0; row < result.length; row++) {
for (int col = 0; col < result[row].length; col++) {
result[row][col] = random.nextInt(1, 101);
}
}
return result;
}

Smallest element in largest row

I came across this problem in class and I'm stuck on it. I did plenty of research but I'm not being able to fix my code.
I need to create a matrix and find the smallest value in the row of the largest value (I believe this element is called minimax). I'm trying to do with a simple 3 x 3 matrix. What I have so far:
Scanner val = new Scanner(System.in);
int matrizVal[][] = new int[3][3];
for (int a = 0; a < matrizVal.length; a++) {
for (int b = 0; b < matrizVal.length; b++) {
System.out.print("(" + a + ", " + b + "): ");
matrizVal[a][b] = val.nextInt();
}
}
int largest = matrizVal[0][0];
int largestrow = 0;
int arr[] = new int[2];
for (int row = 0; row < matrizVal.length; row++){
for (int col = 0; col < matrizVal.length; col++){
if (largest < matrizVal[row][col]){
largest = matrizVal[row][col];
largestrow = row;
}
}
}
To find the so called minimax element I decided to create a for each loop and get all the values of largestrow except the largest one.
for (int i : matrizVal[largestrow]){
if (i != largest){
System.out.print(i);
}
}
Here's where I'm stuck! I'd simply like to 'sort' this integer and take the first value and that'd be the minimax. I'm thinking about creating an array of size [matrizVal.length - 1], but not sure if it's gonna work.
I did a lot of research on the subject but nothing seems to help. Any tips are welcome.
(I don't think it is but I apologize if it's a duplicate)
Given the code you have provided, matrizVal[largestrow] should be the row of the matrix that contains the highest valued element.
Given that your task is to extract the smallest value in this array, there are a number of options.
If you want to simply extract the minimum value, a naive approach would go similarly to how you determined the maximum value, just with one less dimension.
For example:
int min = matrizVal[largestrow][0];
for (int i = 0; i < matrizVal.length; i++) {
if (matrizVal[largestrow][i] < min) {
min = matrizVal[largestrow][i];
}
}
// min will be the target value
Alternatively, if you want to sort the array such that the first element of the array is always the smallest, first ensure that you're making a copy of the array so as to avoid mutating the original matrix. Then feel free to use any sorting algorithm of your choice. Arrays.sort() should probably suffice.
You can simplify your approach by scanning each row for the maximum and minimum values in that row and then deciding what to do with those values based on the maximum value found in previous rows. Something like this (untested) should work:
int largestValue = Integer.MIN_VALUE;
int smallestValue = 0; // anything, really
for (int[] row : matrizVal) {
// First find the largest and smallest value for this row
int largestRowValue = Integer.MIN_VALUE;
int smallestRowValue = Integer.MAX_VALUE;
for (int val : row) {
smallestRowValue = Math.min(smallestRowValue, val);
largestRowValue = Math.max(largestRowValue, val);
}
// now check whether we found a new highest value
if (largestRowValue > largestValue) {
largestValue = largestRowValue;
smallestValue = smallestRowValue;
}
}
This doesn't record the row index, since it didn't sound like you needed to find that. If you do, then replace the outer enhanced for loop with a loops that uses an explicit index (as with your current code) and record the index as well.
I wouldn't bother with any sorting, since that (1) destroys the order of the original data (or introduces the expense of making a copy) and (2) has higher complexity than a one-time scan through the data.
You may want to consider a different alternative using Java 8 Stream :
int[] maxRow = Arrays.stream(matrizVal).max(getCompertator()).get();
int minValue = Arrays.stream(maxRow).min().getAsInt();
where getCompertator() is defined by:
private static Comparator<? super int[]> getCompertator() {
return (a1, a2)->
Integer.compare(Arrays.stream(a1).max().getAsInt(),
Arrays.stream(a2).max().getAsInt()) ;
}
Note that it may not give you the (undefined) desired output if two rows include the same highest value .

Iterating a 2D Array using Java 8

This is an implementation of the 0-1 Knapsack Problem. The problem statement goes like this,
You're given two arrays, one containing the weights of a set of items and the other containing values for the respective weights. You're provided a max weight. Within the constraint of staying under the max weight, determine the maximum value that you can obtain by selecting or not selecting the set of items.
The values and the weight list will always be of the same size.
This is my solution, which generally works fine(not on edge cases).
public static int getCombination(int[] weights, int[] values, int maxWeight){
int[][] memo = new int[weights.length][maxWeight + 1];
for (int i = 0; i < memo.length; i++) {
for(int j=0; j < memo[i].length; j++){
if(j == 0){
memo[i][j] = 0;
}else if(weights[i] > j){
if(i == 0) memo[i][j] = 0;
else memo[i][j] = memo[i-1][j];
}else{
if(i == 0){
memo[i][j] = values[i];
}
else{
memo[i][j] = Integer.max((values[i] + memo[i-1][j- weights[i]]), memo[i-1][j]);
}
}
}
}
return memo[weights.length -1][maxWeight];
}
Now I want to re-write this complete logic in a declarative manner using Java 8 Streams and lambdas. Can someone help me with that.
Since your for loop based solution is completely fine, streams do not add much value here if you only convert the for loops to forEach streams.
You get more out of using streams, if you use IntStream and the toArray method, because you can concentrate on calculating the value based on row and column index, and not care about filling it into the array.
int[][] memo = IntStream.range(0, rows)
.mapToObj(r -> IntStream.range(0, cols)
.map(c -> computeValue(r, c))
.toArray())
.toArray(int[rows][cols]::new);
Here, we create an array for each row, and then put those into a 2D-array at the end. As you can see, the toArray() methods take care of filling the arrays.
Actually, now that I looked at your method to calculate the values more closely, I realize that streams might be difficult if not impossible to use in this case. The problem is that you need values from previous columns and rows to calculate the current value. This is not possible in my solution precisely because we only create the arrays at the end. More specifically, my approach is stateless, i.e. you do not remember the result of previous iterations.
You could see if you can use Stream.reduce() to achieve your goal instead.
BTW, your approach is fine. If you don't want to parallelize this, you are good to go.
Here's a possible starting point to create the indices into your array:
int rows = 3;
int cols = 4;
int[][] memo = new int[rows][cols];
IntStream.range(0, rows * cols).forEach(n -> {
int i = n / cols;
int j = n % cols;
System.out.println("(" + i + "," + j + ")");
});

Print characters as a Matrix

Below problem has a list of characters and number of columns as the input. Number of columns is not a constant and can vary with every input.
Output should have all the rows fully occupied except for the last one.
list: a b c d e f g
colums: 3
Wrong:
a b c
d e f
g
Wrong:
a d g
b e
c f
Correct:
a d f
b e g
c
I have tried below:
public static void printPatern(List<Character> list, int cols) {
for (int i = 0; i < cols; i++) {
for (int j = i; j < list.size(); j += cols) {
System.out.print(list.get(j));
}
System.out.println();
}
}
It gives output as (which is wrong):
a d g
b e
c f
I am trying to come with an algorithm to print the correct output. I want to know what are the different ways to solve this problem. Time and Space complexity doesn't matter. Also above method which I tried is wrong because it takes columns as the parameter but that's actually acting as the number of rows.
FYI: This is not a HOMEWORK problem.
Finally able to design the algorithm for this problem
Please refer below java code same
public class puzzle{
public static void main(String[] args){
String list[] = { "a", "b", "c","d","e","f","g","h","i","j" };
int column = 3;
int rows = list.length/column; //Calculate total full rows
int lastRowElement = list.length%column;//identify number of elements in last row
if(lastRowElement >0){
rows++;//add inclomplete row to total number of full filled rows
}
//Iterate over rows
for (int i = 0; i < rows; i++) {
int j=i;
int columnIndex = 1;
while(j < list.length && columnIndex <=column ){
System.out.print("\t"+list[j]);
if(columnIndex<=lastRowElement){
if(i==rows-1 && columnIndex==lastRowElement){
j=list.length; //for last row display nothing after column index reaches to number of elements in last row
}else{
j += rows; //for other rows if columnIndex is less than or equal to number of elements in last row then add j value by number of rows
}
}else {
if(lastRowElement==0){
j += rows;
}else{
j += rows-1; //for column greater than number of element in last row add j = row-1 as last row will not having the column for this column index.
}
}
columnIndex++;//Increase column Index by 1;
}
System.out.println();
}
}
}
This is probably homework; so I am not going to do it for you, but give you some hints to get going. There are two points here:
computing the correct number of rows
computing the "pattern" that you need when looping your list so that you print the expected result
For the first part, you can look into the modulo operation; and for the second part: start iterating your list "on paper" and observe how you are printing the correct result manually.
Obviously, that second part is the more complicated one. It might help if you realize that printing "column by column" is straight forward. So when we take your correct example and print the indexes instead of values, you get:
0 3 6
1 4 7
2 5
Do that repeatedly for different input; and you will soon discover the pattern of indexes that you need to print "row by row".

Java - How to join multiple values in a multidimensional array?

I'm beginner programmer, I have fallen into a rabbit hole trying to understand how to use arrays. I'm trying to create a table using multidimensional arrays and I am looking to create a table with 7 rows and 5 columns.
column 1 = will take values from the user input.This input is stored in an array.
column 2 = will print the highest input in that array.
column 3 = will print the lowest input in that array
column 4 = will print take the increment Total. i.e current input + previous input.
column 5 = will take the increment average. i.e Total/Index
Complete Code below
import java.util.Scanner;
public class Numbers {
public static void main(String[] args)
{
//----User Input - Add values to array ----//
Scanner keyboard = new Scanner(System.in);
int places = 7;
int [] values = new int [places];
int sum = 0;
System.out.println("Enter a Numbers:");
for (int count = 0; count < places; count++)
{
values[count] = keyboard.nextInt();
System.out.println(values[count]);
}
//----------------Total---------------------//
for (int numb : values)
{
sum = sum + numb;
}
System.out.println("\n Total:" + sum);
//----------Average------------------------/
double avg = 0;
if (values.length > 0)
{
avg = sum / values.length;
}
System.out.printf("\n Average:"+ avg);
//---------Table Start---------------------//
int [] [] table = new int [7][5];
for (int row =0; row < 7; row++)
for (int column = 0; column < 5; column++)
table [row][column] = getTable(column, column, column, column, avg);
System.out.println("\n\nIndex\tInput\tHigest\tLowest\tTotal\tAverage");
for (int row = 0; row < 7; row++)
{
System.out.print((row + 1) + " ");
for (int column = 0; column < 5; column++)
System.out.print("\t " + table[row][column] + " ");
System.out.println();
}
}
public static int getTable(int input, int highest, int lowest, int total, double average) {
/* TO DO:
* - add each user input from values array into Input column
*
* - add highest/lowest values in the Highest/Lowest column
*
* - add each of the array element total in Total column - this column should take previous Total plus current total.
* i.e Total = Total + Input
*
* - add Average - Current average value.
*/
return 0;
}
}
What I don't know is how to get my code to fill each of the rows using different values each time.
For Example:
Index 1
1st column take first value of the values array
2nd column: take highest value of the values array
3rd column: take lowest value of the values array
4th column: take previous element of values array plus the current value
5th column: take total/index
I know that may need to create a method to get my program to loop through but I just don't know how to do it. I've tried a few different ways, but I'm just getting confused. In the left corner of the screenshot below, is how the columns would look like. Notice how they are all returning 0, which I known that is coming from the getTable method that I created, which is doing just that.
Basically, in this code, you're looping over all the columns:
for (int row =0; row < 7; row++)
for (int column = 0; column < 5; column++)
table [row][column] = getTable(column, column, column, column, avg);
You don't want to do this. Looping over all the columns would make sense if you were doing pretty much the same thing with each column. But you're not. You want each column to have the result of a very different computation. So it would make more sense to say something like
for (int row = 0; row < table.length; row++) {
table[row][0] = getFirstValue(values);
table[row][1] = getHighestValue(values);
table[row][2] = getLowestValue(values);
...
and so on. (However, I don't really understand how "values" is supposed to be used. You're inputting one set of values, but you're creating a table with 7 rows based on that one set of values. Perhaps there's more things wrong with your code.)
Note a couple of things: (1) I replaced 7 with table.length in the loop. table.length is the number of rows, and will be 7. But if you change things to use a different number of rows, then using table.length means you don't have to change the for loop. (2) My code passes values as a parameter to the different methods, which is necessary because the methods will be making computations on the input values. Your code didn't pass values to getTable(), so there's no way getTable() could have performed any computations, since it didn't have the data.
The code could be improved further. One way would be to define constants in the class like
private static final int FIRST_VALUE_COLUMN = 0;
private static final int HIGHEST_VALUE_COLUMN = 1;
...
table[row][FIRST_VALUE_COLUMN] = getFirstValue(values);
table[row][HIGHEST_VALUE_COLUMN] = getHighestValue(values);
which would be more readable.
A more significant improvement would be not to use a 2-D array at all. Since you have five values with different meanings, the normal approach in Java would be to create a class with five instance variables to hold the computed data:
public class ComputedData {
private int firstValue;
private int highestValue;
private int lowestValue;
public void setFirstValue(int firstValue) {
this.firstValue = firstValue;
}
public int getFirstValue() {
return firstValue;
}
// similarly for other fields
}
table would then be a 1-dimensional array of ComputedData.
This is better because now you don't have to assign meaningless column numbers to different computed values. Instead, the names tell you just what each computed value is. Also, it means you can add new values later that don't have to be int. With an array, all elements in the array have to be the same type (you can use Object which can then hold a value of any type, but that can make the code messier). In fact, you may decide later that the "average" should be a double instead of an int since averages of integers aren't always integers. You can make this change pretty easily by changing the type of an instance variable in the ComputedData class. If you use an array, though, this kind of change gets pretty complicated.

Categories

Resources