Check if two strings contain a similar substring - java

I am trying to write a program and part of that program is finding a similarity between two strings. You ask the user how many similar letters should be with in the strings.
For example:
string1 = aghilfamjijasrnlklk;
string2 = dfdfkkjhklkfnajnvfo;
user types 3,
program prints:
klk is similar in both
starts at index 16, ends at 19 in string 1
starts at index 8, ends at 11 in string 2
What I have tried:
for (int i = 0; i <= search; i++) {
if (string1.regionMatches(i, string2, 0, substringlength)) {
found = true;
System.out.print("Match found!");
break;
}
}
I would have done more but I am at a complete stand still and do not know what to do; I am fairly new at coding.

I am guessing the way to do this is to use the LCS algorithm (Longest Common subsequence) and at the end just take the first X amount of Chars the algorithm gives you. (ofcourse this isn't the most computational fastest but the algorithm is almost completely written by others)
As an example LCS(computer , uouthgr) will give you a string value of "outr" if the user wanted only 3 chars give him the subString of "out".
Here is the LCS code I found on the net (it need modifications to serve your purpose):
public class LCS {
public static void main(String[] args) {
String x = StdIn.readString();
String y = StdIn.readString();
int M = x.length();
int N = y.length();
// opt[i][j] = length of LCS of x[i..M] and y[j..N]
int[][] opt = new int[M+1][N+1];
// compute length of LCS and all subproblems via dynamic programming
for (int i = M-1; i >= 0; i--) {
for (int j = N-1; j >= 0; j--) {
if (x.charAt(i) == y.charAt(j))
opt[i][j] = opt[i+1][j+1] + 1;
else
opt[i][j] = Math.max(opt[i+1][j], opt[i][j+1]);
}
}
// recover LCS itself and print it to standard output
int i = 0, j = 0;
while(i < M && j < N) {
if (x.charAt(i) == y.charAt(j)) {
System.out.print(x.charAt(i));
i++;
j++;
}
else if (opt[i+1][j] >= opt[i][j+1]) i++;
else j++;
}
System.out.println();
}
}
for more detail I recommend to read:
https://en.wikipedia.org/wiki/Longest_common_subsequence_problem
but if you are like me I will recommend more to look for a video on the subject in youtube!

Related

How to find frequency of characters in a string without using array in java

Given a String, I want to create a frequency distribution of characters in the String. That is, for each distinct character in the string, I want to count how many times it occurs.
Output is a String that consists of zero or more occurrences of the pattern xd, where x is a character from the source String, and d is the number of occurrences of x within the String. Each x in the output should occur once.
The challenge is to do this without using an array or Collection.
Examples:
Source: "aasdddr" Result: "a2s1d3r1"
Source: "aabacc" Result: "a3b1c2"
Source: "aasdddraabcdaa" Result: "a6s1d4r1b1c1"
I tried this way:
String str = "aasdddr", result = "";
int counter = 0;
for(int i = 0; i < str.length(); i++){
result += "" + str.charAt(i);
for(int j = 1; j < str.length(); j++){
if(str.charAt(i) == str.charAt(j)){
counter++;
}
}
result += counter;
}
System.out.println(result);
My output is a1a2s3d6d9d12r13
Finally, I found the solution. But I think any question has more than one solution.
First, we should declare an empty string to keep the result. We use a nested loop because the outer loop will keep a character fixed during each iteration of the inner loop. Also, we should declare a count variable inside the outer loop. Because in each match, it will be increased by one and after controlling each character in the inner loop, it will be zero for the next check. Finally, after the inner loop, we should put a condition to check whether we have that character inside the result string. If there isn't any character like that, then it will be added to the result string. After that, its frequency (count) will be added. Outside of the loop, we can print it.
public class FrequenciesOfChar {
public static void main(String[] args) {
String str = "aabcccd"; // be sure that you don't have any digit in your string
String result = ""; // this will hold new string
for (int i = 0; i < str.length(); i++) { // this will hold a character till be checked by inner loop
int count = 0; // put here so that it can be zero after each cycle for new character
for (int j = 0; j < str.length(); j++) { // this will change
if(str.charAt(i) == str.charAt(j)){ // this will check whether there is a same character
count++; // if there is a same character, count will increase
}
}
if( !(result.contains(""+str.charAt(i))) ){ // this checks if result doesn't contain the checked character
result += ""+str.charAt(i); // first if result doesn't contain the checked character, character will be added
result += count; // then the character's frequency will be added
}
}
System.out.println(result);
}
}
Run Result:
aabcccd - a2b1c3d1
First, counter needs to be reset inside the for loop. Each time you encounter a character in the source String, you want to restart the counter. Otherwise, as you have seen, the value of the counter is strictly increasing.
Now, think about what happens if a character occurs in more than one place in the source String, as in the "aasdddraabcdaa" example. A sequence of 1 or more a appears in 3 places. Because, at the time you get to the 2nd occurrence of a, a has been previously counted, you want to skip over it.
Because the source String cannot contain digits, the result String can be used to check if a particular character value has already been processed. So, after fixing the problem with counter, the code can be fixed by adding these two lines:
if (result.indexOf (source.charAt(i)) >= 0) {
continue; }
Here is the complete result:
package stackoverflowmisc;
public class StackOverflowMisc {
public static String freqDist(String source) {
String result = "";
int counter ;
for (int i = 0; i < source.length(); i++) {
if (result.indexOf (source.charAt(i)) >= 0) { continue; }
counter = 1;
result += source.charAt(i);
for (int j = 1; j < source.length(); j++) {
if (source.charAt(i) == source.charAt(j)) {
counter++;
}
}
result += counter;
}
return result;
}
public static void main(String[] args) {
String [] test = {"aasdddr", "aabacc", "aasdddraabcdaa"};
for (int i = 0; i < test.length; ++i) {
System.out.println (test[i] + " - " + freqDist (test[i]));
}
System.out.println ("End of Program");
}
}
Run results:
aasdddr - a2s2d4r2
aabacc - a3b2c3
aasdddraabcdaa - a6s2d5r2b2c2
End of Program
In one of the Q&A comments, you said the source string can contain only letters. How would the program work if it were allowed to contain digits? You can't use the result String, because the processing inserts digits there. Again, this is an easy fix: Add a 3rd String to record which values have already been found:
public static String freqDist2(String source) {
String result = "", found = "";
int counter ;
for (int i = 0; i < source.length(); i++) {
if (found.indexOf (source.charAt(i)) >= 0) { continue; }
counter = 1;
result += source.charAt(i);
found += source.charAt(i);
for (int j = 1; j < source.length(); j++) {
if (source.charAt(i) == source.charAt(j)) {
counter++;
}
}
result += counter;
}
return result;
}
Another possibility is to delete the corresponding characters from the source String as they are counted. If you are not allowed to modify the Source String, make a copy and use the copy.
Comment: I don't know if this is what your professor or whomever had in mind by placing the "No array" restriction, because a String is essentially built on a char array.

How to rotate 2-D Array in Java

[SOLVED]
The title of this question is vague but hopefully this will clear things up.
Basically, what I am looking for is a solution to rotating this set of data. This data is set up in a specific way.
Here is an example of how the input and output would look like:
Input:
3
987
654
321
Output:
123
456
789
The '3' represents the number of columns and rows that will be used. If you input the number '4', you will be allowed to input 4 sets of 4 integers.
Input:
4
4567
3456
2345
1234
Output:
1234
2345
3456
4567
The goal is to find a way to rotate the data only if needed. You have to make sure the smallest corner number is at the top left. For example, for the code above, you rotated it so 1 is at the top left.
The problem I have is that I don't know how to rotate the data. I am only able to rotate the corners but not the sides. This is what my code does so far:
take the input of each line and turn them into strings
split those strings into separate characters
store those characters in an array
I just do not know how to compare those characters and in the end rotate the data.
Any help would be appreciated! Any questions will be answered.
A detailed description of the problem is here(problem J4).
This is just a challenge I assigned myself for practice for next year's contest, so giving me the answer won't "spoil" the question, but actually help me learn.
Here is my code so far:
import java.util.Scanner;
public class Main {
public static void main(String[] args) {
Scanner kb = new Scanner(System.in);
int max = kb.nextInt();
int maxSqrt = (max * max);
int num[] = new int[max];
String num_string[] = new String[max];
char num_char[] = new char[maxSqrt];
int counter = 0;
int counter_char = 0;
for (counter = 0; counter < max; counter++) {
num[counter] = kb.nextInt();
}
for (counter = 0; counter < max; counter++) {
num_string[counter] = Integer.toString(num[counter]);
}
int varPos = 0, rowPos = 0, charPos = 0, i = 0;
for (counter = 0; counter < maxSqrt; counter++) {
num_char[varPos] = num_string[rowPos].charAt(charPos);
i++;
if (i == max) {
rowPos++;
i = 0;
}
varPos++;
if (charPos == (max - 1)) {
charPos = 0;
} else {
charPos++;
}
}
//
for(int a = 0 ; a < max ; a++){
for(int b = 0 ; b < max ; b++)
{
num_char[counter_char] = num_string[a].charAt(b);
counter_char++;
}
}
//here is where the code should rotate the data
}
}
This is a standard 90 degree clockwise rotation for a 2D array.
I have provided the solution below, but first a few comments.
You said that you're doing this:
take the input of each line and turn them into strings
split those strings into separate characters
store those characters in an array
Firstly youre essentially turning a int matrix into a character matrix. I do not think you need to do this, since even if you do want to compare values, you can use the ints provided.
Secondly, there is no need to compare any 2 data elements in the matrix, since the rotation does not depend on any value.
Here is an adapted solution for java, originally written in C# by Nick Berardi on this question
private int[][] rotateClockWise(int[][] matrix) {
int size = matrix.length;
int[][] ret = new int[size][size];
for (int i = 0; i < size; ++i)
for (int j = 0; j < size; ++j)
ret[i][j] = matrix[size - j - 1][i]; //***
return ret;
}
If you wanted to do a counterCW rotation, replace the starred line with
ret[i][j] = matrix[j][size - i - 1]

Algorithm for combinatorics (6 choose 2)

Following this question, I want to now code "6 choose 2" times "4 choose 2." By that I mean, lets say I have 6 characters "A B C D E F." The first time I choose any two characters to delete. The 2nd time I want to choose 2 different letters to delete and then I append the results of these two trials. Hence, I will receive 90("6 choose 2" times "4 choose 2") eight character strings. The characters in the pattern are from the same pattern {1,2,3,4,5, 6}. All the characters are unique and no repetition.
Here is what I have so far.
public String[] genDelPatterns(String design){
char[] data = design.toCharArray();
String[] deletionPatterns = new String[15];
int x = 0;
StringBuilder sb = new StringBuilder("");
int index = 0;
for(int i = 0; i < (6-1); i++){
for(int j = i+1; j < 6; j++){
for(int k= 0; k < 6; k++){
if((k != j) && (k != i))
sb.append(String.valueOf(data[k]));
}
deletionPatterns[x++] = sb.toString();
sb = new StringBuilder("");
}
}
return deletionPatterns;
}
public String[] gen8String(String[] pattern1, String[] pattern2){
String[] combinedPatterns = new String[225];
int k = 0;
for(int i = 0; i < 15; i++)
{
for(int j = 0; j < 15; j++)
combinedPatterns[k++] = pattern1[i] + pattern2[j];
}
return combinedPatterns;
}
I will be calling the methods like this:
gen8String(genDelPatterns("143256"), genDelPatterns("254316"));
Currently, I am generating all the possible 8 letter strings. But I want to only generate the 8 character strings according to the aforementioned specifications. I am really stuck on how I can elegantly do this multiplication. The only way I can think of is to make another method that does "4 choose 2" and then combine the 2 string arrays. But this seems very roundabout.
EDIT: An example of an 8 character string would be something like "14322516", given the inputs I have already entered when calling gen8String, (143256,254316). Note that the first 4 characters are derived from 143256 with the 5 and 6 deleted. But since I deleted 5 and 6 in the first trail, I am no longer allowed to delete the same things in the 2nd pattern. Hence, I deleted the 3 and 4 from the 2nd pattern.
you have a chain of methods , each one called a variation itself.
For so, my advice is to use a recursive method!
to achieve your goal you have to have a little experience with this solution.
A simple example of a method that exploits the recursion:
public static long factorial(int n) {
if (n == 1) return 1;
return n * factorial(n-1);
}
I can also suggest you to pass objects (constructed to perfection) for the method parameter, if is too complex to pass simple variables
This is the heart of this solution in my opinion.
While what you tried to do is definitely working, it seems you are looking for other way to implement it. Here is the skeleton of what I would do given the small constrains.
// Very pseudo code
// FOR(x,y,z) := for(int x=y; x<z;x++)
string removeCharacter(string s, int banA, int banB){
string ret = "";
FOR(i,1,7){
if(i != banA && i != banB){
ret += s[i];
}
}
return ret;
}
List<string> Generate(s1,s2){
List<string> ret = new List<string>();
FOR(i,1,7) FOR(j,i+1,7) FOR(m,1,7) FOR(n,m+1,7){
if(m != i && m != j && n != i && n != j){
string firstHalf = removeCharacter(s1,i,j);
string secondHalf = removeCharacter(s2,m,n);
ret.Add(firstHalf + secondHalf);
}
}
return ret;
}
This should generate all possible 8-characters string.
Here is the solution I came up with. Doesn't really take "mathematical" approach, I guess. But it does the job.
//generating a subset of 90 eight character strings (unique deletion patterns)
public static String[] gen8String(String[] pattern1, String[] pattern2){
String[] combinedSubset = new String[90]; //emty array for the subset of 90 strings
String combinedString = ""; //string holder for each combined string
int index = 0; //used for combinedSubset array
int present = 0; //used to check if all 6 characters are present
for(int i = 0; i < 15; i++){
for(int j = 0; j < 15; j++){
combinedString = pattern1[i] + pattern2[j]; //combine both 4 letter strings into 8 char length string
char[] parsedString = combinedString.toCharArray(); //parse into array
//check if all 6 characters are present
for(int k = 1; k <= 6; k++)
{
if(new String(parsedString).contains(k+"")) {
present++;
}
else
break;
//if all 6 are present, then add it to combined subset
if(present == 6)
combinedSubset[index++] = combinedString;
}
present = 0;
}
}
return combinedSubset;
}

very simple sign frequency

I want to get the frequency of all 128 signs (ASCII) with the simplest code possible. No imports.
I am writing in Java (Eclipse), starting off like this:
public class Text {
public static void main (String[] args) {
then I want to calculate the frequency of each sign with a loop (preferably for loop). I know how to do this for a specific sign, e.g. the sign 'a' which is 97:
int a = 0;
for (int i = 0; i < s.length(); i++) { // s is a String
if (s.charAt(i) == 'a') {
a += 1;
}
}
System.out.println("a: " + a);
I need to create a table of all the signs (e.g. int[] p = new int p[1,2,3] - only for a string (or char?)) assign each index its number and then let a loop write out all the sign frequencies.
All this should be done only with loops and commands: .length, charAt().
Simply:
final String s = "Hello World!";
final int frequencies[] = new int[128];
for (int i = 0; i < s.length(); i++) {
final int ascii = (int) s.charAt(i);
frequencies[ascii]++;
}
(in response to user2974951's "answer")
That's the String representation of the array. Try printing with a loop instead:
for(int i = 0; i < frequencies.length; i++) {
System.out.println(frequencies[i]);
}
You can also try System.out.println(Arrays.toString(frequencies)); but that might look a bit ugly given the large amount of ASCII characters you are considering.

Java function needed for finding the longest duplicated substring in a string?

Need java function to find the longest duplicate substring in a string
For instance, if the input is “banana”,output should be "ana" and we have count the number of times it has appeared in this case it is 2.
The solution is as below
public class Test{
public static void main(String[] args){
System.out.println(findLongestSubstring("i like ike"));
System.out.println(findLongestSubstring("madam i'm adam"));
System.out.println(findLongestSubstring("When life hands you lemonade, make lemons"));
System.out.println(findLongestSubstring("banana"));
}
public static String findLongestSubstring(String value) {
String[] strings = new String[value.length()];
String longestSub = "";
//strip off a character, add new string to array
for(int i = 0; i < value.length(); i++){
strings[i] = new String(value.substring(i));
}
//debug/visualization
//before sort
for(int i = 0; i < strings.length; i++){
System.out.println(strings[i]);
}
Arrays.sort(strings);
System.out.println();
//debug/visualization
//after sort
for(int i = 0; i < strings.length; i++){
System.out.println(strings[i]);
}
Vector<String> possibles = new Vector<String>();
String temp = "";
int curLength = 0, longestSoFar = 0;
/*
* now that the array is sorted compare the letters
* of the current index to those above, continue until
* you no longer have a match, check length and add
* it to the vector of possibilities
*/
for(int i = 1; i < strings.length; i++){
for(int j = 0; j < strings[i-1].length(); j++){
if (strings[i-1].charAt(j) != strings[i].charAt(j)){
break;
}
else{
temp += strings[i-1].charAt(j);
curLength++;
}
}
//this could alleviate the need for a vector
//since only the first and subsequent longest
//would be added; vector kept for simplicity
if (curLength >= longestSoFar){
longestSoFar = curLength;
possibles.add(temp);
}
temp = "";
curLength = 0;
}
System.out.println("Longest string length from possibles: " + longestSoFar);
//iterate through the vector to find the longest one
int max = 0;
for(int i = 0; i < possibles.size();i++){
//debug/visualization
System.out.println(possibles.elementAt(i));
if (possibles.elementAt(i).length() > max){
max = possibles.elementAt(i).length();
longestSub = possibles.elementAt(i);
}
}
System.out.println();
//concerned with whitespace up until this point
// "lemon" not " lemon" for example
return longestSub.trim();
}
}
This is a common CS problem with a dynamic programming solution.
Edit (for lijie):
You are technically correct -- this is not the exact same problem. However this does not make the link above irrelevant and the same approach (w.r.t. dynamic programming in particular) can be used if both strings provided are the same -- only one modification needs to be made: don't consider the case along the diagonal. Or, as others have pointed out (e.g. LaGrandMere), use a suffix tree (also found in the above link).
Edit (for Deepak):
A Java implementation of the Longest Common Substring (using dynamic programming) can be found here. Note that you will need to modify it to ignore "the diagonal" (look at the Wikipedia diagram) or the longest common string will be itself!
In Java : Suffix Tree.
Thanks to the ones that have found how to solve it, I didn't know.

Categories

Resources