Intersection of two strings in Java

Intersection of two strings in Java - java

Need a Java function to find intersection of two strings. i.e. characters common to the strings.
Example:
String s1 = new String("Sychelless");
String s2 = new String("Sydney");

Using HashSet<Character>:
HashSet<Character> h1 = new HashSet<Character>(), h2 = new HashSet<Character>();
for(int i = 0; i < s1.length(); i++)
{
h1.add(s1.charAt(i));
}
for(int i = 0; i < s2.length(); i++)
{
h2.add(s2.charAt(i));
}
h1.retainAll(h2);
Character[] res = h1.toArray(new Character[0]);
This is O(m + n), which is asymptotically optimal.

Extract the characters
String.toCharArray
Put them in a Set
Find the intersection
Set.retainAll

Most basic approach:
String wordA = "Sychelless";
String wordB = "Sydney";
String common = "";
for(int i=0;i<wordA.length();i++){
for(int j=0;j<wordB.length();j++){
if(wordA.charAt(i)==wordB.charAt(j)){
common += wordA.charAt(i)+" ";
break;
}
}
}
System.out.println("common is: "+common);

More detail on saugata's response (appeared while I was writing this): -
public static void main(String[] args) {
String s1 = "Seychelles";
String s2 = "Sydney";
Set<Character> ss1 = toSet(s1);
ss1.retainAll(toSet(s2));
System.out.println(ss1);
}
public static Set<Character> toSet(String s) {
Set<Character> ss = new HashSet<Character>(s.length());
for (char c : s.toCharArray())
ss.add(Character.valueOf(c));
return ss;
}

I think the algorithm you are looking for is the problem of the longest common subsequence

Found same question here, refer this
Implementing an efficent algorithm to find the intersection of two strings

By means of Guava this task seems much easier:
String s1 = new String("Sychelless");
String s2 = new String("Sydney");
Set<String> setA = Sets.newHashSet(Splitter.fixedLength(1).split(s1));
Set<String> setB = Sets.newHashSet(Splitter.fixedLength(1).split(s2));
Sets.intersection(setA, setB);

Optimized solution:
public static String twoStrings(String s1, String s2){
HashSet<Character> stringOne = new HashSet<Character>(), stringTwo = new HashSet<Character>();
int stringOneLength = s1.length();
int stringTwoLength = s2.length();
for(int i=0; i<stringOneLength || i<stringTwoLength; i++) {
if(i < stringOneLength)
stringOne.add(s1.charAt(i));
if(i < stringTwoLength)
stringTwo.add(s2.charAt(i));
}
stringOne.retainAll(stringTwo);
return stringOne.toString();
}

I have used TreeSet. And retainAll() in TreeSet to get matched elements.
Oracle Doc:
retainAll(Collection<?> c)
Retains only the elements in this set that are contained in the
specified collection (optional operation).
String s1 = new String("Sychelless");
String s2 = new String("Sydney");
Set<Character> firstSet = new TreeSet<Character>();
for(int i = 0; i < s1.length(); i++) {
firstSet.add(s1.charAt(i));
}
Set<Character> anotherSet = new TreeSet<Character>();
for(int i = 0; i < s2.length(); i++) {
anotherSet.add(s2.charAt(i));
}
firstSet.retainAll(anotherSet);
System.out.println("Matched characters are " + firstSet.toString());//print common strings
//output > Matched characters are [S, e, y]

s1.contains(s2) returns true;
s1.indexOf(s2) returns 0.
s1.indexOf("foo") returns -1
For more sophisticated cases use class Pattern.

Related

How can I compare two strings and try to print out comman latters but i could not avoid to repeat a latter more than once

I am comparing two strings and try to print out comman latters but i could not avoid to repeat a latter more than once.
here is my code
public static String getCommonCharacters ( final String a, final String b){
String result="";
for(int i = 0; i < a.length(); i++){
for(int j = 0; j < b.length(); j++)
if(a.charAt(i)==b.charAt(j)){
result +=a.charAt(i);
}
} return result;
the problem is when a = "baac" and b =" fdeabac " then i get out = "aabaac" instead of "abc" or "bca" etc

change the if condition to:
if (a.charAt(i) == b.charAt(j) &&
!result.contains(String.valueOf(a.charAt(i)))) { ... }
Thus, you only perform the statement:
result +=a.charAt(i);
if the accumulating string doesn't already contain the character.

Working code with minor modification to yours:
public class StringCompare {
public static String getCommonCharacters(final String a, final String b) {
String result = "";
for (int i = 0; i < a.length(); i++) {
for (int j = 0; j < b.length(); j++)
if (a.charAt(i) == b.charAt(j)) {
result += a.charAt(i);
}
}
return result;
}
public static void main(String[] args) {
System.out.println(getCommonCharacters("baac", "fdeabac ").replaceAll(
"(.)\\1{1,}", "$1")); // You could use regular expressions for
// that. Removing repeated characters.
}
}
Output:
bac
Pattern explanation:
"(.)\1{1,}" means any character (added to group 1) followed by itself at least once
"$1" references contents of group 1
More about Regular Expressions Oracle Docs

Hier is another solution: create two new HashSet for each String which we change to charArray, then add them to hashSet with For loops,
retainAll() method provide used to remove it's elements from a list that are not contained in the specified collection.#Java Doc by Oracle
Last For-Loop used to concatenate char as a strings.
String str ="";
Set<Character> s1 = new HashSet<Character>();
Set<Character> s2 = new HashSet<Character>();
for(char c:a.toCharArray()) s1.add(c);
for(char c:b.toCharArray()) s2.add(c);
s1.retainAll(s2);
for(char s:s1) str +=s;
return str;

Removing all the characters from a given string

how can I modify my code so that it removes all the characters in a given string (not just a string) in another string in O(n)? If using other data structures would help, please hint as well.
public static String removeChar(String s, char ch){
StringBuilder sb= new StringBuilder();
char[] charArray= s.toCharArray();
for (int i=0; i<charArray.length; i++){
if (charArray[i]!=ch) {
sb.append(charArray[i]);
}
}
return sb.toString();
}
Is there a faster way for this?
UPDATE: I want to write a new function like removeAllCharsInSecondStringFromFirstString(String S1, String S2)

Rather then iterating each character of the String, you could use String.indexOf(int) and a loop to add each substring between ch intervals. Something like,
public static String removeChar(String s, char ch) {
StringBuilder sb = new StringBuilder();
int p1 = 0, p2 = s.indexOf(ch);
while (p2 > -1) {
sb.append(s.substring(p1, p2));
p1 = p2 + 1;
p2 = s.indexOf(ch, p1);
}
if (p1 < s.length()) {
sb.append(s.substring(p1, s.length()));
}
return sb.toString();
}

With hints and help of dimo I wrote this solution:
public static String removeAllChars(String src, String dst){
HashSet<Character> chars = new HashSet<>();
char[] dstCharArray=dst.toCharArray();
for (int i=0; i<dstCharArray.length; i++){
chars.add(dstCharArray[i]);
}
StringBuilder sb = new StringBuilder();
char[] srcCharArray = src.toCharArray();
for (int i=0; i<srcCharArray.length; i++){
if (!chars.contains(srcCharArray[i])){
sb.append(srcCharArray[i]);
}
}
return sb.toString();
}

If you really want to implement this yourself you can use a Set to contain the collection of characters that you want to strip out. Here's a template to get you started:
public static String removeAllChars(String source, String charsString) {
HashSet<Character> chars = new HashSet<>();
for (int i = 0; i < charsString.length(); i++) {
chars.add(charsString.charAt(i));
}
StringBuilder sb = new StringBuilder();
for (int i = 0; i < source.length(); i++) {
// chars.contains(source.charAt(i)) is O(1)
// use this to determine which chars to exclude
}
return sb.toString();
}

Try to use this.
Remove all no numerical value
String str = "343.dfsdgdffsggdfg333";
str = str.replaceAll("[^\\d.]", "");
Output will give you "343.333"
If in the future you will need delete numerical and special value try this
String str = "343.dfsdgdffsggdfg333";
string = string.replace(/[^a-zA-Z0-9]/g, '');

How to find characters common to two stringBuffers?

I am tring this method to find the common characters in two stringBuffers by returning a stringBuffer. without using arrays. Please tell me what errors I am making.
private StringBuffer stringBuffer1;
private StringBuffer stringBuffer2;
public commonCharacters(String s1, String s2) {
stringBuffer1 = new StringBuffer(s1);
stringBuffer2 = new StringBuffer(word2);
}
String commonChars = "";
for (i = 0; i < stringBuffer1.length; i++) {
char c = stringBuffer1.charAt(i);
if (s2.indexOf(c) != -1) {
commonChars = commonChars + c;
}
}

I think you should consider that each character can appear multiple times in each String. A typical solution is to do that in two steps:
1) Build a map char -> number of occurrences for each of the two strings
2) Compare the maps to find common characters
Map<Character, Integer> charCount(String s) {
Map<Character, Integer> mapCount = new HashMap<>();
for (char c : s.toCharArray()) {
if (!s.contains(c)) mapCount.put(c, 1 + mapCount.computeIfAbsent(c, e -> 0));
}
}
void findCommonCharacter(String s1, String s2) {
Map<Character, Integer> m1 = charCount(s1);
Map<Character, Integer> m2 = charCount(s2);
for (char c : m1.keys()) {
int occurrences = m2.containsKey(c) ? Math.min(m1.get(m1), m2.get(m2) : 0;
if (occurrences>0) {
System.out.println("The two strings share " + occurrences
+ " occurrences of " + c);
}
}
}

Here is a method using arrays, we'll fill arrays of a length of 26 and get back to them to know how many of each letter there are in each word and compare both to take the minimum value.
Solution
public static void main(String[] args) {
StringBuffer sb1 = new StringBuffer("dad");
StringBuffer sb2 = new StringBuffer("daddy");
System.out.println(findCommons(sb1, sb2)); // Prints "add"
sb1 = new StringBuffer("Encapsulation");
sb2 = new StringBuffer("Programmation");
System.out.println(findCommons(sb1, sb2)); // Prints "aainopt"
}
public static StringBuffer findCommons(StringBuffer b1, StringBuffer b2){
int[] array1 = new int[26];
int[] array2 = new int[26];
int[] common = new int[26];
StringBuffer sb = new StringBuffer("");
for (int i = 0 ; i < (b1.length() < b2.length() ? b2:b1).length() ; i++){
if (i < b1.length()) array1[Character.toLowerCase(b1.charAt(i)) - 'a'] += 1;
if (i < b2.length()) array2[Character.toLowerCase(b2.charAt(i)) - 'a'] += 1;
}
for (int i = 0 ; i < 26 ; i++){
common[i] = array1[i] < array2[i] ? array1[i] : + array2[i];
for (int j = 0 ; j < common[i] ; j++) sb.append((char)('a' + i));
}
return sb;
}

Use a Set intersection - you only need 3 lines of code:
Set<Character> common = stringBuffer1.toString().chars()
.mapToObj(c -> Character.valueOf((char)c)).collect(Collectors.toSet());
common.retainAll(stringBuffer2.toString().chars()
.mapToObj(c -> Character.valueOf((char)c)).collect(Collectors.toSet()));
StringBuffer result = new StringBuffer(common.stream().map(String::valueOf).collect(Collectors.joining("")));
But you shouldn't be using StringBuffer for this; just working with Strings will do it more easily:
public static String commonCharacters(String s1, String s2) {
Set<String> chars = Arrays.stream(s1.split("")).collect(Collectors.toSet());
chars.retainAll(Arrays.stream(s2.split("")).collect(Collectors.toSet()));
return chars.stream().reduce((a, b) -> a + b).get();
}

Sorting characters alphabetically in a String

Could smb please explaing the process of sorting characters of String alphabetically? For example, if I have String "hello" the output should be "ehllo" but my code is doing it wrong.
public static void main(String[] args)
{
String result = "";
Scanner kbd = new Scanner(System.in);
String input = kbd.nextLine();
for(int i = 1; i < input.length(); i++)
{
if(input.charAt(i-1) < input.charAt(i))
result += input.charAt(i-1);
//else
// result += input.charAt(i);
}
System.out.println(result);
}
}

You may do the following thing -
1. Convert your String to char[] array.
2. Using Arrays.sort() sort your char array
Code snippet:
String input = "hello";
char[] charArray = input.toCharArray();
Arrays.sort(charArray);
String sortedString = new String(charArray);
System.out.println(sortedString);
Or if you want to sort the array using for loop (for learning purpose) you may use (But I think the first one is best option ) the following code snippet-
input="hello";
char[] charArray = input.toCharArray();
length = charArray.length();
for(int i=0;i<length;i++){
for(int j=i+1;j<length;j++){
if (charArray[j] < charArray[i]) {
char temp = charArray[i];
charArray[i]=arr[j];
charArray[j]=temp;
}
}
}

You can sort a String in Java 8 using Stream as below:
String sortedString =
Stream.of("hello".split(""))
.sorted()
.collect(Collectors.joining());

Procedure :
At first convert the string to char array
Then sort the array of character
Convert the character array to string
Print the string
Code snippet:
String input = "world";
char[] arr = input.toCharArray();
Arrays.sort(arr);
String sorted = new String(arr);
System.out.println(sorted);

Sorting as a task has a lower bound of O(n*logn), with n being the number of elements to sort. What this means is that if you are using a single loop with simple operations, it will not be guaranteed to sort correctly.
A key element in sorting is deciding what you are sorting by. In this case its alphabetically, which, if you convert each character to a char, equates to sorting in ascending order, since a char is actually just a number that the machine maps to the character, with 'a' < 'b'. The only gotcha to look out for is mixed case, since 'z' < 'A'. To get around, this, you can use str.tolower(). I'd recommend you look up some basic sorting algorithms too.

Your for loop is starting at 1 and it should be starting at zero:
for(int i = 0; i < input.length(); i++){...}

You can do this using Arrays.sort, if you put the characters into an array first.
Character[] chars = new Character[str.length()];
for (int i = 0; i < chars.length; i++)
chars[i] = str.charAt(i);
// sort the array
Arrays.sort(chars, new Comparator<Character>() {
public int compare(Character c1, Character c2) {
int cmp = Character.compare(
Character.toLowerCase(c1.charValue()),
Character.toLowerCase(c2.charValue())
);
if (cmp != 0) return cmp;
return Character.compare(c1.charValue(), c2.charValue());
}
});
Now build a string from it using StringBuilder.

Most basic and brute force approach using the two for loop:
It sort the string but with the cost of O(n^2) time complexity.
public void stringSort(String str){
char[] token = str.toCharArray();
for(int i = 0; i<token.length; i++){
for(int j = i+1; j<token.length; j++){
if(token[i] > token[j]){
char temp = token[i];
token[i] = token[j];
token[j] = temp;
}
}
}
System.out.print(Arrays.toString(token));
}

public class SortCharcterInString {
public static void main(String[] args) {
String str = "Hello World";
char[] arr;
List<Character> L = new ArrayList<Character>();
for (int i = 0; i < str.length(); i++) {
arr = str.toLowerCase().toCharArray();
L.add(arr[i]);
}
Collections.sort(L);
str = L.toString();
str = str.replaceAll("\\[", "").replaceAll("\\]", "")
.replaceAll("[,]", "").replaceAll(" ", "");
System.out.println(str);
}

Collection Framework & Data Structures

I recently faced some technical interviews, the questions were:
Q.1 Two Strings are given "Hello" & "World". Print Unique
Characters Present in first and not in the second string
OUTPUT: He.
My Answer: Compare each character of one string to with every other character of second, not optimal at all (wrong, obviously).
Q.2 ABCABBABCAB, OUTPUT:4A5B2C`, (basically count occurrence of each character)
do this in one pass, not multiple traversal in string, there where other
Again do this in optimal way.
Similarly, there where few other question too..
Question which arises at core to me is:
Which data structure from collection framework will help me to handle such scenarios in most optimum way; and
Which particular data structure from Java Collection Framework to be used when and why?
Also, If there are books for such topics, do tell
Any Help-Books, References and Links will be of great help in learning and understanding.
IMPORTANT: I need real time scenarios, where which the data structure is implemented
I have studied, Collection API, not throughly, but a summarised idea of hierachy and major data structure classes. i know how to use them, but where and why exactly use them eludes me?

public class G {
public static void main(String[] args) {
new G().printCharacterCount("ABCABBABCAB");
System.out.println();
new G().printUniqueCharacters("Hello", "world");
}
void printUniqueCharacters(String a, String b) {
Set<Character> set = new HashSet<Character>();
for (int i = 0; i < a.length(); i++)
set.add(a.charAt(i));
for (int i = 0; i < b.length(); i++)
set.remove(b.charAt(i));
for (Character c : set)
System.out.print(c);
}
void printCharacterCount(String a) {
Map<Character, Integer> map = new TreeMap<Character, Integer>();
for(int i = 0; i < a.length(); i++) {
char c = a.charAt(i);
if(!map.containsKey(c))
map.put(c, 0);
map.put(c, map.get(c) +1);
}
for(char c : map.keySet()) {
System.out.print(map.get(c) + "" + c);
}
}
}

Example of algorithm you could use.
Q1.
put all the letters of String1 in a set (which only keeps unique entries)
remove all the letters of String2 from the set
your set now contains the unique letters of String1 which were not in String2
Q2.
store the number of occurrence of the letters in a Map<Character, Integer>
if a letter is not in the map, the count is 1
if a letter is already in the map, the count needs to be incremented
I know how to use them, but where and why exactly use them eludes me?
By trying to solve that kind of puzzle on your own ;-)

Set<Character> set1=new HashSet<Character>(Arrays.asList(ArrayUtils.toObject("Hello".toCharArray())));
Set<Character> set2=new HashSet<Character>(Arrays.asList(ArrayUtils.toObject("World".toCharArray())));
set1.removeAll(set2);
System.out.println(set1);
Using apache ArrayUtils.toObject(char[] array) .You could write a util method instead.

For #1 :
String one = "Hello";
String two = "World";
Set<Character> set = new HashSet<Character>();
for (int i = 0; i < one.length(); i++) {
set.add(one.charAt(i));
}
for (int i = 0; i < two.length(); i++) {
set.remove(two.charAt(i));
}
for (char ch : set) {
System.out.println(ch);
}
For #2 :
String str = YourInput;
int[] array = new int[26];
for (int i = 0; i < str.length(); i++) {
char ch = str.charAt(i);
array[ch - 'A']++;
}
for (int i = 0; i < array.length; i++) {
if (array[i] != 0) {
System.out.println(array[i] + (char) (i + 'A'));
}
}

public static void main(String[] args) {
String s1 = "Hello";
String s2 = "World";
List<Character> list1 = new ArrayList<Character>();
List<Character> list2 = new ArrayList<Character>();
for(char c : s1.toCharArray()){
if(!list1.contains(c)){
list1.add(c);
}
}
for(char c : s2.toCharArray()){
if(!list2.contains(c)){
list2.add(c);
}
}
List<Character> uniqueList = new ArrayList<Character>();
for (Character character1 : list1) {
boolean unique = true;
for (Character character2 : list2) {
if(character1.equals(character2)){
unique = false;
}
}
if(unique){
uniqueList.add(character1);
}
}
for (Character character : uniqueList) {
System.out.print(character);
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Intersection of two strings in Java - java

Need a Java function to find intersection of two strings. i.e. characters common to the strings. Example: String s1 = new String("Sychelless"); String s2 = new String("Sydney");

Extract the characters String.toCharArray Put them in a Set Find the intersection Set.retainAll

Most basic approach: String wordA = "Sychelless"; String wordB = "Sydney"; String common = ""; for(int i=0;i<wordA.length();i++){ for(int j=0;j<wordB.length();j++){ if(wordA.charAt(i)==wordB.charAt(j)){ common += wordA.charAt(i)+" "; break; } } } System.out.println("common is: "+common);

I think the algorithm you are looking for is the problem of the longest common subsequence

Found same question here, refer this Implementing an efficent algorithm to find the intersection of two strings

By means of Guava this task seems much easier: String s1 = new String("Sychelless"); String s2 = new String("Sydney"); Set<String> setA = Sets.newHashSet(Splitter.fixedLength(1).split(s1)); Set<String> setB = Sets.newHashSet(Splitter.fixedLength(1).split(s2)); Sets.intersection(setA, setB);

s1.contains(s2) returns true; s1.indexOf(s2) returns 0. s1.indexOf("foo") returns -1 For more sophisticated cases use class Pattern.

Related

How can I compare two strings and try to print out comman latters but i could not avoid to repeat a latter more than once

Removing all the characters from a given string

How to find characters common to two stringBuffers?

Sorting characters alphabetically in a String

Collection Framework & Data Structures

Categories

Resources