Related
What I want to do is create a method that takes two objects as input
of type String. The method will return logical truth if both strings are the same (word spacing and capitalization do not matter). I thought to split String, make an Array of elements, add each element to List and then compare each element to space and remove it from List. At the end use a compareToIgnoreCase() method. I stopped on removing space from List for string2. It works to string1List and doesn't work to string2List, I'm wondering why?? :(
I will be grateful for help, I spend a lot of time on it and I'm stuck. Maybe someone know a better solution.
import java.util.ArrayList;
import java.util.List;
public class Strings {
public static void main(String[] args) {
String string1 = "This is a first string";
String string2 = "this is a first string";
String[] arrayOfString1 = string1.split("");
List<String> string1List = new ArrayList<>();
for (int i = 0; i < arrayOfString1.length; ++i) {
string1List.add(arrayOfString1[0 + i]);
}
String[] arrayOfString2 = string2.split("");
List<String> string2List = new ArrayList<>();
for (int i = 0; i < arrayOfString2.length; ++i) {
string2List.add(arrayOfString2[0 + i]);
}
for (int i = 0; i < string1List.size(); ++i) {
String character = string1List.get(0 + i);
if (character.equals(" ")) {
string1List.remove(character);
}
}
for (int i = 0; i < string2List.size(); ++i) {
String character = string2List.get(0 + i);
if (character.equals(" ")) {
string2List.remove(character);
}
}
System.out.println(string2List.size());
}
}
You can try below solution. As you mentioned word spacing and capitalization do not matter
1.remove capitalization - using toLowercase()
2.for word spacing - remove all word spacing using removeAll() with regex pattern "\\s+" so it removes all spaces.
3. check both strings now.
public class StringChecker {
public static void main(String[] args) {
System.out.println(checkString("This is a first string", "this is a first string"));
}
public static boolean checkString(String string1, String string2){
String processedStr1 = string1.toLowerCase().replaceAll("\\s+", "");
String processedStr2 = string2.toLowerCase().replaceAll("\\s+", "");
System.out.println(" s1 : " + processedStr1);
System.out.println(" s2 : " + processedStr2);
return processedStr1.equals(processedStr2);
}
}
Your problem has nothing to do with spaces. You can replace them with any other character (for example "a") to test this. Therefore, removing spaces in any of the methods given above will not improve your code.
The source of the problem is iterating the list with the for command. When you remove an item from a list inside the for loop, after removing the i-th element, the next element in the list becomes the i-th current element.
On the next repetition of the loop - when i is incremented by one - the current i + 1 item becomes the next item in the list, and thus you "lose" (at least) one item. Therefore, it is a bad idea to iterate through the list with the for command.
However you may use many other methods available for collections - for instance Iterators - and your program will work fine.
Iterator <String> it = string1List.iterator();
while(it.hasNext())
{
if(it.next().equals("a")) it.remove();
}
Of course there is no need at all to use Lists to compare these two strings.
Currently I am having a hard time trying to figure out if there is a better way to refactor the following code.
Given the following:
String detail = "POTATORANDOMFOOD";
Lets say I want to assign variables with different parts of detail, the end result would look something like this.
String title = detail.substring(0, 6); // POTATO
String label = detail.substring(6, 12); // RANDOM
String tag = detail.substring(12, 16); // FOOD
Now lets say the string detail length constantly changes, sometimes it only contains "POTATORANDOM" and no "FOOD", sometimes it contains even more characters "POTATORANDOMFOODTODAY", so another variable would be used.
String title = detail.substring(0, 6); // POTATO
String label = detail.substring(6, 12); // RANDOM
String tag = detail.substring(12, 16); // FOOD
...
String etc = detail.substring(30, 40); // etc value from detail string
The issue with this, is that since the string sometimes is shorter or longer, we would run into the StringIndexOutOfBoundsException which is not good.
So currently I have a naive way to handle this:
if (detail != null || !detail.isEmpty()) {
if (detail.length() >= 6) {
title = detail.substring(0, 6);
if (detail.length() >= 12) {
label = detail.substring(6, 12);
if (detail.length() >= 16) {
tag = detail.substring(12, 16);
.
.
.
}
}
}
}
This can get really messy, especially if lets say the string were to grow even more.
So my question is, what would be a good design pattern that would fit for this type of problem? I have tried the chain of responsibility design pattern but, the issue with this one is that it only returns a single value, while I am trying to return multiple ones if possible. This way I can assign multiple variables depending on the length of the string.
Any help/hints is greatly appreciated!
Edited:
The order and length are always the same. So title will always be first and it will always contain 6 characters. label will always be second and it will always contain 6 characters. tag will always be third and it will always contain 4 characters, etc.
If I was you, I would do the following:
Define a class to hold a Word definition
public class Word {
private final String name;
private final int startIndex;
private final int endIndex;
public Word(String name, int startIndex, int endIndex) {
this.name = name;
this.startIndex = startIndex;
this.endIndex = endIndex;
}
public String getName() { return name; }
public int getStartIndex() { return startIndex; }
public int getEndIndex() { return endIndex; }
}
Create a static list which holds all the possible words
public static final List<Word> WORDS = List.of(
new Word("title", 0, 6),
new Word("label", 6, 12),
new Word("tag", 12, 16),
...
);
Create a function that parses the String detail by walking this list until when the size of the string is exhausted
... and of course storing the elements into a Map<String, String> so that you can access them later.
public Map<String, String> parseDetail(String detail) {
Map<String, String> receivedWords = new LinkedHashMap<>(); //<-- map respecting insertion order
if (detail.isEmpty()) {
return receivedWords;
}
int parsedLength = 0;
for (Word word : WORDS) {
receivedWords.put(word.getName(), detail.substring(word.getStartIndex(), word.getEndIndex()); //<-- store the current word
parsedLength += word.getEndIndex() - word.getStartIndex(); //increase the parsedLength by the length of your word
if (parsedLength >= detail.length()) {
break; //<-- exit the loop when you're done with the parsing
}
}
return receivedWords;
}
To sum up:
Map<String, String> receivedWords = parseDetail(detail);
receivedWords.forEach((k, v) -> {
System.out.println("Key: " + k + ", value: " + v);
});
Output:
Key: title, value: POTATO
Key: label, value: RANDOM
Key: tag, value: FOOD
...
Tip 1: The input you receive looks pretty weird. I understand that you cannot change it but I would try to negotiate with the caller (if possible) a better way to send you their input (ideally a structured object, if not possible at least a string with some separator so that you can simply split by that character).
Tip 2: I have defined the list of words statically in the code. But I would instead define an external file (e.g. a Json file, or an Xml, or even a simple text file) that you parse dynamically to create the list. That will allow someone else to configure this file with the words/start index/end index without you having to do it in the code each time there is a change.
You could simply check the length of the total string to see if it has the RANDOM and the FOOD attributes before using substring()
String title = "", label = "", tag = "";
if (detail.length() >= 6)
title = detail.substring(0, 6);
if (detail.length() >= 12)
label = detail.substring(6, 12);
if (detail.length() == 16)
tag = detail.substring(12,16);
I would suggest a regex aproach:
public static void main(String[] args) {
String detail = "POTATORANDOMFOODTODAY";
Pattern p = Pattern.compile("(.{0,6})(.{0,6})(.{0,4})(.{0,5})");
Matcher m = p.matcher(detail);
m.find();
String title = m.group(1);
String label = m.group(2);
String tag = m.group(3);
String day = m.group(4);
System.out.println("title: " + title + ", lable: " + label + ", tag: " + tag + ", day: " + day);
}
//output: title: POTATO, lable: RANDOM, tag: FOOD, day: TODAY
If you have a lots of groups I would suggest to use named captured groups. The approach above can particularly be difficult to maintain as adding or removing a group in the middle of the regex upsets the previous numbering used via Matcher#group(int groupNumber). Using named capturing groups:
public static void main(String[] args) {
String detail = "POTATORANDOMFOODTODAY";
Pattern p = Pattern.compile("(?<title>.{0,6})(?<label>.{0,6})(?<tag>.{0,4})(?<day>.{0,5})");
Matcher m = p.matcher(detail);
m.find();
String title = m.group("title");
String label = m.group("label");
String tag = m.group("tag");
String day = m.group("day");
System.out.println("title: " + title + ", lable: " + label + ", tag: " + tag + ", day: " + day);
}
//output: title: POTATO, lable: RANDOM, tag: FOOD, day: TODAY
If the string is dynamic then it can essentially contain basically anything and since there can possibly be no whitespace(s) in the string the only way to know what a specific word (substring) might be is to play the string against a 'word list'. You can quickly come to realize how pivotal even a single whitespace (or separator character) can be within a string. Using the String#substring() method is only good if you already know what all the words within the detail string happen to be.
The simple solution would be to set acceptable rules as to how a specific string should be received. After all, why would you want to accept a string that contains multiple words without a separator character of some type to begin with. If the string has whitespaces in it, to separate the words contained within that string, a mere:
String[] words = string.split("\\s+");
line of code would do the trick. Bottom line, get rid of that nonsense of accepting strings containing multiple words with no separation mechanism included, even if that separation mechanism is by making use of the underscore ( _ ) character (or some other character). Well...if you can.
I suppose sometimes we just can't modify how we're dealt things (something like taxes) and how we receive specific strings is simply out of our control. If this is the case then one way to deal with this dilemma is to work against an established Word-List. This word list can in in the size of a few words to hundreds of thousands of words. The situation you need to deal with will determine the word list size. If small enough the word list can be contained within a String Array or a collection like an ArrayList or List Interface. If really large however then the word list would most likely be contained within a Text file. The word list I most commonly use contains well over 370,000 individual words.
Here is an example of using a small Word-List contained within a List Interface:
String detail = "POTATORANDOMFOODTODAY";
List<String> wordList = Arrays.asList(new String[] {
"pumpkin", "carrot", "potato", "tomato", "lettus", "radish", "bean",
"pea", "food", "random", "today", "yesterday", "tomorrow",
});
// See if the detail string 'contains' any word-list words...
List<String> found = new ArrayList<>();
for (int i = 0; i < wordList.size(); i++) {
String word = wordList.get(i);
if (detail.toLowerCase().contains(word.toLowerCase())) {
found.add(word.toUpperCase());
}
}
/* Ensure the words within the list are in proper order.
That is, the same order as they are received within the
detail String. This is necessary since words from the
word-List can be found anywhere within the detail string. */
int startIndex = 0;
List<String> foundWords = new ArrayList<>();
String tmpStrg = "";
while (!tmpStrg.equals(detail)) {
for (int i = 0; i < found.size(); i++) {
String word = found.get(i);
if (detail.indexOf(word) == startIndex) {
foundWords.add(word);
startIndex = startIndex + word.length();
String procStrg = foundWords.toString().replace(", ", "");
tmpStrg = procStrg.substring(1, procStrg.length() - 1);
}
}
}
//Format and Display the required data
if (foundWords.isEmpty()) {
System.err.println("Couldn't find any required words!");
return; // or whatever...
}
String title = foundWords.get(0);
String label = foundWords.size() > 1 ? foundWords.get(1) : "N/A";
String[] tag = new String[1];
if (foundWords.size() > 2) {
tag = new String[foundWords.size()-2];
for (int i = 0; i < foundWords.size() - 2; i++) {
tag[i] = foundWords.get(i + 2);
}
}
else {
tag[0] = "N/A";
}
System.out.println("Title:\t" + title);
System.out.println("Label:\t" + label);
System.out.println("Tags:\t"
+ Arrays.toString(tag).substring(1, Arrays.toString(tag).length() - 1));
When the above code is run the console window would display:
Title: POTATO
Label: RANDOM
Tags: FOOD, TODAY
You can use the Stream API and use filter() method.
Then you use map() to apply your existing logic, that should do the trick.
Switch-cases could be an alternative but it adds more LoC but reduces the arrow code of all the nested ifs
I am very new to Java and as a starter I have been offered to try this at home.
Write a program that will find out number of occurences of a smaller string in a bigger string as a part of it as well as an individual word.
For example,
Bigger string = "I AM IN AMSTERDAM", smaller string = "AM".
Output: As part of string: 3, as a part of word: 1.
While I did nail the second part (as a part of word), and even had my go at the first one (searching for the word as a part of the string), I just don't seem to figure out how to crack the first part. It keeps on displaying 1 for me with the example input, where it should be 3.
I have definitely made an error- I'll be really grateful if you could point out the error and rectify it. As a request, I am curious learner- so if possible (at your will)- please provide an explanation as to why so.
import java.util.Scanner;
public class Program {
static Scanner sc = new Scanner(System.in);
static String search,searchstring;
static int n;
void input(){
System.out.println("What do you want to do?"); System.out.println("1.
Search as part of string?");
System.out.println("2. Search as part of word?");
int n = sc.nextInt();
System.out.println("Enter the main string"); searchstring =
sc.nextLine();
sc.nextLine(); //Clear buffer
System.out.println("Enter the search string"); search = sc.nextLine();
}
static int asPartOfWord(String main,String search){
int count = 0;
char c; String w = "";
for (int i = 0; i<main.length();i++){
c = main.charAt(i);
if (!(c==' ')){
w += c;
}
else {
if (w.equals(search)){
count++;
}
w = ""; // Flush old value of w
}
}
return count;
}
static int asPartOfString(String main,String search){
int count = 0;
char c; String w = ""; //Stores the word
for (int i = 0; i<main.length();i++){
c = main.charAt(i);
if (!(c==' ')){
w += c;
}
else {
if (w.length()==search.length()){
if (w.equals(search)){
count++;
}
}
w = ""; // Replace with new value, no string
}
}
return count;
}
public static void main(String[] args){
Program a = new Program();
a.input();
switch(n){
case 1: System.out.println("Total occurences: " +
asPartOfString(searchstring,search));
case 2: System.out.println("Total occurences: " +
asPartOfWord(searchstring,search));
default: System.out.println("ERROR: No valid number entered");
}
}
}
EDIT: I will be using the loop structure.
A simpler way would be to use regular expressions (that probably defeats the idea of writing it yourself, although learning regexes is a good idea because they are very powerful: as you can see the core of my code is 4 lines long in the countMatches method).
public static void main(String... args) {
String bigger = "I AM IN AMSTERDAM";
String smaller = "AM";
System.out.println("Output: As part of string: " + countMatches(bigger, smaller) +
", as a part of word: " + countMatches(bigger, "\\b" + smaller + "\\b"));
}
private static int countMatches(String in, String regex) {
Matcher m = Pattern.compile(regex).matcher(in);
int count = 0;
while (m.find()) count++;
return count;
}
How does it work?
we create a Matcher that will find a specific pattern in your string, and then iterate to find the next match until there is none left and increment a counter
the patterns themselves: "AM" will find any occurrence of AM in the string, in any position. "\\bAM\\b" will only match whole words (\\b is a word delimiter).
That may not be what you were looking for but I thought it'd be interesting to see another approach. An technically, I am using a loop :-)
Although writing your own code with lots of loops to work things out may execute faster (debatable), it's better to use the JDK if you can, because there's less code to write, less debugging and you can focus on the high-level stuff instead of the low level implementation of character iteration and comparison.
It so happens, the tools you need to solve this already exist, and although using them requires knowledge you don't have, they are elegant to the point of being a single line of code for each method.
Here's how I would solve it:
static int asPartOfString(String main,String search){
return main.split(search, -1).length - 1;
}
static int asPartOfWord(String main,String search){
return main.split("\\b" + search + "\\b", -1).length - 1
}
See live demo of this code running with your sample input, which (probably deliberately) contains an edge case (see below).
Performance? Probably a few microseconds - fast enough. But the real benefit is there is so little code that it's completely clear what's going on, and almost nothing to get wrong or that needs debugging.
The stuff you need to know to use this solution:
regex term for "word boundary" is \b
split() takes a regex as its search term
the 2nd parameter of split() controls behaviour at the end of the string: a negative number means "retain blanks at end of split", which handle the edge case of the main string ending with the smaller string. Without the -1, a call to split would throw away the trailing blank in this edge case.
You could use Regular Expressions, try ".*<target string>.*" (Replace target string with what you are searching for.
Have a look at the Java Doc for "Patterns & Regular Expressions"
To search for the occurrences in a string this could be helpful.
Matcher matcher = Pattern.compile(".*AM.*").matcher("I AM IN AMSTERDAM")
int count = 0;
while (matcher.find()) {
count++;
}
Here's an alternative (and much shorter) way to get it to work using Pattern and Matcher,or more commonly known as regex.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class CountOccurances {
public static void main(String[] args) {
String main = "I AM IN AMSTERDAM";
String search = "AM";
System.out.printf("As part of string: %d%n",
asPartOfString(main, search));
System.out.printf("As part of word: %d%n",
asPartOfWord(main, search));
}
private static int asPartOfString(String main, String search) {
Matcher m = Pattern.compile(search).matcher(main);
int count = 0;
while (m.find()) {
count++;
}
return count;
}
private static int asPartOfWord(String main, String search) {
// \b - A word boundary
return asPartOfString(main, "\\b" + search + "\\b");
}
}
Output:
As part of string: 3
As part of word: 1
For the first part of your Exercise this should work:
static int asPartOfWord(String main, String search) {
int count = 0;
while(main.length() >= search.length()) { // while String main is at least as long as String search
if (main.substring(0,search.length()).equals(search)) { // if String main from index 0 until exclusively search.length() equals the String search, count is incremented;
count++;
}
main = main.substring(1); // String main is shortened by cutting off the first character
}
return count;
You may think about the way you name variables:
static String search,searchstring;
static int n;
While search and searchstring will tell us what is meant, you should write the first word in lower case, every word that follows should be written with the first letter in upper case. This improves readability.
static int n won't give you much of a clue what it is used for if you read your code again after a few days, you might use something more meaningful here.
static String search, searchString;
static int command;
Okk As programmer we love get involved in logic building but that is not the case some time we become blank over some type of puzzle as below mentioned. Let me declare that this is not any kind of homework or job stuff it simply a logic and performance practice puzzle.Okk the puzzle of given an Strings` with comma separated words like
String S= peas,sugar,rice,soup
Now crux is to find out length of longest chain of the words like last character of word should be the first character of next word and so on to create a longest possible chain and finally to calculate the length of that chain.
Now I had tried to figure out some sort of solution like
split the string with comma
add them in list
sort that list
etc
but now how to develop further logic As I m little poor over logic development,Help is appreciated and if above half logic is not proper as it should be than what must the simple sort and perfect way to get the length of the longest chain of words.
Summary
input: String S= peas,sugar,rice,soup.
output: 4 length of words (peas->sugar->rice->soup) or (soup->peas->sugar->rice) etc
Once you have list (or array) you can iterate over the array checking your condition (equality of last letter of n-th words with the first letter of first word) and increase counter each time. Once the condition is false just escape the loop. Your counter will hold value you need.
okk friends here the logic and core part which I had made and my puzzle got solved
import java.util.Map;
import java.util.Stack;
public class CandidateCode
{
public static int chainLength=0;
public static void main(String[] args) {
String s= "peas,sugar,rice,soup";
int chainLengthfinal=wordChain(s);
System.out.println("final length:"+chainLengthfinal);
}
public static int wordChain(String input1)
{
List<String> stringList = new ArrayList<String>();
stringList= Arrays.asList(input1.split(","));
boolean ischain = new CandidateCode().hasChain(stringList);
if (ischain) {
return chainLength;
}
return 0;
}
Map<Character, List<String>> startsWith = new HashMap<Character, List<String>>();
Map<Character, List<String>> endsWith = new HashMap<Character, List<String>>();
private Character getFirstChar(String str) {
return str.charAt(0);
}
private Character getLastChar(String str) {
return str.charAt(str.length() - 1);
}
boolean hasChain(List<String> stringList) {
for (String str : stringList) {
Character start = getFirstChar(str);
Character end = getLastChar(str);
List<String> startsWithList;
List<String> endsWithList;
if (startsWith.containsKey(start)) {
startsWithList = startsWith.get(start);
} else {
startsWithList = new ArrayList<String>();
startsWith.put(start, startsWithList);
}
if (endsWith.containsKey(end)) {
endsWithList = endsWith.get(end);
} else {
endsWithList = new ArrayList<String>();
endsWith.put(end, endsWithList);
}
startsWithList.add(str);
endsWithList.add(str);
}
Stack<String> stringStack = new Stack<String>();
for (String str : stringList) {
if (hasChain(stringList.size(), str, stringStack)) {
System.out.println(stringStack);
System.out.println("size "+stringStack.size());
chainLength= stringStack.size();
return true;
}
}
return false;
}
private boolean hasChain(int size, String startString, Stack<String> stringStack) {
if (size == stringStack.size()) return true;
Character last = getLastChar(startString);
if (startsWith.containsKey(last)) {
List<String> stringList = startsWith.get(last);
for (int i = 0; i < stringList.size(); i++) {
String candidate = stringList.remove(i--);
stringStack.push(candidate);
if (hasChain(size, candidate, stringStack)) {
return true;
}
stringStack.pop();
stringList.add(++i, candidate);
}
}
return false;
}
}
output of the above program will be
[soup, peas, sugar, rice]
size 4.
final length:4.
initialize a " " string named last(String last=" ")
get the first string by splitting with comma
substring the last char of the string and store it to last
boolean brokenchain=false;
length=0;
while(more string to split with comma)&&(!brokenchain){
split string with comma
substring to get first char
if(first char!=last){
brokenchain=true;
}else{
length++;
get last char of this string with substring and store it to last
}
}
if you have for input a sequence of legth 5 and the it brokes and there is a sequence of length 6 following which you want to count and print as output, you have to store the count variable in a map, for example, as a key associated with the sequence as far. then you continue the loop(you have to make the brokenchain=false again) until the input string sequence ends. then you get the bigger key from your map and print it with his associated value(the biggest sequence)
I think you need to find the largest and smallest number.
split the string with comma
add them as list_item
compare list_item1 and list_item2, the largest value becomes list_item_X
compare list_item3 and list_item4, the largest value becomes list_item_Y
Now compare list_item1 and list_item_X, the largest value becomes
So the largest value is list_item_Z, here is implimentation through code.
$s = 'peas,sugar,rice,soup';
$list_items = explode(',', $s);
$lengths = array_map('strlen', $list_items);
echo "The shortest is " . min($lengths) .
". The longest is " . max($lengths);
Suppose I have two long strings. They are almost same.
String a = "this is a example"
String b = "this is a examp"
Above code is just for example. Actual strings are quite long.
Problem is one string have 2 more characters than the other.
How can I check which are those two character?
You can use StringUtils.difference(String first, String second).
This is how they implemented it:
public static String difference(String str1, String str2) {
if (str1 == null) {
return str2;
}
if (str2 == null) {
return str1;
}
int at = indexOfDifference(str1, str2);
if (at == INDEX_NOT_FOUND) {
return EMPTY;
}
return str2.substring(at);
}
public static int indexOfDifference(CharSequence cs1, CharSequence cs2) {
if (cs1 == cs2) {
return INDEX_NOT_FOUND;
}
if (cs1 == null || cs2 == null) {
return 0;
}
int i;
for (i = 0; i < cs1.length() && i < cs2.length(); ++i) {
if (cs1.charAt(i) != cs2.charAt(i)) {
break;
}
}
if (i < cs2.length() || i < cs1.length()) {
return i;
}
return INDEX_NOT_FOUND;
}
To find the difference between 2 Strings you can use the StringUtils class and the difference method. It compares the two Strings, and returns the portion where they differ.
StringUtils.difference(null, null) = null
StringUtils.difference("", "") = ""
StringUtils.difference("", "abc") = "abc"
StringUtils.difference("abc", "") = ""
StringUtils.difference("abc", "abc") = ""
StringUtils.difference("ab", "abxyz") = "xyz"
StringUtils.difference("abcde", "abxyz") = "xyz"
StringUtils.difference("abcde", "xyz") = "xyz"
Without iterating through the strings you can only know that they are different, not where - and that only if they are of different length. If you really need to know what the different characters are, you must step through both strings in tandem and compare characters at the corresponding places.
The following Java snippet efficiently computes a minimal set of characters that have to be removed from (or added to) the respective strings in order to make the strings equal. It's an example of dynamic programming.
import java.util.HashMap;
import java.util.Map;
public class StringUtils {
/**
* Examples
*/
public static void main(String[] args) {
System.out.println(diff("this is a example", "this is a examp")); // prints (le,)
System.out.println(diff("Honda", "Hyundai")); // prints (o,yui)
System.out.println(diff("Toyota", "Coyote")); // prints (Ta,Ce)
System.out.println(diff("Flomax", "Volmax")); // prints (Fo,Vo)
}
/**
* Returns a minimal set of characters that have to be removed from (or added to) the respective
* strings to make the strings equal.
*/
public static Pair<String> diff(String a, String b) {
return diffHelper(a, b, new HashMap<>());
}
/**
* Recursively compute a minimal set of characters while remembering already computed substrings.
* Runs in O(n^2).
*/
private static Pair<String> diffHelper(String a, String b, Map<Long, Pair<String>> lookup) {
long key = ((long) a.length()) << 32 | b.length();
if (!lookup.containsKey(key)) {
Pair<String> value;
if (a.isEmpty() || b.isEmpty()) {
value = new Pair<>(a, b);
} else if (a.charAt(0) == b.charAt(0)) {
value = diffHelper(a.substring(1), b.substring(1), lookup);
} else {
Pair<String> aa = diffHelper(a.substring(1), b, lookup);
Pair<String> bb = diffHelper(a, b.substring(1), lookup);
if (aa.first.length() + aa.second.length() < bb.first.length() + bb.second.length()) {
value = new Pair<>(a.charAt(0) + aa.first, aa.second);
} else {
value = new Pair<>(bb.first, b.charAt(0) + bb.second);
}
}
lookup.put(key, value);
}
return lookup.get(key);
}
public static class Pair<T> {
public Pair(T first, T second) {
this.first = first;
this.second = second;
}
public final T first, second;
public String toString() {
return "(" + first + "," + second + ")";
}
}
}
To directly get only the changed section, and not just the end, you can use Google's Diff Match Patch.
List<Diff> diffs = new DiffMatchPatch().diffMain("stringend", "stringdiffend");
for (Diff diff : diffs) {
if (diff.operation == Operation.INSERT) {
return diff.text; // Return only single diff, can also find multiple based on use case
}
}
For Android, add: implementation 'org.bitbucket.cowwoc:diff-match-patch:1.2'
This package is far more powerful than just this feature, it is mainly used for creating diff related tools.
String strDiffChop(String s1, String s2) {
if (s1.length > s2.length) {
return s1.substring(s2.length - 1);
} else if (s2.length > s1.length) {
return s2.substring(s1.length - 1);
} else {
return null;
}
}
Google's Diff Match Patch is good, but it was a pain to install into my Java maven project. Just adding a maven dependency did not work; eclipse just created the directory and added the lastUpdated info files. Finally, on the third try, I added the following to my pom:
<dependency>
<groupId>fun.mike</groupId>
<artifactId>diff-match-patch</artifactId>
<version>0.0.2</version>
</dependency>
Then I manually placed the jar and source jar files into my .m2 repo from https://search.maven.org/search?q=g:fun.mike%20AND%20a:diff-match-patch%20AND%20v:0.0.2
After all that, the following code worked:
import fun.mike.dmp.Diff;
import fun.mike.dmp.DiffMatchPatch;
DiffMatchPatch dmp = new DiffMatchPatch();
LinkedList<Diff> diffs = dmp.diff_main("Hello World.", "Goodbye World.");
System.out.println(diffs);
The result:
[Diff(DELETE,"Hell"), Diff(INSERT,"G"), Diff(EQUAL,"o"), Diff(INSERT,"odbye"), Diff(EQUAL," World.")]
Obviously, this was not originally written (or even ported fully) into Java. (diff_main? I can feel the C burning into my eyes :-) )
Still, it works. And for people working with long and complex strings, it can be a valuable tool.
To find the words that are different in the two lines, one can use the following code.
String[] strList1 = str1.split(" ");
String[] strList2 = str2.split(" ");
List<String> list1 = Arrays.asList(strList1);
List<String> list2 = Arrays.asList(strList2);
// Prepare a union
List<String> union = new ArrayList<>(list1);
union.addAll(list2);
// Prepare an intersection
List<String> intersection = new ArrayList<>(list1);
intersection.retainAll(list2);
// Subtract the intersection from the union
union.removeAll(intersection);
for (String s : union) {
System.out.println(s);
}
In the end, you will have a list of words that are different in both the lists. One can modify it easily to simply have the different words in the first list or the second list and not simultaneously. This can be done by removing the intersection from only from list1 or list2 instead of the union.
Computing the exact location can be done by adding up the lengths of each word in the split list (along with the splitting regex) or by simply doing String.indexOf("subStr").
On top of using StringUtils.difference(String first, String second) as seen in other answers, you can also use StringUtils.indexOfDifference(String first, String second) to get the index of where the strings start to differ. Ex:
StringUtils.indexOfDifference("abc", "dabc") = 0
StringUtils.indexOfDifference("abc", "abcd") = 3
where 0 is used as the starting index.
Another great library for discovering the difference between strings is DiffUtils at https://github.com/java-diff-utils. I used Dmitry Naumenko's fork:
public void testDiffChange() {
final List<String> changeTestFrom = Arrays.asList("aaa", "bbb", "ccc");
final List<String> changeTestTo = Arrays.asList("aaa", "zzz", "ccc");
System.out.println("changeTestFrom=" + changeTestFrom);
System.out.println("changeTestTo=" + changeTestTo);
final Patch<String> patch0 = DiffUtils.diff(changeTestFrom, changeTestTo);
System.out.println("patch=" + Arrays.toString(patch0.getDeltas().toArray()));
String original = "abcdefghijk";
String badCopy = "abmdefghink";
List<Character> originalList = original
.chars() // Convert to an IntStream
.mapToObj(i -> (char) i) // Convert int to char, which gets boxed to Character
.collect(Collectors.toList()); // Collect in a List<Character>
List<Character> badCopyList = badCopy.chars().mapToObj(i -> (char) i).collect(Collectors.toList());
System.out.println("original=" + original);
System.out.println("badCopy=" + badCopy);
final Patch<Character> patch = DiffUtils.diff(originalList, badCopyList);
System.out.println("patch=" + Arrays.toString(patch.getDeltas().toArray()));
}
The results show exactly what changed where (zero based counting):
changeTestFrom=[aaa, bbb, ccc]
changeTestTo=[aaa, zzz, ccc]
patch=[[ChangeDelta, position: 1, lines: [bbb] to [zzz]]]
original=abcdefghijk
badCopy=abmdefghink
patch=[[ChangeDelta, position: 2, lines: [c] to [m]], [ChangeDelta, position: 9, lines: [j] to [n]]]
For a simple use case like this. You can check the sizes of the string and use the split function. For your example
a.split(b)[1]
I think the Levenshtein algorithm and the 3rd party libraries brought out for this very simple (and perhaps poorly stated?) test case are WAY overblown.
Assuming your example does not suggest the two bytes are always different at the end, I'd suggest the JDK's Arrays.mismatch( byte[], byte[] ) to find the first index where the two bytes differ.
String longer = "this is a example";
String shorter = "this is a examp";
int differencePoint = Arrays.mismatch( longer.toCharArray(), shorter.toCharArray() );
System.out.println( differencePoint );
You could now repeat the process if you suspect the second character is further along in the String.
Or, if as you suggest in your example the two characters are together, there is nothing further to do. Your answer then would be:
System.out.println( longer.charAt( differencePoint ) );
System.out.println( longer.charAt( differencePoint + 1 ) );
If your string contains characters outside of the Basic Multilingual Plane - for example emoji - then you have to use a different technique. For example,
String a = "a 🐣 is cuter than a 🐇.";
String b = "a 🐣 is cuter than a 🐹.";
int firstDifferentChar = Arrays.mismatch( a.toCharArray(), b.toCharArray() );
int firstDifferentCodepoint = Arrays.mismatch( a.codePoints().toArray(), b.codePoints().toArray() );
System.out.println( firstDifferentChar ); // prints 22!
System.out.println( firstDifferentCodepoint ); // prints 20, which is correct.
System.out.println( a.codePoints().toArray()[ firstDifferentCodepoint ] ); // prints out 128007
System.out.println( new String( Character.toChars( 128007 ) ) ); // this prints the rabbit glyph.
You may try this
String a = "this is a example";
String b = "this is a examp";
String ans= a.replace(b, "");
System.out.print(now);
//ans=le