Get the Smallest Matching String from a Long String

Get the Smallest Matching String from a Long String - java

Suppose I have a String
interpreter, interprete, interpret
now what i want to do is to get the smallest matching string from the above string that must be:
interpret
Is it possible using Java if it is can somebody help me out digging this problem thanks

Check out this.....
public static void main(String[] ar)
{
List<String> all=new LinkedList<String>();
all.add("interpreter");
all.add("interprete");
all.add("interpret");
String small="";
small=all.get(0);
for (String string : all) {
if(small.contains(string))
{
small=string;
}
}
System.out.println(small);
}
Let me know, Is it satisfying your requirement???
//-----------------Edited One--------------------------
public static void main(String[] ar)
{
List<String> all=new LinkedList<String>();
Set<String> result=new LinkedHashSet<String>();
all.add("interpreter");
all.add("interprete");
all.add("interpret");
all.add("developed");
all.add("develops");
String small="";
for(int i=0;i<all.size();i++)
{
small=all.get(i);
for(int j=i;j<all.size();j++)
{
if(small.contains(all.get(j)))
{
small=all.get(j);
}
}
result.add(small);
}
for (String string : result) {
System.out.println(string);
}
}

If I get you correctly, you want the shortest word in an input string s which includes a target string t="interpret".
So first, split the string into words w, e.g., using s.split("\\s*,\\s*"), then use w.contains(t) on each string w to check if it contains the word you look for. Choose the shortest string for which the contains method returns true.

you need to compare all char one by one of all string and a array of boolean flag maintain
for every pair of string then check out all Boolean array similarity(length) and then substring
of any string from that length
i hope this will help

What you are looking for is called a lemmatizer/steamer for Java.
There are a few of them (I have not used any) but you may want to search/try a few of them:
Snowball
Lemamatization
You should test each of them, because for example some (in case of snowball) will do:
Community
Communities --> Communiti // this is obviously wrong

Related

Return the first index from arraylist where string was found logic confusion

guys so I have this method that I am trying to construct, I am just having a hard time understanding the logic. This is the condition of the method:
public int search(String str) – search the list for parameter str.
Searches should work regardless of case. For example, “TOMATO” is
equivalent to “tomato.”
Hint: the String class has a method called
equalsIgnoreCase. If the string str appears more than once in the
ArrayList, return the first index where the string str was found or
return -1 if the string str was not found in the ArrayList.
This is what I have so far for my code, I am not sure if this is the right way to do it. My ArrayList is defined as words.
In order to solve this issue, I am thinking of using a foreach statement to iterate through the ArrayList then an If to check if the words match then return the Index value based on the match but I am getting error. The other confusion I am having is how do I only return the first Index value only. Maybe I am doing this wrong. Any help or direction is appreciated.
public int search(String str)
{
for(String s : words)
if(s.contains(s.equalsIgnoreCase(str)))
return s.get(s.equalsIgnoreCase(str));
}

The first answer unnecessarily has to search through the list of words to find the index once it has determined that the word is in the list. The code should be able to already know the index. This is the more efficient approach:
public int search(String str) {
int i = 0;
for (String s : words) {
if (s.equalsIgnoreCase(str))
return i;
i++;
}
return -1;
}
There is also the more classic approach...the way it might have been done before the enhance for loop was added to the Java language:
public int search(String str) {
for (int i = 0; i < words.size(); i++)
if (words.get(i).equalsIgnoreCase(str))
return i;
return -1;
}

You actually overcomplicated it a little bit
public int search(String str) {
for(String s : words) {
if(s.equalsIgnoreCase(str)) {
return words.indexOf(s);
}
}
return -1;
}
Since the return method will stop running more code in the function it will always return the first matching word.

You can use stream also to resolve this problem:
public boolean search(List<String> words, String wordToMatch)
{
Predicate<String> equalityPred = s -> s.equalsIgnoreCase(wordToMatch);
return words.stream().anyMatch(equalityPred);
}

Split method creates empty elements in Java Array

I have the following String "Make Me A SandWich"
Someone decided to troll me and replace the spaces with a random number of LOL.
so now the string is "LOLMakeLOLLOLLOLMELOLALOLSandWich"
My goal is to revert this change.
I tried to create a string array with split method but this caused "empty" elements inside of the array that has a value but when I try to log it, it doesn't show anything. It's also not equal to ""
Public class MyClass{
public static void main(String[] args) {
String trollText = "MakeLOLLOLLOLMELOLALOLSandWich";
String[] array = trollText.split("LOL");
if (array[1]=="")System.out.print("it's an empty string");
if (array[1]==" ")System.out.print("it's a space sign");
if (array[1]==null)System.out.print("it's equal to nothing");
if (array[1]==' '+"")System.out.print("I don't know what's that");
else System.out.print(array[1]+"<-- This is an element and it has a value");
}
}
I consider the problem solved if someone tells me what array[1] equals to.
Knowing the value will give me something to compare to when copying the elements into a new array.

When comparing two strings in java, you cannot use == operator which compares object references. You need to use array[1].equals("")
Also, if you simply want to replace all occurrences of a string, you can do following
trollText.replaceAll("LOL", " ")

Here is my solution. skipping empty or " " string and appending notEmpty values to new StringBuilder() and finally print it.
import java.util.Arrays;
public class LOL_problem {
public static void main(String[] args) {
String trollText = "MakeLOLLOLLOLMELOLALOLSandWich";
StringBuilder sb = new StringBuilder();
String[] array = trollText.split("LOL");
//System.out.println(Arrays.toString(array));
for (String str : array) {
if (!str.equals("")) sb.append(str+" ");
}
System.out.println(sb.toString().trim());
}
}

We should use equals(String str) method to check if strings are equals instead of '==' which does object reference check.
To replace all the occurrence, you can use trollText.replaceAll method as below.
public class MyClass{
public static void main(String[] args) {
String trollText = "MakeLOLLOLLOLMELOLALOLSandWich";
String result = trollText.replaceAll("LOL", " ");
System.out.println(result);
}
}

To compare Strings in Java, use:
String.equals("text");
This will return true if the Strings are identical and false if not.

Search a string against another using Regex

I need to check whether a String is contained in another String.
For example, "abc" is contained in "abc/def/gh","def/abc/gh" but not in "abcd/xyz/gh","def/abcd/gh".
So, I have split the input String by "/". Then iterated the generated String array to check against the input.
Is it possible to avoid the creation of the array using something like Regex?
Also, could anybody confirm whether using Regex will be faster than the creation & iteration of array as I have used?
Thanks in advance
public class RegexTest {
public static void main(String[] args) {
System.out.println(contains("abc/def/gh", "abc"));
System.out.println(contains("def/abc/gh", "abc"));
System.out.println(contains("def/abcd/gh", "abc"));
System.out.println(contains("abcd/xyz/gh", "abc"));
}
private static boolean contains(String input, String searchString) {
String[] strings = input.split("/");
for (String string : strings) {
if (string.equals(searchString))
return true;
}
return false;
}
}
The console output is:
true
true
false
false

Something like this:
String pattern = "(.*/)?abc(/.*)?";
System.out.println("abc/def/gh".matches(pattern));
System.out.println("def/abc/gh".matches(pattern));
System.out.println("def/abcd/gh".matches(pattern));
System.out.println("abcd/xyz/gh".matches(pattern));
prints
true
true
false
false

Using regex is more convenient (?), but please time yourself whether it is faster:
if (!searchString.contains("/")) {
return input.matches("(.*/)?" + Pattern.quote(searchString) + "(/.*)?");
} else {
return false;
}
I made sure that the searchString does not contain /, before inserting it as literal with Pattern.quote. The regex will make sure that there is a / before and after the search string in the input, either that or the search string is the first or last token in the input.

try this regex
s.matches("^abc/.+|.+/abc/.+|.+/abc$")
or
s.startsWith("abc/") || s.contains("/abc/") || s.endsWith("/abc")

Ordering a string alphabetically - did I miss something obvious?

public class Anagram {
public static void main(String[] args) {
String a = "Despera tion-".toLowerCase();
String b = "A Rope Ends It".toLowerCase();
String aSorted = sortStringAlphabetically(a);
String bSorted = sortStringAlphabetically(b);
if(aSorted.equals(bSorted)){
System.out.println("Anagram Found!");
}else{
System.out.println("No anagram was found");
}
}
public static String sortStringAlphabetically(String s) {
char[] ca = s.toCharArray();
int cnt = 0;
ArrayList al = new ArrayList();
for (int i = 0; i < ca.length; i++) {
if (Character.isLetter(ca[cnt]))
al.add(ca[cnt]);
cnt++;
}
Collections.sort(al);
return al.toString();
}
}
As a learner, I hacked up this boolean Anagram checker. My chosen solution was to create a sortStringAlphabetically method seems to do just too much type-juggling String -> chars[] -> ArrayList ->String - given that I do just want to compare 2 strings to test whether one phrase is an anagram of another - could I have done it with less type-juggling?
ps The tutors solution was a mile away from my attempt, and probably much better for a lot of reasons - but I am really trying to get a handle on all the different Collection types.
http://www.home.hs-karlsruhe.de/~pach0003/informatik_1/aufgaben/en/doc/src-html/de/hska/java/exercises/arrays/Anagram.html#line.18
EDIT
FTW here is the original challenge, I realise I wandered away from the solution.
http://www.home.hs-karlsruhe.de/~pach0003/informatik_1/aufgaben/en/arrays.html
My initial kneejerk reaction was to simply work though array a, knocking out those chars which matched with array b - but that seemingly required me to rebuild the array at every iteration - Many thanks for all your efforts to educate me.

There are different ways to improve this, if you go with this algorithm.
First, you don't necessarily need to create a character array. You can use String.charAt() to access a specific character of your string.
Second, you don't need a list. If you used a SortedMultiSet or a SortedBag, you could just add things in sorted order. If you write a function that creates the SortedMultiSet from your string, you could just compare the sets without rebuilding the string.
Note: I don't know what libraries you're allowed to use (Google and Apache have these types), but you can always 'brew your own'.
Also, make sure to use generics for your types. Just defining ArrayLists is pretty risky, IMHO.

You could just sort the string without using a list:
public static String sortStringAlphabetically(String s) {
String lettersOnly = s.replaceAll("\\W", "");
char[] chars = lettersOnly.toCharArray();
Arrays.sort(chars);
return new String(chars);
}
N.B. I haven't actually tried running the code.

Your algorithm, but shorter (and yet, slower). The "type-juggling" is done "implicitly" in Java's various library classes:
public static boolean isAnagram(String a, String b) {
List<String> listA = new ArrayList<String>(Arrays.asList(
a.toLowerCase().replaceAll("\\W", "").split("")));
List<String> listB = new ArrayList<String>(Arrays.asList(
b.toLowerCase().replaceAll("\\W", "").split("")));
Collections.sort(listA);
Collections.sort(listB);
return listA.equals(listB);
}
Optionally, replace the \W regular expression to exclude those letters that you don't want to consider for the anagram

public class Anagram {
public static void main(String[] args) throws Exception {
String s1 = "Despera tion-";
String s2 = "A Rope Ends It";
anagramCheck(s1, s2);
}
private static void anagramCheck(String s1, String s2) {
if (isAnagram(s1, s2)) {
System.out.println("Anagram Found!");
} else {
System.out.println("No anagram was found");
}
}
private static boolean isAnagram(String s1, String s2) {
return sort(s1).equals(sort(s2));
}
private static String sort(String s) {
char[] array = s.replaceAll("\\W", "").toLowerCase().toCharArray();
Arrays.sort(array);
return new String(array);
}
}

Check that word all words from one string exist in the other

How can I check that all the words from string #2 exist in String #1? It should be case insensitive and I want exclude all punctuation and special characters during comparison of words.
Any help?
Thanks.

Algorithm
Iterate through words in String #1 and insert them as keys into a dictionary/hash/associative array.
Iterate through words in String #2 and check if each word is a key in the dictionary created in step 1.
If one is not found, return false.
After the iteration has finished, return true.
Running time: O(n)
I'll let someone else implement this in Java.

To find the words in a String while ignoring the various punctuations etc you can use the StringTokenizer class.
StringTokenizer st = new StringTokenizer("Your sentence;with whatever. punctuations? might exists", " :?.,-+=[]");
This breaks up the String into Tokens using the delimiters provided in the second example. You can then use hasMoreTokens() and nextToken() method to iterate the tokens.
Then you can use the algorithm suggested by #MattDiPasquale.

You can try String's built-in split method
 
it looks like
public String[] split(String regex)
and it returns an array of Strings based on the regular expression you use. There are examples in the link above.
You can easily generate two arrays this way (one for String #1 and one for String #2).
Sort the arrays and then check if the arrays are equal. (size and order)
You can simplify array sorting if you utilize java.util.Arrays
Arrays in Java have a lot of library methods and you should learn about them because they are incredibly useful sometimes:
http://leepoint.net/notes-java/data/arrays/arrays-library.html
This is slightly less efficient than building a dictionary/hash table/ADT with your selected delimiters (like in MattDiPasquale's answer), but it might be easier to understand if you are not very familiar with hash functions or dictionaries (as a datatype).

isContainsAll(s1, s2)
1 . split s2 by " "; s.split("")
2 . check if s1 contains all the element of s2
public static boolean isContainsAll(String s1, String s2){
String[] split = s2.split(" ");
for(int i=0; i<split.length; i++){
if(!s1.contains(split[i])){
return false;
}
}
return true;
}
public static void main(String... args){
System.out.println(isContainsAll("asd dsasda das asd; asds asd;/ ", "asd;/"));
}

While the algorithm to do this is simple, the implementation is more involved if you want to support multiple locales. Below is a sample code that supports multiple locales. I've verified this with English as well as Chinese (But I am not sure if it passes the Turkey Test ;-)). Anyways the below code needs some refactoring but this will get you started.
NOTE: Even if you doesn't want support for other languages than English, I still would use the below as the word boundarie/punctuations/grammar etc are locale/language dependent which might not be well addressed by StringTokenizer, String.split(...) and other basic APIs.
import java.text.BreakIterator;
import java.text.Collator;
import java.util.Locale;
import java.util.Set;
import java.util.TreeSet;
import org.apache.commons.lang.StringEscapeUtils;
public class UnicodeWordCount
{
public static void main(final String[] args)
{
testEnglish();
testChinese();
}
public static void testEnglish()
{
BreakIterator wordIterator = BreakIterator.getWordInstance(Locale.ENGLISH);
String str = "This is the source string";
String match = "source string is this";
String doesntMatch = "from Pangea";
Set<String> uniqueWords = extractWords(str, wordIterator, Locale.ENGLISH);
printWords(uniqueWords);
System.out.println("Should print true: " + contains(match, wordIterator, uniqueWords));
System.out.println("Should print false: " + contains(doesntMatch, wordIterator, uniqueWords));
}
public static void testChinese()
{
BreakIterator wordIterator = BreakIterator.getWordInstance(Locale.CHINESE);
String str = "\u4E0D\u70BA\u6307\u800C\u8B02\u4E4B\u6307\uFF0C\u662F[\u7121\u90E8]\u70BA\u6307\u3002\u201D\u5176\u539F\u6587\u70BA";
String match = "\u5176\u539F\u6587\u70BA\uFF0C\u70BA\u6307";
String doesntMatch = "\u4E0D\u70BA\u6307\u800C\u8B02\u4E4B\u6307\uFF0C\u662F[\u517C\u4E0D]\u70BA\u6307\u3002";
Set<String> uniqueWords = extractWords(str, wordIterator, Locale.CHINESE);
printWords(uniqueWords);
System.out.println("Should print true: " + contains(match, wordIterator, uniqueWords));
System.out.println("Should print false: " + contains(doesntMatch, wordIterator, uniqueWords));
}
public static Set<String> extractWords(final String input, final BreakIterator wordIterator, final Locale desiredLocale)
{
Collator collator = Collator.getInstance(desiredLocale);
collator.setStrength(Collator.PRIMARY);
Set<String> uniqueWords = new TreeSet<String>(collator);
wordIterator.setText(input);
int start = wordIterator.first();
int end = wordIterator.next();
while (end != BreakIterator.DONE)
{
String word = input.substring(start, end);
if (Character.isLetterOrDigit(word.charAt(0)))
{
uniqueWords.add(word);
}
start = end;
end = wordIterator.next();
}
return uniqueWords;
}
public static boolean contains(final String target, final BreakIterator wordIterator, final Set<String> uniqueWords)
{
wordIterator.setText(target);
int start = wordIterator.first();
int end = wordIterator.next();
while (end != BreakIterator.DONE)
{
String word = target.substring(start, end);
if (Character.isLetterOrDigit(word.charAt(0)))
{
if (!uniqueWords.contains(word))
{
return false;
}
}
start = end;
end = wordIterator.next();
}
return true;
}
private static void printWords(final Set<String> uniqueWords)
{
for (String word : uniqueWords)
{
System.out.println(StringEscapeUtils.escapeJava(word));
}
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Get the Smallest Matching String from a Long String - java

Suppose I have a String interpreter, interprete, interpret now what i want to do is to get the smallest matching string from the above string that must be: interpret Is it possible using Java if it is can somebody help me out digging this problem thanks

you need to compare all char one by one of all string and a array of boolean flag maintain for every pair of string then check out all Boolean array similarity(length) and then substring of any string from that length i hope this will help

Related

Return the first index from arraylist where string was found logic confusion

Split method creates empty elements in Java Array

Search a string against another using Regex

Ordering a string alphabetically - did I miss something obvious?

Check that word all words from one string exist in the other

Categories

Resources