Partially match strings in case of List.contains(String)

Partially match strings in case of List.contains(String) - java

I have a List<String>
List<String> list = new ArrayList<String>();
list.add("ABCD");
list.add("EFGH");
list.add("IJ KL");
list.add("M NOP");
list.add("UVW X");
if I do list.contains("EFGH"), it returns true.
Can I get a true in case of list.contains("IJ")? I mean, can I partially match strings to find if they exist in the list?
I have a list of 15000 strings. And I have to check about 10000 strings if they exist in the list. What could be some other (faster) way to do this?
Thanks.

If suggestion from Roadrunner-EX does not suffice then, I believe you are looking for Knuth–Morris–Pratt algorithm.
Time complexity:
Time complexity of the table algorithm is O(n), preprocessing time
Time complexity of the search algorithm is O(k)
So, the complexity of the overall algorithm is O(n + k).
n = Size of the List
k = length of pattern you are searching for
Normal Brute-Force will have time complexity of O(nm)
Moreover KMP algorithm will take same O(k) complexity for searching with same search string, on the other hand, it will be always O(km) for brute force approach.

Perhaps you want to put each String group into a HashSet, and by fragment, I mean don't add "IJ KL" but rather add "IJ" and "KL" separately. If you need both the list and this search capabilities, you may need to maintain two collections.

As a second answer, upon rereading your question, you could also inherit from the interface List, specialize it for Strings only, and override the contains() method.
public class PartialStringList extends ArrayList<String>
{
public boolean contains(Object o)
{
if(!(o instanceof String))
{
return false;
}
String s = (String)o;
Iterator<String> iter = iterator();
while(iter.hasNext())
{
String iStr = iter.next();
if (iStr.contain(s))
{
return true;
}
}
return false;
}
}
Judging by your earlier comments, this is maybe not the speed you're looking for, but is this more similar to what you were asking for?

You could use IterableUtils from Apache Commons Collections.
List<String> list = new ArrayList<String>();
list.add("ABCD");
list.add("EFGH");
list.add("IJ KL");
list.add("M NOP");
list.add("UVW X");
boolean hasString = IterableUtils.contains(list, "IJ", new Equator<String>() {
#Override
public boolean equate(String o1, String o2) {
return o2.contains(o1);
}
#Override
public int hash(String o) {
return o.hashCode();
}
});
System.out.println(hasString); // true

You can iterate over the list, and then call contains() on each String.
public boolean listContainsString(List<string> list. String checkStr)
{
Iterator<String> iter = list.iterator();
while(iter.hasNext())
{
String s = iter.next();
if (s.contain(checkStr))
{
return true;
}
}
return false;
}
Something like that should work, I think.

How about:
java.util.List<String> list = new java.util.ArrayList<String>();
list.add("ABCD");
list.add("EFGH");
list.add("IJ KL");
list.add("M NOP");
list.add("UVW X");
java.util.regex.Pattern p = java.util.regex.Pattern.compile("IJ");
java.util.regex.Matcher m = p.matcher("");
for(String s : list)
{
m.reset(s);
if(m.find()) System.out.println("Partially Matched");
}

Here's some code that uses a regex to shortcut the inner loop if none of the test Strings are found in the target String.
public static void main(String[] args) throws Exception {
List<String> haystack = Arrays.asList(new String[] { "ABCD", "EFGH", "IJ KL", "M NOP", "UVW X" });
List<String> needles = Arrays.asList(new String[] { "IJ", "NOP" });
// To cut down on iterations, create one big regex to check the whole haystack
StringBuilder sb = new StringBuilder();
sb.append(".*(");
for (String needle : needles) {
sb.append(needle).append('|');
}
sb.replace(sb.length() - 1, sb.length(), ").*");
String regex = sb.toString();
for (String target : haystack) {
if (!target.matches(regex)) {
System.out.println("Skipping " + target);
continue;
}
for (String needle : needles) {
if (target.contains(needle)) {
System.out.println(target + " contains " + needle);
}
}
}
}
Output:
Skipping ABCD
Skipping EFGH
IJ KL contains IJ
M NOP contains NOP
Skipping UVW X
If you really want to get cute, you could bisect use a binary search to identify which segments of the target list matches, but it mightn't be worth it.
It depends which is how likely it is that yo'll find a hit. Low hit rates will give a good result. High hit rates will perform not much better than the simple nested loop version. consider inverting the loops if some needles hit many targets, and other hit none.
It's all about aborting a search path ASAP.

Yes, you can! Sort of.
What you are looking for, is often called fuzzy searching or approximate string matching and there are several solutions to this problem.
With the FuzzyWuzzy lib, for example, you can have all your strings assigned a score based on how similar they are to a particular search term. The actual values seem to be integer percentages of the number of characters matching with regards to the search string length.
After invoking FuzzySearch.extractAll, it is up to you to decide what the minimum score would be for a string to be considered a match.
There are also other, similar libraries worth checking out, like google-diff-match-patch or the Apache Commons Text Similarity API, and so on.
If you need something really heavy-duty, your best bet would probably be Lucene (as also mentioned by Ryan Shillington)

This is not a direct answer to the given problem. But I guess this answer will help someone to compare partially both given and the elements in a list using Apache Commons Collections.
final Equator equator = new Equator<String>() {
#Override
public boolean equate(String o1, String o2) {
final int i1 = o1.lastIndexOf(":");
final int i2 = o2.lastIndexOf(":");
return o1.substring(0, i1).equals(o2.substring(0, i2));
}
#Override
public int hash(String o) {
final int i1 = o.lastIndexOf(":");
return o.substring(0, i1).hashCode();
}
};
final List<String> list = Lists.newArrayList("a1:v1", "a2:v2");
System.out.println(IteratorUtils.matchesAny(list.iterator(), new EqualPredicate("a2:v1", equator)));

Related

How to check if two objects in a ArrayList are the same? [duplicate]

How could I go about detecting (returning true/false) whether an ArrayList contains more than one of the same element in Java?
Many thanks,
Terry
Edit
Forgot to mention that I am not looking to compare "Blocks" with each other but their integer values. Each "block" has an int and this is what makes them different.
I find the int of a particular Block by calling a method named "getNum" (e.g. table1[0][2].getNum();

Simplest: dump the whole collection into a Set (using the Set(Collection) constructor or Set.addAll), then see if the Set has the same size as the ArrayList.
List<Integer> list = ...;
Set<Integer> set = new HashSet<Integer>(list);
if(set.size() < list.size()){
/* There are duplicates */
}
Update: If I'm understanding your question correctly, you have a 2d array of Block, as in
Block table[][];
and you want to detect if any row of them has duplicates?
In that case, I could do the following, assuming that Block implements "equals" and "hashCode" correctly:
for (Block[] row : table) {
Set set = new HashSet<Block>();
for (Block cell : row) {
set.add(cell);
}
if (set.size() < 6) { //has duplicate
}
}
I'm not 100% sure of that for syntax, so it might be safer to write it as
for (int i = 0; i < 6; i++) {
Set set = new HashSet<Block>();
for (int j = 0; j < 6; j++)
set.add(table[i][j]);
...
Set.add returns a boolean false if the item being added is already in the set, so you could even short circuit and bale out on any add that returns false if all you want to know is whether there are any duplicates.

Improved code, using return value of Set#add instead of comparing the size of list and set.
public static <T> boolean hasDuplicate(Iterable<T> all) {
Set<T> set = new HashSet<T>();
// Set#add returns false if the set does not change, which
// indicates that a duplicate element has been added.
for (T each: all) if (!set.add(each)) return true;
return false;
}

With Java 8+ you can use Stream API:
boolean areAllDistinct(List<Block> blocksList) {
return blocksList.stream().map(Block::getNum).distinct().count() == blockList.size();
}

If you are looking to avoid having duplicates at all, then you should just cut out the middle process of detecting duplicates and use a Set.

Improved code to return the duplicate elements
Can find duplicates in a Collection
return the set of duplicates
Unique Elements can be obtained from the Set
public static <T> List getDuplicate(Collection<T> list) {
final List<T> duplicatedObjects = new ArrayList<T>();
Set<T> set = new HashSet<T>() {
#Override
public boolean add(T e) {
if (contains(e)) {
duplicatedObjects.add(e);
}
return super.add(e);
}
};
for (T t : list) {
set.add(t);
}
return duplicatedObjects;
}
public static <T> boolean hasDuplicate(Collection<T> list) {
if (getDuplicate(list).isEmpty())
return false;
return true;
}

I needed to do a similar operation for a Stream, but couldn't find a good example. Here's what I came up with.
public static <T> boolean areUnique(final Stream<T> stream) {
final Set<T> seen = new HashSet<>();
return stream.allMatch(seen::add);
}
This has the advantage of short-circuiting when duplicates are found early rather than having to process the whole stream and isn't much more complicated than just putting everything in a Set and checking the size. So this case would roughly be:
List<T> list = ...
boolean allDistinct = areUnique(list.stream());

If your elements are somehow Comparable (the fact that the order has any real meaning is indifferent -- it just needs to be consistent with your definition of equality), the fastest duplicate removal solution is going to sort the list ( 0(n log(n)) ) then to do a single pass and look for repeated elements (that is, equal elements that follow each other) (this is O(n)).
The overall complexity is going to be O(n log(n)), which is roughly the same as what you would get with a Set (n times long(n)), but with a much smaller constant. This is because the constant in sort/dedup results from the cost of comparing elements, whereas the cost from the set is most likely to result from a hash computation, plus one (possibly several) hash comparisons. If you are using a hash-based Set implementation, that is, because a Tree based is going to give you a O( n log²(n) ), which is even worse.
As I understand it, however, you do not need to remove duplicates, but merely test for their existence. So you should hand-code a merge or heap sort algorithm on your array, that simply exits returning true (i.e. "there is a dup") if your comparator returns 0, and otherwise completes the sort, and traverse the sorted array testing for repeats. In a merge or heap sort, indeed, when the sort is completed, you will have compared every duplicate pair unless both elements were already in their final positions (which is unlikely). Thus, a tweaked sort algorithm should yield a huge performance improvement (I would have to prove that, but I guess the tweaked algorithm should be in the O(log(n)) on uniformly random data)

If you want the set of duplicate values:
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
public class FindDuplicateInArrayList {
public static void main(String[] args) {
Set<String> uniqueSet = new HashSet<String>();
List<String> dupesList = new ArrayList<String>();
for (String a : args) {
if (uniqueSet.contains(a))
dupesList.add(a);
else
uniqueSet.add(a);
}
System.out.println(uniqueSet.size() + " distinct words: " + uniqueSet);
System.out.println(dupesList.size() + " dupesList words: " + dupesList);
}
}
And probably also think about trimming values or using lowercase ... depending on your case.

Simply put:
1) make sure all items are comparable
2) sort the array
2) iterate over the array and find duplicates

To know the Duplicates in a List use the following code:It will give you the set which contains duplicates.
public Set<?> findDuplicatesInList(List<?> beanList) {
System.out.println("findDuplicatesInList::"+beanList);
Set<Object> duplicateRowSet=null;
duplicateRowSet=new LinkedHashSet<Object>();
for(int i=0;i<beanList.size();i++){
Object superString=beanList.get(i);
System.out.println("findDuplicatesInList::superString::"+superString);
for(int j=0;j<beanList.size();j++){
if(i!=j){
Object subString=beanList.get(j);
System.out.println("findDuplicatesInList::subString::"+subString);
if(superString.equals(subString)){
duplicateRowSet.add(beanList.get(j));
}
}
}
}
System.out.println("findDuplicatesInList::duplicationSet::"+duplicateRowSet);
return duplicateRowSet;
}

best way to handle this issue is to use a HashSet :
ArrayList<String> listGroupCode = new ArrayList<>();
listGroupCode.add("A");
listGroupCode.add("A");
listGroupCode.add("B");
listGroupCode.add("C");
HashSet<String> set = new HashSet<>(listGroupCode);
ArrayList<String> result = new ArrayList<>(set);
Just print result arraylist and see the result without duplicates :)

This answer is wrriten in Kotlin, but can easily be translated to Java.
If your arraylist's size is within a fixed small range, then this is a great solution.
var duplicateDetected = false
if(arrList.size > 1){
for(i in 0 until arrList.size){
for(j in 0 until arrList.size){
if(i != j && arrList.get(i) == arrList.get(j)){
duplicateDetected = true
}
}
}
}

private boolean isDuplicate() {
for (int i = 0; i < arrayList.size(); i++) {
for (int j = i + 1; j < arrayList.size(); j++) {
if (arrayList.get(i).getName().trim().equalsIgnoreCase(arrayList.get(j).getName().trim())) {
return true;
}
}
}
return false;
}

String tempVal = null;
for (int i = 0; i < l.size(); i++) {
tempVal = l.get(i); //take the ith object out of list
while (l.contains(tempVal)) {
l.remove(tempVal); //remove all matching entries
}
l.add(tempVal); //at last add one entry
}
Note: this will have major performance hit though as items are removed from start of the list.
To address this, we have two options. 1) iterate in reverse order and remove elements. 2) Use LinkedList instead of ArrayList. Due to biased questions asked in interviews to remove duplicates from List without using any other collection, above example is the answer. In real world though, if I have to achieve this, I will put elements from List to Set, simple!

/**
* Method to detect presence of duplicates in a generic list.
* Depends on the equals method of the concrete type. make sure to override it as required.
*/
public static <T> boolean hasDuplicates(List<T> list){
int count = list.size();
T t1,t2;
for(int i=0;i<count;i++){
t1 = list.get(i);
for(int j=i+1;j<count;j++){
t2 = list.get(j);
if(t2.equals(t1)){
return true;
}
}
}
return false;
}
An example of a concrete class that has overridden equals() :
public class Reminder{
private long id;
private int hour;
private int minute;
public Reminder(long id, int hour, int minute){
this.id = id;
this.hour = hour;
this.minute = minute;
}
#Override
public boolean equals(Object other){
if(other == null) return false;
if(this.getClass() != other.getClass()) return false;
Reminder otherReminder = (Reminder) other;
if(this.hour != otherReminder.hour) return false;
if(this.minute != otherReminder.minute) return false;
return true;
}
}

ArrayList<String> withDuplicates = new ArrayList<>();
withDuplicates.add("1");
withDuplicates.add("2");
withDuplicates.add("1");
withDuplicates.add("3");
HashSet<String> set = new HashSet<>(withDuplicates);
ArrayList<String> withoutDupicates = new ArrayList<>(set);
ArrayList<String> duplicates = new ArrayList<String>();
Iterator<String> dupIter = withDuplicates.iterator();
while(dupIter.hasNext())
{
String dupWord = dupIter.next();
if(withDuplicates.contains(dupWord))
{
duplicates.add(dupWord);
}else{
withoutDupicates.add(dupWord);
}
}
System.out.println(duplicates);
System.out.println(withoutDupicates);

A simple solution for learners.
//Method to find the duplicates.
public static List<Integer> findDublicate(List<Integer> numList){
List<Integer> dupLst = new ArrayList<Integer>();
//Compare one number against all the other number except the self.
for(int i =0;i<numList.size();i++) {
for(int j=0 ; j<numList.size();j++) {
if(i!=j && numList.get(i)==numList.get(j)) {
boolean isNumExist = false;
//The below for loop is used for avoid the duplicate again in the result list
for(Integer aNum: dupLst) {
if(aNum==numList.get(i)) {
isNumExist = true;
break;
}
}
if(!isNumExist) {
dupLst.add(numList.get(i));
}
}
}
}
return dupLst;
}

How to match the exact string value in the list of comma separated string [duplicate]

I have a String[] with values like so:
public static final String[] VALUES = new String[] {"AB","BC","CD","AE"};
Given String s, is there a good way of testing whether VALUES contains s?

Arrays.asList(yourArray).contains(yourValue)
Warning: this doesn't work for arrays of primitives (see the comments).
Since java-8 you can now use Streams.
String[] values = {"AB","BC","CD","AE"};
boolean contains = Arrays.stream(values).anyMatch("s"::equals);
To check whether an array of int, double or long contains a value use IntStream, DoubleStream or LongStream respectively.
Example
int[] a = {1,2,3,4};
boolean contains = IntStream.of(a).anyMatch(x -> x == 4);

Concise update for Java SE 9
Reference arrays are bad. For this case we are after a set. Since Java SE 9 we have Set.of.
private static final Set<String> VALUES = Set.of(
"AB","BC","CD","AE"
);
"Given String s, is there a good way of testing whether VALUES contains s?"
VALUES.contains(s)
O(1).
The right type, immutable, O(1) and concise. Beautiful.*
Original answer details
Just to clear the code up to start with. We have (corrected):
public static final String[] VALUES = new String[] {"AB","BC","CD","AE"};
This is a mutable static which FindBugs will tell you is very naughty. Do not modify statics and do not allow other code to do so also. At an absolute minimum, the field should be private:
private static final String[] VALUES = new String[] {"AB","BC","CD","AE"};
(Note, you can actually drop the new String[]; bit.)
Reference arrays are still bad and we want a set:
private static final Set<String> VALUES = new HashSet<String>(Arrays.asList(
new String[] {"AB","BC","CD","AE"}
));
(Paranoid people, such as myself, may feel more at ease if this was wrapped in Collections.unmodifiableSet - it could then even be made public.)
(*To be a little more on brand, the collections API is predictably still missing immutable collection types and the syntax is still far too verbose, for my tastes.)

You can use ArrayUtils.contains from Apache Commons Lang
public static boolean contains(Object[] array, Object objectToFind)
Note that this method returns false if the passed array is null.
There are also methods available for primitive arrays of all kinds.
Example:
String[] fieldsToInclude = { "id", "name", "location" };
if ( ArrayUtils.contains( fieldsToInclude, "id" ) ) {
// Do some stuff.
}

Just simply implement it by hand:
public static <T> boolean contains(final T[] array, final T v) {
for (final T e : array)
if (e == v || v != null && v.equals(e))
return true;
return false;
}
Improvement:
The v != null condition is constant inside the method. It always evaluates to the same Boolean value during the method call. So if the input array is big, it is more efficient to evaluate this condition only once, and we can use a simplified/faster condition inside the for loop based on the result. The improved contains() method:
public static <T> boolean contains2(final T[] array, final T v) {
if (v == null) {
for (final T e : array)
if (e == null)
return true;
}
else {
for (final T e : array)
if (e == v || v.equals(e))
return true;
}
return false;
}

Four Different Ways to Check If an Array Contains a Value
Using List:
public static boolean useList(String[] arr, String targetValue) {
return Arrays.asList(arr).contains(targetValue);
}
Using Set:
public static boolean useSet(String[] arr, String targetValue) {
Set<String> set = new HashSet<String>(Arrays.asList(arr));
return set.contains(targetValue);
}
Using a simple loop:
public static boolean useLoop(String[] arr, String targetValue) {
for (String s: arr) {
if (s.equals(targetValue))
return true;
}
return false;
}
Using Arrays.binarySearch():
The code below is wrong, it is listed here for completeness. binarySearch() can ONLY be used on sorted arrays. You will find the result is weird below. This is the best option when array is sorted.
public static boolean binarySearch(String[] arr, String targetValue) {
return Arrays.binarySearch(arr, targetValue) >= 0;
}
Quick Example:
String testValue="test";
String newValueNotInList="newValue";
String[] valueArray = { "this", "is", "java" , "test" };
Arrays.asList(valueArray).contains(testValue); // returns true
Arrays.asList(valueArray).contains(newValueNotInList); // returns false

If the array is not sorted, you will have to iterate over everything and make a call to equals on each.
If the array is sorted, you can do a binary search, there's one in the Arrays class.
Generally speaking, if you are going to do a lot of membership checks, you may want to store everything in a Set, not in an array.

For what it's worth I ran a test comparing the 3 suggestions for speed. I generated random integers, converted them to a String and added them to an array. I then searched for the highest possible number/string, which would be a worst case scenario for the asList().contains().
When using a 10K array size the results were:
Sort & Search : 15
Binary Search : 0
asList.contains : 0
When using a 100K array the results were:
Sort & Search : 156
Binary Search : 0
asList.contains : 32
So if the array is created in sorted order the binary search is the fastest, otherwise the asList().contains would be the way to go. If you have many searches, then it may be worthwhile to sort the array so you can use the binary search. It all depends on your application.
I would think those are the results most people would expect. Here is the test code:
import java.util.*;
public class Test {
public static void main(String args[]) {
long start = 0;
int size = 100000;
String[] strings = new String[size];
Random random = new Random();
for (int i = 0; i < size; i++)
strings[i] = "" + random.nextInt(size);
start = System.currentTimeMillis();
Arrays.sort(strings);
System.out.println(Arrays.binarySearch(strings, "" + (size - 1)));
System.out.println("Sort & Search : "
+ (System.currentTimeMillis() - start));
start = System.currentTimeMillis();
System.out.println(Arrays.binarySearch(strings, "" + (size - 1)));
System.out.println("Search : "
+ (System.currentTimeMillis() - start));
start = System.currentTimeMillis();
System.out.println(Arrays.asList(strings).contains("" + (size - 1)));
System.out.println("Contains : "
+ (System.currentTimeMillis() - start));
}
}

Instead of using the quick array initialisation syntax too, you could just initialise it as a List straight away in a similar manner using the Arrays.asList method, e.g.:
public static final List<String> STRINGS = Arrays.asList("firstString", "secondString" ...., "lastString");
Then you can do (like above):
STRINGS.contains("the string you want to find");

With Java 8 you can create a stream and check if any entries in the stream matches "s":
String[] values = {"AB","BC","CD","AE"};
boolean sInArray = Arrays.stream(values).anyMatch("s"::equals);
Or as a generic method:
public static <T> boolean arrayContains(T[] array, T value) {
return Arrays.stream(array).anyMatch(value::equals);
}

You can use the Arrays class to perform a binary search for the value. If your array is not sorted, you will have to use the sort functions in the same class to sort the array, then search through it.

ObStupidAnswer (but I think there's a lesson in here somewhere):
enum Values {
AB, BC, CD, AE
}
try {
Values.valueOf(s);
return true;
} catch (IllegalArgumentException exc) {
return false;
}

Actually, if you use HashSet<String> as Tom Hawtin proposed you don't need to worry about sorting, and your speed is the same as with binary search on a presorted array, probably even faster.
It all depends on how your code is set up, obviously, but from where I stand, the order would be:
On an unsorted array:
HashSet
asList
sort & binary
On a sorted array:
HashSet
Binary
asList
So either way, HashSet for the win.

Developers often do:
Set<String> set = new HashSet<String>(Arrays.asList(arr));
return set.contains(targetValue);
The above code works, but there is no need to convert a list to set first. Converting a list to a set requires extra time. It can as simple as:
Arrays.asList(arr).contains(targetValue);
or
for (String s : arr) {
if (s.equals(targetValue))
return true;
}
return false;
The first one is more readable than the second one.

If you have the google collections library, Tom's answer can be simplified a lot by using ImmutableSet (http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/collect/ImmutableSet.html)
This really removes a lot of clutter from the initialization proposed
private static final Set<String> VALUES = ImmutableSet.of("AB","BC","CD","AE");

In Java 8 use Streams.
List<String> myList =
Arrays.asList("a1", "a2", "b1", "c2", "c1");
myList.stream()
.filter(s -> s.startsWith("c"))
.map(String::toUpperCase)
.sorted()
.forEach(System.out::println);

One possible solution:
import java.util.Arrays;
import java.util.List;
public class ArrayContainsElement {
public static final List<String> VALUES = Arrays.asList("AB", "BC", "CD", "AE");
public static void main(String args[]) {
if (VALUES.contains("AB")) {
System.out.println("Contains");
} else {
System.out.println("Not contains");
}
}
}

Using a simple loop is the most efficient way of doing this.
boolean useLoop(String[] arr, String targetValue) {
for(String s: arr){
if(s.equals(targetValue))
return true;
}
return false;
}
Courtesy to Programcreek

the shortest solution
the array VALUES may contain duplicates
since Java 9
List.of(VALUES).contains(s);

Use the following (the contains() method is ArrayUtils.in() in this code):
ObjectUtils.java
public class ObjectUtils {
/**
* A null safe method to detect if two objects are equal.
* #param object1
* #param object2
* #return true if either both objects are null, or equal, else returns false.
*/
public static boolean equals(Object object1, Object object2) {
return object1 == null ? object2 == null : object1.equals(object2);
}
}
ArrayUtils.java
public class ArrayUtils {
/**
* Find the index of of an object is in given array,
* starting from given inclusive index.
* #param ts Array to be searched in.
* #param t Object to be searched.
* #param start The index from where the search must start.
* #return Index of the given object in the array if it is there, else -1.
*/
public static <T> int indexOf(final T[] ts, final T t, int start) {
for (int i = start; i < ts.length; ++i)
if (ObjectUtils.equals(ts[i], t))
return i;
return -1;
}
/**
* Find the index of of an object is in given array, starting from 0;
* #param ts Array to be searched in.
* #param t Object to be searched.
* #return indexOf(ts, t, 0)
*/
public static <T> int indexOf(final T[] ts, final T t) {
return indexOf(ts, t, 0);
}
/**
* Detect if the given object is in the given array.
* #param ts Array to be searched in.
* #param t Object to be searched.
* #return If indexOf(ts, t) is greater than -1.
*/
public static <T> boolean in(final T[] ts, final T t) {
return indexOf(ts, t) > -1;
}
}
As you can see in the code above, that there are other utility methods ObjectUtils.equals() and ArrayUtils.indexOf(), that were used at other places as well.

For arrays of limited length use the following (as given by camickr). This is slow for repeated checks, especially for longer arrays (linear search).
Arrays.asList(...).contains(...)
For fast performance if you repeatedly check against a larger set of elements
An array is the wrong structure. Use a TreeSet and add each element to it. It sorts elements and has a fast exist() method (binary search).
If the elements implement Comparable & you want the TreeSet sorted accordingly:
ElementClass.compareTo() method must be compatable with ElementClass.equals(): see Triads not showing up to fight? (Java Set missing an item)
TreeSet myElements = new TreeSet();
// Do this for each element (implementing *Comparable*)
myElements.add(nextElement);
// *Alternatively*, if an array is forceably provided from other code:
myElements.addAll(Arrays.asList(myArray));
Otherwise, use your own Comparator:
class MyComparator implements Comparator<ElementClass> {
int compareTo(ElementClass element1; ElementClass element2) {
// Your comparison of elements
// Should be consistent with object equality
}
boolean equals(Object otherComparator) {
// Your equality of comparators
}
}
// construct TreeSet with the comparator
TreeSet myElements = new TreeSet(new MyComparator());
// Do this for each element (implementing *Comparable*)
myElements.add(nextElement);
The payoff: check existence of some element:
// Fast binary search through sorted elements (performance ~ log(size)):
boolean containsElement = myElements.exists(someElement);

If you don't want it to be case sensitive
Arrays.stream(VALUES).anyMatch(s::equalsIgnoreCase);

Try this:
ArrayList<Integer> arrlist = new ArrayList<Integer>(8);
// use add() method to add elements in the list
arrlist.add(20);
arrlist.add(25);
arrlist.add(10);
arrlist.add(15);
boolean retval = arrlist.contains(10);
if (retval == true) {
System.out.println("10 is contained in the list");
}
else {
System.out.println("10 is not contained in the list");
}

Check this
String[] VALUES = new String[]{"AB", "BC", "CD", "AE"};
String s;
for (int i = 0; i < VALUES.length; i++) {
if (VALUES[i].equals(s)) {
// do your stuff
} else {
//do your stuff
}
}

Arrays.asList() -> then calling the contains() method will always work, but a search algorithm is much better since you don't need to create a lightweight list wrapper around the array, which is what Arrays.asList() does.
public boolean findString(String[] strings, String desired){
for (String str : strings){
if (desired.equals(str)) {
return true;
}
}
return false; //if we get here… there is no desired String, return false.
}

Use below -
String[] values = {"AB","BC","CD","AE"};
String s = "A";
boolean contains = Arrays.stream(values).anyMatch(v -> v.contains(s));

Use Array.BinarySearch(array,obj) for finding the given object in array or not.
Example:
if (Array.BinarySearch(str, i) > -1)` → true --exists
false --not exists

Try using Java 8 predicate test method
Here is a full example of it.
import java.util.Arrays;
import java.util.List;
import java.util.function.Predicate;
public class Test {
public static final List<String> VALUES =
Arrays.asList("AA", "AB", "BC", "CD", "AE");
public static void main(String args[]) {
Predicate<String> containsLetterA = VALUES -> VALUES.contains("AB");
for (String i : VALUES) {
System.out.println(containsLetterA.test(i));
}
}
}
http://mytechnologythought.blogspot.com/2019/10/java-8-predicate-test-method-example.html
https://github.com/VipulGulhane1/java8/blob/master/Test.java

Create a boolean initially set to false. Run a loop to check every value in the array and compare to the value you are checking against. If you ever get a match, set boolean to true and stop the looping. Then assert that the boolean is true.

As I'm dealing with low level Java using primitive types byte and byte[], the best so far I got is from bytes-java https://github.com/patrickfav/bytes-java seems a fine piece of work

You can check it by two methods
A) By converting the array into string and then check the required string by .contains method
String a = Arrays.toString(VALUES);
System.out.println(a.contains("AB"));
System.out.println(a.contains("BC"));
System.out.println(a.contains("CD"));
System.out.println(a.contains("AE"));
B) This is a more efficent method
Scanner s = new Scanner(System.in);
String u = s.next();
boolean d = true;
for (int i = 0; i < VAL.length; i++) {
if (VAL[i].equals(u) == d)
System.out.println(VAL[i] + " " + u + VAL[i].equals(u));
}

How can I check if a string has a substring from a List?

I am looking for the best way to check if a string contains a substring from a list of keywords.
For example, I create a list like this:
List<String> keywords = new ArrayList<>();
keywords.add("mary");
keywords.add("lamb");
String s1 = "mary is a good girl";
String s2 = "she likes travelling";
String s1 has "mary" from the keywords, but string s2 does not have it. So, I would like to define a method:
boolean containsAKeyword(String str, List<String> keywords)
Where containsAKeyword(s1, keywords) would return true but containsAKeyword(s2, keywords) would return false. I can return true even if there is a single substring match.
I know I can just iterate over the keywords list and call str.contains() on each item in the list, but I was wondering if there is a better way to iterate over the complete list (avoid O(n) complexity) or if Java provides any built-in methods for this.

I would recommend iterating over the entire list. Thankfully, you can use an enhanced for loop:
for(String listItem : myArrayList){
if(myString.contains(listItem)){
// do something.
}
}
EDIT To the best of my knowledge, you have to iterate the list somehow. Think about it, how will you know which elements are contained in the list without going through it?
EDIT 2
The only way I can see the iteration running quickly is to do the above. The way this is designed, it will break early once you've found a match, without searching any further. You can put your return false statement at the end of looping, because if you have checked the entire list without finding a match, clearly there is none. Here is some more detailed code:
public boolean containsAKeyword(String myString, List<String> keywords){
for(String keyword : keywords){
if(myString.contains(keyword)){
return true;
}
}
return false; // Never found match.
}
EDIT 3
If you're using Kotlin, you can do this with the any method:
val containsKeyword = myArrayList.any { it.contains("keyword") }

In JDK8 you can do this like:
public static boolean hasKey(String key) {
return keywords.stream().filter(k -> key.contains(k)).collect(Collectors.toList()).size() > 0;
}
hasKey(s1); // prints TRUE
hasKey(s2); // prints FALSE

Now you can use Java 8 stream for this purpose:
keywords.stream().anyMatch(keyword -> str.contains(keyword));

Here is the solution
List<String> keywords = new ArrayList<>();
keywords.add("mary");
keywords.add("lamb");
String s1 = "mary is a good girl";
String s2 = "she likes travelling";
// The function
boolean check(String str, List<String> keywords)
Iterator<String> it = keywords.iterator();
while(it.hasNext()){
if(str.contains(it.next()))
return true;
}
return false;
}

Iterate over the keyword list and return true if the string contains your keyword. Return false otherwise.
public boolean containsAKeyword(String str, List<String> keywords){
for(String k : keywords){
if(str.contains(k))
return true;
}
return false;
}

You can add all the words in keywords in a hashmap. Then you can use str.contains for string 1 and string 2 to check if keywords are available.

Depending on the size of the list, I would suggest using the matches() method of String. String.matches takes a regex argument that, with smaller lists, you could sinply build a regular expression and evaluate it:
String Str = new String("This is a test string");
System.out.println(Str.matches("(.*)test(.*)"));
This should print out "true."
Or you could use java.util.regex.Pattern.

Removing a String from an ArrayList

So I have a problem that takes the names of people from a user and stores them in an ArrayList(personalNames). After that I need to take that list and remove any name that has anything besides letters a-z (anything with numbers or symbols) in it and put them into a separate ArrayList(errorProneNames) that holds the errors. Could someone help me with the removal part?
public class NameList {
public static void main(String[] args) {
ArrayList<String> personalNames = new ArrayList<String>();
Scanner input = new Scanner(System.in);
String answer;
do{
System.out.println("Enter the personal Names: ");
String names = input.next();
personalNames.add(names);
System.out.println("would you like to enter another name (yes/no)?");
answer = input.next();
} while (answer.equalsIgnoreCase("yes"));
ArrayList<String> errorProneNames = new ArrayList<String>();
}
}

If it's the "how do I remove an element from an ArrayList<>" part which is causing problems, and you want to check all the values, you probably want to use an Iterator and call remove on that:
for (Iterator<String> iterator = personalNames.iterator(); iterator.hasNext(); ) {
String name = iterator.next();
if (isErrorProne(name)) {
iterator.remove();
}
}
Note that you mustn't remove an element from a collection while you're iterating over it in an enhanced-for loop except with the iterator. So this would be wrong:
// BAD CODE: DO NOT USE
for (String name : personalNames) {
if (isErrorProne(name)) {
personalNames.remove(name);
}
}
That will throw a ConcurrentModificationException.
Another option would be to create a new list of good names:
List<String> goodNames = new ArrayList<>();
for (String name : personalNames) {
if (!isErrorProne(name)) {
goodNames.add(name);
}
}
Now, if your real problem is that you don't know how to write the isErrorProne method, that's a different matter. I suspect that you want to use a regular expression to check that the name only contains letters, spaces, hyphens, and perhaps apostrophes - but you should think carefully about exactly what you want here. So you might want:
private static boolean isErrorProne(String name) {
return !name.matches("^[a-zA-Z \\-']+$");
}
Note that that won't cope with accented characters, for example. Maybe that's okay for your situation - maybe it's not. You need to consider exactly what you want to allow, and adjust the regular expression accordingly.
You may also want to consider expressing it in terms of whether something is a good name rather than whether it's a bad name - particularly if you use the last approach of building up a new list of good names.

Here is your solution :
String regex = "[a-zA-Z]*";
for (String temp : personalNames ) {
if (!temp.matches(regex)){
errorProneNames.add(temp);
personalNames.remove(temp);
}
}

You can use the remove() method of ArrayList
personalNames.remove("stringToBeRemoved");
Lot of overloaded methods are available. You can delete with index, Object(String itself) etc. You can see Javadocs for more info.
Also to remove all String having anything but a-z letters you can use regex. Logic is as follows
String regex = "[a-zA-Z]*";
String testString = "abc1";
if(!testString.matches(regex)){
System.out.println("Remove this");
}
As Jon pointed out while iterating over the List do not use the Lists's remove() method but the iterators remove() method.

There are two ways you can do this:
The first is to iterate backwards through the list, remove them, then add them into the second list. I say to do it backwards, because it will change the index.
for (int i = personalNames.size()-1; i >=0; i++) {
if (isBadName(personalNames.get(i)]){
errorProneNames.add(personalNames.get(i));
personalNames.remove(i);
}
}
The second way is to use the Iterator provided by ArrayList (personalNames.iterator()). This will allow you to go forward.

I would probably do this
// Check that the string contains only letters.
private static boolean onlyLetters(String in) {
if (in == null) {
return false;
}
for (char c : in.toCharArray()) {
if (!Character.isLetter(c)) {
return false;
}
}
return true;
}
public static void main(String[] args) {
ArrayList<String> personalNames = new ArrayList<String>();
ArrayList<String> errorProneNames = new ArrayList<String>(); // keep this list here.
Scanner input = new Scanner(System.in);
String answer;
do {
System.out.println("Enter the personal Names: ");
String names = input.next();
if (onlyLetters(names)) { // test on input.
personalNames.add(names); // good.
} else {
errorProneNames.add(names); // bad.
}
System.out
.println("would you like to enter another name (yes/no)?");
answer = input.next();
} while (answer.equalsIgnoreCase("yes"));
}

get an iterator from list, while itr has next element give it to a method for example isNotProneName which takes a String and returns true or false, if the given String matches not your needs. if false returned remove string from itr and add it to the other list

Use regex [a-zA-Z ]+ with String.matches to test error-prone name and Iterator to remove.
Iterator<String> it=personalNames.iterator();
while(it.hasNext()){
String name=it.next();
if(name.matches("[a-zA-Z ]+")){
it.remove();
}
}

Java: Detect duplicates in ArrayList?

How could I go about detecting (returning true/false) whether an ArrayList contains more than one of the same element in Java?
Many thanks,
Terry
Edit
Forgot to mention that I am not looking to compare "Blocks" with each other but their integer values. Each "block" has an int and this is what makes them different.
I find the int of a particular Block by calling a method named "getNum" (e.g. table1[0][2].getNum();

Simplest: dump the whole collection into a Set (using the Set(Collection) constructor or Set.addAll), then see if the Set has the same size as the ArrayList.
List<Integer> list = ...;
Set<Integer> set = new HashSet<Integer>(list);
if(set.size() < list.size()){
/* There are duplicates */
}
Update: If I'm understanding your question correctly, you have a 2d array of Block, as in
Block table[][];
and you want to detect if any row of them has duplicates?
In that case, I could do the following, assuming that Block implements "equals" and "hashCode" correctly:
for (Block[] row : table) {
Set set = new HashSet<Block>();
for (Block cell : row) {
set.add(cell);
}
if (set.size() < 6) { //has duplicate
}
}
I'm not 100% sure of that for syntax, so it might be safer to write it as
for (int i = 0; i < 6; i++) {
Set set = new HashSet<Block>();
for (int j = 0; j < 6; j++)
set.add(table[i][j]);
...
Set.add returns a boolean false if the item being added is already in the set, so you could even short circuit and bale out on any add that returns false if all you want to know is whether there are any duplicates.

Improved code, using return value of Set#add instead of comparing the size of list and set.
public static <T> boolean hasDuplicate(Iterable<T> all) {
Set<T> set = new HashSet<T>();
// Set#add returns false if the set does not change, which
// indicates that a duplicate element has been added.
for (T each: all) if (!set.add(each)) return true;
return false;
}

With Java 8+ you can use Stream API:
boolean areAllDistinct(List<Block> blocksList) {
return blocksList.stream().map(Block::getNum).distinct().count() == blockList.size();
}

If you are looking to avoid having duplicates at all, then you should just cut out the middle process of detecting duplicates and use a Set.

Improved code to return the duplicate elements
Can find duplicates in a Collection
return the set of duplicates
Unique Elements can be obtained from the Set
public static <T> List getDuplicate(Collection<T> list) {
final List<T> duplicatedObjects = new ArrayList<T>();
Set<T> set = new HashSet<T>() {
#Override
public boolean add(T e) {
if (contains(e)) {
duplicatedObjects.add(e);
}
return super.add(e);
}
};
for (T t : list) {
set.add(t);
}
return duplicatedObjects;
}
public static <T> boolean hasDuplicate(Collection<T> list) {
if (getDuplicate(list).isEmpty())
return false;
return true;
}

I needed to do a similar operation for a Stream, but couldn't find a good example. Here's what I came up with.
public static <T> boolean areUnique(final Stream<T> stream) {
final Set<T> seen = new HashSet<>();
return stream.allMatch(seen::add);
}
This has the advantage of short-circuiting when duplicates are found early rather than having to process the whole stream and isn't much more complicated than just putting everything in a Set and checking the size. So this case would roughly be:
List<T> list = ...
boolean allDistinct = areUnique(list.stream());

If your elements are somehow Comparable (the fact that the order has any real meaning is indifferent -- it just needs to be consistent with your definition of equality), the fastest duplicate removal solution is going to sort the list ( 0(n log(n)) ) then to do a single pass and look for repeated elements (that is, equal elements that follow each other) (this is O(n)).
The overall complexity is going to be O(n log(n)), which is roughly the same as what you would get with a Set (n times long(n)), but with a much smaller constant. This is because the constant in sort/dedup results from the cost of comparing elements, whereas the cost from the set is most likely to result from a hash computation, plus one (possibly several) hash comparisons. If you are using a hash-based Set implementation, that is, because a Tree based is going to give you a O( n log²(n) ), which is even worse.
As I understand it, however, you do not need to remove duplicates, but merely test for their existence. So you should hand-code a merge or heap sort algorithm on your array, that simply exits returning true (i.e. "there is a dup") if your comparator returns 0, and otherwise completes the sort, and traverse the sorted array testing for repeats. In a merge or heap sort, indeed, when the sort is completed, you will have compared every duplicate pair unless both elements were already in their final positions (which is unlikely). Thus, a tweaked sort algorithm should yield a huge performance improvement (I would have to prove that, but I guess the tweaked algorithm should be in the O(log(n)) on uniformly random data)

If you want the set of duplicate values:
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
public class FindDuplicateInArrayList {
public static void main(String[] args) {
Set<String> uniqueSet = new HashSet<String>();
List<String> dupesList = new ArrayList<String>();
for (String a : args) {
if (uniqueSet.contains(a))
dupesList.add(a);
else
uniqueSet.add(a);
}
System.out.println(uniqueSet.size() + " distinct words: " + uniqueSet);
System.out.println(dupesList.size() + " dupesList words: " + dupesList);
}
}
And probably also think about trimming values or using lowercase ... depending on your case.

Simply put:
1) make sure all items are comparable
2) sort the array
2) iterate over the array and find duplicates

To know the Duplicates in a List use the following code:It will give you the set which contains duplicates.
public Set<?> findDuplicatesInList(List<?> beanList) {
System.out.println("findDuplicatesInList::"+beanList);
Set<Object> duplicateRowSet=null;
duplicateRowSet=new LinkedHashSet<Object>();
for(int i=0;i<beanList.size();i++){
Object superString=beanList.get(i);
System.out.println("findDuplicatesInList::superString::"+superString);
for(int j=0;j<beanList.size();j++){
if(i!=j){
Object subString=beanList.get(j);
System.out.println("findDuplicatesInList::subString::"+subString);
if(superString.equals(subString)){
duplicateRowSet.add(beanList.get(j));
}
}
}
}
System.out.println("findDuplicatesInList::duplicationSet::"+duplicateRowSet);
return duplicateRowSet;
}

best way to handle this issue is to use a HashSet :
ArrayList<String> listGroupCode = new ArrayList<>();
listGroupCode.add("A");
listGroupCode.add("A");
listGroupCode.add("B");
listGroupCode.add("C");
HashSet<String> set = new HashSet<>(listGroupCode);
ArrayList<String> result = new ArrayList<>(set);
Just print result arraylist and see the result without duplicates :)

This answer is wrriten in Kotlin, but can easily be translated to Java.
If your arraylist's size is within a fixed small range, then this is a great solution.
var duplicateDetected = false
if(arrList.size > 1){
for(i in 0 until arrList.size){
for(j in 0 until arrList.size){
if(i != j && arrList.get(i) == arrList.get(j)){
duplicateDetected = true
}
}
}
}

private boolean isDuplicate() {
for (int i = 0; i < arrayList.size(); i++) {
for (int j = i + 1; j < arrayList.size(); j++) {
if (arrayList.get(i).getName().trim().equalsIgnoreCase(arrayList.get(j).getName().trim())) {
return true;
}
}
}
return false;
}

String tempVal = null;
for (int i = 0; i < l.size(); i++) {
tempVal = l.get(i); //take the ith object out of list
while (l.contains(tempVal)) {
l.remove(tempVal); //remove all matching entries
}
l.add(tempVal); //at last add one entry
}
Note: this will have major performance hit though as items are removed from start of the list.
To address this, we have two options. 1) iterate in reverse order and remove elements. 2) Use LinkedList instead of ArrayList. Due to biased questions asked in interviews to remove duplicates from List without using any other collection, above example is the answer. In real world though, if I have to achieve this, I will put elements from List to Set, simple!

/**
* Method to detect presence of duplicates in a generic list.
* Depends on the equals method of the concrete type. make sure to override it as required.
*/
public static <T> boolean hasDuplicates(List<T> list){
int count = list.size();
T t1,t2;
for(int i=0;i<count;i++){
t1 = list.get(i);
for(int j=i+1;j<count;j++){
t2 = list.get(j);
if(t2.equals(t1)){
return true;
}
}
}
return false;
}
An example of a concrete class that has overridden equals() :
public class Reminder{
private long id;
private int hour;
private int minute;
public Reminder(long id, int hour, int minute){
this.id = id;
this.hour = hour;
this.minute = minute;
}
#Override
public boolean equals(Object other){
if(other == null) return false;
if(this.getClass() != other.getClass()) return false;
Reminder otherReminder = (Reminder) other;
if(this.hour != otherReminder.hour) return false;
if(this.minute != otherReminder.minute) return false;
return true;
}
}

ArrayList<String> withDuplicates = new ArrayList<>();
withDuplicates.add("1");
withDuplicates.add("2");
withDuplicates.add("1");
withDuplicates.add("3");
HashSet<String> set = new HashSet<>(withDuplicates);
ArrayList<String> withoutDupicates = new ArrayList<>(set);
ArrayList<String> duplicates = new ArrayList<String>();
Iterator<String> dupIter = withDuplicates.iterator();
while(dupIter.hasNext())
{
String dupWord = dupIter.next();
if(withDuplicates.contains(dupWord))
{
duplicates.add(dupWord);
}else{
withoutDupicates.add(dupWord);
}
}
System.out.println(duplicates);
System.out.println(withoutDupicates);

A simple solution for learners.
//Method to find the duplicates.
public static List<Integer> findDublicate(List<Integer> numList){
List<Integer> dupLst = new ArrayList<Integer>();
//Compare one number against all the other number except the self.
for(int i =0;i<numList.size();i++) {
for(int j=0 ; j<numList.size();j++) {
if(i!=j && numList.get(i)==numList.get(j)) {
boolean isNumExist = false;
//The below for loop is used for avoid the duplicate again in the result list
for(Integer aNum: dupLst) {
if(aNum==numList.get(i)) {
isNumExist = true;
break;
}
}
if(!isNumExist) {
dupLst.add(numList.get(i));
}
}
}
}
return dupLst;
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Partially match strings in case of List.contains(String) - java

Perhaps you want to put each String group into a HashSet, and by fragment, I mean don't add "IJ KL" but rather add "IJ" and "KL" separately. If you need both the list and this search capabilities, you may need to maintain two collections.

Related

How to check if two objects in a ArrayList are the same? [duplicate]

How to match the exact string value in the list of comma separated string [duplicate]

How can I check if a string has a substring from a List?

Removing a String from an ArrayList

Java: Detect duplicates in ArrayList?

Categories

Resources