Java using contains function to match string object ignore capital case? - java

I want that the contain function should return true even if the following are in capital letters
List<String> pformats= Arrays.asList("odt","ott","oth","odm","sxw","stw","sxg","doc","dot","xml","docx","docm","dotx","dotm","doc","wpd","wps","rtf","txt","csv","sdw","sgl","vor","uot","uof","jtd","jtt","hwp","602","pdb","psw","ods","ots","sxc","stc","xls","xlw","xlt","xlsx","xlsm","xltx","xltm","xlsb","wk1","wks","123","dif","sdc","vor","dbf","slk","uos","pxl","wb2","odp","odg","otp","sxi","sti","ppt","pps","pot","pptx","pptm","potx","potm","sda","sdd","sdp","vor","uop","cgm","bmp","dxf","emf","eps","met","pbm","pct","pcd","pcx","pgm","plt","ppm","psd","ras","sda","sdd","sgf","sgv","svm","tgs","tif","tiff","vor","wmf","xbm","xpm","jpg","jpeg","gif","png","pdf","log");
if(pformats.contains(extension)){
// do stuff
}

A Set is a better choice for a lookup.
private static final Set<String> P_FORMATS = new HashSet<String>(Arrays.asList(
"odt,ott,oth,odm,sxw,stw,sxg,doc,dot,xml,docx,docm,dotx,dotm,doc,wpd,wps,rtf,txt,csv,sdw,sgl,vor,uot,uof,jtd,jtt,hwp,602,pdb,psw,ods,ots,sxc,stc,xls,xlw,xlt,xlsx,xlsm,xltx,xltm,xlsb,wk1,wks,123,dif,sdc,vor,dbf,slk,uos,pxl,wb2,odp,odg,otp,sxi,sti,ppt,pps,pot,pptx,pptm,potx,potm,sda,sdd,sdp,vor,uop,cgm,bmp,dxf,emf,eps,met,pbm,pct,pcd,pcx,pgm,plt,ppm,psd,ras,sda,sdd,sgf,sgv,svm,tgs,tif,tiff,vor,wmf,xbm,xpm,jpg,jpeg,gif,png,pdf,log".split(","));
if(P_FORMATS.contains(extension.toLowerCase())){
// do stuff
}

Short answer: Will not work. You can't overwrite the contains, BUT: You can us the following code:
List<String> pformats= Arrays.asList("odt","ott","oth","odm","sxw","stw","sxg","doc","dot","xml","docx","docm","dotx","dotm","doc","wpd","wps","rtf","txt","csv","sdw","sgl","vor","uot","uof","jtd","jtt","hwp","602","pdb","psw","ods","ots","sxc","stc","xls","xlw","xlt","xlsx","xlsm","xltx","xltm","xlsb","wk1","wks","123","dif","sdc","vor","dbf","slk","uos","pxl","wb2","odp","odg","otp","sxi","sti","ppt","pps","pot","pptx","pptm","potx","potm","sda","sdd","sdp","vor","uop","cgm","bmp","dxf","emf","eps","met","pbm","pct","pcd","pcx","pgm","plt","ppm","psd","ras","sda","sdd","sgf","sgv","svm","tgs","tif","tiff","vor","wmf","xbm","xpm","jpg","jpeg","gif","png","pdf","log");
if(pformats.contains(extension.toLowerCase())){
}
This will make you extension to lowercase, and if within your Array are all extensions are already lowerCase, than it'll wokk.

Convert your List of extensions into a regular expression, compile it with the CASE_INSENSITVE flag, and use that.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public final class Foo {
public static void main(final String... args) {
final Pattern p = Pattern.compile("odt|ott|oth|odm|sxw|stw|sxg|doc|dot|xml|docx|docm|dotx|dotm|doc|wpd|wps|rtf|txt|csv|sdw|sgl|vor|uot|uof|jtd|jtt|hwp|602|pdb|psw|ods|ots|sxc|stc|xls|xlw|xlt|xlsx|xlsm|xltx|xltm|xlsb|wk1|wks|123|dif|sdc|vor|dbf|slk|uos|pxl|wb2|odp|odg|otp|sxi|sti|ppt|pps|pot|pptx|pptm|potx|potm|sda|sdd|sdp|vor|uop|cgm|bmp|dxf|emf|eps|met|pbm|pct|pcd|pcx|pgm|plt|ppm|psd|ras|sda|sdd|sgf|sgv|svm|tgs|tif|tiff|vor|wmf|xbm|xpm|jpg|jpeg|gif|png|pdf|log", Pattern.CASE_INSENSITIVE);
// Will be true
System.out.println(p.matcher("bmp").matches());
// Will be false
System.out.println(p.matcher("quasar").matches());
}
}
This would probably be easier to read/maintain if you build the regex programatically, but I've left that as an exercise to the reader.

How about:
extension.toLowerCase()
?
Although I'm not sure 100% sure what contains() method will do in this example. You might need to stick your extensions into a Set.
Edit: No it wont work as the contains method checks for the existence of a particular Object. Your string, even with the same value, is a different Object. So yes either a) override the contains method, e.g loop through the array and do a string comparison or b) simpler, use a Set.
Edit 2: Apparently it will work per comments below as ArrayList.contains() checks for equality (so you will get a string match), but this seems to disagree with the top voted answer that says it wont.

If all your formats are lower case, then toLowerCase combined with a HashSet is the preferred solution.
If your formats are in mixed case (and shall stay this way, as you are using them for other things, too) you need a real case-insensitive comparison.
Then a TreeSet (or other SortedSet) with a case insensitive collator as the comparator will do. (It is not as fast as a HashSet, but will still be faster then the ArrayList (except for really small lists).)
Alternatively a HashSet variant using a custom hashCode and equals (or simply a normal HashSet on wrapper objects with a case insensitive implementation of equals and hashCode) would do fine.

Add this extended List class:
private static class ListIgnoreCase<String> extends java.util.LinkedList {
public ListIgnoreCase(Collection<String> c) {
super();
addAll(c);
}
public boolean containsIgnoreCase(java.lang.String toSearch) {
for (Object element : this)
if (java.lang.String.valueOf(element).equalsIgnoreCase(toSearch))
return true;
return false;
}
}
Now you can call asList like this:
if(new ListIgnoreCase(Arrays.asList("odt","ott","oth","odm"))
.containtsIgnoreCase(extension)) {
...

You can use IteracleUtils and Predicate from collections4 (apache).
List<String> pformats= Arrays.asList("odt","ott","oth","odm","sxw","stw","sxg","doc","dot","xml","docx","docm","dotx","dotm","doc","wpd","wps","rtf","txt","csv","sdw","sgl","vor","uot","uof","jtd","jtt","hwp","602","pdb","psw","ods","ots","sxc","stc","xls","xlw","xlt","xlsx","xlsm","xltx","xltm","xlsb","wk1","wks","123","dif","sdc","vor","dbf","slk","uos","pxl","wb2","odp","odg","otp","sxi","sti","ppt","pps","pot","pptx","pptm","potx","potm","sda","sdd","sdp","vor","uop","cgm","bmp","dxf","emf","eps","met","pbm","pct","pcd","pcx","pgm","plt","ppm","psd","ras","sda","sdd","sgf","sgv","svm","tgs","tif","tiff","vor","wmf","xbm","xpm","jpg","jpeg","gif","png","pdf","log");
Predicate<String> predicate = (s) -> StringUtils.equalsIgnoreCase(s, "JPG");
if(IterableUtils.matchesAny(pformats, predicate))
// do stuff
}
org.apache.commons.collections4.IterableUtils

Related

How to typesafe check equality of booleans?

public class ComplexObject {
private boolean isA, isB;
}
//custom comparator
public boolean checkComplexObject(ComplexObject o1, ComplexObject o2) {
return o1.getIsA() == o2.getIsB();
}
Now when I change the data type in ComplexObject from boolean to String for example, the comparator will not break, nor will I notice that in future I would compare Strings instead of booleans and thus get different results.
Question: how could I compare the boolean properties typesafe, so that I get compilation error when I change the datatype of the fields?
One very simple thing you can do is put in a redundant cast:
return (boolean)o1.getIsA() == (boolean)o2.getIsB();
You can also define a method that only accepts boolean:
static boolean booleanEquals(boolean a, boolean b) {
return a == b;
}
Then call booleanEquals instead of using ==.
As a side note, this programming seems a bit overly defensive to me.
There are a few things you can do, but all of them will make your code less readable, and therefore I would advise against them.
For example :
return o1.getIsA() ^ o2.getIsB() == false;
or
return (o1.getIsA() && o2.getIsB()) || (!o1.getIsA() && !o2.getIsB());
You can use XOR for that:
return o1.getIsA() ^ o2.getIsB();
The better question is, why would you do that?
If you refactor (well, change heavily) your attributes from boolean to String you should always check the affected code. If you want code workarounds for a common practice (double checking), you're may introduce overly complicated code in your whole application.
If you're aware of that problem, why dont you put a comment directly on your affected class attributes, that it may be compared by ==. If you want or another dev wants to change it's type later on, they will be warned.

How to tell if an Array or List is also a set?

I have a char[]. I would like to be able to tell if it is a Set and if so create a new Set with the array values. I know I can use a try-catch block but is there any built in method for Java which I could use to test this without throwing an error. It is not imperative that I use a char[]. I could also use a List or something else.
I have a char[]. I would like to be able to tell if it is a Set
It won't be. It can't be. It may have distinct values, but it won't be a Set.
If you actually want to check whether the array contains distinct values, the simplest way would probably be to create a Set<Character> and check whether any add operation returns false:
public static boolean uniqueValues(char[] values) {
Set<Character> set = new HashSet<Character>();
for (char c : values) {
if (!set.add(c)) {
return false;
}
}
return true;
}
(That gives you an early out as soon as you find a duplicate, rather than continuing to construct the whole set.)
An alternative would be to create a boolean[] of size 65536 to see which characters you've got:
public static boolean uniqueValues(char[] values) {
boolean[] seen = new boolean[65536];
for (char c : values) {
int index = c;
if (seen[index]) {
return false;
}
seen[index] = true;
}
return true;
}
For small arrays, this will be hugely wasteful of memory - for larger arrays (of distinct elements, or where the duplicate occurs late) it's more space efficient than the HashSet approach.
You can test if a variable is from certain type using instanceof operator:
if (myVar instanceof Set) {
System.out.println("It's a Set.");
//do what you want/need
}
Still, the usage of instanceof operator seems to be a problem in your design. Even more, you must not use instanceof operator in an array to check if it is a Collection.
EDIT: Based on your last comment in your question, you want to seek if there's a duplicated element in your array. You can do this using the Set as explained in JonSkeet's answer (no need to rewrite the logic and explanation he already has provided).
If your thinking of using "generics" for types the system may have List for the type it can store to check for Set you can use "instanceof" test operator in a comparison.
List<Object[]> alist = new ArrayList<Object[]>();
//setup the list of arrays
if(alist.get(0) instanceof Set){
// do what you do with a set
}else{
// do what you require
}

In Java how can I check if an object is in a linked list?

Below is my class. The insertSymbol method is supposed to add an object to the linked list which is then added to a hash table. But when I print the contents of the hash table it has double entries. I tried to correct this by using "if(temp.contains(value)){return;}" but it isn't working. I read that I need to use #override in a couple of places. Could anyone help me know how and where to use the overrides? Thank you!
import java.util.*;
public class Semantic {
String currentScope;
Stack theStack = new Stack();
HashMap<String, LinkedList> SymbolTable= new HashMap<String, LinkedList>();
public void insertSymbol(String key, SymbolTableItem value){
LinkedList<SymbolTableItem> temp = new LinkedList<SymbolTableItem>();
if(SymbolTable.get(key) == null){
temp.addLast(value);
SymbolTable.put(key, temp);
}else{
temp = SymbolTable.get(key);
if(temp.contains(value)){
return;
}else{
temp.addLast(value);
SymbolTable.put(key, temp);
}
}
}
public String printValues(){
return SymbolTable.toString();
}
public boolean isBoolean(){
return true;
}
public boolean isTypeMatching(){
return true;
}
public void stackPush(String theString){
theStack.add(theString);
}
}
You have multiple options here. You'll need at least to add an equals (and therefor also a hashcode) method to your class.
However, if you want your collection to only contain unique items, why not use a Set instead?
If you still want to use a List, you can use your current approach, it just that the characteristics of a Set are that all items in a Set are unique, so a Set might make sense here.
Adding an equals method can quite easily be done. Apache Equalsbuilder is a good approach in this.
You don't need the 2nd line when you add a new value with the same key:
temp.addLast(value);
SymbolTable.put(key, temp); // <-- Not needed. Its already in there.
Let me explain something that #ErikPragt alludes to regarding this code:
if(temp.contains(value)){
What do you suppose that means?
If you look in the javadocs for LinkedList you will find that if a value in the list is non-null, it uses the equals() method on the value object to see if the list element is the same.
What that means, in your case, is that your class SymbolTableItem needs an equals() method that will compare two of these objects to see if they are the same, whatever that means in your case.
Lets assume the instances will be considered the same if the names are the same. You will need a method like this in the 'SymbolTableItem` class:
#Overrides
public boolean equals(Object that) {
if (that == null) {
return false;
}
if (this.getName() == null) {
return that.getName() == null;
}
return this.getName().equals(that.getName());
}
It it depends on more fields, the equals will be correspondingly more complex.
NOTE: One more thing. If you add an equals method to a class, it is good programming practice to add a hashcode() method too. The rule is that if two instances are equal, they should have the same hashcode and if not equal they don't have to be different hashcodes but it would be very nice if they did.
If you use your existing code where only equals is used, you don't need a hashcode, stricly. But if you don't add a hashcode it could be a problem someday. Maybe today.
In the case where the name is all that matters, your hashcode could just return: this.getName().hashcode().
Again, if there are more things to compare to tell if they are equal, the hashcode method will be more complex.

How can I check two Object-Arrays for Equality in JUnit?

I have a JAVA class NoName whose objects have the method getProperties(). This method returns an Array of Property.
When I now have two instances of NoName, how can I use assertEquals to check whether both instances' Property-Arrays are the same?
_assertEquals(inst.getProperties(), ance.getProrties())_ won't do the job, because it's deprecated.
And since the NoName class is a library class I cannot overwrite equals() (which seems to be the usual solution for this kind of problem, as far as I read until now).
Thanks in advance.
assertThat(ob1.getProperties(),
IsArrayContainingInOrder.contains(obj2.getProperties));
This is using a Hamcrest Matcher which I believe to be a preferable method to doing asserts since the output on failure is much more descriptive.
There is also an IsArrayContainingInAnyOrder if order does not matter.
IsArrayContainingInAnyOrder
Since you're talking about comparing an internal structure of an object, you can either override the equals method for the NoName class, and inside that compare the array of properties for both objects. But then, you'll need to take care of the hashCode method too.
Or, you can simply create a helper method hasSameProperties(NoName obj) in the NoName class, and make that method return a boolean flag after comparing the property arrays of both the objects. Then in JUnit, you can simply use the assertTrue method.
Property[] one = inst.getProperties(), two = ance.getProperties();
assertEquals(one.length, two.length);
for (int i=0; i<one.length; i++) assertEquals("index #" + i, one[i], two[i]);
That provides a basic scrub. If you're worried about some other edge conditions (e.g., like the array might be null, or that the array is allowed to have a different order as long as all the same elements are in each), then some extra work will be required (e.g., null check, a sort if Property implements Comparable).
Hamcrest 1.3 uses a slightly different syntax, some examples:
import static org.hamcrest.collection.IsArrayContainingInOrder.arrayContaining;
...
#Test
public void inOrder1() {
assertThat(new String[]{"foo", "bar"}, arrayContaining(equalTo("foo"), equalTo("bar")));
}
#Test(expected = AssertionError.class)
public void inOrder2() {
assertThat(new String[]{"bar", "foo"}, arrayContaining(equalTo("foo"), equalTo("bar")));
// Expected: ["foo", "bar"]
// but: item 0: was "bar"
}
#Test(expected = AssertionError.class)
public void inOrder3() {
assertThat(new String[]{"foo", "bar", "lux"}, arrayContaining(equalTo("foo"), equalTo("bar")));
// Expected: ["foo", "bar"] in any order
// but: Not matched: "lux"
}
#Test(expected = AssertionError.class)
public void inOrder4() {
assertThat(new String[]{"foo", "bar"}, arrayContaining(equalTo("foo"), equalTo("bar"), equalTo("lux")));
// Expected: ["foo", "bar", "lux"] in any order
// but: No item matched: "lux" in ["foo", "bar"]
}

How to remove duplicate words using java

I have text file. In that i want to remove duplicate words.My text file contains words like
அந்தப்
சத்தம்
அந்த
இந்தத்
பாப்பா
இந்த
கனவுத்
அந்த
கனவு
I remove duplicate words. But the words which has ending 'ப்' , 'த்' are consider as seperate words and not able to remove as duplicate word. If i remove 'ப்' , 'த்' it remove from some other words like பாப்பா, சத்தம். Please suggest any ideas to solve this problem using java.Thanks in advance.
I think I would use a Set with a custom comperator (such as a TreeSet). That way you can define equals any way you like.
I don't understand the given language (google translate's guess is Tamil), but from your question I read, that there are special rules for 'equality' for words written in that language - like words can be equal even if they're written differently (e.g. with different endings).
So you may want to wrap the strings containing words of that language in special object where you can define a custom 'equals' method, like this:
public class TamilWord {
String writtenWord = null;
public TamilWord(String writtenWord) {
this.writtenWord = writtenWord;
}
public String getWrittenWord() {
return writtenWord;
}
#Overwrite
public boolean equals(Object other) {
// Define your custom rules here, so that two words that
// are written differently may be considered as equal
}
}
Then you can create TamilWord objects for all parsed Strings and drop them into
a Set. So if we have the word abcd and abcD which are different in writing but according to rules considered equal, only one of those will be added to the set.
Use a scanner to scan in each line as a string into a set then write the strings in the set to a file.
First you should explain us how you parse your file, as it seems that your tokenization is not working appropriately. Then, to my mind, the obvious suggestion to a query for unduplication is to use a Set (and even a TreeSet) which should ensure uniqueness of your elements according to given Set contains rules.
My way to solve this:
Read word by word and put it to java.util.Set<TheWord>. Finally, you will have the Set with no duplicates. You also should define TheWord class:
class TheWord {
String word;
public TheWord() {}
public String getWord() {
return word;
}
public void setWord(String word) {
this.word = word;
}
public boolean equals(TheWord o) {
// put here your specific way to compare words
// taking into account your language rules and considerations
}
}

Categories

Resources