How to find the missing elements in a sequence? - java

I have a string arraylist under that i need to pass 22184 elements from ["AA00001", "AA00005" ,"AA00003" ----- "ZZ00678"] and i need to generate the sequence elements which are not present in the list. I have written code for that and for less inputs it is generating the required output. But when i am adding 22184 elements and want to generate 200 unique ids which are not present in the arraylist i am getting error as
The code of method main(String[]) is exceeding the 65535 bytes limit
Can someone please help ?
import java.util.ArrayList;
public class GenerateIds
{
private static ArrayList<String> ids = new ArrayList<>();
static int n=50; //no of Ids u want to generate
static int completed =0;
static char ID[] = new char[7];
public static void main(String[] args)
{
ids.add("AA00001");
ids.add("AA00004");
ids.add("AA00007");
generateIds(0);
for(String id : ids)
{
System.out.println(id);
}
}
private static void generateIds(int i)
{
if(n!=completed)
{
if(i<2)
{
for(char c ='A';c<'Z';c++)
{
ID[i]=c;
generateIds(i+1);
}
}
else if(i>=2 && i<7)
{
for(char c ='0';c<='9';c++)
{
ID[i]=c;
generateIds(i+1);
}
}else if(i==7)
{
String id = String.valueOf(ID);
if(!ids.contains(id))
{
ids.add(id);
completed++;
}
}
}
}
}

You can put your id's in a text file. Then use something like.
List<String> ids = Files.readAllLines(Paths.get("ids.txt"));

In java a methods can't have more than 65535 bytes.
The main method is becoming too large since you are doing all the adds inline:
ids.add("AA00001");
ids.add("AA00004");
ids.add("AA00007");
...
This will make the main method too long. What you can do to solve this (and to find the missing elements) is putting all the String values in a List and loop over it to find the missing elements:
public void findMissingElements() {
List<String> missingIds = allPossibleIds.stream()
.filter(isMissingIn(existingIds))
.collect(toList());
//do something with the missingIds...
}
As other readers such as matt suggested, you can e.g. put all the Strings in a file and read the file.
I wrote a small example to show how it would all work together. I rewrote your generateIds method with jOOλ to generate all the possible ids and renamed it to allPossibleIds (however your recursive method would work too). I limited the ids to a 3 size digit number to limit the search time as an example.
public class FindMissingIdsTest {
private List<String> allPossibleIds;
private List<String> existingIds;
#Before
public void setup() throws IOException {
allPossibleIds = allPossibleIds();
existingIds = retrieveIdsFromSubSystem();
}
#Test
public void findMissingElements() {
List<String> missingIds = allPossibleIds.stream()
.filter(isMissingIn(existingIds))
.collect(toList());
}
private Predicate<String> isMissingIn(List<String> existingIds) {
return possibleId -> !existingIds.contains(possibleId);
}
public List<String> allPossibleIds(){
List<String> alphabet = Seq.rangeClosed('A', 'Z').map(Object::toString).toList();
List<String> letterCombinations = Seq.seq(alphabet).crossJoin(Seq.seq(alphabet)).map(t -> t.v1 + t.v2).toList();
List<String> numbericParts = IntStream.range(0, 1000)
.mapToObj(i -> String.format("%03d", i))
.collect(toList());
return Seq.seq(letterCombinations).crossJoin(Seq.seq(numbericParts)).map(t -> t.v1 + t.v2).toList();
}
public List<String> retrieveIdsFromSubSystem() throws IOException {
return Files.readAllLines(Paths.get("ids.txt"));
}
}
To change to 5 digits again you can just change the 1000 to 100000 and the %03d to %05d.
If you can order the list, you could probably find a faster and better algorithm. It all depends on the situation. e.g. if you have an ordered list, you could build up the stream of all the ids, iterate over it and follow in the existing list with a pointer instead of always doing a resource consuming contains().

Related

Java: Identify common paths in ArrayList<String> using Lambdas

I got an array of elements like :
ArrayList<String> t = new ArrayList();
t.add("/folder1/sub-folder1");
t.add("/folder2/sub-folder2");
t.add("/folder1/sub-folder1/data");
I need to get output as /folder1/sub-folder1 which is mostly repeated path.
In python this can be achieved using the below function:
def getRepeatedPath(self, L):
""" Returns the highest repeated path/string in a provided list """
try:
pkgname = max(g(sorted(L)), key=lambda(x, v): (len(list(v)), -L.index(x)))[0]
return pkgname.replace("/", ".")
except:
return "UNKNOWN"
I am trying to work on equivalent lambda function in Java. I got struck and need some help in the lambda implementation.
public String mostRepeatedSubString(ArrayList<String> pathArray) {
Collections.sort(pathArray);
String mostRepeatedString = null;
Map<String,Integer> x = pathArray.stream.map(s->s.split("/")).collect(Collectors.toMap());
return mostRepeatedString;
}
Lots of tweaking, but I finally got it!
public static void main(String[] args) {
ArrayList<String> t = new ArrayList<String>();
t.add("folder1/sub-folder1");
t.add("folder2/sub-folder2");
t.add("folder1/sub-folder1/data");
System.out.println(mostRepeatedSubString(t));
}
public static String mostRepeatedSubString(List<String> pathArray) {
return pathArray
.stream()
// Split to lists of strings
.map(s -> Arrays.asList(s.split("/")))
// Group by first folder
.collect(Collectors.groupingBy(lst -> lst.get(0)))
// Find the key with the largest list value
.entrySet()
.stream()
.max((e1, e2) -> e1.getValue().size() - e2.getValue().size())
// Extract that largest list
.map(Entry::getValue)
.orElse(Arrays.asList())
// Intersect the lists in that list to find maximal matching
.stream()
.reduce(YourClassName::commonPrefix)
// Change back to a string
.map(lst -> String.join("/", lst))
.orElse("");
}
private static List<String> commonPrefix(List<String> lst1, List<String> lst2) {
int maxIndex = 0;
while(maxIndex < Math.min(lst1.size(), lst2.size())&& lst1.get(maxIndex).equals(lst2.get(maxIndex))) {
maxIndex++;
}
return lst1.subList(0, maxIndex);
}
Note that I had to remove the initial / from the paths, otherwise that character would have been used in the split, resulting in the first string in every path list being the empty string, which would always be the most common prefix. Shouldn't be too hard to do this in pre-processing though.

Create Test case for sorted list of strings

I hava a list of strings and in my code I order this list. I want to write a unit test to ensure that the list has been orderer properly. my code
#Test
public void test() {
List<String> orderedList = new ArrayList<String>();
orderedList.add("a");
orderedList.add("b");
orderedList.add("a");
assertThat(orderedList, isInDescendingOrdering());
}
private Matcher<? super List<String>> isInDescendingOrdering()
{
return new TypeSafeMatcher<List<String>>()
{
#Override
public void describeTo (Description description)
{
description.appendText("ignored");
}
#Override
protected boolean matchesSafely (List<String> item)
{
for(int i = 0 ; i < item.size() -1; i++) {
if(item.get(i).equals(item.get(i+1))) return false;
}
return true;
}
};
}
somehow it success al the times.
You are absolutely overcomplicating things here. Writing a custom matcher is a nice exercise, but it does not add any real value to your tests.
Instead I would suggest that you simply create some
String[] expectedArray =....
value and give that to your call to assertThat. That is less sexy, but much easier to read and understand. And that is what counts for unit tests!
You can do it simply by copying the array, then sorting it and finally compare it with original array.
The code is given below:
#Test
public void test() {
List<String> orderedList = new ArrayList<String>();
orderedList.add("a");
orderedList.add("b");
orderedList.add("a");
//Copy the array to sort and then to compare with original
List<String> orderedList2 = new ArrayList<String>(orderedList);
orderedList2.sort((String s1, String s2) -> s1.compareTo(s2));
Assert.assertEquals(orderedList, orderedList2);
}

Iterating through sets

I am writing a program that will receive a list of words. After that, it will store the repeated words and the non repeated into two different lists. My code is the following:
public class Runner
{
public static void run (Set<String> words)
{
Set<String> uniques= new HashSet<String>();
Set<String> dupes= new HashSet<String>();
Iterator<String> w = words.iterator();
while (w.hasNext())
{
if (!(uniques.add(w.next())))
{
dupes.add(w.next());
}
else
{
uniques.add(w.next());
}
}
System.out.println ("Uniques: "+ uniques);
System.out.println ("Dupes: "+ dupes);
}
}
However, the output for the following:
right, left, up, left, down
is:
Uniques: [left, right, up, down]
Dupes: []
and my desired would be:
Uniques: [right, left, up, down]
Dupes: [ left ]
I want to achieve this using sets. I know it would be way easier to just an ArrayList but I am trying to understand sets.
The reason for your problem is that the argument words is a Set<String>. A set by definition will not contain duplicates. The argument words should be a List<String>. The code also makes the mistake of calling w.next() twice. A call to the next() will cause the iterator to advance.
public static void run(List<String> words) {
Set<String> uniques= new HashSet<String>();
Set<String> dupes= new HashSet<String>();
Iterator<String> w = words.iterator();
while(w.hasNext()) {
String word = w.next();
if(!uniques.add(word)) {
dupes.add(word);
}
}
}
You are doing uniques.add(w.next()) twice. Why?
Also, don't keep calling w.next() - this makes the iteration happen. Call it once and keep a local reference.
Use:
String next = w.next();
if(uniques.contains(next)) {
// it's a dupe
} else {
// it's not a dupe
}

How to match the exact string value in the list of comma separated string [duplicate]

I have a String[] with values like so:
public static final String[] VALUES = new String[] {"AB","BC","CD","AE"};
Given String s, is there a good way of testing whether VALUES contains s?
Arrays.asList(yourArray).contains(yourValue)
Warning: this doesn't work for arrays of primitives (see the comments).
Since java-8 you can now use Streams.
String[] values = {"AB","BC","CD","AE"};
boolean contains = Arrays.stream(values).anyMatch("s"::equals);
To check whether an array of int, double or long contains a value use IntStream, DoubleStream or LongStream respectively.
Example
int[] a = {1,2,3,4};
boolean contains = IntStream.of(a).anyMatch(x -> x == 4);
Concise update for Java SE 9
Reference arrays are bad. For this case we are after a set. Since Java SE 9 we have Set.of.
private static final Set<String> VALUES = Set.of(
"AB","BC","CD","AE"
);
"Given String s, is there a good way of testing whether VALUES contains s?"
VALUES.contains(s)
O(1).
The right type, immutable, O(1) and concise. Beautiful.*
Original answer details
Just to clear the code up to start with. We have (corrected):
public static final String[] VALUES = new String[] {"AB","BC","CD","AE"};
This is a mutable static which FindBugs will tell you is very naughty. Do not modify statics and do not allow other code to do so also. At an absolute minimum, the field should be private:
private static final String[] VALUES = new String[] {"AB","BC","CD","AE"};
(Note, you can actually drop the new String[]; bit.)
Reference arrays are still bad and we want a set:
private static final Set<String> VALUES = new HashSet<String>(Arrays.asList(
new String[] {"AB","BC","CD","AE"}
));
(Paranoid people, such as myself, may feel more at ease if this was wrapped in Collections.unmodifiableSet - it could then even be made public.)
(*To be a little more on brand, the collections API is predictably still missing immutable collection types and the syntax is still far too verbose, for my tastes.)
You can use ArrayUtils.contains from Apache Commons Lang
public static boolean contains(Object[] array, Object objectToFind)
Note that this method returns false if the passed array is null.
There are also methods available for primitive arrays of all kinds.
Example:
String[] fieldsToInclude = { "id", "name", "location" };
if ( ArrayUtils.contains( fieldsToInclude, "id" ) ) {
// Do some stuff.
}
Just simply implement it by hand:
public static <T> boolean contains(final T[] array, final T v) {
for (final T e : array)
if (e == v || v != null && v.equals(e))
return true;
return false;
}
Improvement:
The v != null condition is constant inside the method. It always evaluates to the same Boolean value during the method call. So if the input array is big, it is more efficient to evaluate this condition only once, and we can use a simplified/faster condition inside the for loop based on the result. The improved contains() method:
public static <T> boolean contains2(final T[] array, final T v) {
if (v == null) {
for (final T e : array)
if (e == null)
return true;
}
else {
for (final T e : array)
if (e == v || v.equals(e))
return true;
}
return false;
}
Four Different Ways to Check If an Array Contains a Value
Using List:
public static boolean useList(String[] arr, String targetValue) {
return Arrays.asList(arr).contains(targetValue);
}
Using Set:
public static boolean useSet(String[] arr, String targetValue) {
Set<String> set = new HashSet<String>(Arrays.asList(arr));
return set.contains(targetValue);
}
Using a simple loop:
public static boolean useLoop(String[] arr, String targetValue) {
for (String s: arr) {
if (s.equals(targetValue))
return true;
}
return false;
}
Using Arrays.binarySearch():
The code below is wrong, it is listed here for completeness. binarySearch() can ONLY be used on sorted arrays. You will find the result is weird below. This is the best option when array is sorted.
public static boolean binarySearch(String[] arr, String targetValue) {
return Arrays.binarySearch(arr, targetValue) >= 0;
}
Quick Example:
String testValue="test";
String newValueNotInList="newValue";
String[] valueArray = { "this", "is", "java" , "test" };
Arrays.asList(valueArray).contains(testValue); // returns true
Arrays.asList(valueArray).contains(newValueNotInList); // returns false
If the array is not sorted, you will have to iterate over everything and make a call to equals on each.
If the array is sorted, you can do a binary search, there's one in the Arrays class.
Generally speaking, if you are going to do a lot of membership checks, you may want to store everything in a Set, not in an array.
For what it's worth I ran a test comparing the 3 suggestions for speed. I generated random integers, converted them to a String and added them to an array. I then searched for the highest possible number/string, which would be a worst case scenario for the asList().contains().
When using a 10K array size the results were:
Sort & Search : 15
Binary Search : 0
asList.contains : 0
When using a 100K array the results were:
Sort & Search : 156
Binary Search : 0
asList.contains : 32
So if the array is created in sorted order the binary search is the fastest, otherwise the asList().contains would be the way to go. If you have many searches, then it may be worthwhile to sort the array so you can use the binary search. It all depends on your application.
I would think those are the results most people would expect. Here is the test code:
import java.util.*;
public class Test {
public static void main(String args[]) {
long start = 0;
int size = 100000;
String[] strings = new String[size];
Random random = new Random();
for (int i = 0; i < size; i++)
strings[i] = "" + random.nextInt(size);
start = System.currentTimeMillis();
Arrays.sort(strings);
System.out.println(Arrays.binarySearch(strings, "" + (size - 1)));
System.out.println("Sort & Search : "
+ (System.currentTimeMillis() - start));
start = System.currentTimeMillis();
System.out.println(Arrays.binarySearch(strings, "" + (size - 1)));
System.out.println("Search : "
+ (System.currentTimeMillis() - start));
start = System.currentTimeMillis();
System.out.println(Arrays.asList(strings).contains("" + (size - 1)));
System.out.println("Contains : "
+ (System.currentTimeMillis() - start));
}
}
Instead of using the quick array initialisation syntax too, you could just initialise it as a List straight away in a similar manner using the Arrays.asList method, e.g.:
public static final List<String> STRINGS = Arrays.asList("firstString", "secondString" ...., "lastString");
Then you can do (like above):
STRINGS.contains("the string you want to find");
With Java 8 you can create a stream and check if any entries in the stream matches "s":
String[] values = {"AB","BC","CD","AE"};
boolean sInArray = Arrays.stream(values).anyMatch("s"::equals);
Or as a generic method:
public static <T> boolean arrayContains(T[] array, T value) {
return Arrays.stream(array).anyMatch(value::equals);
}
You can use the Arrays class to perform a binary search for the value. If your array is not sorted, you will have to use the sort functions in the same class to sort the array, then search through it.
ObStupidAnswer (but I think there's a lesson in here somewhere):
enum Values {
AB, BC, CD, AE
}
try {
Values.valueOf(s);
return true;
} catch (IllegalArgumentException exc) {
return false;
}
Actually, if you use HashSet<String> as Tom Hawtin proposed you don't need to worry about sorting, and your speed is the same as with binary search on a presorted array, probably even faster.
It all depends on how your code is set up, obviously, but from where I stand, the order would be:
On an unsorted array:
HashSet
asList
sort & binary
On a sorted array:
HashSet
Binary
asList
So either way, HashSet for the win.
Developers often do:
Set<String> set = new HashSet<String>(Arrays.asList(arr));
return set.contains(targetValue);
The above code works, but there is no need to convert a list to set first. Converting a list to a set requires extra time. It can as simple as:
Arrays.asList(arr).contains(targetValue);
or
for (String s : arr) {
if (s.equals(targetValue))
return true;
}
return false;
The first one is more readable than the second one.
If you have the google collections library, Tom's answer can be simplified a lot by using ImmutableSet (http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/collect/ImmutableSet.html)
This really removes a lot of clutter from the initialization proposed
private static final Set<String> VALUES = ImmutableSet.of("AB","BC","CD","AE");
In Java 8 use Streams.
List<String> myList =
Arrays.asList("a1", "a2", "b1", "c2", "c1");
myList.stream()
.filter(s -> s.startsWith("c"))
.map(String::toUpperCase)
.sorted()
.forEach(System.out::println);
One possible solution:
import java.util.Arrays;
import java.util.List;
public class ArrayContainsElement {
public static final List<String> VALUES = Arrays.asList("AB", "BC", "CD", "AE");
public static void main(String args[]) {
if (VALUES.contains("AB")) {
System.out.println("Contains");
} else {
System.out.println("Not contains");
}
}
}
Using a simple loop is the most efficient way of doing this.
boolean useLoop(String[] arr, String targetValue) {
for(String s: arr){
if(s.equals(targetValue))
return true;
}
return false;
}
Courtesy to Programcreek
the shortest solution
the array VALUES may contain duplicates
since Java 9
List.of(VALUES).contains(s);
Use the following (the contains() method is ArrayUtils.in() in this code):
ObjectUtils.java
public class ObjectUtils {
/**
* A null safe method to detect if two objects are equal.
* #param object1
* #param object2
* #return true if either both objects are null, or equal, else returns false.
*/
public static boolean equals(Object object1, Object object2) {
return object1 == null ? object2 == null : object1.equals(object2);
}
}
ArrayUtils.java
public class ArrayUtils {
/**
* Find the index of of an object is in given array,
* starting from given inclusive index.
* #param ts Array to be searched in.
* #param t Object to be searched.
* #param start The index from where the search must start.
* #return Index of the given object in the array if it is there, else -1.
*/
public static <T> int indexOf(final T[] ts, final T t, int start) {
for (int i = start; i < ts.length; ++i)
if (ObjectUtils.equals(ts[i], t))
return i;
return -1;
}
/**
* Find the index of of an object is in given array, starting from 0;
* #param ts Array to be searched in.
* #param t Object to be searched.
* #return indexOf(ts, t, 0)
*/
public static <T> int indexOf(final T[] ts, final T t) {
return indexOf(ts, t, 0);
}
/**
* Detect if the given object is in the given array.
* #param ts Array to be searched in.
* #param t Object to be searched.
* #return If indexOf(ts, t) is greater than -1.
*/
public static <T> boolean in(final T[] ts, final T t) {
return indexOf(ts, t) > -1;
}
}
As you can see in the code above, that there are other utility methods ObjectUtils.equals() and ArrayUtils.indexOf(), that were used at other places as well.
For arrays of limited length use the following (as given by camickr). This is slow for repeated checks, especially for longer arrays (linear search).
Arrays.asList(...).contains(...)
For fast performance if you repeatedly check against a larger set of elements
An array is the wrong structure. Use a TreeSet and add each element to it. It sorts elements and has a fast exist() method (binary search).
If the elements implement Comparable & you want the TreeSet sorted accordingly:
ElementClass.compareTo() method must be compatable with ElementClass.equals(): see Triads not showing up to fight? (Java Set missing an item)
TreeSet myElements = new TreeSet();
// Do this for each element (implementing *Comparable*)
myElements.add(nextElement);
// *Alternatively*, if an array is forceably provided from other code:
myElements.addAll(Arrays.asList(myArray));
Otherwise, use your own Comparator:
class MyComparator implements Comparator<ElementClass> {
int compareTo(ElementClass element1; ElementClass element2) {
// Your comparison of elements
// Should be consistent with object equality
}
boolean equals(Object otherComparator) {
// Your equality of comparators
}
}
// construct TreeSet with the comparator
TreeSet myElements = new TreeSet(new MyComparator());
// Do this for each element (implementing *Comparable*)
myElements.add(nextElement);
The payoff: check existence of some element:
// Fast binary search through sorted elements (performance ~ log(size)):
boolean containsElement = myElements.exists(someElement);
If you don't want it to be case sensitive
Arrays.stream(VALUES).anyMatch(s::equalsIgnoreCase);
Try this:
ArrayList<Integer> arrlist = new ArrayList<Integer>(8);
// use add() method to add elements in the list
arrlist.add(20);
arrlist.add(25);
arrlist.add(10);
arrlist.add(15);
boolean retval = arrlist.contains(10);
if (retval == true) {
System.out.println("10 is contained in the list");
}
else {
System.out.println("10 is not contained in the list");
}
Check this
String[] VALUES = new String[]{"AB", "BC", "CD", "AE"};
String s;
for (int i = 0; i < VALUES.length; i++) {
if (VALUES[i].equals(s)) {
// do your stuff
} else {
//do your stuff
}
}
Arrays.asList() -> then calling the contains() method will always work, but a search algorithm is much better since you don't need to create a lightweight list wrapper around the array, which is what Arrays.asList() does.
public boolean findString(String[] strings, String desired){
for (String str : strings){
if (desired.equals(str)) {
return true;
}
}
return false; //if we get here… there is no desired String, return false.
}
Use below -
String[] values = {"AB","BC","CD","AE"};
String s = "A";
boolean contains = Arrays.stream(values).anyMatch(v -> v.contains(s));
Use Array.BinarySearch(array,obj) for finding the given object in array or not.
Example:
if (Array.BinarySearch(str, i) > -1)` → true --exists
false --not exists
Try using Java 8 predicate test method
Here is a full example of it.
import java.util.Arrays;
import java.util.List;
import java.util.function.Predicate;
public class Test {
public static final List<String> VALUES =
Arrays.asList("AA", "AB", "BC", "CD", "AE");
public static void main(String args[]) {
Predicate<String> containsLetterA = VALUES -> VALUES.contains("AB");
for (String i : VALUES) {
System.out.println(containsLetterA.test(i));
}
}
}
http://mytechnologythought.blogspot.com/2019/10/java-8-predicate-test-method-example.html
https://github.com/VipulGulhane1/java8/blob/master/Test.java
Create a boolean initially set to false. Run a loop to check every value in the array and compare to the value you are checking against. If you ever get a match, set boolean to true and stop the looping. Then assert that the boolean is true.
As I'm dealing with low level Java using primitive types byte and byte[], the best so far I got is from bytes-java https://github.com/patrickfav/bytes-java seems a fine piece of work
You can check it by two methods
A) By converting the array into string and then check the required string by .contains method
String a = Arrays.toString(VALUES);
System.out.println(a.contains("AB"));
System.out.println(a.contains("BC"));
System.out.println(a.contains("CD"));
System.out.println(a.contains("AE"));
B) This is a more efficent method
Scanner s = new Scanner(System.in);
String u = s.next();
boolean d = true;
for (int i = 0; i < VAL.length; i++) {
if (VAL[i].equals(u) == d)
System.out.println(VAL[i] + " " + u + VAL[i].equals(u));
}

List of string with occurrences count and sort

I'm developing a Java Application that reads a lot of strings data likes this:
1 cat (first read)
2 dog
3 fish
4 dog
5 fish
6 dog
7 dog
8 cat
9 horse
...(last read)
I need a way to keep all couple [string, occurrences] in order from last read to first read.
string occurrences
horse 1 (first print)
cat 2
dog 4
fish 2 (last print)
Actually i use two list:
1) List<string> input; where i add all data
In my example:
input.add("cat");
input.add("dog");
input.add("fish");
...
2)List<string> possibilities; where I insert the strings once in this way:
if(possibilities.contains("cat")){
possibilities.remove("cat");
}
possibilities.add("cat");
In this way I've got a sorted list where all possibilities.
I use it like that:
int occurrence;
for(String possible:possibilities){
occurrence = Collections.frequency(input, possible);
System.out.println(possible + " " + occurrence);
}
That trick works good but it's too slow(i've got millions of input)... any help?
(English isn’t my first language, so please excuse any mistakes.)
Use a Map<String, Integer>, as #radoslaw pointed, to keep the insertion sorting use LinkedHashMap and not a TreeMap as described here:
LinkedHashMap keeps the keys in the order they were inserted, while a TreeMap is kept sorted via a Comparator or the natural Comparable ordering of the elements.
Imagine you have all the strings in some array, call it listOfAllStrings, iterate over this array and use the string as key in your map, if it does not exists, put in the map, if it exists, sum 1 to actual result...
Map<String, Integer> results = new LinkedHashMap<String, Integer>();
for (String s : listOfAllStrings) {
if (results.get(s) != null) {
results.put(s, results.get(s) + 1);
} else {
results.put(s, 1);
}
}
Make use of a TreeMap, which will keep ordering on the keys as specified by the compare of your MyStringComparator class handling MyString class which wraps String adding insertion indexes, like this:
// this better be immutable
class MyString {
private MyString() {}
public static MyString valueOf(String s, Long l) { ... }
private String string;
private Long index;
public hashcode(){ return string.hashcode(); }
public boolean equals() { // return rely on string.equals() }
}
class MyStringComparator implements Comparator<MyString> {
public int compare(MyString s1, MyString s2) {
return -s1.getIndex().compareTo(s2.gtIndex());
}
}
Pass the comparator while constructing the map:
Map<MyString,Integer> map = new TreeMap<>(new MyStringComparator());
Then, while parsing your input, do
Long counter = 0;
while (...) {
MyString item = MyString.valueOf(readString, counter++);
if (map.contains(item)) {
map.put(map.get(item)+1);
} else {
map.put(item,1);
}
}
There will be a lot of instantiation because of the immutable class, and the comparator will not be consistent with equals, but it should work.
Disclaimer: this is untested code just to show what I'd do, I'll come back and recheck it when I get my hands on a compiler.
Here is the complete solution for your problem,
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class DataDto implements Comparable<DataDto>{
public int count = 0;
public String string;
public long lastSeenTime;
public DataDto(String string) {
this.string = string;
this.lastSeenTime = System.currentTimeMillis();
}
public boolean equals(Object object) {
if(object != null && object instanceof DataDto) {
DataDto temp = (DataDto) object;
if(temp.string != null && temp.string.equals(this.string)) {
return true;
}
}
return false;
}
public int hashcode() {
return string.hashCode();
}
public int compareTo(DataDto o) {
if(o != null) {
return o.lastSeenTime < this.lastSeenTime ? -1 : 1;
}
return 0;
}
public String toString() {
return this.string + " : " + this.count;
}
public static final void main(String[] args) {
String[] listOfAllStrings = {"horse", "cat", "dog", "fish", "cat", "fish", "dog", "cat", "horse", "fish"};
Map<String, DataDto> results = new HashMap<String, DataDto>();
for (String s : listOfAllStrings) {
DataDto dataDto = results.get(s);
if(dataDto != null) {
dataDto.count = dataDto.count + 1;
dataDto.lastSeenTime = System.nanoTime();
} else {
dataDto = new DataDto(s);
results.put(s, dataDto);
}
}
List<DataDto> finalResults = new ArrayList<DataDto>(results.values());
System.out.println(finalResults);
Collections.sort(finalResults);
System.out.println(finalResults);
}
}
Ans
[horse : 1, cat : 2, fish : 2, dog : 1]
[fish : 2, horse : 1, cat : 2, dog : 1]
I think this solution will be suitable for your requirement.
If you know that your data is not going to exceed your memory capacity when you read it all into memory, then the solution is simple - using a LinkedList or a and a LinkedHashMap.
For example, if you use a Linked list:
LinkedList<String> input = new LinkedList();
You then proceed to use input.add() as you did originally. But when the input list is full, you basically use Jordi Castilla's solution - but put the entries in the linked list in reverse order. To do that, you do:
Iterator<String> iter = list.descendingIterator();
LinkedHashMap<String,Integer> map = new LinkedHashMap<>();
while (iter.hasNext()) {
String s = iter.next();
if ( map.containsKey(s)) {
map.put( s, map.get(s) + 1);
} else {
map.put(s, 1);
}
}
Now, the only real difference between his solution and mine is that I'm using list.descendingIterator() which is a method in LinkedList that gives you the entries in backwards order, from "horse" to "cat".
The LinkedHashMap will keep the proper order - whatever was entered first will be printed first, and because we entered things in reverse order, then whatever was read last will be printed first. So if you print your map the result will be:
{horse=1, cat=2, dog=4, fish=2}
If you have a very long file, and you can't load the entire list of strings into memory, you had better keep just the map of frequencies. In this case, in order to keep the order of entry, we'll use an object such as this:
private static class Entry implements Comparable<Entry> {
private static long nextOrder = Long.MIN_VALUE;
private String str;
private int frequency = 1;
private long order = nextOrder++;
public Entry(String str) {
this.str = str;
}
public String getString() {
return str;
}
public int getFrequency() {
return frequency;
}
public void updateEntry() {
frequency++;
order = nextOrder++;
}
#Override
public int compareTo(Entry e) {
if ( order > e.order )
return -1;
if ( order < e.order )
return 1;
return 0;
}
#Override
public String toString() {
return String.format( "%s: %d", str, frequency );
}
}
The trick here is that every time you update the entry (add one to the frequency), it also updates the order. But the compareTo() method orders Entry objects from high order (updated/inserted later) to low order (updated/inserted earlier).
Now you can use a simple HashMap<String,Entry> to store the information as you read it (I'm assuming you are reading from some sort of scanner):
Map<String,Entry> m = new HashMap<>();
while ( scanner.hasNextLine() ) {
String str = scanner.nextLine();
Entry entry = m.get(str);
if ( entry == null ) {
entry = new Entry(str);
m.put(str, entry);
} else {
entry.updateEntry();
}
}
Scanner.close();
Now you can sort the values of the entries:
List<Entry> orderedList = new ArrayList<Entry>(m.values());
m = null;
Collections.sort(orderedList);
Running System.out.println(orderedList) will give you:
[horse: 1, cat: 2, dog: 4, fish: 2]
In principle, you could use a TreeMap whose keys contained the "order" stuff, rather than a plain HashMap like this followed by sorting, but I prefer not having either mutable keys in a map, nor changing the keys constantly. Here we are only changing the values as we fill the map, and each key is inserted into the map only once.
What you could do:
Reverse the order of the list using
Collections.reverse(input). This runs in linear time - O(n);
Create a Set from the input list. A Set garantees uniqueness.
To preserve insertion order, you'll need a LinkedHashSet;
Iterate over this set, just as you did above.
Code:
/* I don't know what logic you use to create the input list,
* so I'm using your input example. */
List<String> input = Arrays.asList("cat", "dog", "fish", "dog",
"fish", "dog", "dog", "cat", "horse");
/* by the way, this changes the input list!
* Copy it in case you need to preserve the original input. */
Collections.reverse(input);
Set<String> possibilities = new LinkedHashSet<String>(strings);
for (String s : possibilities) {
System.out.println(s + " " + Collections.frequency(strings, s));
}
Output:
horse 1
cat 2
dog 4
fish 2

Categories

Resources