finding vals from table with variable keys - java

There is a table:
key consists from 3 suffixes:
region+s1+s2
region, like US is always specified, but other ones can be not specified so * will be used for "all".
for example:
for key = "US_A_U" value = 2, because:
trying to find full match: find in the table key ("US_A_U") - not
found
1 step less strict find: find key ("US_A_*") - found == 2
for key = "US_Q_Q" value = 3, because:
trying to find full match: find in the table key ("US_Q_Q") - not
found
1 step less strict find: find key ("US_Q_*") - not found
find key ("US_*_Q") - not found
1 step less strict find: find key ("US_*_*") - found = 3
for key = "US_O_P" value = 3, because:
trying to find full match: find in the table key ("US_O_P") - not
found
1 step less strict find: find key ("US_O_*") - not found
find key ("US_*_P") - not found
1 step less strict find: find key ("US_*_*") - found = 3
so to use HashMap method I will need to call 4 times map.get() to find a value, which is too many as this code will be run very very often.
Is there any nicer or faster solutions?
package test;
import java.util.HashMap;
public class MainCLass {
public static void main(String[] args) {
// init map (assuming this code will be run only once)
HashMap<String, String> map = new HashMap<>();
map.put("US_A_B", "1");
map.put("US_A_*", "2");
map.put("US_*_*", "3");
map.put("US_O_O", "4");
map.put("US_*_W", "5");
map.put("ASIA_*_*", "6");
// now often called logic
// incoming params, for this example hardcoded
String reg = "US";
String s1 = "O";
String s2 = "P";
String val = null;
val = map.get(reg+"_"+s1+"_"+s2);
if (val == null){
val = map.get(reg+"_"+s1+"_*");
if (val == null){
val = map.get(reg+"_"+"*_"+s2);
if (val == null){
val = map.get(reg+"_*_*");
}
}
}
System.out.println(val);
}
}
upd: I need to add that there are always 3 incoming params (region, s1, s2). Each of this param never will equal "*" and never be empty, so the full key always be like US_J_K (and not US_*_K etc.)
so by these 3 params I need to find right value from the init table.

You could try creating a tier of maps such as
Map<String, Map<String, Map<String, String>>> map;
In this map the first key is region, the second key is s1, and the third key is s2. This will allow To easily search for region, s1, and s2 independently.
EDIT:
Example usage with searching for "US_O_P"
public static void main(String[] args) {
RegionMap map = new RegionMap();
String region = "US";
String s1 = "O";
String s2 = "P";
String val = map.search(region, s1, s2);
System.out.println(val);
}
public class RegionMap {
private Map<String, Map<String, Map<String, String>>> regionMap;
public RegionMap() {
init();
}
public String search(String region, String s1, String s2) {
String val = searchS1(regionMap.get(region), s1, s2);
if (val == null) {
val = searchS1(regionMap.get("*"), s1, s2);
}
return val;
}
private String searchS1(Map<String, Map<String, String>> s1Map, String s1, String s2) {
if (s1Map == null) {
return null;
}
String val = searchS2(s1Map.get(s1), s2);
if (val == null) {
val = searchS2(s1Map.get("*"), s2);
}
return val;
}
private String searchS2(Map<String, String> s2Map, String s2) {
if (s2Map == null) {
return null;
}
String val = s2Map.get(s2);
if (val == null) {
val = s2Map.get("*");
}
return val;
}
private void init() {
regionMap = new HashMap<>();
addEntry("US", "A", "B", "1");
addEntry("US", "A", "*", "2");
addEntry("US", "*", "*", "3");
addEntry("US", "O", "O", "4");
addEntry("US", "*", "W", "5");
addEntry("ASIA", "*", "*", "6");
}
private void addEntry(String region, String s1, String s2, String value) {
Map<String, Map<String, String>> s1Map = regionMap.get(region);
if (s1Map == null) {
s1Map = new HashMap<>();
regionMap.put(region, s1Map);
}
Map<String, String> s2Map = s1Map.get(s1);
if (s2Map == null) {
s2Map = new HashMap<>();
s1Map.put(s1, s2Map);
}
s2Map.put(s2, value);
}
}
EDIT:
Benchmark results
I ran tests for searching for "US_O_P" multiple times and found the following results for 1,000,000,000 searches
Original: 9.7334702479 seconds
Tiered: 2.471287074 seconds
The following is the benchmark code
public class RegionMapOrig {
private Map<String, String> map;
public RegionMapOrig() {
init();
}
private void init() {
map = new HashMap<>();
map.put("US_A_B", "1");
map.put("US_A_*", "2");
map.put("US_*_*", "3");
map.put("US_O_O", "4");
map.put("US_*_W", "5");
map.put("ASIA_*_*", "6");
}
public String search(String reg, String s1, String s2) {
String val = null;
val = map.get(reg + "_" + s1 + "_" + s2);
if (val == null) {
val = map.get(reg + "_" + s1 + "_*");
if (val == null) {
val = map.get(reg + "_" + "*_" + s2);
if (val == null) {
val = map.get(reg + "_*_*");
}
}
}
return val;
}
}
private static final int N = 1000000000;
public static void main(String[] args) {
String region = "US";
String s1 = "O";
String s2 = "P";
testOrig(region, s1, s2);
test(region, s1, s2);
}
private static void testOrig(String region, String s1, String s2) {
RegionMapOrig map = new RegionMapOrig();
long start = System.nanoTime();
for (int i = 0; i < N; ++i) {
String val = map.search(region, s1, s2);
}
long end = System.nanoTime();
System.out.println((end - start) / 10E9);
}
private static void test(String region, String s1, String s2) {
RegionMap map = new RegionMap();
long start = System.nanoTime();
for (int i = 0; i < N; ++i) {
String val = map.search(region, s1, s2);
}
long end = System.nanoTime();
System.out.println((end - start) / 10E9);
}
Running this code multiple times have yielded the same results. However, this benchmark is a simple and may not be definitive. To truly test your results you will need to analyze the performance with a real data set that represents your typical values. I believe your performance issue may lie within your string concatenation and not how many calls to the map. The other reason why mine may have performed better is that my internal maps may be cached making multiple retrievals faster.
EDIT: Benchmark update
After further investigation by removing string concatentation your original code improved showing these results:
Orginal (no concatentation): 1.2068575417 seconds
Tiered: 2.2982665873 seconds
The code changes are:
public String searchNoCat(String cache1, String cache2, String cache3, String cache4) {
String val = null;
val = map.get(cache1);
if (val == null) {
val = map.get(cache2);
if (val == null) {
val = map.get(cache3);
if (val == null) {
val = map.get(cache4);
}
}
}
return val;
}
private static void testOrigNoCat(String region, String s1, String s2) {
RegionMapOrig map = new RegionMapOrig();
String cache1 = region + "_" + s1 + "_" + s2;
String cache2 = region + "_" + s1 + "_*";
String cache3 = region + "_" + "*_" + s2;
String cache4 = region + "_*_*";
long start = System.nanoTime();
for (int i = 0; i < N; ++i) {
String val = map.searchNoCat(cache1, cache2, cache3, cache4);
}
long end = System.nanoTime();
System.out.println((end - start) / 10E9);
}
However, the issue still remains on how to efficiently cache such values or reduce the number of concatenations for generic input. I do not know of an efficient way to do this. Therefore, I think that the tiered map is an efficient solution that eludes the concatenation problem.

It looks like you need some tree structure to help you encapsulating the logic with the wildcards ("*") replacements when searching for a value.
First I wrote some unit tests to describe the expected behaviour
import static org.junit.Assert.*;
import org.junit.Before;
import org.junit.Test;
public class WildcardSearchSpec {
private Node root;
#Before
public void before() {
root = new WildcardSearch();
root.add("US_A_B", "1");
root.add("US_A_*", "2");
root.add("US_*_*", "3");
root.add("US_O_O", "4");
root.add("US_*_W", "5");
root.add("ASIA_*_*", "6");
}
#Test
public void itShouldReturnFullWildcardCorrespondingValue() {
String key = "US_Q_Q";
String value = root.value(key);
assertEquals("3", value);
}
#Test
public void itShouldReturnNoWildcardCorrespondingValue() {
String key = "US_A_B";
String value = root.value(key);
assertEquals("1", value);
}
#Test
public void itShouldReturnS2WildcardCorrespondingValue() {
String key = "US_A_U";
String value = root.value(key);
assertEquals("2", value);
}
#Test
public void itShouldReturnS1WidlcardCorrespondingValue() {
String key = "US_W_W";
String value = root.value(key);
assertEquals("5", value);
}
#Test(expected=NoValueException.class)
public void itShouldThrowWhenNoCorrespondingValue() {
String key = "EU_A_B";
root.value(key);
fail();
}
}
The interface one can extract from these tests is the following
public interface Node {
void add(String key, String value);
String value(String key);
}
Which is implemented by WildcardSearch
import java.util.HashMap;
import java.util.Map;
public final class WildcardSearch implements Node {
private final Map<String, CountrySearch> children = new HashMap<>();
#Override
public void add(String key, String value) {
String country = key.split("_")[0];
String rest = key.substring(country.length() + 1);
children.putIfAbsent(country, new CountrySearch());
children.get(country).add(rest, value);
}
#Override
public String value(String key) {
String country = key.split("_")[0];
String rest = key.substring(country.length() + 1);
if (!children.containsKey(country)) {
return children.get(country).value(rest);
} else {
throw new NoValueException();
}
}
}
WildcardSearch uses CountrySearch to delegate the search in each country.
import java.util.HashMap;
import java.util.Map;
final class CountrySearch implements Node {
private final Map<String, SuffixeSearch> children = new HashMap<>();
#Override
public void add(String key, String value) {
String[] splittedKey = key.split("_");
String s1 = splittedKey[0];
String s2 = splittedKey[1];
children.putIfAbsent(s1, new SuffixeSearch());
children.get(s1).add(s2, value);
}
#Override
public String value(String key) {
String[] splittedKey = key.split("_");
String s1 = splittedKey[0];
String s2 = splittedKey[1];
if (children.containsKey(s1)) {
return children.get(s1).value(s2);
} else if (children.containsKey("*")) {
return children.get("*").value(s2);
} else {
throw new NoValueException();
}
}
}
CountrySearch uses SuffixeSearch to delegate the search in the suffixes.
import java.util.HashMap;
import java.util.Map;
final class SuffixeSearch implements Node {
private final Map<String, String> children = new HashMap<>();
public void add(String key, String value) {
children.put(key, value);
}
#Override
public String value(String key) {
if (children.containsKey(key)) {
return children.get(key);
} else if (children.containsKey("*")) {
return children.get("*");
} else {
throw new NoValueException();
}
}
}
Note: NoValueException is a custom RuntimeException.
The point is that each responsibility is clearly separated.
SuffixeSearch is only able to return the value for the corresponding key or the value corresponding to "*". It doesn't know anything about how is the overall key structured, nor the values are clustered by country, etc.
CountrySearch only knows about its level, delegating the rest to SuffixeSearch or ignoring what is above.
WildcardSearch only knows about splitting in country and delegates to CountrySearch the responsibility to do the wildcard magic.

Best and more general solution would be to use a Search Tree which you could implement yourself fairly easily and is a good programming exercise as well. There are also lots of tutorials and examples around, how to implement it.
For your special use case you could make use of cascading Maps, as DragonAssassin aready posted, which leverages what Java already offers.

One possible optimization is to expand the map to all possible values, it will need more memory and has some initialization cost but it might be worth it.
I made a few assumptions, if they dont apply to your problem this approach is useless for you.
The region data does not change, (a partial restart is acceptable in the case the data changes).
It is always one char instead of the "star". So "US_A_B" not "US_AA_BB".
Only uppercase letters instead of the "star". So no "US_a_b" or "US_/_/"
This approach creates int[] for every region. In this array are all the possible values calculated for 'A''A' -> 'Z''Z' including the '*'. So for a request you only need to find the correct int[] and calculate the index in the array based on the chars supplied.
I run it with benchmarks from #DragonAssassin and got 1/10 of his approach. The cost is about 1kb of memory for each region.
Here is the code:
static class AreaMapBuilder {
private List<String> areas = new ArrayList<>();
private Map<String, Integer> codes = new HashMap<>();
public void put(String area, char a, char b, int value) {
areas.add(area);
if (a == '*')
a = '#';
if (b == '*')
b = '#';
codes.put(area + "_" + a + "_" + b, value);
}
public AreaMap build() {
Map<String, int[]> codes = new HashMap<>();
for (String area : areas) {
codes.put(area, forArea(area));
}
return new AreaMap(codes);
}
private int[] forArea(String area) {
int[] forArea = new int[27 * 27];
for (int indexA = 0; indexA < 27; indexA++) {
for (int indexB = 0; indexB < 27; indexB++) {
forArea[indexA * 27 + indexB] = slowGet(area, (char) (indexA + '#'), (char) (indexB + '#'));
}
}
return forArea;
}
private int slowGet(String area, char a, char b) {
Integer val = codes.get(area + "_" + a + "_" + b);
if (val == null) {
val = codes.get(area + "_" + a + "_#");
if (val == null) {
val = codes.get(area + "_" + "#_" + b);
if (val == null) {
val = codes.get(area + "_#_#");
}
}
}
return val;
}
}
static class AreaMap {
private Map<String, int[]> codes;
public AreaMap(Map<String, int[]> codes) {
this.codes = codes;
}
public int get(String area, char a, char b) {
if (a == '*')
a = 0;
else
a -= '#';
if (b == '*')
b = 0;
else
b -= '#';
return codes.get(area)[a * 27 + b];
}
}
static AreaMap getMap(){
AreaMapBuilder areaBuilder = new AreaMapBuilder();
areaBuilder.put("US", 'A', 'B', 1);
areaBuilder.put("US", 'A', '*', 2);
areaBuilder.put("US", '*', '*', 3);
areaBuilder.put("US", 'O', 'O', 4);
areaBuilder.put("US", '*', 'W', 5);
areaBuilder.put("ASIA", '*', '*', 6);
return areaBuilder.build();
}

If prepared right you could nest three maps and mark an entry star for the generic cases (in fact * would just be another key into the maps). To get the desired number you would then need three "Indexes". Assuming there will always be a *-Map:
Map<String, Map<String, Map<String, Integer>>> map;
Map<String, Map<String, String> us_map = new Map<String, Map<String, String>();
Map<String, Map<String, String> asia_map = new Map<String, Map<String, String>();
Map<String, String> us_a_map = new Map<String, Integer>();
us_a_map.put("B", 1);
us_a_map.put("*", 2);
Map<String, String> us_star_map = new Map<String, Integer>();
us_star_map.put("*", 3);
us_star_map.put("W", 5);
map.put( "US", us_map);
us_map.put( "A", us_a_map );
us_map.put( "*", us_star_map );
map.put( "ASIA", asia_map);
In this map the performance will be better than in your proposed case, since the maps are smaller. For example to get element US_A_B you would
Integer value = map.get( "US" ).get( "A" ).get( "B" );
To deal with missing elements (in this case the * elements have to be considered) one also can find the Map entry "in each level": With following input:
String l0 = "US";
String l1 = "A";
String l2 = "unknown";
And assuming there is always an entry for "*" in each of the Maps:
Map<String, Map<String, String>> level_0
Map<String, String> level_1;
Integer level_2; // This will be the desired result
level_0 = map.get(l0);
if (level_0 == null) {
level_0 = star_0;
}
level_1 = level_0.get(l1);
if (level_1 == null) {
level_1 = level_0.get("*");
}
level_2 = level_1.get(l2);
if (level_2 == null) {
level_2 = level_1.get("*");
}
The Result will then be the value of level_2.

Related

Converting alpha numeric string to integer?

I have a hashMap that contains key and value as 'String'. I am getting these values from a web page in my selenium automation script.
my hashmap has following
<Italy, 3.3 millions>
<Venezuela, 30.69 millions>
<Japan, 127.1 millions>
How can I convert all the string alphanumeric values to integers so that I can apply sorting on the hashmap?
I have to display the word 'millions'.
As far as I understand from your question what you need to do is to be able to sort those values, so what you need is a Comparator.
Here is the Comparator that could do the trick:
Comparator<String> comparator = new Comparator<String>() {
#Override
public int compare(final String value1, final String value2) {
return Double.compare(
Double.parseDouble(value1.substring(0, value1.length() - 9)),
Double.parseDouble(value2.substring(0, value2.length() - 9))
);
}
};
System.out.println(comparator.compare("3.3 millions", "30.69 millions"));
System.out.println(comparator.compare("30.69 millions", "30.69 millions"));
System.out.println(comparator.compare("127.1 millions", "30.69 millions"));
Output:
-1
0
1
If you have only millions you can try something like this
String str = "3.3 Millions";
String[] splitted = str.split(" ");
double i = Double.valueOf(splitted[0])*1000000;
System.out.println(i);
or do your calculation depending on the substring
not sure if this is what you are looking for.. If i get it right you have to change your map from
<String, String> to <String, Double>.
See my example below :
import java.text.ParseException;
import java.util.HashMap;
import java.util.Map;
public class NewClass9 {
public static void main(String[] args) throws ParseException{
Map<String,String> oldMap = new HashMap<>();
oldMap.put("Italy", "3.3 millions");
oldMap.put("Venezuela", "30.69 millions");
oldMap.put("Japan", "127.1 millions");
Map<String,Double> newMap = new HashMap<>();
for(String key : oldMap.keySet()){
newMap.put(key, convert(oldMap.get(key)));
}
for(String key : newMap.keySet()){
System.out.printf("%.0f millions\n" ,newMap.get(key));
}
}
private static double convert(String str) {
String[] splitted = str.split(" ");
return Double.valueOf(splitted[0])*1000000;
}
}
A bit overkill but this should be extensible.
NB: I've only covered the multiplier lookup.
/**
* Possible units and their multipliers.
*/
enum Unit {
Unit(1),
Hundred(100),
Thousand(1000),
Million(1000000),
Billion(1000000000),
Squillion(Integer.MAX_VALUE);
private final int multiplier;
Unit(int multiplier) {
this.multiplier = multiplier;
}
}
/**
* Comparator that matches caseless and plurals
*
* NB: Not certain if this is consistent.
*/
private static final Comparator<String> COMPARECASELESSANDPLURALS
= (String o1, String o2) -> {
// Allow case difference AND plurals.
o1 = o1.toLowerCase();
o2 = o2.toLowerCase();
int diff = o1.compareTo(o2);
if (diff != 0) {
// One character different in length?
if (Math.abs(o1.length() - o2.length()) == 1) {
// Which may be plural?
if (o1.length() > o2.length()) {
// o1 might be plural.
if (o1.endsWith("s")) {
diff = o1.substring(0, o1.length() - 1).compareTo(o2);
}
} else if (o2.endsWith("s")) {
// o2 might be plural.
diff = -o2.substring(0, o2.length() - 1).compareTo(o1);
}
}
}
return diff;
};
// Build my lookup.
static final Map<String, Integer> MULTIPLIERS
= Arrays.stream(Unit.values())
// Collect into a Map
.collect(Collectors.toMap(
// From name of the enum.
u -> u.name(),
// To its multiplier.
u -> u.multiplier,
// Runtime exception in case of duplicates.
(k, v) -> {
throw new RuntimeException(String.format("Duplicate key %s", k));
},
// Use a TreeMap that ignores case and plural.
() -> new TreeMap(COMPARECASELESSANDPLURALS)));
// Gives the multiplier for a word.
public Optional<Integer> getMultiplier(String word) {
return Optional.ofNullable(MULTIPLIERS.get(word));
}
public void test() {
String[] tests = {"Million", "Millions", "Thousand", "Aardvark", "billion", "billions", "squillion"};
for (String s : tests) {
System.out.println("multiplier(" + s + ") = " + getMultiplier(s).orElse(1));
}
}

How to join items of list, but use a different delimiter for the last item?

Given list like:
List<String> names = Lists.newArrayList("George", "John", "Paul", "Ringo")
I'd like to transform it to a string like this:
George, John, Paul and Ringo
I can do it with rather clumsy StringBuilder thing like so:
String nameList = names.stream().collect(joining(", "));
if (nameList.contains(",")) {
StringBuilder builder = new StringBuilder(nameList);
builder.replace(nameList.lastIndexOf(','), nameList.lastIndexOf(',') + 1, " and");
return builder.toString();
}
Is there a bit more elegant approach? I don't mind using a library if needed.
NOTES:
I could use an old for loop with an index, but I am not looking for such a solution
There are no commas within the values (names)
I'm not sure how elegant this is, but it works. The annoying part is that you have to reverse the List.
List<String> list = Arrays.asList("George", "John", "Paul", "Ringo");
String andStr = " and ";
String commaStr = ", ";
int n = list.size();
String result = list.size() == 0 ? "" :
IntStream.range(0, n)
.mapToObj(i -> list.get(n - 1 - i))
.reduce((s, t) -> t + (s.contains(andStr) ? commaStr : andStr) + s)
.get();
System.out.println(result);
However, I think the best solution is this.
StringBuilder sb = new StringBuilder();
int n = list.size();
for (String string : list) {
sb.append(string);
if (--n > 0)
sb.append(n == 1 ? " and " : ", ");
}
System.out.println(sb);
It's clear, efficient, and obviously works. I don't think Streams are a good fit for this problem.
As you already did most of it I would introduce a second method "replaceLast" which is not in the JDK for java.lang.String so far:
import java.util.List;
import java.util.stream.Collectors;
public final class StringUtils {
private static final String AND = " and ";
private static final String COMMA = ", ";
// your initial call wrapped with a replaceLast call
public static String asLiteralNumeration(List<String> strings) {
return replaceLast(strings.stream().collect(Collectors.joining(COMMA)), COMMA, AND);
}
public static String replaceLast(String text, String regex, String replacement) {
return text.replaceFirst("(?s)" + regex + "(?!.*?" + regex + ")", replacement);
}
}
You might change the delimiters and params as well. Here the test for your requirements so far:
#org.junit.Test
public void test() {
List<String> names = Arrays.asList("George", "John", "Paul", "Ringo");
assertEquals("George, John, Paul and Ringo", StringUtils.asLiteralNumeration(names));
List<String> oneItemList = Arrays.asList("Paul");
assertEquals("Paul", StringUtils.asLiteralNumeration(oneItemList));
List<String> emptyList = Arrays.asList("");
assertEquals("", StringUtils.asLiteralNumeration(emptyList));
}
You may join all the elements except the last by using a sublist:
String nameList =
names.isEmpty() ? "" :
names.subList(0, names.size() - 1)
.stream()
.collect(Collectors.joining(firstDelimiter))
+ ((names.size() > 1) ? secondDelimiter + names.get(names.size() - 1) : names.get(0))
;
Personally, I don't like this approach because for non-array backed list implementations, the time of List#get may be O(n).
Tested here:
static String joinList(List<String> names) {
return joinList(names, ", ", " and ");
}
static String joinList(List<String> names, String firstDelimiter, String secondDelimiter) {
return names.isEmpty() ? "" :
names.subList(0, names.size() - 1)
.stream()
.collect(Collectors.joining(firstDelimiter))
+ ((names.size() > 1) ? secondDelimiter+ names.get(names.size() - 1) : names.get(0))
;
}
public static void main(String[] args) {
System.out.println(joinList(Arrays.asList("George", "John", "Paul", "Ringo")));
System.out.println(joinList(Arrays.asList("Ringo")));
System.out.println(joinList(Arrays.asList()));
System.out.println(joinList(null)); //this one throws NPE as OP oddly requested
}
Here's an alternative implementation or joinList, which may make it clearer what actually happens:
static String joinList(List<String> names, String firstDelimiter, String secondDelimiter) {
if (names.isEmpty()) {
return "";
} else if (names.size() == 1) {
return names.get(0);
} else {
return names.subList(0, names.size() - 1)
.stream().collect(Collectors.joining(firstDelimiter))
+ secondDelimiter + names.get(names.size() - 1);
}
}
If you don't mind using an Iterator, this works:
private static String specialJoin(Iterable<?> list, String sep, String lastSep) {
StringBuilder result = new StringBuilder();
final Iterator<?> i = list.iterator();
if (i.hasNext()) {
result.append(i.next());
while (i.hasNext()) {
final Object next = i.next();
result.append(i.hasNext() ? sep : lastSep);
result.append(next);
}
}
return result.toString();
}
It can probably be rewritten as a collector easily enough by someone who is familiar with that api.
Here's an elegant solution using the streams api:
String nameList = names.stream().collect(naturalCollector(", ", " and "));
Unfortunatley, it depends on this function, that could be stashed away in some utility class:
public static Collector<Object, Ack, String> naturalCollector(String sep, String lastSep) {
return new Collector<Object, Ack, String>() {
#Override public BiConsumer<Ack, Object> accumulator() {
return (Ack a, Object o) -> a.add(o, sep);
}
#Override public Set<java.util.stream.Collector.Characteristics> characteristics() {
return Collections.emptySet();
}
#Override public BinaryOperator<Ack> combiner() {
return (Ack one, Ack other) -> one.merge(other, sep);
}
#Override public Function<Ack, String> finisher() {
return (Ack a) -> a.toString(lastSep);
}
#Override public Supplier<Ack> supplier() {
return Ack::new;
}
};
}
... and also on this class, which is an internal stateholder in the above function, but which the Collector API wants exposed:
class Ack {
private StringBuilder result = null;
private Object last;
public void add(Object u, String sep) {
if (last != null) {
doAppend(sep, last);
}
last = u;
}
private void doAppend(String sep, Object t) {
if (result == null) {
result = new StringBuilder();
} else {
result.append(sep);
}
result.append(t);
}
public Ack merge(Ack other, String sep) {
if (other.last != null) {
doAppend(sep, last);
if (other.result != null) {
doAppend(sep, other.result);
}
last = other.last;
}
return this;
}
public String toString(String lastSep) {
if (result == null) {
return last == null ? "" : String.valueOf(last);
}
result.append(lastSep).append(last);
return result.toString();
}
}
If commas are never in the values, it's a one-liner:
String all = names.toString().replaceAll("^.|.$", "").replaceAll(",(?!.*,)", " and");
You can have write a custom function to add the last delimiter, but for delimiter in between you can use StringUtils.join() to accomplish your task.Check this link for api

implementing comparator to sort list of strings

I have a list of strings which I would like to sort instead of by their lexicographic order- by their weight (number of times the word appears in the specifies URL / number of words in this URL).
the problem is with the methode "searchPrefix" that when I creat a new Comparator, it obviously doesn't recognize the fields of that class in which I use to calculate the weight.
things iv'e tried:
1. using SortedMap and then there is no need to implement the Comparator, only that the instructions specifically note to implement the Comparator.
2. using getters (also didn't work because i'm working within the class and the methode);
3. implement the list as List> urlList = new ArrayList... also didn't work.
(The implementation of Comparator is what I would like to do)
how do I change it to work?
package il.ac.tau.cs.sw1.searchengine;
import java.util.*
public class MyWordIndex implements WordIndex {
public SortedMap<String, HashMap<String, Integer>> words;
public HashMap<String, Integer> urls;
public MyWordIndex() {
this.words = new TreeMap<String, HashMap<String, Integer>>();;
this.urls = new HashMap<String, Integer>();
}
#Override
public void index(Collection<String> words, String strURL) {
this.urls.put(strURL, words.size()); // to every page- how many words in it.
String subPrefix = "";
HashMap<String, Integer> help1; // how many times a word appears on that page
for (String word : words) {
if (word == null || word == "") // not a valid word
continue;
word.toLowerCase();
help1 = new HashMap<String, Integer>();
for (int i = 0; i < word.length(); i++) {
subPrefix = word.substring(0, i);
if (this.words.get(subPrefix) == null) { // new prefix
help1.put(strURL, 1);
this.words.put(subPrefix, help1);
}
else { // prefix exists
if (this.words.get(subPrefix).get(strURL) == null)//new URL with old prefix
this.words.get(subPrefix).put(strURL, 1);
else // both url and prefix exists
this.words.get(subPrefix).put(strURL, help1.get(strURL) + 1);
}
}
}
}
#Override
public List<String> searchPrefix(String prefix) {
prefix.toLowerCase();
List<String> urlList = new ArrayList<String>();
for (String word : this.words.keySet()) {
if (word.startsWith(prefix)) {
for (String strUrl : this.words.get(word).keySet()) {
urlList.add(strUrl);
}
}
}
Collections.sort(urlList, new Comparator<String>() {
#Override
public int compare(String strUrl1, String strUrl2) {
Double d1 = this.words.get(word).get(strUrl1) / this.urls.get(strUrl1);
Double d2 = this.words.get(word).get(strUrl2) / this.urls.get(strUrl2);
return Double.compare(d1, d2);
}
});
........
}
These changes take you closer to a solution.
Double d1 = MyWordIndex.this.words.get(word).get(strUrl1) / (double) MyWordIndex.this.urls.get(strUrl1);
Double d2 = MyWordIndex.this.words.get(word).get(strUrl2) / (double) MyWordIndex.this.urls.get(strUrl2);
I don't know what word is supposed to be though as there is no variable with that name in scope.
Suggestion for the for-loop in your index method:
for (int i = 1; i < word.length(); i++) { // no point starting at 0 - empty string
subPrefix = word.substring(0, i);
if (this.words.get(subPrefix) == null) { // new prefix
help1.put(strURL, 1);
this.words.put(subPrefix, help1);
}
else { // prefix exists
Integer count = this.words.get(subPrefix).get(strURL);
if (count == null)//new URL with old prefix
count = 0;
this.words.get(subPrefix).put(strURL, count + 1);
}
}
While we are on this, may I suggest Guava multiset which does this sort of counting for you automatically:
import com.google.common.collect.Multiset;
import com.google.common.collect.HashMultiset;
public class MultiTest{
public final Multiset<String> words;
public MultiTest() {
words = HashMultiset.create();
}
public static void main(String []args) {
MultiTest test = new MultiTest();
test.words.add("Mandible");
test.words.add("Incredible");
test.words.add("Commendable");
test.words.add("Mandible");
System.out.println(test.words.count("Mandible")); // 2
}
}
Finally to solve your problem, this should work, haven't tested:
#Override
public List<String> searchPrefix(String prefix) {
prefix = prefix.toLowerCase(); // Strings are immutable so this returns a new String
Map<String, Double> urlList = new HashMap<String, Double>();
for (String word : this.words.keySet()) {
if (word.startsWith(prefix)) {
for (String strUrl : this.words.get(word).keySet()) {
Double v = urlList.get(strUrl);
if (v == null) v = 0;
urlList.put(strUrl, v + this.words.get(word).get(strUrl));
}
}
}
List<String> myUrls = new ArrayList<String>(urlList.keySet());
Collections.sort(myUrls, new Comparator<String>() {
#Override
public int compare(String strUrl1, String strUrl2) {
return Double.compare(urlList.get(strUrl1) / MyWordIndex.this.urls.get(strUrl1),
urlList.get(strUrl2) / MyWordIndex.this.urls.get(strUrl2));
}
});
return myUrls;
}

Need a Fresh pair of Eyes to Work Out the Logic Behind Comparing Values in a Map of Maps

Problem
Data is in the format:
Map<String, Map<String, Integer>>
Which looks something like this:
{"MilkyWay": {"FirstStar" : 3, "SecondStar" : 9 .... }, "Andromeda": {"FirstStar" : 10, "SecondStar" : 9 .... } }
I want to compare the Star values in a quick loop, so I'd like to compare the integer value of FirstStar in MilkyWay and Andromeda and have it return true or falseif the values are the same or not. Since this Map of Maps is huge.
My Attempt
I'd like to do it something like:
//GalaxyMap<String, <Map<String, Integer>>
for (Map<String, Integer> _starMap : GalaxyMap.values())
{
for (String _starKey : _starMap.keySet()){
//Can't quite think of the correct logic... and I'm tired...
}
}
I'd like to keep it as short as possible... I've been staring at this for a while and I'm going in circles with it.
EDIT
Outer keys differ, Inner keys are the same
Also since this is a response from a server, I don't know the size it's going to be
Why does this need to be a map. If you're always using "FirstStar", "SecondStar" etc, as your keys, then why not make it a list instead..
Map<String, List<Integer>>
Then you can do something like:
public boolean compareGalaxyStar(String galaxyName, String otherGalaxyName, int star) {
List<Integer> galaxyStars = galaxyMap.get(galaxyName);
List<Integer> otherGalaxyStars = galaxyMap.get(otherGalaxyName);
return galaxyStars.get(star) == otherGalaxyStars.get(star);
}
NOTE: You need to do some validation to make sure the input is correct.
To implement this for all stars, it is not much different.
if(galaxyStars.size() == otherGalaxyStars.size()) {
for(int x = 0; x < galaxyStars.size(); x++) {
// Perform your comparisons.
if(galaxyStars.get(x) != otherGalaxyStars.get(x)) {
// Uh oh, unequal.
return false;
}
}
}
If the structure of the inner maps also could differ, you should do something like that:
static boolean allStarValuesEqual(Map<String, Map<String, Integer>> galaxies) {
Map<String, Integer> refStars = null;
for (Map<String, Integer> galaxy : galaxies.values()) {
if (refStars == null) {
refStars = galaxy;
} else {
for (Entry<String, Integer> currentStar : galaxy.entrySet()) {
if (!currentStar.getValue().equals(refStars.get(currentStar.getKey()))) {
return false;
}
}
}
}
return true;
}
Please check below program along with output:
package com.test;
import java.util.HashMap;
import java.util.Map;
import java.util.Set;
public class CompareMapValues {
private final static String FS = "FirstStar";
private final static String SS = "SecondStar";
private final static String MW = "MilkyWay";
private final static String A = "Andromeda";
public static void main(String[] args) {
Map> map = new HashMap>();
Map innerMap1 = new HashMap();
innerMap1.put(FS, 3);
innerMap1.put(SS, 9);
Map innerMap2 = new HashMap();
innerMap2.put(FS, 10);
innerMap2.put(SS, 9);
map.put(MW, innerMap1);
map.put(A, innerMap2);
Set set = map.keySet();
for(String s: set) {
Map outerMap = map.get(s);
Set set2 = map.keySet();
for(String s2: set2) {
Map innerMap = map.get(s2);
if(!s2.equals(s)) {
Set set3 = outerMap.keySet();
for(String s3: set3) {
int i1 = outerMap.get(s3);
Set set4 = innerMap.keySet();
for(String s4: set4) {
int i2 = innerMap.get(s3);
if(s3.equals(s4) && i1==i2) {
System.out.println("For parent " + s + " for " + s3 + " value is " + i1);
}
}
}
}
}
}
}
}
//Output:
//For parent Andromeda for SecondStar value is 9
//For parent MilkyWay for SecondStar value is 9
Hope this helps.

Fastest way to get all values from a Map where the key starts with a certain expression

Consider you have a map<String, Object> myMap.
Given the expression "some.string.*", I have to retrieve all the values from myMap whose keys starts with this expression.
I am trying to avoid for loops because myMap will be given a set of expressions not only one and using for loop for each expression becomes cumbersome performance wise.
What is the fastest way to do this?
If you work with NavigableMap (e.g. TreeMap), you can use benefits of underlying tree data structure, and do something like this (with O(lg(N)) complexity):
public SortedMap<String, Object> getByPrefix(
NavigableMap<String, Object> myMap,
String prefix ) {
return myMap.subMap( prefix, prefix + Character.MAX_VALUE );
}
More expanded example:
import java.util.NavigableMap;
import java.util.SortedMap;
import java.util.TreeMap;
public class Test {
public static void main( String[] args ) {
TreeMap<String, Object> myMap = new TreeMap<String, Object>();
myMap.put( "111-hello", null );
myMap.put( "111-world", null );
myMap.put( "111-test", null );
myMap.put( "111-java", null );
myMap.put( "123-one", null );
myMap.put( "123-two", null );
myMap.put( "123--three", null );
myMap.put( "123--four", null );
myMap.put( "125-hello", null );
myMap.put( "125--world", null );
System.out.println( "111 \t" + getByPrefix( myMap, "111" ) );
System.out.println( "123 \t" + getByPrefix( myMap, "123" ) );
System.out.println( "123-- \t" + getByPrefix( myMap, "123--" ) );
System.out.println( "12 \t" + getByPrefix( myMap, "12" ) );
}
private static SortedMap<String, Object> getByPrefix(
NavigableMap<String, Object> myMap,
String prefix ) {
return myMap.subMap( prefix, prefix + Character.MAX_VALUE );
}
}
Output is:
111 {111-hello=null, 111-java=null, 111-test=null, 111-world=null}
123 {123--four=null, 123--three=null, 123-one=null, 123-two=null}
123-- {123--four=null, 123--three=null}
12 {123--four=null, 123--three=null, 123-one=null, 123-two=null, 125--world=null, 125-hello=null}
I wrote a MapFilter recently for just such a need. You can also filter filtered maps which makes then really useful.
If your expressions have common roots like "some.byte" and "some.string" then filtering by the common root first ("some." in this case) will save you a great deal of time. See main for some trivial examples.
Note that making changes to the filtered map changes the underlying map.
public class MapFilter<T> implements Map<String, T> {
// The enclosed map -- could also be a MapFilter.
final private Map<String, T> map;
// Use a TreeMap for predictable iteration order.
// Store Map.Entry to reflect changes down into the underlying map.
// The Key is the shortened string. The entry.key is the full string.
final private Map<String, Map.Entry<String, T>> entries = new TreeMap<>();
// The prefix they are looking for in this map.
final private String prefix;
public MapFilter(Map<String, T> map, String prefix) {
// Store my backing map.
this.map = map;
// Record my prefix.
this.prefix = prefix;
// Build my entries.
rebuildEntries();
}
public MapFilter(Map<String, T> map) {
this(map, "");
}
private synchronized void rebuildEntries() {
// Start empty.
entries.clear();
// Build my entry set.
for (Map.Entry<String, T> e : map.entrySet()) {
String key = e.getKey();
// Retain each one that starts with the specified prefix.
if (key.startsWith(prefix)) {
// Key it on the remainder.
String k = key.substring(prefix.length());
// Entries k always contains the LAST occurrence if there are multiples.
entries.put(k, e);
}
}
}
#Override
public String toString() {
return "MapFilter(" + prefix + ") of " + map + " containing " + entrySet();
}
// Constructor from a properties file.
public MapFilter(Properties p, String prefix) {
// Properties extends HashTable<Object,Object> so it implements Map.
// I need Map<String,T> so I wrap it in a HashMap for simplicity.
// Java-8 breaks if we use diamond inference.
this(new HashMap<>((Map) p), prefix);
}
// Helper to fast filter the map.
public MapFilter<T> filter(String prefix) {
// Wrap me in a new filter.
return new MapFilter<>(this, prefix);
}
// Count my entries.
#Override
public int size() {
return entries.size();
}
// Are we empty.
#Override
public boolean isEmpty() {
return entries.isEmpty();
}
// Is this key in me?
#Override
public boolean containsKey(Object key) {
return entries.containsKey(key);
}
// Is this value in me.
#Override
public boolean containsValue(Object value) {
// Walk the values.
for (Map.Entry<String, T> e : entries.values()) {
if (value.equals(e.getValue())) {
// Its there!
return true;
}
}
return false;
}
// Get the referenced value - if present.
#Override
public T get(Object key) {
return get(key, null);
}
// Get the referenced value - if present.
public T get(Object key, T dflt) {
Map.Entry<String, T> e = entries.get((String) key);
return e != null ? e.getValue() : dflt;
}
// Add to the underlying map.
#Override
public T put(String key, T value) {
T old = null;
// Do I have an entry for it already?
Map.Entry<String, T> entry = entries.get(key);
// Was it already there?
if (entry != null) {
// Yes. Just update it.
old = entry.setValue(value);
} else {
// Add it to the map.
map.put(prefix + key, value);
// Rebuild.
rebuildEntries();
}
return old;
}
// Get rid of that one.
#Override
public T remove(Object key) {
// Do I have an entry for it?
Map.Entry<String, T> entry = entries.get((String) key);
if (entry != null) {
entries.remove(key);
// Change the underlying map.
return map.remove(prefix + key);
}
return null;
}
// Add all of them.
#Override
public void putAll(Map<? extends String, ? extends T> m) {
for (Map.Entry<? extends String, ? extends T> e : m.entrySet()) {
put(e.getKey(), e.getValue());
}
}
// Clear everything out.
#Override
public void clear() {
// Just remove mine.
// This does not clear the underlying map - perhaps it should remove the filtered entries.
for (String key : entries.keySet()) {
map.remove(prefix + key);
}
entries.clear();
}
#Override
public Set<String> keySet() {
return entries.keySet();
}
#Override
public Collection<T> values() {
// Roll them all out into a new ArrayList.
List<T> values = new ArrayList<>();
for (Map.Entry<String, T> v : entries.values()) {
values.add(v.getValue());
}
return values;
}
#Override
public Set<Map.Entry<String, T>> entrySet() {
// Roll them all out into a new TreeSet.
Set<Map.Entry<String, T>> entrySet = new TreeSet<>();
for (Map.Entry<String, Map.Entry<String, T>> v : entries.entrySet()) {
entrySet.add(new Entry<>(v));
}
return entrySet;
}
/**
* An entry.
*
* #param <T> The type of the value.
*/
private static class Entry<T> implements Map.Entry<String, T>, Comparable<Entry<T>> {
// Note that entry in the entry is an entry in the underlying map.
private final Map.Entry<String, Map.Entry<String, T>> entry;
Entry(Map.Entry<String, Map.Entry<String, T>> entry) {
this.entry = entry;
}
#Override
public String getKey() {
return entry.getKey();
}
#Override
public T getValue() {
// Remember that the value is the entry in the underlying map.
return entry.getValue().getValue();
}
#Override
public T setValue(T newValue) {
// Remember that the value is the entry in the underlying map.
return entry.getValue().setValue(newValue);
}
#Override
public boolean equals(Object o) {
if (!(o instanceof Entry)) {
return false;
}
Entry e = (Entry) o;
return getKey().equals(e.getKey()) && getValue().equals(e.getValue());
}
#Override
public int hashCode() {
return getKey().hashCode() ^ getValue().hashCode();
}
#Override
public String toString() {
return getKey() + "=" + getValue();
}
#Override
public int compareTo(Entry<T> o) {
return getKey().compareTo(o.getKey());
}
}
// Simple tests.
public static void main(String[] args) {
String[] samples = {
"Some.For.Me",
"Some.For.You",
"Some.More",
"Yet.More"};
Map map = new HashMap();
for (String s : samples) {
map.put(s, s);
}
Map all = new MapFilter(map);
Map some = new MapFilter(map, "Some.");
Map someFor = new MapFilter(some, "For.");
System.out.println("All: " + all);
System.out.println("Some: " + some);
System.out.println("Some.For: " + someFor);
Properties props = new Properties();
props.setProperty("namespace.prop1", "value1");
props.setProperty("namespace.prop2", "value2");
props.setProperty("namespace.iDontKnowThisNameAtCompileTime", "anothervalue");
props.setProperty("someStuff.morestuff", "stuff");
Map<String, String> filtered = new MapFilter(props, "namespace.");
System.out.println("namespace props " + filtered);
}
}
The accepted answer works in 99% of all the cases, but the devil is in the details.
Specifically, the accepted answer does not work when the map has a key which begins with the prefix, followed by Character.MAX_VALUE followed by anything else. Comments posted to the accepted answer yields small improvements, but still does not cover all of the cases.
The following solution also uses NavigableMap to pick out a sub map given a key prefix. The solution is the subMapFrom() method and the trick is to not bump/increment the last char of the prefix, rather, the last char which is not MAX_VALUE whilst cutting off all trailing MAX_VALUEs. So for example, if the prefix is "abc" we increment it to "abd". But if the prefix is "ab" + MAX_VALUE we drop the last char and bump the preceding char instead, resulting in "ac".
import static java.lang.Character.MAX_VALUE;
public class App
{
public static void main(String[] args) {
NavigableMap<String, String> map = new TreeMap<>();
String[] keys = {
"a",
"b",
"b" + MAX_VALUE,
"b" + MAX_VALUE + "any",
"c"
};
// Populate map
Stream.of(keys).forEach(k -> map.put(k, ""));
// For each key that starts with 'b', find the sub map
Stream.of(keys).filter(s -> s.startsWith("b")).forEach(p -> {
System.out.println("Looking for sub map using prefix \"" + p + "\".");
// Always returns expected sub maps with no misses
// [b, b￿, b￿any], [b￿, b￿any] and [b￿any]
System.out.println("My solution: " +
subMapFrom(map, p).keySet());
// WRONG! Prefix "b" misses "b￿any"
System.out.println("SO answer: " +
map.subMap(p, true, p + MAX_VALUE, true).keySet());
// WRONG! Prefix "b￿" misses "b￿" and "b￿any"
System.out.println("SO comment: " +
map.subMap(p, true, tryIncrementLastChar(p), false).keySet());
System.out.println();
});
}
private static <V> NavigableMap<String, V> subMapFrom(
NavigableMap<String, V> map, String keyPrefix)
{
final String fromKey = keyPrefix, toKey; // undefined
// Alias
String p = keyPrefix;
if (p.isEmpty()) {
// No need for a sub map
return map;
}
// ("ab" + MAX_VALUE + MAX_VALUE + ...) returns index 1
final int i = lastIndexOfNonMaxChar(p);
if (i == -1) {
// Prefix is all MAX_VALUE through and through, so grab rest of map
return map.tailMap(p, true);
}
if (i < p.length() - 1) {
// Target char for bumping is not last char; cut out the residue
// ("ab" + MAX_VALUE + MAX_VALUE + ...) becomes "ab"
p = p.substring(0, i + 1);
}
toKey = bumpChar(p, i);
return map.subMap(fromKey, true, toKey, false);
}
private static int lastIndexOfNonMaxChar(String str) {
int i = str.length();
// Walk backwards, while we have a valid index
while (--i >= 0) {
if (str.charAt(i) < MAX_VALUE) {
return i;
}
}
return -1;
}
private static String bumpChar(String str, int pos) {
assert !str.isEmpty();
assert pos >= 0 && pos < str.length();
final char c = str.charAt(pos);
assert c < MAX_VALUE;
StringBuilder b = new StringBuilder(str);
b.setCharAt(pos, (char) (c + 1));
return b.toString();
}
private static String tryIncrementLastChar(String p) {
char l = p.charAt(p.length() - 1);
return l == MAX_VALUE ?
// Last character already max, do nothing
p :
// Bump last character
p.substring(0, p.length() - 1) + ++l;
}
}
Output:
Looking for sub map using prefix "b".
My solution: [b, b￿, b￿any]
SO answer: [b, b￿]
SO comment: [b, b￿, b￿any]
Looking for sub map using prefix "b￿".
My solution: [b￿, b￿any]
SO answer: [b￿, b￿any]
SO comment: []
Looking for sub map using prefix "b￿any".
My solution: [b￿any]
SO answer: [b￿any]
SO comment: [b￿any]
Should perhaps be added that I also tried various other approaches including code I found elsewhere on the internet. All of them failed by yielding an incorrect result or out right crashed with various exceptions.
Remove all keys which does not start with your desired prefix:
yourMap.keySet().removeIf(key -> !key.startsWith(keyPrefix));
map's keyset has no a special structure so I think you have to check each of the keys anyway. So you can't find a way which will be faster than a single loop...
I used this code to do a speed trial:
public class KeyFinder {
private static Random random = new Random();
private interface Receiver {
void receive(String value);
}
public static void main(String[] args) {
for (int trials = 0; trials < 10; trials++) {
doTrial();
}
}
private static void doTrial() {
final Map<String, String> map = new HashMap<String, String>();
giveRandomElements(new Receiver() {
public void receive(String value) {
map.put(value, null);
}
}, 10000);
final Set<String> expressions = new HashSet<String>();
giveRandomElements(new Receiver() {
public void receive(String value) {
expressions.add(value);
}
}, 1000);
int hits = 0;
long start = System.currentTimeMillis();
for (String expression : expressions) {
for (String key : map.keySet()) {
if (key.startsWith(expression)) {
hits++;
}
}
}
long stop = System.currentTimeMillis();
System.out.printf("Found %s hits in %s ms\n", hits, stop - start);
}
private static void giveRandomElements(Receiver receiver, int count) {
for (int i = 0; i < count; i++) {
String value = String.valueOf(random.nextLong());
receiver.receive(value);
}
}
}
The output was:
Found 0 hits in 1649 ms
Found 0 hits in 1626 ms
Found 0 hits in 1389 ms
Found 0 hits in 1396 ms
Found 0 hits in 1417 ms
Found 0 hits in 1388 ms
Found 0 hits in 1377 ms
Found 0 hits in 1395 ms
Found 0 hits in 1399 ms
Found 0 hits in 1357 ms
This counts how many of 10000 random keys start with any one of 1000 random String values (10M checks).
So about 1.4 seconds on a simple dual core laptop; is that too slow for you?

Categories

Resources