Finding combinations of a HashSet of Strings

Finding combinations of a HashSet of Strings - java

I have a HashMap<String, HashSet>. The String stores the name of the person, and the HashSet stores the list of people that are friends with the person.
KEY<STRING> VALUE<HASHSET>
Dave Steve
Steve Dave
Bob Dalton
Dalton Bob, Sue
Anne Sue
Sue Dalton, Anne
In the above data, Dave is friends with Steve (line 1 and 2). From line 4, Dalton is friends with Bob and Sue. However, Bob and Sue are not necessarily friends. The program needs to input Bob and Sue as friends. In other words, Bob should be added to Sue's friend list and Sue should be added to Bob's friends list. However, Dalton's friends list may have an infinite amount of people. I am also not allowed to store the friend list data into an Array or an ArrayList.
One solution I was considering (but haven't tried) was to edit my read(String name1, String name2) method. (Note: In my runner class, whenever this method is called, it is called as read(name1, name2) and read(name2, name1)) In short, this method reads in two friendships and adds in the friendship into the map. In the else block (if name1 is already a key in the HashMap), I was thinking to add in code to take the existing friendlist (which will only have one value) of name1 and call read again.
Here's the read method, if you need it
private Map<String, Set> friends;
// Adds two friends, name1 and name2, to the HashMap of friendships
public void read(String name1, String name2)
{
// Temporary HashSet in case a person has more than one friend
Set<String> strSet = new HashSet<String>();
if (!friends.containsKey(name1))
{
strSet.add(name2);
friends.put(name1, strSet);
}
else
{
strSet.clear();
// Set strSet to the current friendlist of name1
strSet = friends.get(name1);
strSet.add(name2);
// Make a new entry in the HashMap with name1 and the updated friend list
friends.put(name1, strSet);
}
}
Another solution (going off of the title of this thread) is to find all the possible combinations of the friendlist. e.g. if Dalton has Bob, Sue, and Dave in his friend list, I could have a method that finds all possible combinations of two way friendships (remember, order doesn't matter):
Bob Sue
Bob Dave
Sue Dave
However, I don't know how to code this. Any suggestions?

The second solution you described is equivalent to a disjoint-set data structure. Your friends end up being in sets, where everyone in each set is friends with everyone else in that set and no one else.
The tricky part of implementing this data structure is merging two sets when you discover that two people in different sets are friends.
This is a naive implementation:
public class DisjointFriendSet {
private final Map<String, Set<String>> personToFriends = new HashMap<>();
/**
* Includes the person themselves in their group of friends.
*
* If no friendships have been registered for this person, then returns a set
* containing just themselves.
*
* #param person
* #return
*/
public Set<String> getFriends(String person) {
if(personToFriends.containsKey(person)) {
return personToFriends.get(person);
} else {
final Set<String> result = new HashSet<>();
result.add(person);
return result;
}
}
public void addFriendship(String person1, String person2) {
final Set<String> friends1 = getFriends(person1);
final Set<String> friends2 = getFriends(person2);
if(friends1 == friends2) {
return;
} else {
personToFriends.put(person1, friends1);
friends1.addAll(friends2);
for(String person: friends2) {
personToFriends.put(person, friends1);
}
}
}
/**
*
* #return All unique friendship groups
*/
public Collection<Set<String>> getAllFriendshipGroups() {
return new HashSet<>(personToFriends.values());
}
public static void main(String[] args) {
final DisjointFriendSet disjointFriendSet = new DisjointFriendSet();
disjointFriendSet.addFriendship("Alice","Beowulf");
disjointFriendSet.addFriendship("Charity","Donald");
disjointFriendSet.addFriendship("Eduardo","Frank");
disjointFriendSet.addFriendship("Grendel","Harriet");
System.out.println("Friendship groups: "+disjointFriendSet.getAllFriendshipGroups());
System.out.println("Adding friendship between Grendel and Beowulf");
disjointFriendSet.addFriendship("Grendel","Beowulf");
System.out.println("Friendship groups: "+disjointFriendSet.getAllFriendshipGroups());
System.out.println();
for(String person: new String[]{"Alice","Beowulf","Charity","Donald","Eduardo","Frank","Grendel","Harriet","Zod"}) {
System.out.println(person+"'s friends: "+disjointFriendSet.getFriends(person));
}
}
}

Here's a pseudo code approach (I can look up the Java later)
Assume at each addition that all friends of friends are properly matched.
Take the two new inputs, and create a temporary collection of all of their friends as well as the input values.
For every value in the temporary collection, add every other value as a friend. (A set should only maintain unique values, but you can explicitly check if need be).
This may not be the most efficient solution (at every step, half of the additions would be duplicates), but it should be a starting point.
Func (friend1, friend2)
tempSet = masterSet(friend1).Hashset
UNION
masterSet(friend2).Hashset
UNION
friend1, friend2
foreach (friend in tempSet)
foreach(otherFriend in tempSet - friend)
friend.AddFriend(otherFriend)
otherFriend.AddFriend(friend)

Related

Shuffling groups of elements in an arraylist

I am working on a group generator and currently I am making an ArrayList from this txt file.
So that, the ArrayList is in the form of [PedroA, Brazil, Male, 10G, Saadia...]
I want to shuffle 4 elements at a time, to randomize this arraylist.
I am storing the info in
ArrayList<String> studentInfo = info.readEachWord(className);

This is very hard to do. It's possible, of course, but difficult.
It is being made difficult because what you want to do is bizarre.
The normal way to do this would be to:
Make a class representing a single entry, let's call it class Person.
Read this data by parsing each line into a single Person instance, and add them all to a list.
Just call Collections.shuffle(list); to shuffle them.
If we have the above, we could do what you want, by then converting your List<Person> back into a List<String>. In many ways this is the simplest way to do the task you ask for, but then you start wondering why you want this data in the form of a list of strings in the first place.
enum Gender {
MALE, FEMALE, OTHER;
public static Gender parse(String in) {
switch (in.toLowerCase()) {
case "male": return MALE;
case "female": return FEMALE;
default: return OTHER;
}
}
class Person {
String name;
String location;
Gender gender;
[some type that properly represents whatever 10G and 10W means];
public static Person readLine(String line) {
String[] parts = line.split("\\s+", 4);
Person p = new Person();
p.name = parts[0];
p.location = parts[1];
p.gender = Gender.parse(parts[2]);
...;
return p;
}
}
you get the idea.

Optimizing Stream with lambda

What would be the best way of optimizing the below code further
public List<GroupDTOv2> getAllGroups(String xTenantId, CourseType courseType, String courseId, ContextType contextType, String contextId) throws AuthenticationException {
final List<GroupV2> groups = groupV2Repository.findByTenantIdAndCourseTypeAndCourseIdAndContextTypeAndContextId(xTenantId, courseType, courseId, contextType, contextId);
final RosterDTOv2 roster = rosterServiceFacade.getRoster(xTenantId, courseType, courseId, contextType, contextId);
final ArrayList<GroupDTOv2> groupDtoList=new ArrayList<>();
groups.stream().forEach(group -> {
final GroupDTOv2 groupDTO=new GroupDTOv2();
BeanUtils.copyProperties(group,groupDTO);
roster.getUsers().forEach(userDTOv2 -> {
if(userDTOv2.getUserId().equalsIgnoreCase(group.getTeamLeadId())){
groupDTO.setTeamLead(userDTOv2);
}
if(group.getTeamMemberIds().contains(userDTOv2.getUserId())){
groupDTO.getTeamMembers().add(userDTOv2);
}
});
groupDtoList.add(groupDTO);
});
return groupDtoList;
}
If we use stream twice to set the team-lead object and the team members i think the cost would be high,In that case what would be the mostappropriate way

You seem to have quadratic complexity1) for finding the matching leaders and team members. Consider putting those into a Map, mapping user IDs to actual users:
Map<String, UserDTOv2> userMap = roster.getUsers().stream()
.collect(Collectors.toMap(user -> user.getUserId().toLowerCase(),
user -> user));
Then, you do not need the inner loops and can instead just look up the leader and members. Also, instead of forEach and then groupDtoList.add, you could just usemap and collect.
List<GroupDTOv2> groupDtoList = groups.stream().map(group -> {
GroupDTOv2 groupDTO = new GroupDTOv2();
BeanUtils.copyProperties(group, groupDTO);
groupDTO.setTeamLead(userMap.get(group.getTeamLeadId().toLowerCase()));
group.getTeamMemberIds().forEach(id -> {
groupDTO.getTeamMembers().add(userMap.get(id.toLowerCase()));
});
return groupDTO;
}).collect(Collectors.toList());
Note, however, that the behaviour is not exactly the same as in your code. This assumes that (a) no two users have the same ID, and (b) that the roster will actually contain a matching user for the group's leader and each of its members. Yours would allow for duplicate IDs or no matching users and would just pick the last matching leader or omit members if no matching user can be found.
1) Not really quadratic, but O(n*m), with n being the number of groups and m the number of users.

Create a HashMap with a fixed Key corresponding to a HashSet. point of departure

My aim is to create a hashmap with a String as the key, and the entry values as a HashSet of Strings.
OUTPUT
This is what the output looks like now:
Hudson+(surname)=[Q2720681], Hudson,+Quebec=[Q141445], Hudson+(given+name)=[Q5928530], Hudson,+Colorado=[Q2272323], Hudson,+Illinois=[Q2672022], Hudson,+Indiana=[Q2710584], Hudson,+Ontario=[Q5928505], Hudson,+Buenos+Aires+Province=[Q10298710], Hudson,+Florida=[Q768903]]
According to my idea, it should look like this:
[Hudson+(surname)=[Q2720681,Q141445,Q5928530,Q2272323,Q2672022]]
The purpose is to store a particular name in Wikidata and then all of the Q values associated with it's disambiguation, so for example:
This is the page for "Bush".
I want Bush to be the Key, and then for all of the different points of departure, all of the different ways that Bush could be associated with a terminal page of Wikidata, I want to store the corresponding "Q value", or unique alpha-numeric identifier.
What I'm actually doing is trying to scrape the different names, values, from the wikipedia disambiguation and then look up the unique alpha-numeric identifier associated with that value in wikidata.
For example, with Bush we have:
George H. W. Bush
George W. Bush
Jeb Bush
Bush family
Bush (surname)
Accordingly the Q values are:
George H. W. Bush (Q23505)
George W. Bush (Q207)
Jeb Bush (Q221997)
Bush family (Q2743830)
Bush (Q1484464)
My idea is that the data structure should be construed in the following way
Key:Bush
Entry Set: Q23505, Q207, Q221997, Q2743830, Q1484464
But the code I have now doesn't do that.
It creates a seperate entry for each name and Q value. i.e.
Key:Jeb Bush
Entry Set: Q221997
Key:George W. Bush
Entry Set: Q207
and so on.
The full code in all it's glory can be seen on my github page, but I'll summarize it below also.
This is what I'm using to add values to my data strucuture:
// add Q values to their arrayList in the hash map at the index of the appropriate entity
public static HashSet<String> put_to_hash(String key, String value)
{
if (!q_valMap.containsKey(key))
{
return q_valMap.put(key, new HashSet<String>() );
}
HashSet<String> list = q_valMap.get(key);
list.add(value);
return q_valMap.put(key, list);
}
This is how I fetch the content:
while ((line_by_line = wiki_data_pagecontent.readLine()) != null)
{
// if we can determine it's a disambig page we need to send it off to get all
// the possible senses in which it can be used.
Pattern disambig_pattern = Pattern.compile("<div class=\"wikibase-entitytermsview-heading-description \">Wikipedia disambiguation page</div>");
Matcher disambig_indicator = disambig_pattern.matcher(line_by_line);
if (disambig_indicator.matches())
{
//off to get the different usages
Wikipedia_Disambig_Fetcher.all_possibilities( variable_entity );
}
else
{
//get the Q value off the page by matching
Pattern q_page_pattern = Pattern.compile("<!-- wikibase-toolbar --><span class=\"wikibase-toolbar-container\"><span class=\"wikibase-toolbar-item " +
"wikibase-toolbar \">\\[<span class=\"wikibase-toolbar-item wikibase-toolbar-button wikibase-toolbar-button-edit\"><a " +
"href=\"/wiki/Special:SetSiteLink/(.*?)\">edit</a></span>\\]</span></span>");
Matcher match_Q_component = q_page_pattern.matcher(line_by_line);
if ( match_Q_component.matches() )
{
String Q = match_Q_component.group(1);
// 'Q' should be appended to an array, since each entity can hold multiple
// Q values on that basis of disambig
put_to_hash( variable_entity, Q );
}
}
}
and this is how I deal with a disambiguation page:
public static void all_possibilities( String variable_entity ) throws Exception
{
System.out.println("this is a disambig page");
//if it's a disambig page we know we can go right to the wikipedia
//get it's normal wiki disambig page
Document docx = Jsoup.connect( "https://en.wikipedia.org/wiki/" + variable_entity ).get();
//this can handle the less structured ones.
Elements linx = docx.select( "p:contains(" + variable_entity + ") ~ ul a:eq(0)" );
for (Element linq : linx)
{
System.out.println(linq.text());
String linq_nospace = linq.text().replace(' ', '+');
Wikidata_Q_Reader.getQ( linq_nospace );
}
}
I was thinking maybe I could pass the Key value around, but I really don't know. I'm kind of stuck. Maybe someone can see how I can implement this functionality.

I'm not clear from your question what isn't working, or if you're seeing actual errors. But, while your basic data structure idea (HashMap of String to Set<String>) is sound, there's a bug in the "add" function.
public static HashSet<String> put_to_hash(String key, String value)
{
if (!q_valMap.containsKey(key))
{
return q_valMap.put(key, new HashSet<String>() );
}
HashSet<String> list = q_valMap.get(key);
list.add(value);
return q_valMap.put(key, list);
}
In the case where a key is seen for the first time (if (!q_valMap.containsKey(key))), it vivifies a new HashSet for that key, but it doesn't add value to it before returning. (And the returned value is the old value for that key, so it'll be null.) So you're going to be losing one of the Q-values for every term.
For multi-layered data structures like this, I usually special-case just the vivification of the intermediate structure, and then do the adding and return in a single code path. I think this would fix it. (I'm also going to call it valSet because it's a set and not a list. And there's no need to re-add the set to the map each time; it's a reference type and gets added the first time you encounter that key.)
public static HashSet<String> put_to_hash(String key, String value)
{
if (!q_valMap.containsKey(key)) {
q_valMap.put(key, new HashSet<String>());
}
HashSet<String> valSet = q_valMap.get(key);
valSet.add(value);
return valSet;
}
Also be aware that the Set you return is a reference to the live Set for that key, so you need to be careful about modifying it in callers, and if you're doing multithreading you're going to have concurrent access issues.
Or just use a Guava Multimap so you don't have to worry about writing the implementation yourself.

Which collections to use?

Suppose I want to store phone numbers of persons. Which kind of collection should I use for key value pairs? And it should be helpful for searching. The name may get repeated, so there may be the same name having different phone numbers.

In case you want to use key value pair. Good choice is to use Map instead of collection.
So what should that map store ?
As far it goes for key. First thing you want to assure is that your key is unique to avoid collisions.
class Person {
long uniqueID;
String name;
String lastname;
}
So we will use the uniqueID of Person for key.
What about value ?
In this case is harder. As the single Person can have many phone numbers. But for simple task lest assume that a person can have only one phone number. Then what you look is
class PhoneNumberRegistry {
Map<Long,String> phoneRegistry = new HashMap<>();
}
Where the long is taken from person. When you deal with Maps, you should implement the hashCode and equals methods.
Then your registry could look like
class PhoneNumberRegistry {
Map<Person,String> phoneRegistry = new HashMap<>();
}
In case when you want to store more then one number for person, you will need to change the type of value in the map.
You can use Set<String> to store multiple numbers that will not duplicate. But to have full control you should introduce new type that not only store the number but also what king of that number is.
class PhoneNumberRegistry {
Map<Person,HashSet<String>> phoneRegistry = new HashMap<>();
}
But then you will have to solve various problems like, what phone number should i return ?

Your problem has different solutions. For example, I'll go with a LIST: List<Person>, where Person is a class like this:
public class Person{
private String name;
private List<String> phoneNumbers;
// ...
}
For collections searching/filtering I suggest Guava Collections2.filter method.

You should use this:
Hashtable<String, ArrayList<String>> addressbook = new Hashtable<>();
ArrayList<String> persons = new ArrayList<String>()
persons.add("Tom Butterfly");
persons.add("Maria Wanderlust");
addressbook.put("+0490301234567", persons);
addressbook.put("+0490301234560", persons);
Hashtable are save to not have empty elements, the ArrayList is fast in collect small elements. Know that multiple persons with different names may have same numbers.
Know that 2 persons can have the same number and the same Name!
String name = "Tom Butterfly";
String[] array = addressbook.keySet().toArray(new String[] {});
int firstElement = Collections.binarySearch(Arrays.asList(array),
name, new Comparator<String>() {
#Override
public int compare(String top, String bottom) {
if (addressbook.get(top).contains(bottom)) {
return 0;
}
return -1;
}
});
System.out.println("Number is " + array[firstElement]);

Maybe
List<Pair<String, String> (for one number per person)
or
List<Pair<String, String[]> (for multiple numbers per person)
will fit your needs.

how to manipulate list in java

Edit: My list is sorted as it is coming from a DB
I have an ArrayList that has objects of class People. People has two properties: ssn and terminationReason. So my list looks like this
ArrayList:
ssn TerminatinoReason
123456789 Reason1
123456789 Reason2
123456789 Reason3
568956899 Reason2
000000001 Reason3
000000001 Reason2
I want to change this list up so that there are no duplicates and termination reasons are seperated by commas.
so above list would become
New ArrayList:
ssn TerminatinoReason
123456789 Reason1, Reason2, Reason3
568956899 Reason2
000000001 Reason3, Reason2
I have something going where I am looping through the original list and matching ssn's but it does not seem to work.
Can someone help?
Code I was using was:
String ssn = "";
Iterator it = results.iterator();
ArrayList newList = new ArrayList();
People ob;
while (it.hasNext())
{
ob = (People) it.next();
if (ssn.equalsIgnoreCase(""))
{
newList.add(ob);
ssn = ob.getSSN();
}
else if (ssn.equalsIgnoreCase(ob.getSSN()))
{
//should I get last object from new list and append this termination reason?
ob.getTerminationReason()
}
}

To me, this seems like a good case to use a Multimap, which would allow storing multiple values for a single key.
The Google Collections has a Multimap implementation.
This may mean that the Person object's ssn and terminationReason fields may have to be taken out to be a key and value, respectively. (And those fields will be assumed to be String.)
Basically, it can be used as follows:
Multimap<String, String> m = HashMultimap.create();
// In reality, the following would probably be iterating over the
// Person objects returned from the database, and calling the
// getSSN and getTerminationReasons methods.
m.put("0000001", "Reason1");
m.put("0000001", "Reason2");
m.put("0000001", "Reason3");
m.put("0000002", "Reason1");
m.put("0000002", "Reason2");
m.put("0000002", "Reason3");
for (String ssn : m.keySet())
{
// For each SSN, the termination reasons can be retrieved.
Collection<String> termReasonsList = m.get(ssn);
// Do something with the list of reasons.
}
If necessary, a comma-separated list of a Collection can be produced:
StringBuilder sb = new StringBuilder();
for (String reason : termReasonsList)
{
sb.append(reason);
sb.append(", ");
}
sb.delete(sb.length() - 2, sb.length());
String commaSepList = sb.toString();
This could once again be set to the terminationReason field.
An alternative, as Jonik mentioned in the comments, is to use the StringUtils.join method from Apache Commons Lang could be used to create a comma-separated list.
It should also be noted that the Multimap doesn't specify whether an implementation should or should not allow duplicate key/value pairs, so one should look at which type of Multimap to use.
In this example, the HashMultimap is a good choice, as it does not allow duplicate key/value pairs. This would automatically eliminate any duplicate reasons given for one specific person.

What you might need is a Hash. HashMap maybe usable.
Override equals() and hashCode() inside your People Class.
Make hashCode return the people (person) SSN. This way you will have all People objects with the same SSN in the same "bucket".
Keep in mind that the Map interface implementation classes use key/value pairs for holding your objects so you will have something like myHashMap.add("ssn",peopleobject);

List<People> newlst = new ArrayList<People>();
People last = null;
for (People p : listFromDB) {
if (last == null || !last.ssn.equals(p.ssn)) {
last = new People();
last.ssn = p.ssn;
last.terminationReason = "";
newlst.add(last);
}
if (last.terminationReason.length() > 0) {
last.terminationReason += ", ";
}
last.terminationReason += p.terminationReason;
}
And you get the aggregated list in newlst.
Update: If you are using MySQL, you can use the GROUP_CONCAT function to extract data in your required format. I don't know whether other DB engines have similar function or not.
Update 2: Removed the unnecessary sorting.

Two possible problems:
This won't work if your list isn't sorted
You aren't doing anything with ob.getTerminationReason(). I think you mean to add it to the previous object.

EDIT: Now that i see you´ve edited your question.
As your list is sorted, (by ssn I presume)
Integer currentSSN = null;
List<People> peoplelist = getSortedList();//gets sorted list from DB.
/*Uses foreach construct instead of iterators*/
for (People person:peopleList){
if (currentSSN != null && people.getSSN().equals(currentSSN)){
//same person
system.out.print(person.getReason()+" ");//writes termination reason
}
else{//person has changed. New row.
currentSSN = person.getSSN();
system.out.println(" ");//new row.
system.out.print(person.getSSN()+ " ");//writes row header.
}
}
If you don´t want to display the contents of your list, you could use it to create a MAP and then use it as shown below.
If your list is not sorted
Maybe you should try a different approach, using a Map. Here, ssn would be the key of the map, and values could be a list of People
Map<Integer,List<People>> mymap = getMap();//loads a Map from input data.
for(Integer ssn:mymap.keyset()){
dorow(ssn,mymap.get(ssn));
}
public void dorow(Integer ssn, List<People> reasons){
system.out.print(ssn+" ");
for (People people:reasons){
system.out.print(people.getTerminationReason()+" ");
}
system.out.println("-----");//row separator.
Last but not least, you should override your hashCode() and equals() method on People class.
for example
public void int hashcode(){
return 3*this.reason.hascode();
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Finding combinations of a HashSet of Strings - java

Related

Shuffling groups of elements in an arraylist

Optimizing Stream with lambda

Create a HashMap with a fixed Key corresponding to a HashSet. point of departure

Which collections to use?

how to manipulate list in java

Categories

Resources