Algorithm for Graph/Data Structure on Java - java

I have been working on the following problem where, I have a CSV file with two columns, we can say the filed names are "Friends". Both the columns contain letters from A to Z.
e.g.
A B
B C
A E
D F
E F
Each row has two different letters(no duplication in the row). A is a friend of B, C is a friend of D etc...If person A talks to person B and Person B talks to person C, then B and C will become aquitances. Aquintaces are who share a common friend. I need to fin out who has more aquintances?
I have been trying with two different methods one using differnt data structures like hashmap, arraylist, stack etc, and another using graph theory (JGraphT library).
But, i am stuck with the logic if I use data strcutres and I am stuck with traversal in the graph if I use graph theory.
I have following questions:-
What is a better approach to go with data structures or graph? Or
any other better approach/logic/algorithm than this?
Does anyone know how to traverse a graph in JgraphT Library. I am
not able to do this, they have very limited documentation about
the library.
Please, any help would really be appreciated.

Generally HashMaps are among the most rapid and easy to use. I would recommend you use them rather any custom libraries, except if you are sure that some library will do easily something you need and something that will take longtime for you to code.
In your case, just you can just use each person as a key and the list of his friends as the object pointed to by. Parsing your .csv file and filling the HashMap accordingly will solve your issue, as a previous comment pointed out.

You can have a hash table first that maps every letter to the set of its friends, e.g. A maps to { B }, B maps to { C }, and if Z has two friends Y and W then Z maps to { Y, W }. This is a hash map from letters to sets-of-letters.
To calculate the acquaintances, iterate through the hash map. When you are at entry A -> { B, C, F } then iterate in an inner loop through the set B, C, F. Collect their friends (three sets) into a single set (just insert the elements into a temp set) and remove A from that set if A was found. The size of that set is then the number of acquaintances for A. Rinse and repeat.

Related

Creating a derivation tree from a set of Grammar rules

I have 2 arrays, defining a set of rules in a Context free grammar. With array 1 being the left Side of a rule and array 2 being the right side of a rule, for example :
A = B | C would translate to array1[0] = A, array2[0] = B C
From this, I want to construct all the possible derivation given an integer that defines how many steps can occur. So for example, A ---> C would constitute 1 step. If the integer would be 3, a program would print out all the possible derivations that occur in 3 steps.
Any advice on how to tackle this program would be appreciated, I've been trying to think a way around the problem for hours with no success. I'm using Java.
Thanks.
Since you are using String objects to express derivations you could use your starting symbol and split its derivation using the delimiter " ". Then you could search for derivations for the non-terminals in the right order and try to recursively derive them.
If there are only terminals left, you could print out the complete derivation which you have stored in a structure, e. g. a list, which contains every derivation.
Then you get back to the last point where still other options are available to derive.
But I think your modelling is not that good since this is not really efficient. The problem you are dealing with is typically solved by parser implementations which are also used e. g. for compiler implementation. Have a look at this resource (wikipedia): Parsing.
There are two main approaches: Top-down- and bottom-up-parsing.

Data structure used to perform the union operation on two disjoint sets

What basic data structure would be best to use for the union operation on two disjoint sets?
Are there any algorithms that would run in O(1) time?
I'm thinking some variety of Hash Table, but I'm kind of stuck.
This is for a study guide in Algorithms and Data Structures.
The full question:
The set operation UNION takes two disjoint sets S1 and S2 as input, and returns a
set S = S1 ∪ S2 consisting of all the elements of S1 and S2 (the sets S1 and S2 are
usually destroyed by this operation). Explain how you can support UNION operation
in O(1) time using a suitable data structure. Discuss what data structure you would
use and describe the algorithm for the UNION operation.
If the sets are disjoint, a linked list (with a head and tail) will be enough. The union in this case is only a concatenation of the lists. In C++:
struct LL {
Value *val;
LL *next;
};
struct LList{
LL *head;
LL *tail;
};
and the union operation will be:
void unify(LList* list1, LList* list2) {
// assuming you take care of edge cases
list1->tail->next = list2->head;
list1->tail = list2->tail;
return;
}
An interesting technique that sometimes applies to that problem (not always though, as you will see), is to use an array of "cycles", each cycle storing a set. The cycles are stored as a bunch of "next element" links, so next[i] will give an integer that represents the next item. In the end the links loop back, so the sets are necessarily disjoint.
The nice thing there is that you can union two sets together by swapping two items. If you have indexes s1 and s2, then the sets they are in (s1 and s2 are not special representatives, you can refer to a set by any of its elements) can be unioned by swapping those positions:
int temp = next[s1];
next[s1] = next[s2];
next[s2] = temp;
Or however you can swap in your language. Java doesn't have a nice equivalent of std::swap(&next[s1], &next[s2]) as far as I know.
This is obviously related to cyclic linked lists, but more compact. The downside is that you have to prepare your "universe" in advance. With linked lists you can arbitrarily add items. Also if your items are not the integers 0 to n then you will have an array on the side to do the mapping, but that's not really a pure downside or upside, it depends on what you need to do with it.
A bonus upside is that because you can refer to an item by index, it goes together more easily with other data structures, for example it likes to cooperate with the Union Find structure (which is also an array of integers, well two of them), inheriting the O(1) Union that both structure offer, keeping the amortized O(α(n)) Find of Union Find, and also (from the cycles structure) keeping the O(m) set enumeration for a set of size m. So you mostly get the best of both worlds.
In case it wasn't obvious, you can initialize the "universe" with "all singletons" like this:
for (int i = 0; i < next.length; i++)
next[i] = i;
The same as in Union Find.

Record Matching - Efficient Iteration

I have to preform record matching of 70K records in Java. One record size would be 200 bytes As record matching process all records compared against all records. My query is, how efficiently I can iterate and perform comparison.
First of all, you don't need compare all to each other. Once A - B is equal to B - A, you just need compare one with its successors. For example, you have { A, B, C, D }, then you compare A with B, C and D. Compare B with C and D, and compare C with D. This cut the amount of comparisons from n ^ 2 to n!.
You can optimize the algorithm by making search blocks. Put everyone with the same name and last name on the same block. Everyone with the same email on other block and so on. After all, you process each block comparing their records as described above. Depending on the amount of records you have, you will reduce dramatically the time of processing.
Use Duke [https://github.com/larsga/Duke].
Not perfect, but it's free and Java.
We have .NET version that is better and faster, but it's in-house thing, not OSS yet.

is there any DSL for streams/iterators?

I wonder (and nearly become desperate) if there is any worked out DSL for streams/iterators on ordered series of objects?
The sources are ordered streams of id,time,key,value instances and the requirement is to join and analyse those streams. This has to be done by collecting combinations of keys and applying metrics to values within certain (defineable) time-constraints (count distinct keys or sum values within a day, within same second ..). There are some DSL, that work on timeseries (ESP), but mostly using relatively simple time-windows and they do not seem to be able to handle the order/join by id,time etc (and in consequence the computation of combinations by id).
What I have to do is something like "compute the combinations of A and (B or C), count distinct D within same second, sum E with same id"
The results should contain all available combinations of A, (B or C) with the count of distinct values for key D that are in the same second with A, (B or C) for each distinct id and the sum of the values for key E for each id (which is the sum over all values of E for ids havin A, (B or C).
not an easy question. I'm just looking for maybe helpful, already thought out DSL for such problems. I do not think SQL will make it.
Thanks a lot!
I think you can't find such methods because streams and iterators are not intended to contain ordered data (however they can). As result if you can't rely on sorted data inside there is no need in such methods, because you will need to read all data from stream/iterator thus they will loose their main purpose as a data structure. So why not to use list?

Need algorithm for Sequence calculation

I am trying to find the solution for a problem where i have something like
A > B
B > C
B > D
C > D
And I should get the answer as A > B > C > D.
Conditions for this problem
The output will involve all the elements.
The problem will not have any bogus inputs.
for example, (A>B) (C>D) is a bogus input, since we cannot determine the output.
The inputs can be of any size but never bogus and there will always be a solution to the problem.
I need to find a solution for this optimally using Java Collections. Any tips/hints are welcome.
Thanks in advance!
It's called a Topological Sort. http://en.wikipedia.org/wiki/Topological_sorting
Given that, you should be able to complete your homework on your own.
I'm betting you recently covered graphs in this class...
How do you think a graph could be applied here ?
Can you think of a structure which one would build on the basis of the problem inputs (A>B>, A>D, C>A etc.)? Maybe some kind of directed graph...
Once the problem is expressed in such a graph, the solution would involve navigating this graph...
You start putting them in a List. The list will be sorted, so for the nth pair (a, b), you look up a, using a binary search. If it exists already skip, if not you insert in at proper point. Since a > b, you do that again with b in the remaining part of the List. Hope this help.
You can do this with a Map of the inputs and a recursive method that adds it's answer to a returned List (or just prints each node as it descends the tree.) If you are returning the answer then pre-pending to the returned list will prevent the answer from being reversed D->C->B->A when complete (or you can just .reverse() the list at the end.) Don't forget to test for a break condition when recursing. (hint: key not found)

Categories

Resources