Understanding multidimensional, associative array's in Groovy - java

First time posting and I am having some difficulty understanding groovy script arrays? (not sure if they are list, arrays, or maps). I have typically coded in PHP and am used to associating PHP multidimensional arrays as a (key => value) association. I am not sure if I am overlooking that flexibility in Groovy. It seems like you either have to pick either a map/array combo or a list.
What I am trying to accomplish is I have another associative array that is static I would like to have associated with a key -> value. (e.g. 1 - Tim, 2 - Greg, 3 - Bob, etc...)
I have another associative array that is total dynamic. This needs to be nested within the associative array that I stated above because in this list it will contain task information that the current user has worked on. (e.g. under Tim there he might have worked on 3 unrelated task at a different time and the statuses of those task might vary. So this should correlate to something like this [Task 1, 3/6/19, Completed Task], [Task 2, 3/5/19, Completed Task], [Task 3, 2/5/19, In Progress Task]. Someone named Greg might have instead 4 task.
So my question is what is the best data structure to use for this? How do I add data to this data structure effectively?
I'm sorry if these seem like bare-bones basic questions. Again, I'm new to Groovy.

Map model=[:]
List names=['Tim','Greg','Bob']
names?.each { name->
//dynamically call something that returns a list
// model."${name}"= getSomeList(name)
//get a list assign it the above list maybe something like this
// List someTasks = ['task1','task2']
// model."${name}"= someTasks
//or shorter
// model."${name}"= ['task1','task2']
// 1 element multi element list
if (name=='Bob') {
model."${name}"= ['task1']
} else {
model."${name}"= ['task1','task2']
}
}
//This iterates through map and its value being another iteration
model?.each{ key,value ->
println "working on $key"
value?.each { v-
println "$key has task ${v}"
}
}
Try some of above may help you understand it better and yes you can use <<
Map model=[:]
model << ['bob':['task1']]
model << ['Greg':['task1','task2']]
You could either map like latter or above through an iteration further lists/maps within that list so for example:
model << ['Greg':[
'task1' : ['do thing1','do thing2'],
'task2': [ 'do xyz', 'do abc']
]
]
//This iterates through map and its value being another map with an iteration
model?.each{ key,value ->
println "working on $key"
value?.each {k, v->
println "$key has task ${k}"
v?.each { vv ->
println "$key has task ${k} which needs to do ${vv}"
}
}
}
Using collect you could really simply all the each iterations which is a lot more verbose, using collection you could make it into one line:
names?.collect{[it:getSomeList(it)]}
//sometimes you need to flatten in this case I dont think you would
names?.collect{[it:seriesHotelList(it)]}?.flatten()
List getSomeList(String name) {
return ['task1','task2']
}

The basic data structures that are key/value lookups are just Java Maps (usually the LinkedHashMap implementation in Groovy). Your first-level association seems to be something like a Map<Integer, Employee>. The nested one that you are calling "total dynamic" seems instead to really be a structured class, and you definitely should learn how Java/Groovy classes work. This seems to be something like what you're looking for:
class Employee {
int employeeId
String name
List<Task> tasks
}
enum TaskStatus {
PENDING,
IN_PROGRESS,
COMPLETED
}
class Task {
int taskNumber
LocalDate date // java.time.LocalDate
TaskStatus status
}
By the way, Groovy is a great language and my preferred JVM language, but it's better to make sure you understand the basics first. I recommend using #CompileStatic on all of your classes whenever possible and making sure you understand any cases where you can't use it. This will help to prevent errors and missteps as you learn.

Related

How to recall some values that are printed in Anylogic Console and store them?

I am creating an agent based model in Anylogic 8.7. I created a collection with ArrayList class and Agent elements using this code to separate some agents meeting a specific condition:
collection.addAll(findAll(population,p -> p.counter==variable); for (AgentyType p: collection ) { traceln(p.probability); }
The above code will store the probability attribute of the separated agents in the console. Is there a way to define a loop to retrieve the printed probability attributes from the console one by one and store them in a variable to operate on them? Or if there is a more efficient and optimized way of doing this I would be glad if you share this with me. Thank you all.
I am not sure why you are following this methodology... Agent-Based Modeling already "stores" the parameters you are looking for, you do not need the console as an intermediate. I believe what you are trying to do is the following:
for( AgentType p : agentTypes)
{
if( p.track == 1 )
{
sum = sum + p.probability * p.impact ;
}
{
I recommend you read:
https://help.anylogic.com/topic/com.anylogic.help/html/code/for.html?resultof=%22%66%6f%72%22%20%22%6c%6f%6f%70%22%20
and
https://help.anylogic.com/topic/com.anylogic.help/html/agentbased/statistics.html?resultof=%22%73%74%61%74%69%73%74%69%63%73%22%20%22%73%74%61%74%69%73%74%22%20
The latter will give you a better idea on how to collect Agent statistics based on certain criteria.
depending on the operations you want to perform you can use the following:
https://help.anylogic.com/index.jsp?topic=%2Fcom.anylogic.help%2Fhtml%2Fjavadoc%2Fcom%2Fanylogic%2Fengine%2FUtilitiesCollection.html&resultof=%22%75%74%69%6c%69%74%69%65%73%22%20%22%75%74%69%6c%22%20
you can use something like this to collect one by one the values of your probabilities.
collection.addAll(findAll(population,p -> p.counter==variable);
LinkedList <Double> probabilities= new LinkedList();
for (AgentyType p: collection ) {
probabilities.add(p.probability);
}

Is it good to use a big set or a equivalent map with sets in it?

I have some data points collected from different companies identified by companyId, and the name property of each data point could be duplicate in one company or among different companies.The problem is to group all the data points by its name property which belong to different companies, which means we ignore the data point if its company has already existed in the group.
For example the data points are:
companyId data-point name
1---------------------A
1---------------------A
1---------------------B
2---------------------A
3---------------------B
The results would be:
data-point name group
A=================(1,A)(2,A)
B=================(1,B)(2,B)
We can see that the second data point A from company 1 was ignored.
There are two ways as far as i know to do deduplicate work.
1.Build a Map<String(data point name), Set<Long(companyId)>>
Map<String, Set<Long>> dedup = new HashMap<>();
for(DataPoint dp : datapoints){
String key = dp.getName();
if(!dedup.contains(key)){
dedup.put(key, new HashSet<Long>());
}
if(dedup.get(key).contains(dp.getCompanyId()){
continue;
}
dedup.get(key).add(dp.getCompanyId());
}
2.Build a Big Set<String>
Set<String> dedup;
for(DataPoint dp : datapoints){
String key = dp.getName() + dp.getCompanyId();
if(dedup.contains(key)){
continue;
}
dedup.add(key);
}
So which one is better or more appropriate ?
Method (1) is way better, because method 2 kind of destroys the type information.
There are ready-made collections already available for such cases if you want a well-tested robust implementation, with many additional features.
Guava: https://google.github.io/guava/releases/21.0/api/docs/com/google/common/collect/HashMultimap.html
Eclipse collections:
https://www.eclipse.org/collections/
If you just want a simple implementation, you can follow your method (1) and do it yourself.
Result would be something like this:
{
"A": [1, 2],
"B": [1, 2]
}
Few reasons why I don't prefer method 2:
The method is not reliable. If company name ends with a number, then you might have false deduplication. So, you may need to add a special character like so: <id>~<name>
If you need to consider one more parameter later, it becomes more messy. You may have to do <id>~<name>~<pincode> etc.,
In method 1, you have the added convenience that you can put the company bean directly, if you implement a hashcode and equals which are based on the companyId field alone
The easiest way to do (1) would be:
Map<String, Set<Long>> dedup =
datapoints.stream().collect(
groupingBy(
DataPoint::getName,
mapping(DataPoint::getCompanyId, toSet()));
The easiest way to do (2) would be:
Set<String> dedup =
datapoints.stream()
.map(d -> d.getName() + d.getCompanyId())
.collect(toSet());
The one you choose depends upon what you're trying to do, since they yield different types of data, as well as potentially different results.

Broadcasting a HashMap in Flink

I am using Flink v.1.4.0.
I am working with the DataSet API and one of the things I want to try is very similar to how broadcast variables are used in Apache Spark.
Practically, I want to apply a map function on a DataSet, go through each of the elements in the DataSet and search for it in a HashMap; if the search element is present in the Map then retrieve the respective value.
The HashMap is very big and I don't know if (since I haven't even built my solution) it needs to be Serializable to be transmitted and used by all workers concurrently.
In general, the solution I have in mind would look like this:
Map<String, T> hashMap = new ... ;
DataSet<Point> points = env.readCsv(...);
points
.map(point -> hashMap.getOrDefault(point.getId, 0))
...
but I don't know if this would work or if it is efficient in any way. After doing a bit of searching I found a much better example here according to which one can us Broadcast variables in Flink to broadcast a List as follows:
DataSet<Point> points = env.readCsv(...);
DataSet<Centroid> centroids = ... ; // some computation
points.map(new RichMapFunction<Point, Integer>() {
private List<Centroid> centroids;
#Override
public void open(Configuration parameters) {
this.centroids = getRuntimeContext().getBroadcastVariable("centroids");
}
#Override
public Integer map(Point p) {
return selectCentroid(centroids, p);
}
}).withBroadcastSet("centroids", centroids);
However, .getBroadcastVariable() seems to only work with a List.
Can someone provide an alternative solution with a HashMap?
How would that solution work?
What is the most efficient way to go about solving this?
Could one use a Flink Managed State to do something similar to how broadcast variables are used? How?
Finally, can I attempt multiple mappings with multiple broadcast variables in a pipeline?
Where do the values of hashMap come from? Two other possible solutions:
Reinitialise/recreate/regenerate hashMap in each instance of your filtering/mapping operator separately in open method. Probably more efficient per record, but duplicates initialisation logic.
Create two DataSet, one for hashMap values, second for points and join those two DataSets using desired join strategy. As an analogy, what you are trying to do could be expressed by SQL query SELECT * FROM points p, hashMap h WHERE h.key = p.id.

Map<K,V> to List for Dynamic Test in Java

Background:
I'm trying to write a dynamic Test like:
#TestFactory
Collection<DynamicTest> dynamicTestsFromCollection() {
return Arrays.asList(
dynamicTest("1st dynamic test", () -> assertTrue(true)),
dynamicTest("2nd dynamic test", () -> assertEquals(4, 2 * 2))
);
}
My Map contains Map<String, List<Tuple4>>
The key can have multiple "testfiles" which should be seen as ONE test.
but it can also have multiple keys for different "testcases".
What would be the best approach for this?
I thought converting it to a List and then construct my testcases from it.
IMPORTANT The Key - Value relationship may not be destructed, the order is not important.
Like
A {FooForA, Foo2ForA, Foo3ForA}
B {FooForB, Foo2ForB....}
The Values from Key B, may not get mixxed with Values for A.
Is this possible when converting to an List, since you only have one Element, how do you create the relationship. If this isn't safe im open for other Approaches like for example Streams or something else.
Here is some reference:
http://junit.org/junit5/docs/current/user-guide/#writing-tests-dynamic-tests

what do I use to perform SQL-like lookups on a java table?

I have a 2D array
public static class Status{
public static String[][] Data= {
{ "FriendlyName","Value","Units","Serial","Min","Max","Mode","TestID","notes" },
{ "PIDs supported [01 – 20]:",null,"Binary","0",null,null,"1","0",null },
{ "Online Monitors since DTCs cleared:",null,"Binary","1",null,null,"1","1",null },
{ "Freeze DTC:",null,"NONE IN MODE 1","2",null,null,"1","2",null },
I want to
SELECT "FriendlyName","Value" FROM Data WHERE "Mode" = "1" and "TestID" = "2"
How do I do it? The fastest execution time is important because there could be hundreds of these per minute.
Think about how general it needs to be. The solution for something truly as general as SQL probably doesn't look much like the solution for a few very specific queries.
As you present it, I'd be inclined to avoid the 2D array of strings and instead create a collection - probably an ArrayList, but if you're doing frequent insertions & deletions maybe a LinkedList would be more appropriate - of some struct-like class. So
List<MyThing> list = new ArrayList<MyThing>();
and index the fields on which you want to search using a HashMap:
Map<Integer, MyThing> modeIndex = new HashMap<Integer, MyThing>()
for (MyThing thing : list)
modeIndex.put(thing.mode, thing);
Writing it down makes me realize that won't do, in and of itself, because multiple things could have the same mode. So probably a multimap instead - or roll your own by making the value type of the map not MyThing, but rather List. Google Collections has a fine multimap implementation.
This doesn't exactly answer your question, but it is possible to run some Java RDBMs with their tables entirely in your JVM's memory. For example, HSQLDB. This will give you the full power of SQL selects without the overheads of disc access. The only catch is that you won't be able to query a raw Java data structure like you are asking. You'll first have to insert the data into the DB's in-memory tables.
(I've not tried this ... perhaps someone could comment if this approach is really viable.)
As to your actual question, in C# they used to use LINQ (Language Integrated Query) for this, which takes benefit of the language's support for closures. Right now with Java 6 as the latest official release, Java doesn't support closures, but it's going to come in the shortly upcoming Java 7. The Java 7 based equivalent for LINQ is likely going to be JaQue.
As to your actual problem, you're definitely using a wrong datastructure for the job. Your best bet will be to convert the String[][] into a List<Entity> and using convenient searching/filtering API's provided by Guava, as suggested by Carl Manaster. The Iterables#filter() would be a good start.
EDIT: I took a look at your array, and I think this is definitely a job for RDBMS. If you want in-memory datastructure like features (fast/no need for DB server), embedded in-memory databases like HSQLDB, H2 can provide those.
If you want good execution time, you MUST have a good datastructure. If you just have data stored in a 2D array unordered, you'll be mostly stuck with O(n).
What you need is indexes for example, just like other RDBMS. For example, if you use a lot of WHERE clause like this WHERE name='Brian' AND last_name='Smith' you could do something like this (kind of a pseudocode):
Set<Entry> everyEntry = //the set that contains all data
Map<String, Set<Entry>> indexedSet = newMap();
for(String name : unionSetOfNames){
Set<Entry> subset = Iterables.collect(new HasName(name), everyEntries);
indexedSet.put(name, subset);
}
//and later...
Set<Entry> brians = indexedSet.get("Brian");
Entry target = Iterables.find(new HasLastName("Smith"),brians);
(Please forgive me if the Guava API usage is wrong in the example code (it's pseudo-code!, but you get the idea).
In the above code, you'll be doing a lookup of O(1) once, and then another O(n) lookup, but on a much much smaller subset. So this can be more effective than doing a O(n) lookup on the entire set, etc. If you use a ordered Set, ordered by the last_name and use binary search, that lookup will become O(log n). Things like that. There are bunch of datastructures out there and this is only a very simple example.
So in conclusion, if I were you, I'll define my own classes and create a datastructure using some standard datastructures available in JDK. If that doesn't suffice, I might look at some other datastructures out there, but if it gets really complex, I think I'd just use some in-memory RDBMS like HSQLDB or H2. They are easy to embed, so there are quite close to having your own in-memory datastructure. And as more and more you do complex stuff, chances are that that option provides better performance.
Note also that I used the Google Guava library in my sample code.. They are excellent, and I highly recommend to use them because it's so much nicer. Of course don't forget to look at the java.utli.collections package, too..
I ended up using a lookup table. 90% of the data is referenced from near the top.
public static int lookupReferenceInTable (String instanceMode, String instanceTID){
int ModeMatches[]=getReferencesToMode(Integer.parseInt(instanceMode));
int lineLookup = getReferenceFromPossibleMatches(ModeMatches, instanceTID);
return lineLookup;
}
private static int getReferenceFromPossibleMatches(int[] ModeMatches, String instanceTID) {
int counter = 0;
int match = 0;
instanceTID=instanceTID.trim();
while ( counter < ModeMatches.length ){
int x = ModeMatches[counter];
if (Data[x][DataTestID].equals(instanceTID)){
return ModeMatches[counter];
}
counter ++ ;
}
return match;
}
It can be further optimized so that instead of looping through all of the arrays it will loop on column until it finds a match, then loop the next, then the next. The data is laid out in a flowing and well organized manner so a lookup based on 3 criteria should only take a number of checks equal to the rows.

Categories

Resources