Apache Drill: Write general-purpose array_agg UDF

Apache Drill: Write general-purpose array_agg UDF - java

I would like to create an array_agg UDF for Apache Drill to be able to aggregate all values of a group to a list of values.
This should work with any major types (required, optional) and minor types (varchar, dict, map, int, etc.)
However, I get the impression that Apache Drill's UDF API does not really make use of inheritance and generics. Each type has its own writer and handler, and they cannot be abstracted to handle any type. E.g., the ValueHolder interface seems to be purely cosmetic and cannot be used to have type-agnostic hooking of UDFs to any type.
My current implementation
I tried to solve this by using Java's reflection so I could use the ListHolder's write function independent of the holder of the original value.
However, I then ran into the limitations of the #FunctionTemplate annotation.
I cannot create a general UDF annotation for any value (I tried it with the interface ValueHolder: #param ValueHolder input.
So to me it seems like the only way to support different types to have separate classes for each type. But I can't even abstract much and work on any #Param input, because input is only visible in the class where its defined (i.e. type specific).
I based my implementation on https://issues.apache.org/jira/browse/DRILL-6963
and created the following two classes for required and optional varchars (how can this be unified in the first place?)
#FunctionTemplate(
name = "array_agg",
scope = FunctionScope.POINT_AGGREGATE,
nulls = NullHandling.INTERNAL
)
public static class VarChar_Agg implements DrillAggFunc {
#Param org.apache.drill.exec.expr.holders.VarCharHolder input;
#Workspace ObjectHolder agg;
#Output org.apache.drill.exec.vector.complex.writer.BaseWriter.ComplexWriter out;
#Override
public void setup() {
agg = new ObjectHolder();
}
#Override
public void reset() {
agg = new ObjectHolder();
}
#Override public void add() {
if (agg.obj == null) {
// Initialise list object for output
agg.obj = out.rootAsList();
}
org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter listWriter =
(org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter)agg.obj;
listWriter.varChar().write(input);
}
#Override
public void output() {
((org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter)agg.obj).endList();
}
}
#FunctionTemplate(
name = "array_agg",
scope = FunctionScope.POINT_AGGREGATE,
nulls = NullHandling.INTERNAL
)
public static class NullableVarChar_Agg implements DrillAggFunc {
#Param NullableVarCharHolder input;
#Workspace ObjectHolder agg;
#Output org.apache.drill.exec.vector.complex.writer.BaseWriter.ComplexWriter out;
#Override
public void setup() {
agg = new ObjectHolder();
}
#Override
public void reset() {
agg = new ObjectHolder();
}
#Override public void add() {
if (agg.obj == null) {
// Initialise list object for output
agg.obj = out.rootAsList();
}
if (input.isSet != 1) {
return;
}
org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter listWriter =
(org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter)agg.obj;
org.apache.drill.exec.expr.holders.VarCharHolder outHolder = new org.apache.drill.exec.expr.holders.VarCharHolder();
outHolder.start = input.start;
outHolder.end = input.end;
outHolder.buffer = input.buffer;
listWriter.varChar().write(outHolder);
}
#Override
public void output() {
((org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter)agg.obj).endList();
}
}
Interestingly, I can't import org.apache.drill.exec.vector.complex.writer.BaseWriter to make the whole thing easier because then Apache Drill would not find it.
So I have to put the entire package path for everything in org.apache.drill.exec.vector.complex.writer in the code.
Furthermore, I'm using the depcreated ObjectHolder. Any better solution?
Anyway: These work so far, e.g. with this query:
SELECT
MIN(tbl.`timestamp`) AS start_view,
MAX(tbl.`timestamp`) AS end_view,
array_agg(tbl.eventLabel) AS label_agg
FROM `dfs.root`.`/path/to/avro/folder` AS tbl
WHERE tbl.data.slug IS NOT NULL
GROUP BY tbl.data.slug
however, when I use ORDER BY, I get this:
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: UnsupportedOperationException: NULL
Fragment 0:0
Additionally, I tried more complex types, namely maps/dicts.
Interestingly, when I call SELECT sqlTypeOf(tbl.data) FROM tbl, I get MAP.
But when I write UDFs, the query planner complains about having no UDF array_agg for type dict.
Anyway, I wrote a version for dicts:
#FunctionTemplate(
name = "array_agg",
scope = FunctionScope.POINT_AGGREGATE,
nulls = NullHandling.INTERNAL
)
public static class Map_Agg implements DrillAggFunc {
#Param MapHolder input;
#Workspace ObjectHolder agg;
#Output org.apache.drill.exec.vector.complex.writer.BaseWriter.ComplexWriter out;
#Override
public void setup() {
agg = new ObjectHolder();
}
#Override
public void reset() {
agg = new ObjectHolder();
}
#Override public void add() {
if (agg.obj == null) {
// Initialise list object for output
agg.obj = out.rootAsList();
}
org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter listWriter =
(org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter) agg.obj;
//listWriter.copyReader(input.reader);
input.reader.copyAsValue(listWriter);
}
#Override
public void output() {
((org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter)agg.obj).endList();
}
}
#FunctionTemplate(
name = "array_agg",
scope = FunctionScope.POINT_AGGREGATE,
nulls = NullHandling.INTERNAL
)
public static class Dict_agg implements DrillAggFunc {
#Param DictHolder input;
#Workspace ObjectHolder agg;
#Output org.apache.drill.exec.vector.complex.writer.BaseWriter.ComplexWriter out;
#Override
public void setup() {
agg = new ObjectHolder();
}
#Override
public void reset() {
agg = new ObjectHolder();
}
#Override public void add() {
if (agg.obj == null) {
// Initialise list object for output
agg.obj = out.rootAsList();
}
org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter listWriter =
(org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter) agg.obj;
//listWriter.copyReader(input.reader);
input.reader.copyAsValue(listWriter);
}
#Override
public void output() {
((org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter)agg.obj).endList();
}
}
But here, I get an empty list in the field data_agg for my query:
SELECT
MIN(tbl.`timestamp`) AS start_view,
MAX(tbl.`timestamp`) AS end_view,
array_agg(tbl.data) AS data_agg
FROM `dfs.root`.`/path/to/avro/folder` AS tbl
GROUP BY tbl.data.viewSlag
Summary of questions
Most importantly: How do I create an array_agg UDF for Apache Drill?
How to make UDFs type-agnostic/general purpose? Do I really have to implement an entire class for each Nullable, Required and Repeated version of all types? That's a lot to do and quite tedious. Isn't there a way to handle values in an UDF agnostic to the underlying types?
I wish Apache Drill would just use what Java offers here with function generic types, specialised function overloading and inheritence of their own type system. Am I missing something on how to do that?
How can I fix the NULL problem when I use ORDER BY on my varchar version of the aggregate?
How can I fix the problem where my aggregate of maps/dicts is an empty list?
Is there an alternative to using the deprecated ObjectHolder?

To answer your question, unfortunately you've run into one of the limits of the Drill Aggregate UDF API which is that it can only return simple data types.1 It would be a great improvement to Drill to fix this, but that is the current status. If you're interested in discussing that further, please start a thread on the Drill user group and/or slack channel. I don't think it is impossible, but it would require some modification to the Drill internals. IMHO it would be well worth it because there are a few other UDFs that I'd like to implement that need this feature.
The second part of your question is how to make UDFs type agnostic and once again... you've found yet another bit of ugliness in the UDF API. :-) If you do some digging in the codebase, you'll see that most of the Math functions have versions that accept FLOAT, INT etc..
Regarding the aggregate of null or empty lists. I actually have some good news here... The current way of doing that is to provide two versions of the function, one which accepts regular holders and the second which accepts nullable holders and returns an empty list or map if the inputs are null. Yes, this sucks, but the additional good news is that I'm working on cleaning this up and hopefully will have a PR submitted soon that will eliminate the need to do this.
Regarding the ObjectHolder, I wrote a median function that uses a few Stacks to compute a streaming median and I used the ObjectHolder for that. I think it will be with us for some time as there is no alternative at the moment.
I hope this answers your questions.

Related

Spring Batch : Write a List to a database table using a custom batch size

Background
I have a Spring Batch job where :
FlatFileItemReader - Reads one row at a time from the file
ItemProcesor - Transforms the row from the file into a List<MyObject> and returns the List. That is, each row in the file is broken down into a List<MyObject> (1 row in file transformed to many output rows).
ItemWriter - Writes the List<MyObject> to a database table. (I used this
implementation to unpack the list received from the processor
and delegae to a JdbcBatchItemWriter)
Question
At point 2) The processor can return a List of 100000 MyObject instances.
At point 3), The delegate JdbcBatchItemWriter will end up writing the entire List with 100000 objects to the database.
My question is : The JdbcBatchItemWriter does not allow a custom batch size. For all practical purposes, the batch-size = commit-interval for the step. With this in mind, is there another implementation of an ItemWriter available in Spring Batch that allows writing to the database and allows configurable batch size? If not, how do go about writing a custom writer myself to acheive this?

I see no obvious way to set the batch size on the JdbcBatchItemWriter. However, you can extend the writer and use a custom BatchPreparedStatementSetter to specify the batch size. Here is a quick example:
public class MyCustomWriter<T> extends JdbcBatchItemWriter<T> {
#Override
public void write(List<? extends T> items) throws Exception {
namedParameterJdbcTemplate.getJdbcOperations().batchUpdate("your sql", new BatchPreparedStatementSetter() {
#Override
public void setValues(PreparedStatement ps, int i) throws SQLException {
// set values on your sql
}
#Override
public int getBatchSize() {
return items.size(); // or any other value you want
}
});
}
}
The StagingItemWriter in the samples is an example of how to use a custom BatchPreparedStatementSetter as well.

The answer from Mahmoud Ben Hassine and the comments pretty much covers all aspects of the solution and is the accepted answer.
Here is the implementation I used if anyone is interested :
public class JdbcCustomBatchSizeItemWriter<W> extends JdbcDaoSupport implements ItemWriter<W> {
private int batchSize;
private ParameterizedPreparedStatementSetter<W> preparedStatementSetter;
private String sqlFileLocation;
private String sql;
public void initReader() {
this.setSql(FileUtilties.getFileContent(sqlFileLocation));
}
public void write(List<? extends W> arg0) throws Exception {
getJdbcTemplate().batchUpdate(sql, Collections.unmodifiableList(arg0), batchSize, preparedStatementSetter);
}
public void setBatchSize(int batchSize) {
this.batchSize = batchSize;
}
public void setPreparedStatementSetter(ParameterizedPreparedStatementSetter<W> preparedStatementSetter) {
this.preparedStatementSetter = preparedStatementSetter;
}
public void setSqlFileLocation(String sqlFileLocation) {
this.sqlFileLocation = sqlFileLocation;
}
public void setSql(String sql) {
this.sql = sql;
}
}
Note :
The use of Collections.unmodifiableList prevents the need for any explicit casting.
I use sqlFileLocation to specify an external file that contains the sql and FileUtilities.getfileContents simply returns the contents of this sql file. This can be skipped and one can directly pass the sql to the class as well while creating the bean.

I wouldn't do this. It presents issues for restartability. Instead, modify your reader to produce individual items rather than having your processor take in an object and return a list.

Flatten processing result in spring batch

Does anyone know how in spring-batch (3.0.7) can I flat a result of processor that returns list of entities?
Example:
I got a processor that returns List
public class MyProcessor implements ItemProcessor < Long , List <Entity>> {
public List<Entity> process ( Long id )
}
Now all following processors / writers need to work on List < Entity >. Is there any way to flat the result to simply Entity so the further processors in given step can work on single Entities?
The only way is to persist the list somehow with a writer and then create a separate step that would read from the persisted data.
Thanks in advance!

As you know, processors in spring-batch can be chained with a composite processor. Within the chain, you can change the processing type from processor to processor, but of course input and output type of two "neighbour"-processors have to match.
However, Input out Output type is always treated as one item. Therefore, if the output type of a processor ist a List, this list is regared as one item. Hence, the following processor needs to have an InputType "List", resp., if a writer follows, the Writer needs to have a List-of-List as type its write-method.
Moreover, a processor can not multiply its element. There can only be one output item for every input element.
Basically, there is nothing wrong with having a chain like
Reader<Integer>
ProcessorA<Integer,List<Integer>>
ProcessorB<List<Integer>,List<Integer>>
Writer<List<Integer>> (which leads to a write-method write(List<List<Integer>> items)
Depending on the context, there could be a better solution.
You could mitigate the impact (for instance reuseability) by using wrapper-processors and a wrapper-writer like the following code examples:
public class ListWrapperProcessor<I,O> implements ItemProcessor<List<I>, List<O>> {
ItemProcessor<I,O> delegate;
public void setDelegate(ItemProcessor<I,O> delegate) {
this.delegate = delegate;
}
public List<O> process(List<I> itemList) {
List<O> outputList = new ArrayList<>();
for (I item : itemList){
O outputItem = delegate.process(item);
if (outputItem!=null) {
outputList.add(outputItem);
}
}
if (outputList.isEmpty()) {
return null;
}
return outputList;
}
}
public class ListOfListItemWriter<T> implements InitializingBean, ItemStreamWriter<List<T>> {
private ItemStreamWriter<T> itemWriter;
#Override
public void write(List<? extends List<T>> listOfLists) throws Exception {
if (listOfLists.isEmpty()) {
return;
}
List<T> all = listOfLists.stream().flatMap(Collection::stream).collect(Collectors.toList());
itemWriter.write(all);
}
#Override
public void afterPropertiesSet() throws Exception {
Assert.notNull(itemWriter, "The 'itemWriter' may not be null");
}
public void setItemWriter(ItemStreamWriter<T> itemWriter) {
this.itemWriter = itemWriter;
}
#Override
public void close() {
this.itemWriter.close();
}
#Override
public void open(ExecutionContext executionContext) {
this.itemWriter.open(executionContext);
}
#Override
public void update(ExecutionContext executionContext) {
this.itemWriter.update(executionContext);
}
}
Using such wrappers, you could still implement "normal" processor and writers and then use such wrappers in order to move the "List"-handling out of them.

Unless you can provide a compelling reason, there's no reason to send a List of Lists to your ItemWriter. This is not the way the ItemProcessor was intended to be used. Instead, you should create/configure and ItemReader to return one object with relevant objects.
For example, if you're reading from the database, you could use the HibernateCursorItemReader and a query that looks something like this:
"from ParentEntity parent left join fetch parent.childrenEntities"
Your data model SHOULD have a parent table with the Long id that you're currently passing to your ItemProcessor, so leverage that to your advantage. The reader would then pass back ParentEntity objects, each with a collection of ChildEntity objects that go along with it.

Understanding best use of Java Generics in this example case

Let's say I have a manufacturing scheduling system, which is made up of four parts:
There are factories that can manufacture a certain type of product and know if they are busy:
interface Factory<ProductType> {
void buildProduct(ProductType product);
boolean isBusy();
}
There is a set of different products, which (among other things) know in which factory they are built:
interface Product<ActualProductType extends Product<ActualProductType>> {
Factory<ActualProductType> getFactory();
}
Then there is an ordering system that can generate requests for products to be built:
interface OrderSystem {
Product<?> getNextProduct();
}
Finally, there's a dispatcher that grabs the orders and maintains a work-queue for each factory:
class Dispatcher {
Map<Factory<?>, Queue<Product<?>>> workQueues
= new HashMap<Factory<?>, Queue<Product<?>>>();
public void addNextOrder(OrderSystem orderSystem) {
Product<?> nextProduct = orderSystem.getNextProduct();
workQueues.get(nextProduct.getFactory()).add(nextProduct);
}
public void assignWork() {
for (Factory<?> factory: workQueues.keySet())
if (!factory.isBusy())
factory.buildProduct(workQueues.get(factory).poll());
}
}
Disclaimer: This code is merely an example and has several bugs (check if factory exists as a key in workQueues missing, ...) and is highly non-optimal (could iterate over entryset instead of keyset, ...)
Now the question:
The last line in the Dispatcher (factory.buildProduct(workqueues.get(factory).poll());) throws this compile-error:
The method buildProduct(capture#5-of ?) in the type Factory<capture#5-of ?> is not applicable for the arguments (Product<capture#7-of ?>)
I've been racking my brain over how to fix this in a type-safe way, but my Generics-skills have failed me here...
Changing it to the following, for example, doesn't help either:
public void assignWork() {
for (Factory<?> factory: workQueues.keySet())
if (!factory.isBusy()) {
Product<?> product = workQueues.get(factory).poll();
product.getFactory().buildProduct(product);
}
}
Even though in this case it should be clear that this is ok...
I guess I could add a "buildMe()" function to every Product that calls factory.buildProduct(this), but I have a hard time believing that this should be my most elegant solution.
Any ideas?
EDIT:
A quick example for an implementation of Product and Factory:
class Widget implements Product<Widget> {
public String color;
#Override
public Factory<Widget> getFactory() {
return WidgetFactory.INSTANCE;
}
}
class WidgetFactory implements Factory<Widget> {
static final INSTANCE = new WidgetFactory();
#Override
public void buildProduct(Widget product) {
// Build the widget of the given color (product.color)
}
#Override
public boolean isBusy() {
return false; // It's really quick to make this widget
}
}

Your code is weird.
Your problem is that you are passing A Product<?> to a method which expects a ProductType which is actually T.
Also I have no idea what Product is as you don't mention its definition in the OP.
You need to pass a Product<?> to work. I don't know where you will get it as I can not understand what you are trying to do with your code

Map<Factory<?>, Queue<Product<?>>> workQueues = new HashMap<Factory<?>, Queue<Product<?>>>();
// factory has the type "Factory of ?"
for (Factory<?> factory: workqueues.keySet())
// the queue is of type "Queue of Product of ?"
Queue<Product<?>> q = workqueues.get(factory);
// thus you put a "Product of ?" into a method that expects a "?"
// the compiler can't do anything with that.
factory.buildProduct(q.poll());
}

Got it! Thanks to meriton who answered this version of the question:
How to replace run-time instanceof check with compile-time generics validation
I need to baby-step the compiler through the product.getFactory().buildProduct(product)-part by doing this in a separate generic function. Here are the changes that I needed to make to the code to get it to work (what a mess):
Be more specific about the OrderSystem:
interface OrderSystem {
<ProductType extends Product<ProductType>> ProductType getNextProduct();
}
Define my own, more strongly typed queue to hold the products:
#SuppressWarnings("serial")
class MyQueue<T extends Product<T>> extends LinkedList<T> {};
And finally, changing the Dispatcher to this beast:
class Dispatcher {
Map<Factory<?>, MyQueue<?>> workQueues = new HashMap<Factory<?>, MyQueue<?>>();
#SuppressWarnings("unchecked")
public <ProductType extends Product<ProductType>> void addNextOrder(OrderSystem orderSystem) {
ProductType nextProduct = orderSystem.getNextProduct();
MyQueue<ProductType> myQueue = (MyQueue<ProductType>) workQueues.get(nextProduct.getFactory());
myQueue.add(nextProduct);
}
public void assignWork() {
for (Factory<?> factory: workQueues.keySet())
if (!factory.isBusy())
buildProduct(workQueues.get(factory).poll());
}
public <ProductType extends Product<ProductType>> void buildProduct(ProductType product) {
product.getFactory().buildProduct(product);
}
}
Notice all the generic functions, especially the last one. Also notice, that I can NOT inline this function back into my for loop as I did in the original question.
Also note, that the #SuppressWarnings("unchecked") annotation on the addNextOrder() function is needed for the typecast of the queue, not some Product object. Since I only call "add" on this queue, which, after compilation and type-erasure, stores all elements simply as objects, this should not result in any run-time casting exceptions, ever. (Please do correct me if this is wrong!)

Multiple leaf methods problem in composite pattern

At work, we are developing an PHP application that would be later re-programmed into Java. With some basic knowledge of Java, we are trying to design everything to be easily re-written, without any headaches. Interesting problem came out when we tried to implement composite pattern with huge number of methods in leafs.
What are we trying to achieve (not using interfaces, it's just a quick example):
class Composite {
...
}
class LeafOne {
public function Foo( );
public function Moo( );
}
class LeafTwo {
public function Bar( );
public function Baz( );
}
$c = new Composite( Array( new LeafOne( ), new LeafTwo( ) ) );
// will call method Foo in all classes in composite that contain this method
$c->Foo( );
// same with Bar
$c->Bar( );
It seems like pretty much classic Composite pattern, but problem is that we will have quite many leaf classes and each of them might have ~5 methods (of which few might be different than others). One of our solutions, which seems to be the best one so far and might actually work, is using __call magic method to call methods in leafs.
Unfortunately, we don't know if there is an equivalent of it in Java.
So the actual question is: Is there a better solution for this, using code that would be eventually easily re-coded into Java? Or do you recommend any other solution? Perhaps there's some different, better pattern I could use here.
In case there's something unclear, just ask and I'll edit this post.
Edit:
Actual problem is that not every leaf class contains, for example, method Baz. If we used simple foreach to call Baz in every class, it'd give use bunch of errors, as there are certain classes that don't contain this method. Classic solution would be to have every single method from every single leaf class implemented into Composite class, each with different implementation. But this would make our composite class huge and messy with amount of methods we use.
So usual solution would look like this (Composite class):
class Composite implements Fooable, Bazable {
...
public function Foo( ) {
foreach( $this->classes as $class ) {
$class->Foo( );
}
}
public function Baz( ) {
...
}
}
To prevent our code to become real mess, we were thinking about something like:
class Composite {
...
public function __call( ) {
// implementation
}
}
But we aren't really sure if it's a good solution and if there's something similar also in Java (as asked already before edit).

Within Java you could consider using the visitor pattern whereby you pass a visitor object to each node in the tree and the node makes a callback to the visitor class to determine which behaviour should be performed.
This avoids any casting or explicitly checking the type of each node.
/**
* Visitor capable of visiting each node within a document.
* The visitor contains a callback method for each node type
* within the document.
*/
public interface DocumentNodeVisitor {
void visitWord(Word word);
void visitImage(Image img);
}
/**
* Base interface for each node in a document.
*/
public interface DocumentNode {
void applyVisitor(DocumentVisitor v);
}
/**
* Conrete node implementation representing a word.
*/
public class Word implements DocumentNode {
private final String s;
public Word(String s) { this.s = s; }
public String getValue() { return this.s; }
public void applyVisitor(DocumentVisitor v) {
// Make appropriate callback to visitor.
v.visitWord(this);
}
}
/**
* Conrete node implementation representing an image.
*/
public class Image implements DocumentNode {
public void applyVisitor(DocumentVisitor v) {
// Make appropriate callback to visitor.
v.visitImage(this);
}
}
public class Paragraph implements DocumentNode {
private final List<DocumentNode> children;
public Paragraph() {
this.children = new LinkedList<DocumentNode>();
}
public void addChild(DocumentNode child) {
// Technically a Paragraph should not contain other Paragraphs but
// we allow it for this simple example.
this.children.add(child);
}
// Unlike leaf nodes a Paragraph doesn't callback to
// the visitor but rather passes the visitor to each
// child node.
public void applyVisitor(DocumentVisitor v) {
for (DocumentNode child : children) {
child.applyVisitor(v);
}
}
}
/**
* Concrete DocumentVisitor responsible for spell-checking.
*/
public class SpellChecker implements DocumentVisitor
public void visitImage(Image i) {
// Do nothing, as obviously we can't spellcheck an image.
}
public void visitWord(Word word) {
if (!dictionary.contains(word.getValue()) {
// TODO: Raise warning.
}
}
}

Visitor design pattern is a quite good solution. But you have to consider possible changes in the structure, e.g. new Leaf class will make you implement applyVisitor and add visit* method to every other Visitor you have created. So Visitor really helpts you to add behaviour to structured objects at price of that structure not changing too often. If the structure changes often and the algorithms not so much, you might consider having different composites for objects with same interfaces. If you want to do it the dirty way as you currently do in PHP, look at Java reflection API. Nice solution would be imho dynamic calls (as in Ruby or Python). You can simulate those, but that would be much work... So my answer is use the Visitor with care or consider different Composites for objects with different behaviour.

What's the most practical way to have "functions in a dictionary" in Java?

When programming with C/C++ or Python I sometimes used to have a dictionary with references to functions according to the specified keys. However, I don't really know how to have the same -- or at the very least similar -- behavior in Java allowing me dynamic key-function (or method, in Java slang) association.
Also, I did find the HashMap technique somebody suggested, but is that seriously the best and most elegant way? I mean, it seems like a lot to create a new class for every method I want to use.
I'd really appreciate every input on this.

You don't need to create a full, name class for each action. You can use anonymous inner classes:
public interface Action<T>
{
void execute(T item);
}
private static Map<String, Action<Foo>> getActions()
{
Action<Foo> firstAction = new Action<Foo>() {
#Override public void execute(Foo item) {
// Insert implementation here
}
};
Action<Foo> secondAction = new Action<Foo>() {
#Override public void execute(Foo item) {
// Insert implementation here
}
};
Action<Foo> thirdAction = new Action<Foo>() {
#Override public void execute(Foo item) {
// Insert implementation here
}
};
Map<String, Action<Foo>> actions = new HashMap<String, Action<Foo>>();
actions.put("first", firstAction);
actions.put("second", secondAction);
actions.put("third", thirdAction);
return actions;
}
(Then store it in a static variable.)
Okay, so it's not nearly as convenient as a lambda expression, but it's not too bad.

The short answer is you need to wrap each method in a class - called a functor.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Apache Drill: Write general-purpose array_agg UDF - java

Related

Spring Batch : Write a List to a database table using a custom batch size

Flatten processing result in spring batch

Understanding best use of Java Generics in this example case

Multiple leaf methods problem in composite pattern

What's the most practical way to have "functions in a dictionary" in Java?

Categories

Resources