Java program takes a long list of inputs(parameters), churns a bit and spits some output.
I need a way to organize these parameters in a sane way so in the input txt file I want to write them like this:
parameter1 = 12
parameter2 = 10
strategy1.parameter1 = "goofy"
strategy2.parameter4 = 100.0
Then read this txt file, turn it into a Java object I can pass around to objects when I instantiate them.
I now pyqtgraph has ParameterTree which is handy to use; is there something similar in Java? I am sure others must have had the same need so I don't want to reinvent the wheel.
(other tree structures would also be fine, of course, I just wanted something easy to read)
One way is to turn input.txt into input.json:
{
"parameter1": 12,
"parameter2": 10,
"strategy1": {
"parameter1": "goofy"
},
"strategy2": {
"parameter4": 100.0
}
}
Then use Jackson to deserialize input.json into one of these:
A Map<String, Object> instance, which you could navigate in depth to get all your parameters
An instance of some class of your own that mimics input.json's structure, where your parameters would reside
A JsonNode instance that would be the root of the tree
(1) has the advantage that it's easy and you don't have to create any class to read the parameters, however you'd need to traverse the map, downcast the values you get from it, and you'd need to know the keys in advance (keys match json object's attribute names).
(2) has the advantage that everything would be correctly typed upon deserialization; no need to downcast anything, since every type would be a field of your own classes which represent the structure of the parameters. However, if the structure of your input.json file changed, you would need to change the structure of your classes as well.
(3) is the most flexible of all, and I believe it's the option that is closest to what you have in mind, nonetheless is the most tedious to work with, since it's too low-level. Please refer to this article for further details.
Related
I've got loads of the following to implement.
validateParameter(field_name, field_type, field_validationMessage, visibleBoolean);
Instead of having 50-60 of these in a row, is there some form of nested hashmap/4d array I can use to build it up and loop through them?
Whats the best approach for doing something like that?
Thanks!
EDIT: Was 4 items.
What you could do is create a new Class that holds three values. (The type, the boolean, and name, or the fourth value (you didn't list it)). Then, when creating the HashMap, all you have to do is call the method to get your three values. It may seem like more work, but all you would have to do is create a simple loop to go through all of the values you need. Since I don't know exactly what it is that you're trying to do, all I can do is provide an example of what I'm trying to do. Hope it applies to your problem.
Anyways, creating the Class to hold the three(or four) values you need.
For example,
Class Fields{
String field_name;
Integer field_type;
Boolean validationMessageVisible;
Fields(String name, Integer type, Boolean mv) {
// this.field_name = name;
this.field_type = type;
this.validationMessageVisible = mv;
}
Then put them in a HashMap somewhat like this:
HashMap map = new HashMap<String, Triple>();
map.put(LOCAL STRING FOR NAME OF FIELD, new Field(new Integer(YOUR INTEGER),new Boolean(YOUR BOOLEAN)));
NOTE: This is only going to work as long as these three or four values can all be stored together. For example if you need all of the values to be stored separately for whatever reason it may be, then this won't work. Only if they can be grouped together without it affecting the function of the program, that this will work.
This was a quick brainstorm. Not sure if it will work, but think along these lines and I believe it should work out for you.
You may have to make a few edits, but this should get you in the right direction
P.S. Sorry for it being so wordy, just tried to get as many details out as possible.
The other answer is close but you don't need a key in this case.
Just define a class to contain your three fields. Create a List or array of that class. Loop over the list or array calling the method for each combination.
The approach I'd use is to create a POJO (or some POJOs) to store the values as attributes and validate attribute by attribute.
Since many times you're going to have the same validation per attribute type (e.g. dates and numbers can be validated by range, strings can be validated to ensure they´re not null or empty, etc), you could just iterate on these attributes using reflection (or even better, using annotations).
If you need to validate on the POJO level, you can still reuse these attribute-level validators via composition, while you add more specific validations are you´re going up in the abstraction level (going up means basic attributes -> pojos -> pojos that contain other pojos -> etc).
Passing several basic types as parameters of the same method is not good because the parameters themselves don't tell much and you can easily exchange two parameters of the same type by accident in the method call.
Currently I have a simple pig script which reads from a file on a hadoop fs, as
my_input = load 'input_file' as (A, B, C)
and then I have another line of code which needs to manipulate the fields, like for instance convert them to uppercase (as in the Pig UDF tutorial).
I do something like,
manipulated = FOREACH my_input GENERATE myudf.Upper(A, B, C)
Now in my Upper.java file I know that I can get the value of A, B, C as (assuming they are all Strings)
public String exec(Tuple input) throws IOException
{
//yada yada yada
....
String A = (String) input.get(0);
String B = (String) input.get(1);
String C = (String) input.get(2);
//yada yada yada
....
}
Is there anyway I can get the value of a field by its name? For instance if I need to get like 10 fields, is there no other way than to do input.get(i) from 0 to 9?
I am new to Pig, so I am interested in knowing why this is the case. Is there something like a tuple.getByFieldName('Field Name')?
This is not possible, nor would it be very good design to allow it. Pig field names are like variable names. They allow you to give a memorable name to something that gives you insight into what it means. If you use those names in your UDF, you are forcing every Pig script which uses the UDF to adhere to the same naming scheme. If you decide later that you want to think of your variables a little differently, you can't reflect that in their names because the UDF would not function anymore.
The code that reads data from the input tuple in your UDF is like a function declaration. It establishes how to treat each argument to the function.
If you really want to be able to do this, you can build a map easily enough using the TOMAP builtin function, and have your UDF read from the map. This greatly hurts the reusability of your UDF for the reasons mentioned above, but it is nevertheless a fairly simple workaround.
While I agree that function flexibility would be affected if you use field names, technically it is possible to access fields by names.
The trick is to use inputSchema available through getInputSchema() and get the mapping between field indexes and names from there. You can also override outputSchema and build the mapping there, using inputSchema parameter. Then you would be able to use this mapping in your exec method.
I don't think you can access field by name. You need a structure similar to map to achieve that. In Pig's context, even though you cannot do it by name you can still rely on position if the input (load)'s schema is properly defined and consistent.
The maximum you can do is to validate type of fields you are ingesting in the UDF.
On the other hand, you can use implement "outputSchema" in your UDF to publish its output by name.
UDF Manual
I have a block of static data that I need to organize into an array containing hash maps. Specifically, I want to have a static object in my app that contains the time zone information like this: https://gist.github.com/pamelafox/986163
Seeing how clean the definition looks like in Python, and knowing how a similarly clean definition can be created with some of the other languages I know, I was hoping there is a cleaner approach to it in Java then just running map.put(...) repeatedly. I have seen this question: How to give the static value to HashMap? but what wondering if there is a better way to do it?
One solution would be to store the data as a normal string in whatever format you can think of and then convert the string representation into the map (static, non-static or as a one-time initialized instance).
An improvement of this method would be to store the data in a file and load it (can be included in .jar package, when you use jar). This solution would have the advantage that data can be easily updated.
I am quite new to java currently working on a not-so-simple web browser application in which I would like to record a permanent history file with a 2D array setup with 3 columns containing "Date Viewed", "URL", "How many times this URL has been viewed before".
Currently I have a temporary solution that only saves "URL" which is also used for "Back, Foward" features using an ArrayList.
private List tempHistory = new ArrayList();
I am reading through the Java documentation but I cannot put together a solution, unless I am missing the obvious there is no 2D array as flexible a ArrayList like in Python?
From your description it doesn't sound like you need a 2D array. You just have one dimension -- but of complex data types, right?
So define a HistoryItem class or something with a Date property for date viewed, URL for URL, int for view count.
Then you just want a List<HistoryItem> history = new ArrayList<HistoryItem>().
The reason I don't think you really want a 2D array-like thing is that it could only hold one data type, and you clearly have several data types at work here, like a date and a count. But if you really want a table-like abstraction, try Guava's Table.
No, there is no built-in 2D array type in Java (unless you use primitive arrays).
You could just use a list of lists (List<List>) - however, I think it is almost always better to use a custom type that you put into the list. In your case, you'd create a class HistoryEntry (with fields for "Date viewed", URL etc.), and use List<HistoryEntry>. That way, you get all the benefits a proper type gives you (typechecking, completion in an IDE, ability to put methods into the class etc.).
How do you plan to browse the history then? If you want to search the history for each url later on then ArrayList approach might not be efficient.
I would rather prefer a Map with URL as key.
Map<Url,UrlHistory> browseHistory = new HahMap<Url,UrlHistory> ();
UrlHistory will contains all the fields you want to associate with a url like no. of times page was accessed and all.
The Facts
I have the following datastructure consisting of a table and a list of attributes (simplified):
class Table {
List<Attribute> m_attributes;
}
abstract class Attribute {}
class LongAttribute extends Attribute {}
class StringAttribute extends Attribute {}
class DateAttribute extends Attribute {}
...
Now I want to do different actions with this datastructure:
print it in XML notation
print it in textual form
create an SQL insert statement
create an SQL update statement
initialize it from a SQL result set
First Try
My first attempt was to put all these functionality inside the Attribute, but then the Attribute was overloaded with very different responsibilities.
Alternative
It feels like a visitor pattern could do the job very well instead, but on the other side it looks like overkill for this simple structure.
Question
What's the most elegant way to solve this?
I would look at using a combination of JAXB and Hibernate.
JAXB will let you marshall and unmarshall from XML. By default, properties are converted to elements with the same name as the property, but that can be controlled via #XmlElement and #XmlAttribute annotations.
Hibernate (or JPA) are the standard ways of moving data objects to and from a database.
The Command pattern comes to mind, or a small variation of it.
You have a bunch of classes, each of which is specialized to do a certain thing with your data class. You can keep these classes in a hashmap or some other structure where an external choice can pick one for execution. To do your thing, you call the selected Command's execute() method with your data as an argument.
Edit: Elaboration.
At the bottom level, you need to do something with each attribute of a data row.
This indeed sounds like a case for the Visitor pattern: Visitor simulates a double
dispatch operation, insofar as you are able to combine a variable "victim" object
with a variable "operation" encapsulated in a method.
Your attributes all want to be xml-ed, text-ed, insert-ed updat-ed and initializ-ed.
So you end up with a matrix of 5 x 3 classes to do each of these 5 operations
to each of 3 attribute types. The rest of the machinery of the visitor pattern
will traverse your list of attributes for you and apply the correct visitor for
the operation you chose in the right way for each attribute.
Writing 15 classes plus interface(s) does sound a little heavy. You can do this
and have a very general and flexible solution. On the other hand, in the time
you've spent thinking about a solution, you could have hacked together the code
to it for the currently known structure and crossed your fingers that the shape
of your classes won't change too much too often.
Where I thought of the command pattern was for choosing among a variety of similar
operations. If the operation to be performed came in as a String, perhaps in a
script or configuration file or such, you could then have a mapping from
"xml" -> XmlifierCommand
"text" -> TextPrinterCommand
"serial" -> SerializerCommand
...where each of those Commands would then fire up the appropriate Visitor to do
the job. But as the operation is more likely to be determined in code, you probably
don't need this.
I dunno why you'd store stuff in a database yourself these days instead of just using hibernate, but here's my call:
LongAttribute, DateAttribute, StringAttribute,… all have different internals (i.e. fields specific to them not present in Attribute class), so you cannot create one generic method to serialize them all. Now XML, SQL and plain text all have different properties when serializing to them. There's really no way you can avoid writing O(#subclasses of Attribute #output formats)* different methods of serializing.
Visitor is not a bad pattern for serializing. True, it's a bit overkill if used on non-recursive structures, but a random programmer reading your code will immediately grasp what it is doing.
Now for deserialization (from XML to object, from SQL to object) you need a Factory.
One more hint, for SQL update you probably want to have something that takes old version of the object, new version of the object and creates update query only on the difference between them.
In the end, I used the visitor pattern. Now looking back, it was a good choice.