Indexing a simple Java Record - java

I have a Java Object, Record . It represents a single record as a result of SQL execution. Can CQEngine index collection of Record ?
My class is of the form
public class Record {
private List<String> columnNames;
private List<Object> values;
... Other getters
}
I have looked through some examples, but I have no luck there.
I want to index only specific column(s) with its name and corresponding value. Can this be achived using cqengine or is there any other alternatives to achieve the same.
Thanks.

That seems to be a strange way to model data, but you can use CQEngine with that model if you wish.
(First off, CQEngine will have no use for your column names so you can remove that field.)
To do this, you will need to define a CQEngine virtual attribute for each of the indexes in your list of values.
Each attribute will need to be declared with the data type which will be stored in that column/index, and will need to be able to cast the object at that index in your list of values, to the appropriate data type (String, Double, Integer etc.).
So let's say your Record has a column called 'price', which is of type Double, and is stored at index 5 in the list of values. You could define an attribute which reads it as follows:
public static final Attribute<Record, Double> PRICE =
attribute("PRICE", record -> ((Double) record.values.get(5));
If this sounds complicated, it's because that way of modelling data makes things a bit complicated :) It's usually easier to work with a data model which leverages the Java type system (which your model does not). As such, you will need to keep track of the data types etc. of each field programmatically yourself.
CQEngine itself will work fine with that model though, because at the end of the day CQEngine attributes don't need to read fields, the attributes are just functions which are programmed to fetch values.
There's a bunch of stuff not covered above. For example can your values be null? (if so, you should use the nullable variety of attributes as discussed in the CQEngine docs. Or, might each of your Record objects have different sets of columns? (if so, you can create attributes on-the-fly when you encounter a new column, but you should probably cache the attributes you have created somewhere).
Hope that helps,
Niall (CQEngine author)

Related

Best way to represent set of string values in DB and UI

I'm looking for opinions so I guess this is a 'which is better' question. I have a webapp build in Javascript/jQuery and struts that uses Hibernate to access data in a relational DB (MySQL). When an object/database field has a limited set of strings for values, is it better to use the full string in the object/DB or a 'code' for that string, like a single CHAR instead of the entire string?
class User {
int id;
String userName;
String type; // Values of 'Administrator', 'Regular'
OR
char type // Values of 'A', 'R'
OR
char type // Values of 'A', 'R'
String typeString; // Can be returned on the fly based on 'type' or by DB in SQL CASE statement
}
If the database has the full text string, then its easy coding all the way around, but its wasting the space (in the DB, data transfer) on something that only has a few values.
If the database has just a 'code' then when presenting this field to a user ( like in a grid of existing users, or a dropdown selection list when creating a new user ) the char value must be converted to the full string. Then the question is where should that conversion be done? It could be at the DB level where Hibernate can fill in the full string value from a CASE statement. This saves DB space, but not in data transfer or memory. It could be at the object level where its done in the getter/setter for the 'type' field. Or it could be all the way in the GUI where Javascript converts the 'char' to the appropriate string for the user to see.
Also... if either method is OK to use, what might influence the choice you make? The number of different values? The max length of the strings? How many rows are expected in the table?
I'm sure every DB/programmer has come across this situation many times and probably has a preference.
If you only have a fixed set of user types like Admin and Regular, I think it will easier to use a static hashmap in your code and just store A and R in your code. Something like:
static HashMap<Character,String> userRoles = new HashMap<>();
static{
userRoles.put("A","Admin");
userRoles.put("R", "Regular");
}
When ever you get result from DB, you can just do userRoles.get(type) to check the actual type. This saves space and also it's readable.
I would put the full name in the database alongside an associated short code or ID in some kind of lookup table. Use the shortcode/ID as the primary key for the lookup table, and as a foreign key from other tables. If someone needs to investigate the database layer, or someone needs to use the database for reporting, data warehousing, or analytics this will simplify things greatly.
It's commonly seen as bad practice to name variables, database tables, database columns, functions, etc. with unclear names or abbreviations that not everyone will understand - short codes like this should be seen the same way.
I think its better to do the conversion from the typecode to type (and vice versa) as close to database interaction as possible - in this case Hibernate. This is because your application logic would become more readable and intuitive if it uses the explicit types.
In my opinion- if(BMW.equals(carTypeCode)) {} is lot more readable than if("X".equals(carTypeCode)) {}.
I am not very familiar with Hibernate, but it would be awesome if you could leverage Hibernate for the mapping of String to DB representation and vice versa (maybe using CASE as you mentioned). Personally, I would probably have modeled these Strings as enums and used something like Hibernate Enum Type mapping. Also, you should think about making these type codes a little bit readable by making them at least few chars because these may come in handy when you are debugging some issue by looking at DB dump and you don't have to consult your type-code to type conversion chart.
I don't think performance wise either would not impact much in the average case.

4 Key Value HashMap? Array? Best Approach?

I've got loads of the following to implement.
validateParameter(field_name, field_type, field_validationMessage, visibleBoolean);
Instead of having 50-60 of these in a row, is there some form of nested hashmap/4d array I can use to build it up and loop through them?
Whats the best approach for doing something like that?
Thanks!
EDIT: Was 4 items.
What you could do is create a new Class that holds three values. (The type, the boolean, and name, or the fourth value (you didn't list it)). Then, when creating the HashMap, all you have to do is call the method to get your three values. It may seem like more work, but all you would have to do is create a simple loop to go through all of the values you need. Since I don't know exactly what it is that you're trying to do, all I can do is provide an example of what I'm trying to do. Hope it applies to your problem.
Anyways, creating the Class to hold the three(or four) values you need.
For example,
Class Fields{
String field_name;
Integer field_type;
Boolean validationMessageVisible;
Fields(String name, Integer type, Boolean mv) {
// this.field_name = name;
this.field_type = type;
this.validationMessageVisible = mv;
}
Then put them in a HashMap somewhat like this:
HashMap map = new HashMap<String, Triple>();
map.put(LOCAL STRING FOR NAME OF FIELD, new Field(new Integer(YOUR INTEGER),new Boolean(YOUR BOOLEAN)));
NOTE: This is only going to work as long as these three or four values can all be stored together. For example if you need all of the values to be stored separately for whatever reason it may be, then this won't work. Only if they can be grouped together without it affecting the function of the program, that this will work.
This was a quick brainstorm. Not sure if it will work, but think along these lines and I believe it should work out for you.
You may have to make a few edits, but this should get you in the right direction
P.S. Sorry for it being so wordy, just tried to get as many details out as possible.
The other answer is close but you don't need a key in this case.
Just define a class to contain your three fields. Create a List or array of that class. Loop over the list or array calling the method for each combination.
The approach I'd use is to create a POJO (or some POJOs) to store the values as attributes and validate attribute by attribute.
Since many times you're going to have the same validation per attribute type (e.g. dates and numbers can be validated by range, strings can be validated to ensure they´re not null or empty, etc), you could just iterate on these attributes using reflection (or even better, using annotations).
If you need to validate on the POJO level, you can still reuse these attribute-level validators via composition, while you add more specific validations are you´re going up in the abstraction level (going up means basic attributes -> pojos -> pojos that contain other pojos -> etc).
Passing several basic types as parameters of the same method is not good because the parameters themselves don't tell much and you can easily exchange two parameters of the same type by accident in the method call.

Storing various object types in single column of database

I am working in Java. I have an class called Command. This object class stores a variable List of parameters that are primitives (mostly int and double). The type, number, and order of parameters is specific to each command, so the List is type Object. I won't ever query the table based on what these parameter values are so I figured I would concatenate them into a single String or serialize them in some way. I think this may be a better approach that normalizing the table because I will have to join every time and that table will grow huge pretty quickly. (Edit: The Command object also stores some other members that won't be serialized such as a String to identify the type of command, and a Timestamp for when it was issued.)
So I have 2 questions:
Should I turn them into a delimited String? If so, how do I get each object as a String without knowing which type to cast them to? I attempted to loop through and use the .toString method, but that is not working. It seems to be returning null.
Or is there some way to just serialize that data of the array into a column of the DB? I read about serialization and it seems to be for the context of serializing whole classes.
I would use JSON serializer and deserializer like Jackson to store and retrieve those command objects in DB without losing the specific type information. On a side note, I would have these commands implement a common interface and store them in a list of commands and not in a list of objects.

java dynamic attributes table

I am developing a java application which needs a special component for dynamic attributes. The arguments are serialized (using JSON) and stored in a database and then deserialized at runtime. All attributes are displayed in a JTable with 3 columns (attribute name, attribute type and attribute value) and stored in a hashmap.
I have currently two problems to solve:
The hashmap can also store objects and the objects can be set to null. And if set to null i dont know which class they belong to. How could i store objects even if they are null and known which class they belong to? Do i need to wrap each object in a class that will holds the class of the stored object?
The objects are deserialized from json at runtime. The problem with this is that there are many different types of objects and i don't actually know all object types that will be stored in the hashmap. So i am looking for a way to dynamicly deserialize objects.. Is there such a way? Would i have to store the class of the object in the serialized json string?
Thanks!
Take a look to the Null Object Pattern. You can use an extra class to represent a Null instance of your type and still could contain information about itself.
There is something called a Class Token, Which is the use of Class objects as keys for heterogeneous containers. Take a look to Effective Java By Joshua Bloch, Item 29. I'm not sure how this approach could work for you since you may have many instances of the same type but I leave it as a reference.
First of all, can you motivate why you use JSON serialization for your attributes ?
This method is disadvantageous in many ways in my opinion, it can cause problems with database search and indexing, make database viewing painful and caus unnecessary code in your application. These problems can be not an issue, it depends how you want to use your attributes.
My solution for situation like these is simple table containing columns like:
id - int
attribute_name - varchar
And then add columns for each supported data type:
string_value - varchar
integer_value - int
date_value - date
... and any other types you want.
This design allow for supreme performance using simple and typesafe ORM mapping without any serialization or other boilerplate. It can store values of any type, you just set correct column for attribute type, leaving all other with null. You can simulate null value by using null in all data columns. Indexing and searching also becomes a piece of cake.

solrj: how to store and retrieve List<POJO> via multivalued field in index

My use case is an index which holds titles of online media. The provider of the data associates a list of categories with each title. I am using SolrJ to populate the index via an annotated POJO class
e.g.
#Field("title")
private String title;
#Field("categories")
private List<Category> categoryList;
The associated POJO is
public class Category {
private Long id;
private String name;
...
}
My question has two parts:
a) is this possible via SolrJ - the docs only contain an example of #Field using a List of String, so I assume the serialization/marshalling only supports simple types ?
b) how would I set up the schema to hold this. I have a naive assumption I just need to set
multiValued=true on the required field & it will all work by magic.
I'm just starting to implement this so any response would be highly appreciated.
The answer is as you thought:
a) You have only simple types available. So you will have a List of the same type e.g. String. The point is you cant represent complex types inside the lucene document so you wont deserialize them as well.
b) The problem is what you are trying is to represent relational thinking in a "document store". That will probably work only to a certain point. If you want to represent categories inside a lucene document just use the string it is not necessary to store a id as well.
The only point to store an id as well is: if you want to do aside the search a lookup on a RDBMS. If you want to do this you need to make sure that the id and the category name is softlinked. This is not working for every 1:n relation. (Every 1:n relation where the n related table consists only of required fields is possible. If you have an optional field you need to put something like a filling emptyconstant in the field if possible).
However if these 1:n relations are not sparse its possible actually if you maintain the order in which you add fields to the document. So the case with the category relation can be probably represented if you dont sort the lists.
You may implement a method which returns this Category if you instantiate it with the values at position 0...n. So the solution would be if you want to have the first category it will be at position 0 of every list related to this category.

Categories

Resources