Non-serializing in-memory database

Non-serializing in-memory database - java

I have the following problem:
There is a Set<C> s of objects of class C. C is defined as follows:
class C {
A a;
B b;
...
}
Given A e, B f, ..., I want to find from s all objects o such that o.a = e, o.b = f, ....
Simplest solution: stream over s, filter, collect, return. But that takes a long time.
Half-assed solution: create a Map<A, Set<C>> indexA, which splits the set by a's value. Stream over indexA.get(e), filter for the other conditions, collect, return.
More-assed solution: create index maps for all fields, select for all criteria from the maps, stream over the shortest list, filter for other criteria, collect, return.
You see where this is going: we're accidentally building a database. The thing is that I don't want to serialize my objects. Sure I could grab H2 or HSQLDB and stick my objects in there, but I don't want to persist them. I basically just want indices on my regular old on-the-heap Java objects.
Surely there must be something out there that I can reuse.

Eventually, I found a couple of projects which tackle this problem including CQEngine, which seems like the most complete and mature library for this purpose.

HSQLDB provides the option of storing Java objects directly in an in-memory database without serializing them.
The property sql.live_object=true is used as a property on the connection URL to a mem: database, for example jdbc:hsqldb:mem:test;sql.live_object=true. A table is created with a column of type OTHER to store the object. Extra columns in this table duplicate any fields in the object that need indexing.
For example:
CREATE TABLE OBJECTLIST (ID INTEGER IDENTITY, OBJ OTHER, TS_FIELD TIMESTAMP, INT_FIELD INTEGER)
CREATE INDEX IDX1 ON OBJECTLIST(TS_FIELD)
CREATE INDEX IDX2 ON OBJECTLIST(INT_FIELD)
The object is stored in the OBJ column, and the timestamp and integer values for the fields that are indexed are stored the the extra columns. SQL queries such as SELECT * FROM OBJECTLIST WHERE INT_FILED = 1234 return the rows containing the relevant objects.
http://hsqldb.org/doc/2.0/guide/dbproperties-chapt.html#dpc_sql_conformance

Related

Indexing a simple Java Record

I have a Java Object, Record . It represents a single record as a result of SQL execution. Can CQEngine index collection of Record ?
My class is of the form
public class Record {
private List<String> columnNames;
private List<Object> values;
... Other getters
}
I have looked through some examples, but I have no luck there.
I want to index only specific column(s) with its name and corresponding value. Can this be achived using cqengine or is there any other alternatives to achieve the same.
Thanks.

That seems to be a strange way to model data, but you can use CQEngine with that model if you wish.
(First off, CQEngine will have no use for your column names so you can remove that field.)
To do this, you will need to define a CQEngine virtual attribute for each of the indexes in your list of values.
Each attribute will need to be declared with the data type which will be stored in that column/index, and will need to be able to cast the object at that index in your list of values, to the appropriate data type (String, Double, Integer etc.).
So let's say your Record has a column called 'price', which is of type Double, and is stored at index 5 in the list of values. You could define an attribute which reads it as follows:
public static final Attribute<Record, Double> PRICE =
attribute("PRICE", record -> ((Double) record.values.get(5));
If this sounds complicated, it's because that way of modelling data makes things a bit complicated :) It's usually easier to work with a data model which leverages the Java type system (which your model does not). As such, you will need to keep track of the data types etc. of each field programmatically yourself.
CQEngine itself will work fine with that model though, because at the end of the day CQEngine attributes don't need to read fields, the attributes are just functions which are programmed to fetch values.
There's a bunch of stuff not covered above. For example can your values be null? (if so, you should use the nullable variety of attributes as discussed in the CQEngine docs. Or, might each of your Record objects have different sets of columns? (if so, you can create attributes on-the-fly when you encounter a new column, but you should probably cache the attributes you have created somewhere).
Hope that helps,
Niall (CQEngine author)

Joining postgres tables from different servers

I want to extract data by joining tables from two different postgres hosted on different servers using java.
ResultSet resA = statement_A.executeQuery("select issue_id from Server_A.table_name");
ResultSet resB = statement_B.executeQuery("select issue_id from Server_B.table_name");
How can I get join query executed to get result set in this case ? Any pointers would be highly appreciated..

You can't do it in any automatic/magical way. What you can do is define a class that will have the union of properties of the two tables like:
public class JoinedResult{
private int id;
private int name;
// all other common properties to both
...
// properties exclusive to first table
...
// properteis exclusive to second table
...
}
and construct a list of these object that will contain the joined result of both tables.
To make the actual construction you have a few options:
The first one and the easiest one (but not efficient) is to iterate both results with nested loops, and once the ids (or whatever key is used) match you should construct a JoinedResult.
The second one is a bit more complex but also more efficient:
Iterate first result set and construct a map that will map the id to the object.
Iterate second result set and construct a map that will map the id to the object.
Run a loop over the keys of one of the maps you constructed and use that key to access the matching values in both maps, finally construct the joined object.

Fetching a record from an arraylist that is made up of arrays of objects

I have a java arraylist that is made like this:
{[{},{}], [{},{}], [{},{}], [{},{}]} of around four thousand records.
I have a particular key through which I want to search in one of the objects in this list and fetch that particular array where that
record matches. The search key is a string.
Is there a solution to this without traversing through the entire list.
It is basically a list that is constructed like this:
List<Object[]> list = new ArrayList<>();
I am using this to fetch the the data from two tables using a join. Individual records of each tables map to these objects.
Say table1: {a:1,b:2,c:3} and table2: {x:1,y:2,z:3}
the data returned would be
{[{a:1,b:2,c:3}, {x:1,y:2,z:3}],[{a:2,b:3,c:4}, {x:2,y:3,z:4}]}
How will I search for say in which array in the list is a=2.
Thanks

If you do not want to be a victim of the linear search, you should consider using another type of data structure than List.
The use case you described seems like a good match for a Map in general. If you want constant time key lookup, consider using HashMap instead.

How to store multiple values (more than 2) in a map from a ResultSet, similar to HashMap?

I am new java and to this website and have never asked a question on here before.
My question is, I am trying to write a method that will query the database and this query returns 7 columns of data with 'X' amount of rows and one primary key. I know that with HashMap you can store two values, key and value, but is it possible to take a ResultSet with more than two values per row and store it in a HashMap or something similar. I have been struggling to find an answer to this on the internet.
I am trying to use this method as a cache so that the application does not have to query the database every time it needs information. But I want to store the information so that when the user searches for the key value (ex. customerId), I get the 6 columns of information associated with that 'customerId' and the application simply gets it from the "map" rather than querying the database every time.
Any suggestions would be helpful and I am sorry if this question is a repeat. It seems to happen often on here but I was unable to find an answer.

You can encapsulate them in a class, for example:
public Class EncapsulatedVariables {
String s;
int i;
boolean b;
public EncapsulatedVariables(String s, int i, boolean b) {
this.s = s;
this.i = i;
this.b = b;
}
}
Then you can use an instance of this class as the value in a HashMap, and have methods to retrieve and/or process the data as you deem fit.

Create a customer class to store your values. As you iterate over your resultset, initialize a new customer for every record and place the object into a hashmap using the customerid as the key. Your map would be initalized as Map<String,Customer> cache = new HashMap<String,Customer>();

Flatten a collection as part of an HQL query?

I have a requirement to use a reporting-friendly query from HQL. That is, each column is represented by a standard Java type (Long, Integer, String, List) rather than a mapped class. The values will be used by a third party library, so I have very limited control over post-processing.
My example object tree looks like this:
a.x
a.y (a collection of z)
a.y.z[0].v
a.y.z[1].v
a.y.z[2].v
I would like query to retrieve two columns. The first column is the plain "a.x" field, and the second is a String of a comma separated list of all of the a.y.z.v values. If this is not possible, then having the second column return as a Java list of the a.y.z.v values would be satisfactory.
In short, I would like to flatten the a.y.z.v collection to a csv String or to a List object from inside the query.
I have already attempted the following:
Using the "new" keyword in a list subselect. ie, "select a.x, (select new list(a.y.z.v)) from a". If necessary I could have transformed the contents of the list into a csv, but this caused a syntax error.
Using the "new" keyword with a custom object in a subselect. ie, "select a.x, (select new custom.package.ListToCsvObject(a.y.z)) from a". This caused the same error as the first attempt
Using the "elements()" keyword in the select. Unfortunately, this keyword only seems to work inside "in", "exists" clauses (etc), not as the actual returned value.
The only solution we've been able to find was to create a stored procedure in the database and use that, but such a solution is painfully slow through HQL (it turns a sub-second query into a 30 second query) and therefore is not something we want to continue to do.
I am able to make some limited changes to the Hibernate mapping (so I can add #formula, etc) but I would prefer not to have to make major changes to the database schema to support it. (So no, I don't want to create a denormalized "csv_value" column in the database!)
Could anyone suggest some code or, failing that, an alternative approach to solving this problem?

Try something like this should work. Flattening of list to comma separated string is done in the constructor of your VO class. You can also take a look at resultTransformer, you can create a custom resultTransformer and attach it to the query.
class ResultVO {
String x;
String y;
public ResultVO(String x,List<Z> y) {
this.x = x;
this.y = createCSV(y);
}
}
then in HQL
select new ResultVO(a.x,a.y) from a
A warning - this is not a good way to use JPA. If most of your use cases are like this you should seriously reconsider using some other persistence approach (ibastis, spring jdbc template + sql etc).

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.