Is there a framework to quickly build a JDBC-like interface for an internal data structure?
We have a complex internal data structure for which we have to generate reports. The data itself is persisted in a database but there is a large amount of Java code which manages the dependencies, access rights, data aggregation, etc. While it is theoretically possible to write all this code again in SQL, it would be much more simple if we could add a JDBC-like API to out application and point the reporting framework to that.
Especially since "JDBC" doesn't mean "SQL"; we could use something like commons jxpath to query our model or write our own simple query language.
[EDIT] What I'm looking for is a something that implements most of the necessary boiler plate code, so you can write:
// Get column names and types from "Foo" by reflection
ReflectionResultSet rs = new ReflectionResultSet( Foo.class );
List<Foo> results = ...;
rs.setData( results );
and ReflectionResultSet takes care of cursor management, all the getters, etc.
It sounds like JoSQL (SQL for Java Objects) is exactly what you want.
try googling "jdbe driver framework" The first (for me) looks like a fit for you: http://jxdbcon.sourceforge.net/
Another option that might work (also on the google results from the search above) is the Spring JDBC Templage. Here is a writeup http://www.zabada.com/tutorials/simplifying-jdbc-with-the-spring-jdbc-abstraction-framework.php
I think you'll have to create a new Driver implementation for your data structure. Usually, framework using JDBC have just to be provided an URL and a driver, so if you define your custom driver (and all the things that go with it, for example Connection), you'll be able to add the JDBC-API you want.
Related
ie., Filters.search("eq", "name", "Smith") does the same thing as Filters.eq("name", "smith")
I'm writing some code that will search MongoDb depending on parameters passed in...
So currently, my ugly code looks like
if (param.equals("eq")) Filters.eq("name", name);
if (param.equals("gt")) Filters.gt("name", name);
etc..
I'm hoping there is something like
Filters.search("gt", "name", name) ;
Filters.search("eq", "name", name) ;
etc...
Or perhaps there are other methods in the MongoDb java driver that can help.
I've looked through com.mongodb.client.model.Filters and com.mongodb.client.model.* but haven't seen anything that seems promising.
I haven't seen anything promising either. The mechanisms behind the Filters API seem to have been declared as private which seems to effectively close off any possibility of calling them directly or via subclassing.
If you desperate to do this, you could fork the MongoDB Java client code and modify the API so that you can inject operations directly. But I think your "clunky" approach is probably better.
I've created a custom aggregate function in Stardog that calculates the standard deviation. This works great when you post SPARQL queries to the endpoint or via the query panel in the admin console.
So far, so good, but we're facing a couple of problems. First, of all, when we execute a query like the following, it will execute perfectly via Stardog, but will fail in the SPARQL validator (and with the Jena API as well):
PREFIX : <http://our/namespace#>
PREFIX agg: <urn:aggregate:>
SELECT (agg:stardog:stdev(?age) AS ?stdLMD) (AVG(?age) AS ?avg)
WHERE {
?pat a :Person .
?pat :age ?age .
}
Stardog gives the correct results for standard deviation and average age, but the SPARQL validator throws an exception:
Non-group key variable in SELECT: ?age in expression (?age)
Does Stardog interpret the specification differently or is this a feature I'm unaware of?
Another problem, we're using a custom aggregate function (stdev) in a CONSTRUCT query and again that seems to be working fine via the Stardog API's. Most of our code though is based on Jena, and it doesn't seem to recognize the custom stdev fuction. I guess because this extension is only Stardog related and unavailable for Jena? Let me show an example. ATM, we're executing CONSTRUCT queries via the following Jena code:
final Query dbQuery = QueryFactory.create(query.getContent());
final QueryExecution queryExec = QueryExecutionFactory.create(dbQuery, model);
queryExec.execConstruct(infModel);
As long as we're not using the aggregate function, this works like a charm. As we're constructing triples in multiple named graphs, it's very convenient to have a model available as well (which represents a named graph).
I would like to do something similar with the Stardog java API. I've only gotten as far as:
UpdateQuery dbQuery;
try {
dbQuery = connection.update(query.getContent());
dbQuery.execute();
} catch (final StardogException e) {
LOGGER.error("Cannot execute CONSTRUCT query", e);
}
Problem is that you explicitly need to specify which named graph you want to manipulate in the CONSTRUCT query. There's nothing like a Jena model that represents a part of the database so that we can avoid specifying it in the query. What would be a good approach here?
So my question is twofold: why are queries parsed differently in Stardog and is it possible to have Jena detect the custom Stardog aggregate functions? Thanks!
UPDATE
In the end, what we're trying to accomplish, is to execute a construct query over a given named graph, but write the newly constructed triples to a different graph. In my Jena example, you can see that I'm working with two Jena models to accomplish that. How would you do this with the SNARL API? I've gotten as for as the following code snippet, but this only defines the dataset this query will be executed against, not where the triples will be written to. Any help on this is still appreciated!
UpdateQuery dbQuery;
try {
dbQuery = connection.update(query.getContent());
final DatasetImpl ds = new DatasetImpl();
ds.addNamedGraph(new URIImpl(infDatasource));
dbQuery.dataset(ds);
dbQuery.execute();
} catch (final StardogException e) {
LOGGER.error("Cannot execute CONSTRUCT query", e);
}
The likely reason for the error
Non-group key variable in SELECT: ?age in expression (?age)
Is that the SPARQL validator, and ARQ, have no idea that agg:stardog:stdev is an aggregate and does not interpret it that way. The syntax is no different than a standard projection expression such as (?x + ?y as ?sum), as AndyS noted.
While the SPARQL spec doesn't quite preclude custom aggregates, they're not accounted for in the grammar itself. Both Stardog and Jena allow custom aggregates, albeit in different ways.
Another problem, we're using a custom aggregate function (stdev) in a CONSTRUCT query and again that seems to be working fine via the Stardog API's. Most of our code though is based on Jena, and it doesn't seem to recognize the custom stdev fuction. I guess because this extension is only Stardog related and unavailable for Jena?
Yes, Jena and Stardog are distinct. Anything custom you've defined in Stardog, such as a custom aggregate, won't available directly in Jena.
You might be constructing the model in such a way that Jena, via ARQ, is the query engine as opposed to Stardog. That would explain why you get exceptions that Jena doesn't know about the custom aggregate you've defined within Stardog.
There's nothing like a Jena model that represents a part of the database so that we can avoid specifying it in the query. What would be a good approach here?
You can specify the active graph of a query programmatically via the SNARL API using dataset
So my question is twofold: why are queries parsed differently in Stardog and is it possible to have Jena detect the custom Stardog aggregate functions? Thanks!
They're parsed differently because there's no standard way of defining a custom aggregate and Stardog & Jena choose to implement it differently. Further, Jena would not be aware of Stardog's custom aggregates and vice versa.
Non-group key variable in SELECT: ?age in expression (?age)
Does Stardog interpret the specification differently or is this a
feature I'm unaware of?
I think that you're reading the spec correctly, and that maybe the validator just doesn't recognize non-built-in aggregates. The spec says:
19.8 Grammar
… Aggregate functions can be one of the built-in keywords for aggregates or a custom aggregate, which is syntactically a function call. Aggregate functions may only be used in SELECT, HAVING and ORDER BY clauses.
As to the construct query:
Another problem, we're using a custom aggregate function (stdev) in a CONSTRUCT query and again that seems to be working fine via the Stardog API's. Most of our code though is based on Jena, and it doesn't seem to recognize the custom stdev function.
You didn't mention how you're using this. To use an aggregate within a construct pattern, you'd need to use a subquery. E.g., something like:
construct { ?s :hasStandardDeviation ?stddev }
where {{
select ?s (agg:stddev(?value) as ?stddev) {
?s :hasSampleValue ?value
}
group by ?s
}}
There are some examples of this in SPARQL functions in CONSTRUCT/WHERE. Of course, if the validator rejects the first, it probably rejects the second as well, but it looks like it should actually be legal. With Jena, you may need to make sure that you select a query language that allows extensions, but since the spec allows the custom functions (when identified by IRIs), I'd think you should be able to use the standard SPARQL 1.1 language. You are using SPARQL 1.1 and not the earlier SPARQL spec, right?
Unless a custom aggregate is installed, the parser does not know it's an aggregate. Apache Jena ARQ does not have custom aggregates by default.
An aggregate by URI looks like a plain custom function. So if you have not installed that aggregate, the parser considers it to be a custom function.
The AVG forces an implicit grouping so then the custom function is on a non-group key variable, which is illegal.
I have a web service layer that is written in Java/Jersey, and it serves JSON.
For the front-end of the application, I want to use Rails.
How should I go about building my models?
Should I do something like this?
response = api_client.get_user(123)
User user = User.new(response)
What is the best approach to mapping the JSON to the Ruby object?
What options do I have? Since this is a critical part, I want to know my options, because performance is a factor. This, along with mapping JSON to a Ruby object and going from Ruby object => JSON, is a common occurance in the application.
Would I still be able to make use of validations? Or wouldn't it make sense since I would have validation duplicated on the front-end and the service layer?
Models in Rails do not have to do database operation, they are just normal classes. Normally they are imbued with ActiveRecord magic when you subclass them from ActiveRecord::Base.
You can use a gem such as Virtus that will give you models with attributes. And for validations you can go with Vanguard. If you want something close to ActiveRecord but without the database and are running Rails 3+ you can also include ActiveModel into your model to get attributes and validations as well as have them working in forms. See Yehuda Katz's post for details on that.
In your case it will depend on the data you will consume. If all the datasources have the same basic format for example you could create your own base class to keep all the logic that you want to share across the individual classes (inheritance).
If you have a few different types of data coming in you could create modules to encapsulate behavior for the different types and include the models you need in the appropriate classes (composition).
Generally though you probably want to end up with one class per resource in the remote API that maps 1-to-1 with whatever domain logic you have. You can do this in many different ways, but following the method naming used by ActiveRecord might be a good idea, both since you learn ActiveRecord while building your class structure and it will help other Rails developers later if your API looks and works like ActiveRecords.
Think about it in terms of what you want to be able to do to an object (this is where TDD comes in). You want to be able to fetch a collection Model.all, a specific element Model.find(identifier), push a changed element to the remote service updated_model.save and so on.
What the actual logic on the inside of these methods will have to be will depend on the remote service. But you will probably want each model class to hold a url to it's resource endpoint and you will defiantly want to keep the logic in your models. So instead of:
response = api_client.get_user(123)
User user = User.new(response)
you will do
class User
...
def find id
#api_client.get_user(id)
end
...
end
User.find(123)
or more probably
class ApiClient
...
protected
def self.uri resource_uri
#uri = resource_uri
end
def get id
# basically whatever code you envisioned for api_client.get_user
end
...
end
class User < ApiClient
uri 'http://path.to.remote/resource.json'
...
def find id
get(id)
end
...
end
User.find(123)
Basic principles: Collect all the shared logic in a class (ApiClient). Subclass that on a per resource basis (User). Keep all the logic in your models, no other part of your system should have to know if it's a DB backed app or if you are using an external REST API. Best of all is if you can keep the integration logic completely in the base class. That way you have only one place to update if the external datasource changes.
As for going the other way, Rails have several good methods to convert objects to JSON. From the to_json method to using a gem such as RABL to have actual views for your JSON objects.
You can get validations by using part of the ActiveRecord modules. As of Rails 4 this is a module called ActiveModel, but you can do it in Rails 3 and there are several tutorials for it online, not least of all a RailsCast.
Performance will not be a problem except what you can incur when calling a remote service, if the network is slow you will be to. Some of that could probably be helped with caching (see another answer by me for details) but that is also dependent on the data you are using.
Hope that put you on the right track. And if you want a more thorough grounding in how to design these kind of structures you should pick up a book on the subject, for example Practical Object-Oriented Design in Ruby: An Agile Primer by Sandi Metz.
I want to create a class file dynamically. Here it goes...
With the given ResultSet, extracting the metadata I want to build a class file dynamically with getter and setter methods for all the columns that exist in ResultSet. Also I should be able to use this class file generated where ever I want in my later use.
Can any body suggest me a better way to implement this. Also if any existing jar files available to implement this, that would be helpful.
Perhaps Apache Beanutils might suit your requirements?
See the section on Dynabeans
In particular:
3.3 ResultSetDynaClass (Wraps ResultSet in DynaBeans)
A very common use case for DynaBean APIs is to wrap other collections of "stuff" that do not normally present themselves as JavaBeans. One of the most common collections that would be nice to wrap is the java.sql.ResultSet that is returned when you ask a JDBC driver to perform a SQL SELECT statement. Commons BeanUtils offers a standard mechanism for making each row of the result set visible as a DynaBean, which you can utilize as shown in this example:
Connection conn = ...;
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery
("select account_id, name from customers");
Iterator rows = (new ResultSetDynaClass(rs)).iterator();
while (rows.hasNext()) {
DynaBean row = (DynaBean) rows.next();
System.out.println("Account number is " +
row.get("account_id") +
" and name is " + row.get("name"));
}
rs.close();
stmt.close();
3.4 RowSetDynaClass (Disconnected ResultSet as DynaBeans)
Although ResultSetDynaClass is a very useful technique for representing the results of an SQL query as a series of DynaBeans, an important problem is that the underlying ResultSet must remain open throughout the period of time that the rows are being processed by your application. This hinders the ability to use ResultSetDynaClass as a means of communicating information from the model layer to the view layer in a model-view-controller architecture such as that provided by the Struts Framework, because there is no easy mechanism to assure that the result set is finally closed (and the underlying Connection returned to its connection pool, if you are using one).
The RowSetDynaClass class represents a different approach to this problem. When you construct such an instance, the underlying data is copied into a set of in-memory DynaBeans that represent the result. The advantage of this technique, of course, is that you can immediately close the ResultSet (and the corresponding Statement), normally before you even process the actual data that was returned. The disadvantage, of course, is that you must pay the performance and memory costs of copying the result data, and the result data must fit entirely into available heap memory. For many environments (particularly in web applications), this tradeoff is usually quite beneficial.
As an additional benefit, the RowSetDynaClass class is defined to implement java.io.Serializable, so that it (and the DynaBeans that correspond to each row of the result) can be conveniently serialized and deserialized (as long as the underlying column values are also Serializable). Thus, RowSetDynaClass represents a very convenient way to transmit the results of an SQL query to a remote Java-based client application (such as an applet).
The thing is though - from the sounds of your situation, I understand that you want to create this class at runtime, based on the contents of a ResultSet that you just got back from a database query. This is all well and good, and can be done with bytecode manipulation.
However, what benefit do you perceive you will get from this? Your other code will not be able to call any methods on this class (because it did not exist when they were compiled), and consequently the only way to actually use this generated class would be either via reflection or via methods on its parent class or implemented interfaces (I'm going to assume it would extend ResultSet). You can do the latter without bytecode weaving (look at dynamic proxies for arbitrary runtime implementations of an interface), and if you're doing the former, I don't see how having a class and mechanically calling the getFoo method through reflection is better than just calling resultSet.getString("foo") - it will be slower, more clunky and less type-safe.
So - are you sure you really want to create a class to achieve your goal?
You might want to look at BCEL, although I believe there are other bytecode manipulation libraries available too.
If you're using Java 6 you can write your code and directly call the Java compiler:
Files[] files1 = ... ; // input for first compilation task
Files[] files2 = ... ; // input for second compilation task
JavaCompiler compiler = ToolProvider.getSystemJavaCompiler();
StandardJavaFileManager fileManager = compiler.getStandardFileManager(null, null, null);
Iterable<? extends JavaFileObject> compilationUnits1 =
fileManager.getJavaFileObjectsFromFiles(Arrays.asList(files1));
compiler.getTask(null, fileManager, null, null, null, compilationUnits1).call();
Iterable<? extends JavaFileObject> compilationUnits2 =
fileManager.getJavaFileObjects(files2); // use alternative method
// reuse the same file manager to allow caching of jar files
compiler.getTask(null, fileManager, null, null, null, compilationUnits2).call();
fileManager.close();
You will then have to load said class but you can do that easily enough with a class loader.
Sadly this is what you have to do in Java.
In C# you just use the 'var' type.
I'm confused to the way it's supposed to work. And i don't think it's possible.
Here's why:
If you want to use the class code in the rest of your application, you need an interface (or heavy use of reflection) and that would mean, you know the column types beforehand - defeating the purpose of a generated class.
A generated class might clash during runtime with another one.
If you create a new class for each SQL call, you will have either different classes for the same purpose. And these would probably not even pass a regular call to "equals".
You have to look up classes from previously executed statements. And you loose flexibility and/or fill your heap with classes.
I've done something probably similar. But I wouldn't create dynamic classes.
I had an object called Schema that would load the data of each table I'd need.
I had a Table object that would have a Schema type. Each Schema object would have columns attribute while While Table object had attribute with value and reference on Schema column attribute.
The Schema had everything you'd need to insert,select,delete,update data to the database.
And I had a mediator that would handle connection between the database and Table object.
Table t = new Table('Dog');
t.randomValue(); // needed for the purpose of my project
t.save();
Table u = Table.get(t);
u.delete();
But It could have something to get value on certain column name easily.
Anyway, the principle is easy, my could would load data contained in the table information_data it could probably work with a describe too.
I was able to load anytable dynamically as table had dynamic attributes the structure wasn't hardcoded. But there is no real need to create new classes for each table.
There was also something that could be important to note. Each table schema were loaded once. Tables only had reference to schemas and schemas had reference to column. column had references to column type etc...
It could have been interesting to find a better use than it had. I made that for unit case on database replication. I had no real interest to code a class for each of the 30 tables and do insert/delete/updates and selects. That's the only reason I can see it usefull to create something dynamic about sql. If you don't need to know anything about the tables and only want to insert/delete junk into it.
If I had to redo my code, I'd used more associative array.
Anyway Goodluck
I second the comments made by dtsazza and Stroboskop; generating a new class at run time is probably not what you want to do in this case.
You haven't really gotten into why you want to do this, but it sounds like you are trying to roll your own Object-Relational mapper. That is a problem that's much harder to get right than it first seems.
Instead of building your own system from the down up, you might want to look into existing solutions like Hibernate (high-level system, manages most of you objects and queries for you) or iBatis (a bit more low-level; it handles object mapping, but you still get to write your own SQL).
I have found that in JSF beans and maps can be used interchangably. Hence for handling results where you don't want to build a complete set of get/setters but just create a h:table, it is much easier to create a list with a map for each line, where the key is the column name (or number) and the value is the column content.
If you find it relevant later to make it more typesafe, you can then rework the backend code with beans, and keep your JSF-code unchanged.
To implement data access code in our application we need some framework to wrap around jdbc (ORM is not our choice, because of scalability).
The coolest framework I used to work with is Spring-Jdbc. However, the policy of my company is to avoid external dependencies, especially spring, J2EE, etc.
So we are thinking about writing own handy-made jdbc framework, with functionality similar Spring-jdbc: row mapping, error handling, supporting features of java5, but without transaction support.
Does anyone have experience of writing such jdbc wrapper framework?
If anyone has experience of using other jdbc wrapper frameworks, please share your experience.
Thanks in advance.
We wrote our own wrapper. This topic is worthy of a paper but I doubt I'll ever have time to write it, so here are some key points:
we embraced sql and made no attempt to hide it. the only tweak was to add support for named parameters. parameters are important because we do not encourage the use of on-the-fly sql (for security reasons) and we always use PreparedStatements.
for connection management, we used Apache DBCP. This was convenient at the time but it's unclear how much of this is needed with modern JDBC implementations (the docs on this stuff is lacking). DBCP also pools PreparedStatements.
we didn't bother with row mapping. instead (for queries) we used something similar to the Apache dbutil's ResultSetHandler, which allows you to "feed" the result set into a method which can then dump the information wherever you'd like it. This is more flexible, and in fact it wouldn't be hard to implement a ResultSetHandler for row mapping. for inserts/updates we created a generic record class (basically a hashmap with some extra bells and whistles). the biggest problem with row mapping (for us) is that you're stuck as soon as you do an "interesting" query because you may have fields that map to different classes; because you may have a hierarchical class structure but a flat result set; or because the mapping is complex and data dependent.
we built in error logging. for exception handling: on a query we trap and log, but for an update we trap, log, and rethrow an unchecked exceptions.
we provided transaction support using a wrapper approach. the caller provides the code that performs transaction, and we make sure that the transaction is properly managed, with no chance of forgetting to finish the transaction and with rollback and error handling built-in.
later on, we added a very simplistic relationship scheme that allows a single update/insert to apply to a record and all its dependencies. to keep things simple, we did not use this on queries, and we specifically decided not to support this with deletes because it is more reliable to use cascaded deletes.
This wrapper has been successfully used in two projects to date. It is, of course, lightweight, but these days everyone says their code is lightweight. More importantly, it increases programmer productivity, decreases the number of bugs (and makes problems easier to track down), and it's relatively easy to trace through if need be because we don't believe in adding lots of layers just to provide beautiful architecture.
Spring-JDBC is fantastic. Consider that for an open source project like Spring the down side of external dependency is minimized. You can adopt the most stable version of Spring that satisfies your JDBC abstraction requirements and you know that you'll always be able to modify the source code yourselves if you ever run into an issue -- without depending on an external party. You can also examine the implementation for any security concerns that your organization might have with code written by an external party.
The one I prefer: Dalesbred. It's MIT licensed.
A simple example of getting all rows for a custom class (Department).
List<Department> departments = db.findAll(Department.class,
"select id, name from department");
when the custom class is defined as:
public final class Department {
private final int id;
private final String name;
public Department(int id, String name) {
this.id = id;
this.name = name;
}
}
Disclaimer: it's by a company I work for.
This sounds like a very short sighted decision. Consider the cost of developing/maintaining such a framework, especially when you can get it, and it's source code for free. Not only do you not have to do the development yourself, you can modify it at will if need be.
That being said, what you really need to duplicate is the notion of JdbcTemplate and it's callbacks (PreparedStatementCreator, PreparedStatementCallback), as well and RowMapper/RowCallbackHandler. It shouldn't be overcomplicated to write something like this (especially considering you don't have to do transaction management).
Howver, as i've said, why write it when you can get it for free and modify the source code as you see fit?
Try JdbcSession from jcabi-jdbc. It's as simple as JDBC should be, for example:
String name = new JdbcSession(source)
.sql("SELECT name FROM foo WHERE id = ?")
.set(123)
.select(new SingleOutcome<String>(String.class));
That's it.
Try mine library as alternative:
<dependency>
<groupId>com.github.buckelieg</groupId>
<artifactId>jdbc-fn</artifactId>
<version>0.2</version>
</dependency>
More info here
Jedoo
There is a wrapper class called Jedoo out there that uses database connection pooling and a singleton pattern to access it as a shared variable. It has plenty of functions to run queries fast.
Usage
To use it you should add it to your project and load its singleton in a java class:
import static com.pwwiur.util.database.Jedoo.database;
And using it is pretty easy as well:
if(database.count("users") < 100) {
long id = database.insert("users", new Object[][]{
{"name", "Amir"},
{"username", "amirfo"}
});
database.setString("users", "name", "Amir Forsati", id);
try(ResultSetHandler rsh = database.all("users")) {
while(rsh.next()) {
System.out.println("User ID:" + rsh.getLong("id"));
System.out.println("User Name:" + rsh.getString("name"));
}
}
}
There are also some useful functions that you can find in the documentation linked above.
mJDBC: https://mjdbc.github.io/
I use it for years and found it very useful (I'm the author of this library).
It is inspired by JDBI library but has no dependencies, adds transactions support, provides performance counters and allows to switch to the lowest possible SQL level in Java (old plain JDBC API) easily in case if you really need it.