I'm writing a Java EE program that manages client data. The original specification was for it to talk to a hosted database that the client would setup, but now (just before deadline) they're saying they can't get that together and want it to talk to local files instead. (The program will be used in contexts where internet access is infeasible. Also, the user will want to email or dropbox the files around and expresses some hostility to the idea of my starting a mysql server on their local machine.)
The persistence layer is implemented with javax.persistence, configured to talk to a mysql database. What I'd love is to keep the program as similar as possible and switch to configuring my EntityManagerFactory to talk to a flat file if it can't connect to mysql. Is there a jdbc driver that will talk to a flat file?
(Searching on this subject turns up a number of options, but none of them look obviously supported/tested/likely to be robust. If there's a standard solution, it's the one I'm looking for.)
Related
I am looking how to properly architect a system I am building.
I'll try to give a high level view of the system's requirements:
The system uses various data providers (2 for now, but more to come). The data can be retrieved :
either directly from a remote database (hosted on the provider's server, direct MySQL access)
either directly from files on the local server (the provider pushes files by SCP). Here there would be to additional logic to check if the files are here and purge them / move them afterwards
The method to use depends on the provider.
The system then needs to regularly import data from these providers to a local database. There is a special importation logic which differs for every provider.
Putting all the code in a monolithic application obviously seems like a bad idea, so I'm looking into splitting it into multiple services.
I'm really not sure how to architect this, though, and would appreciate some advice.
Details:
Java + Spring if needed to expose an API between services
Scalability / speed not really important (I hate saying that - but in this case it doesn't matter whether the imports take 1 ms or 4 hours)
Everything needs to be hosted on a single server and no data can leave it (sensible data)
Any help highly appreciated!
I am writing a RESTful service which consumes application/octet-stream and accepts binary files to write them to disk (Tomcat 8, Windows Server 2012R2, JAX-RS). I then need to insert the file contents into an Oracle table.
The service itself runs fine, accepts files and writes them to disk.
My problem (or call it best-practice question) is, how to transfer the data to the Oracle DB. Of course I can open a connection in the service itsef which gets called everytime the service accepts a file, but is this really the "correct" way? We're talking about MANY small files (let's say 100 per minute, each about 300 byte in size).
Should I create a connection pool? Or even a standalone program which keeps the Oracle connection open permanently? Unfortunately, I can't really benchmark at the moment because I am on an isolated test server.
So, tl;dr: How to transfer the content of many small files accepted by a RESTful service to an Oracle DB?
As you are deploying on Tomcat, using the Tomcat managed connection pool is the most generic way. We use this and get very good performance out of it. You could roll your own and benchmark it, but I am not sure about the merits of this. I know I would try the way that's best integrated with Tomcat first and only if it does not perform move to libs like C3P0.
Depending on your use-case you could do without writing the files to disk and instead just insert them into the DB. Since your files are small, there would not even be a reason to go async or fork threads for insertion.
Connection pools are the most generic way to go here. There's lots of reasons - separating the concerns of connection management from connection use, scalability control through configuration, overcoming the latency associated with connection setup,...
There's lots of implementations of simple connection pools out there, they can be found in application servers or libraries - c3p0 is a nice and easy one for a standalone, self-contained webapp.
My aim is to create a local database that can be read and written to with Java. I have some experience with manipulating a local sqlite database with Python, and with interacting with existing networked databases on Microsoft Azure via VB.Net, but the Java formulation for creating a database is escaping me.
Most sources (like the JDBC Docs) seem to assume that you are accessing a database through a network protocol, or a database hosted on localhost. My desired implementation is to create and store the database in a file (or collection of files), so that it can be stored and accessed locally, without network connectivity (presumable through the "file:" protocol).
The JDBC Tutorial looks like it will be very useful once I am up and running, but is currently beyond my scope, since I don't even have an existing database yet.
Many sources have suggested solutions like H2, MySQL, Derby, or Hypersonic DB. However, I'm loath to install extensions (if that's the right term) for a number of reasons:
This project is initially intended to help me learn my way around Java - widening the scope of the project will dilute my experience with the "base" language and, probably, increase the temptation to engage in "cargo cult programming"
If this project does ever get distributed to other users (admittedly unlikely, but still!), I don't want to force them into installing more than the core of Java.
I simply don't know how to install extensions (add-ons? modules?) in Java - one baby-step at a time!
For similar reasons, installing Microsoft SQL Server would not be productive.
This answer looks close to what I'm aiming for; however, it gives the error:
java.sql.SQLException: No suitable driver found for jdbc:mysql://localhost/?user=root&password=rootpassword
and trying "jdbc:file://targetFile.sql" gives a similar error.
I've seen the term "embedded" database, which I think is a subset of "local database" (i.e. a local database is stored on the same system - an embedded database is a local database that is only used by a single application) - if I've got those definitions wrong, please feel free to correct me!
Most likely, the reason for which you are getting the error, is due to the fact that you are not registering the driver (using reflection...) before actually using it for establishing a connection and so on.
Presumably you will want to do something along the lines of Class.forName("driver")
and then cast that if necessary and then registering it in the DriverManager before calling the getConnection() method.
Here is a very useful link that might help you out in solving the issue:
http://www.kfu.com/~nsayer/Java/dyn-jdbc.html
However, if you really want to use a local database/file you might want to have a look at SQLite, that might be one way to go about it, although I recommend going for the MySQL approach, as it is a lot easier to configure and learn how stuff works with JDBC.
If you are still considering SQLite check this out:
Java and SQLite
I see you need some guidance in importing external .jar files into your code (i.e. 3rd party libraries like the ones you will be using for a JDBC driver). Are you using an IDE (e.g. Eclipse, Netbeans, etc.) or are you writing in a text editor and compiling manually?
A number of embedded pure Java databases appeared recently, which have a really simple interface, usually just java.util.Map, don't involve using JDBC or other SQL artifacts, and store their data in a single file or directory:
Chronicle Map
JetBrains Xodus
MapDB
The main downside is that most of such databases provide only the simples key-value model.
DBC can be used with any database that has a JDBC driver, which isn't necessarily a database in "network mode", it can be used with embedded databases as well.
Here are some Java and embeddable databases:
http://www.h2database.com/html/main.html
http://db.apache.org/derby/
http://hsqldb.org/
Java's JDK does not include any implementation of a database nor drivers to access it. It only provides JDBC as an abstraction to connect to a "database". Is up to you to include all the needed libraries in your code.
If you want to have a self contained code you can simply include the .jar file of a embeddable database in you classpath. That way you can create the instance of the database in your code and minimize the external dependencies.
You can find here a list of java embeddable databases
You can find here an example of how to embed HSQLDB in your code.
I'm working on a school project where the client needs to have multiple users querying and writing to a single data source. The users have access to shared network drives and all functionality has to be in the client application, the IT department won't allow a service to run from one of their servers and external server hosting isn't an option.
The amount of data that actually needs to be stored is actually very little, about 144 rows maximum.
I've looked into using embedded databases, sqllite , hsql , objectdb ... etc but they seem over kill for how little data needs to be saved. It also seemed like with hsql if anyone accessed the database it would be completely locked to any other user. Concurrency wouldn't be much of an issue there will be 5-7 people using the system albeit scarcely only a few times a year.
Would using something like XQuery and serializing everything in xml be a viable option or just simply using the java serializable api?
A distributed, client side database writing files to the shared network drive could be a good solution for this use case. Take a look at Cloud DB, it might be what your looking for.
Does the term 'embedded database' carry different meaning from 'database'?
There are two definitions of embedded databases I've seen:
Embedded database as in a database system particularly designed for the "embedded" space (mobile devices and so on.) This means they perform reasonably in tight environments (memory/CPU wise.)
Embedded database as in databases that do not need a server, and are embedded in an application (like SQLite.) This means everything is managed by the application.
I've personally never seen the term used exactly as Wikipedia defines it, but that's probably my fault, although it resembles quite a bit my number 2 above.
The word 'embedded' does add meaning, basically that the database is dedicated to a specific application rather than shared among multiple applications, to a degree hidden from the user of the application, and completely controlled by the application.
An embedded database is conceptually just a part of the application rather than a separate thing.
Just see the usage of ... for example a H2-embedded database. You don't need a server running on your machine, your whole database ist stored in one (these are originally two) local file. It is opened and locked when you connect to your DB, and it is unlocked when you disconnect.
When a developer embeds a database library inside an application and there is no need for administrator, it is called embedded database. Database is hidden, but data management via SQL (e.g. ITTIA DB SQL) or no SQL (e.g. Berkeley DB) is accessible through APIs. Embedded databases are common for web development or device applications.