I am testing databases for a new application where I will have to browse and index millions of xmls files and subsequently generate analysis of these data.
I would use SnappyData in this project. However, I do not know how it works.
Is it recommended for this type of application?
Is it possible to use it with Spring-Data-JPA?
In addition to storing the xmls itself, I would like to store the other data (users and system settings) of the application in the same Database instead of PostgreSQL. Is it recommended?
SnappyData is a Hybrid distributed database and primarily designed to manage data in-memory. So, the simple answer is Yes.
Do you have specific criteria ? Postgres should work too.
To load XML you can use the spark-xml project from databricks.
Related
Honestly, I do not enjoy working with SQLite on Android beyond trivial apps. It is a real pain to keep the database structure up to date between app versions and actually writing data access code is not much fun either when one is used to working with Hibernate and Entity Framework.
I am hoping there are alternative ways for me to store persistent data that will be reliable and robust. E.g. would serializing a collection of objects to external storage be an option? I expect my data to be around 5MB at most at any time.
Are there any other options? Specifically, I am downloading e.g. stock lists and contact details from a server, then allow the user to mark records as processed, etc. I was thinking of an XML file, but that creates another problem: how to robustly handle XML in java using the Android API.
Obviously first prize would have been a NoSQL database, but I know that's not going to be practical even if a stable mobile version existed.
Do you look SQLite Android Framework wich give you DAO and generate the database from POJO for you (as Hibernate) ?
For example : http://greenrobot.org/greendao/
Then you can easily update and versioning your database structure.
I have a desktop java application that can be run by different users. It makes use of JPA to access a database that is stored as a file. For this purpose I do not want to run a separate database server.
The purpose of the database is to store actions that are done in the program, it's a simple "store record" operation. But all users must be able to read these stored records.
How can I make sure the different applications can save their actions, while not overwriting the actions of the other? So I need a way to:
Open the database (file)
Lock it
Write the added action
Close the database (file)
Is there a way in JPA to do/enforce this ? I'm now using hibernate, but that's not a strict requirement.
Please do not answer with "you can do this with technology XXX". Please note that I'm concerned with respect to the concurrency issues and how to enforce that the file is opened, and closed again. How can this be done with technology XXX ?
You can try to use SQLite database file. In this case, you can achieve concurrent access to the same file (e.g. database), and in the same time you can use SQLite JDBC driver along with your JPA provider (e.g. Hibernate).
The only disadvantage might be is that, strictly saying, it's not pure Java approach, as proposed JDBC driver has native libraries bundled there, but I wouldn't consider this as an issue.
JPA is an ORM (Object-relational Mapping) specification ; the R in ORM means the same thing than the R in RDBMS. JPA is so absolutely not suitable for flat file persistence systems
Im currently working my way towards JPA 2.0 and I start of liking how easy it is to maintain persistent data.
What I'm currently trying to accomplish is using JPA in a basic desktop application. The application should allow me to open embedded databases which are on my file system. I chose H2 databases for now, but I can really live switching to JavaDB or anything else.
What Im trying to accomplish is, that one can open the database file without previously define a persistence-unit in the persistence.xml file.
I can easily define a unit and persist objects, but it needs to be configured first.
I want to write some sort of database browser which allows opening without preconfiguration and recompiling.
http://www.objectdb.com/java/jpa/start/connection
I saw that ObjectDB allows access for this type of PersistenceFactory creation, but I was not able to transfer this example to other databases.
Am I totally wrong with the way I approach this probblem? Is JPA not designed with on-the-fly database access?
Thank you for your help,
Johannes
Not part of the JPA standard. Some implementations may offer their own API to do it. For example with DataNucleus if you go to this page http://www.datanucleus.org/products/accessplatform_3_0/jpa/persistence_unit.html at the end you can create dynamic persistence-units (and hence EMFs), and that implementation obviously allows persistence to the widest range of datastores you'll get anywhere
You can pass a Map of properties to createEntityManagerFactory() call that defines the database connection info, etc. The property names are the same as in the persistence.xml. I assume most JPA providers support this, EclipseLink does.
You will still need to define the set of classes for the database and map them.
If you do not have any classes either, than you could look into EclipseLink's dynamic support,
http://wiki.eclipse.org/EclipseLink/Examples/JPA/Dynamic
If you want to make a database browser accessing different databases, you can't use a PU/Entity Manager (imo).
You'll need a dialogue asking a user for the IP/Port of the database, the username/password, the database name to access, and the type of database.
Then all you need to do is create a socket, send requests over the socket, and parse the response into a view.
Since both the request and the response are database specific, the user has to select the proper database driver.
I had completed my project Address Book in Java core, in which my data is stored in database (MySql).
I am facing a problem that when i run my program on other computer than tere is the requirement of creating the hole data base again.
So please tell me any alternative for storing my data without using any database software like mysql, sql etc.
You can use an in-memory database such as HSQLDB, Derby (a.k.a JavaDB), H2, ..
All of those can run without any additional software installation and can be made to act like just another library.
I would suggest using an embeddable, lightweight database such as SQLite. Check it out.
From the features page (under the section Suggested Uses For SQLite):
Application File Format. Rather than
using fopen() to write XML or some
proprietary format into disk files
used by your application, use an
SQLite database instead. You'll avoid
having to write and troubleshoot a
parser, your data will be more easily
accessible and cross-platform, and
your updates will be transactional.
The whole point of StackOverflow was so that you would not have to email around questions/answers :)
You could store data in a filesystem, memory (use serialisation etc) which are simple alternatives to DB. You can even use HSQLDB which can be run completely in memory
If you data is not so big, you may use simple txt file and store everything in it. Then load it in memory. But this will lead to changing the way you modify/query data.
Database software like mysql, sql etc provides an abstraction in terms of implementation effort. If you wish to avoid using the same, you can think of having your own database like XML or flat files. XML is still a better choice as XML parsers or handlers are available. Putting your data in your customised database/flat files will not be manageable in the long run.
Why don't you explore sqlite? It is file based, means you don't need to install it separately and still you have the standard SQL to retrieve or interact with the data? I think, sqlite will be a better choice.
Just use a prevayler (.org). Faster and simpler than using a database.
I assume from your question that you want some form of persistent storage to the local file system of the machine your application runs on. In addition to that, you need to decide on how the data in your application is to be used, and the volume of it. Do you need a database? Are you going to be searching the data different fields? Do you need a query language? Is the data small enough to fit in to a simple data structure in memory? How resilient does it need to be? The answers to these types of questions will help lead to the correct choice of storage. It could be that all you need is a simple CSV file, XML or similar. There are a host of lightweight databases such as SQLite, Berkelely DB, JavaDB etc - but whether or not you need the power of a database is up to your requirements.
A store that I'm using a lot these days is Neo4j. It's a graph database and is not only easy to use but also is completely in Java and is embedded. I much prefer it to a SQL alternative.
In addition of the others answers about embedded databases I was working on a objects database that directly serialize java objects without the need for ORM. Its name is Sofof and I use it in my projects. It has many features which are described in its website page.
I need to create a storage file format for some simple data in a tabular format, was trying to use HDF5 but have just about given up due to some issues, and I'd like to reexamine the use of embedded databases to see if they are fast enough for my application.
Is there a reputable embedded Java database out there that has the option to store data in one file? The only one I'm aware of is SQLite (Java bindings available). I tried H2 and HSQLDB but out of the box they seem to create several files, and it is highly desirable for me to have a database in one file.
edit: reasonably fast performance is important. Object storage is not; for performance concerns I only need to store integers and BLOBs. (+ some strings but nothing performance critical)
edit 2: storage data efficiency is important for larger datasets, so XML is out.
Nitrite Database http://www.dizitart.org/nitrite-database.html
NOsql Object (NO2 a.k.a Nitrite) database is an open source nosql
embedded document store written in Java with MongoDB like API. It
supports both in-memory and single file based persistent store.
H2 uses only one file, if you use the latest H2 build with the PAGE_STORE option. It's a new feature, so it might not be solid.
If you only need read access then H2 is able to read the database files from a zip file.
Likewise if you don't need persistence it's possible to have an in-memory only version of H2.
If you need both read/write access and persistence, then you may be out of luck with standard SQL-type databases, as these pretty much all uniformly maintain the index and data files separately.
Once i used an object database that saved its data to a file. It has a Java and a .NET interface. You might want to check it out. It's called db4o.
Chronicle Map is an embedded pure Java database.
It stores data in one file, i. e.
ChronicleMap<Integer, String> map = ChronicleMap
.of(Integer.class, String.class)
.averageValue("my-value")
.entries(10_000)
.createPersistedTo(databaseFile);
Chronicle Map is mature (no severe storage bugs reported for months now, while it's in active use).
Idependent benchmarks show that Chronicle Map is the fastest and the most memory efficient key-value store for Java.
The major disadvantage for your use case is that Chronicle Map supports only a simple key-value model, however more complex solution could be build on top of it.
Disclaimer: I'm the developer of Chronicle Map.
If you are looking for a small and fast database to maybe ship with another program I would check Apache Derby I don't know how you would define embedded-database but I used this in some projects as a debugging database that can be checked in with the source and is available on every developer machine instantaneous.
This isn't an SQL engine, but If you use Prevayler with XStream, you can easily create a single XML file with all your data. (Prevayler calls it a snapshot file.)
Although it isn't SQL-based, and so requires a little elbow grease, its self-contained nature makes development (and especially good testing) much easier. Plus, it's incredibly fast and reliable.
You may want to check out jdbm - we use it on several projects, and it is quite fast. It does use 2 files (a database file and a log file) if you are using it for ACID type apps, but you can drop directly to direct database access (no log file) if you don't need solid ACID.
JDBM will easily support integers and blobs (anything you want), and is quite fast. It isn't really designed for concurrency, so you have to manage the locking yourself if you have multiple threads, but if you are looking for a simple, solid embedded database, it's a good option.
Since you mentioned sqlite, I assume that you don't mind a native db (as long as good java bindings are available). Firebird works well with java, and does single file storage by default.
Both H2 and HSQLDB would be excellent choices, if you didn't have the single file requirement.
I think for now I'm just going to continue to use HDF5 for the persistent data storage, in conjunction with H2 or some other database for in-memory indexing. I can't get SQLite to use BLOBs with the Java driver I have, and I can't get embedded Firebird up and running, and I don't trust H2 with PAGE_STORE yet.