I got a java web project handling several objects (again containing n objects of type A (e.g. time and value) and m objects of type B (e.g. time and String array)). The web projects itself contains several servlets/jsps for visualization as well as some logic for data manipulation and currently runs on an Apache Tomcat.
Is it possible to store the whole data in the servers (or most of the time: local) memory while the server is running? If the Tomcat is shut down, the data could be stored in a simple file, no restrictions there. On server startup, I just want to read in the files and write the objects to memory. How can I initiate the Tomcat to do so?
The reason why I do not want to use an extra database is, that I want to deliver a zip file containing the tomcat including the deployed *.war file (as I don't want my prof getting stuck with tomcat server setup etc.)
Thanks, ChrisH
You could implement ServletContextListener and write the load-from-file and save-to-file logic in the contextInitialized() and contextDestroyed() methods which are invoked during webapp's startup and shutdown respectively.
You can read and write objects to disk, but they all need to implement java.io.Serializable first. Here is a Serialization tutorial with code examples.
That said, have you considered an embedded database so that you don't need to install a database server? You could use the JDK6's built-in JavaDB for this or its competitor HSQLDB. Alternatively, if it are pure key-value pairs, then you could also just use the java.util.Properties API for this (tutorial here). Just place the propertiesfile somewhere in the classpath and use ClassLoader#getResourceAsStream() to get an InputStream of it, or place it somewhere in WEB-INF and use ServletContext#getResourceAsStream().
I think that HSQLDB is exactly what you need, a small database server that is also embedded natively in Apache Tomcat. It stores data in memory allowing also to write and read contents from a file.
If the app shuts down unexpectedly, you'll lose all your data, because it won't have time to write it to disk.
You could use a database like SQLite/derby/hsql etc. which store their data to the filesystem.
If you don't want to mess with a DB, then you could store everything in memory and flush it to disk every time it's modified. A couple tips here:
Serialization can make this really easy. Make all your objects implement Serializable, and give them a serial version id
use a BufferedOutputStream when
writing to disk, this is faster than a straight FileOutputStream
DO NOT overwrite your old data file directly! Write to a new file, and when done writing, move the completed file on top of your old file. That way, if the server shuts down while you're in the middle of writing your data file, you still have the good file which was written before.
You should acquire a read lock on your data while writing it. Any other code which modifies the data should get a Write lock on the data.
If you don't care about the possibility that your application may scribble all over your data files, your Tomcat / JVM may crash, or your machine may die losing all in-memory objects, then managing persistence as you suggest is an option. But you'll have quite a bit of infrastructure to build, test and maintain. And you'll miss out on the "value add" tools that most RDBMs provide; backup, a query tool, optimizers, replication, etc.
But if catastrophic data loss is not an option for you, you should use an RDBMs, ODBMs, whatever to do your persistence.
Related
My question is about a bukkit plugin.
I want to save data on closing the server. But I can't find the best way to save the data. all the data I want to save are strings. what is the best way?
using yml file saved in the server files or using database MySQL or?
Majority of Bukkit developers prefer YAML because of its availability which have made it standard to use, snakeyaml is included in Bukkit. If you write code that will be shared, such as open source or for a team of developers YAML is almost a necessity.
MySQL should only be used when the data needs to be shared between multiple servers such as a network. If you enter any network, for instance a minigame network, your player data is most likely stored in a database so you will have the same points in every one of their servers. Why not always use MySQL? It requires a connection to be opened which may fail, this means the server is dependent on another source which you usually want to avoid. MySQL is most times also slower performance-wise than other alternatives.
What about other files/methods? I've seen developers store data using JSON or even pure text files, claiming it's faster, but this should only really be considered if you have performance issues or generally prefer that file type.
I am currently doing my project on Migrating data from MySql to NoSql using Java as programming language. Following are the process involved in it:
Reads MySql data and writes into file in Json format
Reads the Json file and writes into NoSql
Writes the error log if any error occurs in any of the above process
However, the migration can be done without using the file as intermediate layer. I found many of the tools and thesis works are done in the above design therefore I just followed it. Is there any benefit using file as a middle layer instead of migrating directly?
To answer the question outright: Yes there is benefits, but it depends on your overall implementation.
Here are a couple things to consider (as to why it could be an asset).
Integrity in case of failure. Depending on how the process runs, if something terrible happens during the transfer, having the files shows you where/why a problem occurred.
If your databases are physically separated the files would save you a lot of overhead traffic between servers.
Generally easier to debug a file then a process. It is easy to see the problem when exporting to a somewhat readable file, versus tracking down the same bug at runtime.
Reasons against:
Files take up extra space you may not want to use.
Slower overall (since this in effect requires reading database, save to file, load from file, write to database).
It adds an extra point of failure. You have to read and write to the database, and convert into a usable format regardless of file implementation. However the added layer of using a file increases the risk of failure (such as missing files, corrupted, too large, etc).
Since storage and bandwidth are a concern in your situation, here would be my recommendation. If you have enough storage to accommodate the files during the transfer (e.g. they are temporary) then transfer using files as it will save you bandwidth. Deleting the files afterwards makes storage less of an issue.
I have a Java utility for database imports. I'd like to be able to use sqlldr for performance on oracle. I could create the control and data files, but that doesn't seem like The Right Thing™ to do. I should be able to stream the data by providing INFILE "-" in the control file (q1 - how? from command line, I can pipe "echo <data...>" to the sqlldr, but there must be a way to just stream the string into the input stream for the process? never used Java for this before). I can't see how to stream the control file itself (q2 - or am I missing something obvious?). I could use named pipes, but I have no idea how to instantiate and use them from Java in windows (q3 - would that work and how?).
<moan>why must oracle be so complicated? it was trivial in mysql...<moan>
"why must oracle be so complicated? it
was trivial in mysql"
What you must remember is, Oracle is a venerable product. SQL Loader as a utility must be twenty years old, maybe more. So naturally it is harder to work with than some newer tools.
And that is why you should stop trying to fit SQL Loader into your new-fangled Java app :-) Look at external tables instead. Because these are database objects we can use SQL SELECTs against them, so it's a whole easier to automate load processes with them. I wrote a bit more about external tables in my answer to another question. Check it out.
Fundamentally SQLLDR is about getting data from one or more files into a database table. It is powerful in that role, especially when dealing with multiple files or parallel loads from a single file (it can have multiple threads/processes reading from the same file at the same time).
Not all of these fit well with reading from something that isn't a real file. If your data stream is coming from a web service, then I'd pull it using UTL_HTTP. If it is coming from FTP, then I'd FTP straight into the database as a CLOB/BLOB and process it from there.
Depending on your version, also look at the preprocessor capabilities of external tables
We have a utility spring-mvc application that doesn't use a database, it is just a soap/rest wrapper. We would like to store an arbitrary message for display to users that persists between deployments. The application must be able to both read and write this data. Are there any best practices for this?
Multiple options.
Write something to the file system - Great for persistence. A little slow. Primary drawback is that it would probably have to be a shared file system, as any type of clustering wouldn't deal well with this. Then you get into file locking issues. Very easy implementation
Embedded DB - Similar benefits and pitfalls as just writing to the file system, but probably deals better with locking/transactional issues. Somewhat more difficult implementation.
Distributed Cache - Like Memcached - A bit faster than file, though not much. Deals with the clustering and locking issues. However, it's not persistent. Fairly reliable for a short webapp restart, but definitely not 100%. More difficult implementation, plus you need another server.
Why not use an embedded database? Options are:
H2
HSQL
Derby
Just include the jar file in the webapps classdir and configure the JDBC URL as normal.
Perfect for demos and easy to substitute when you want to switch to a bigger database server
I would simple store that in a file on a filesystem. It's possible to use an embedded database, or something like that, but for 1 message, a file will be fine.
I'd recommend you store the file outside of the application directory.
It might be alongside (next to) it, but don't go storing it inside your "webapps/" directory, or anything like that.
You'll probably also need to manage concurrency. A global (static) read/write lock should do fine.
I would use JNDI. Why over-complicate?
I need to create a storage file format for some simple data in a tabular format, was trying to use HDF5 but have just about given up due to some issues, and I'd like to reexamine the use of embedded databases to see if they are fast enough for my application.
Is there a reputable embedded Java database out there that has the option to store data in one file? The only one I'm aware of is SQLite (Java bindings available). I tried H2 and HSQLDB but out of the box they seem to create several files, and it is highly desirable for me to have a database in one file.
edit: reasonably fast performance is important. Object storage is not; for performance concerns I only need to store integers and BLOBs. (+ some strings but nothing performance critical)
edit 2: storage data efficiency is important for larger datasets, so XML is out.
Nitrite Database http://www.dizitart.org/nitrite-database.html
NOsql Object (NO2 a.k.a Nitrite) database is an open source nosql
embedded document store written in Java with MongoDB like API. It
supports both in-memory and single file based persistent store.
H2 uses only one file, if you use the latest H2 build with the PAGE_STORE option. It's a new feature, so it might not be solid.
If you only need read access then H2 is able to read the database files from a zip file.
Likewise if you don't need persistence it's possible to have an in-memory only version of H2.
If you need both read/write access and persistence, then you may be out of luck with standard SQL-type databases, as these pretty much all uniformly maintain the index and data files separately.
Once i used an object database that saved its data to a file. It has a Java and a .NET interface. You might want to check it out. It's called db4o.
Chronicle Map is an embedded pure Java database.
It stores data in one file, i. e.
ChronicleMap<Integer, String> map = ChronicleMap
.of(Integer.class, String.class)
.averageValue("my-value")
.entries(10_000)
.createPersistedTo(databaseFile);
Chronicle Map is mature (no severe storage bugs reported for months now, while it's in active use).
Idependent benchmarks show that Chronicle Map is the fastest and the most memory efficient key-value store for Java.
The major disadvantage for your use case is that Chronicle Map supports only a simple key-value model, however more complex solution could be build on top of it.
Disclaimer: I'm the developer of Chronicle Map.
If you are looking for a small and fast database to maybe ship with another program I would check Apache Derby I don't know how you would define embedded-database but I used this in some projects as a debugging database that can be checked in with the source and is available on every developer machine instantaneous.
This isn't an SQL engine, but If you use Prevayler with XStream, you can easily create a single XML file with all your data. (Prevayler calls it a snapshot file.)
Although it isn't SQL-based, and so requires a little elbow grease, its self-contained nature makes development (and especially good testing) much easier. Plus, it's incredibly fast and reliable.
You may want to check out jdbm - we use it on several projects, and it is quite fast. It does use 2 files (a database file and a log file) if you are using it for ACID type apps, but you can drop directly to direct database access (no log file) if you don't need solid ACID.
JDBM will easily support integers and blobs (anything you want), and is quite fast. It isn't really designed for concurrency, so you have to manage the locking yourself if you have multiple threads, but if you are looking for a simple, solid embedded database, it's a good option.
Since you mentioned sqlite, I assume that you don't mind a native db (as long as good java bindings are available). Firebird works well with java, and does single file storage by default.
Both H2 and HSQLDB would be excellent choices, if you didn't have the single file requirement.
I think for now I'm just going to continue to use HDF5 for the persistent data storage, in conjunction with H2 or some other database for in-memory indexing. I can't get SQLite to use BLOBs with the Java driver I have, and I can't get embedded Firebird up and running, and I don't trust H2 with PAGE_STORE yet.