How to storage data for specific dates - java

I am writing an Android application in Java and have the following problems.
I want to store some data, that I log at different days in the week. And I want to show this data in a diagram for example, and to show me the data that has been logged to this date. My question is, what is the best method to solve this problem. Should I use an sqLite database or can I save my data in List? It should be fast and easy to handle when I use the data to show it in my statistics (f.e. diagram) or to filter for specific dates.

You will want to use some method that will be persistent across executions of your program, and of course a database will provide persistent storage. If you use a list, you'll have to save it to storage somehow (perhaps via serialising to a file).

To add to the answers above — and since you asked about it specifically — you should definitely consider sqlite over serializing your own file.
The 2013 PostgreSQL Conference Keynote presented some insightful statistics into the benefits of using sqlite over flat files. Sqlite is, according to its creator (who gave the keynote) "a replacement for fopen()" and uses a mature, familiar SQL API, so it would seem perfectly suited to your needs.

The question is too vague and lacking in detail to provide specific suggestion. But here are some rough ideas.
Little data, simple data
For small amounts of data in simple lists that can fit into memory, write values to text files. I would use the Apache Commons CSV library to assist with the chore of actually writing the files in Comma-separated or Tab-delimited formats.
Little data, slightly complicated data
For storing slightly more complicated objects in a collection that can fit into memory, use the Simple XML Serialization library.
Much data, and/or very complicated data
If you have large amounts of data that do not fit comfortably into memory, or you have many interrelated lists that should be stored as related tables, use a relational database. SQLite is indeed very lite, intended as an alternative to writing to files, not intended to compete against full-fledged databases. For more serious database work, I suggest the H2 Database Engine, built in pure Java.
Be sure to learn about:
java.time classes (especially LocalDate & DayOfWeek)
ISO 8601 formats

Related

How to store and access large and frequently used data in XML?

I'm creating a forum website for my college project. Now just like the format of stackoverflow( ;) ), each question and answers will have comment facility. I'm expecting the size of questions and answers to be large and thus I don't want to use database like MYSQL. So what I'm planning to do is to create an XML file for each question within the 'question' tags and when some answers it append an 'answer' tag with it. Now can anyone help me on how to access the data from such XML files. I'm open to suggestions which tell me how to store the questions, and answers and the comments by some other way rather than with XML file.
Thanks in advance.
I would strongly suggest to use a relational database like MySQL. If you are not talking about vast amounts of data, then relational databases are performing great. Since you deal with a college project, I suppose there will be no need to pay the overhead of moving to another solution.
Anyway, if you want to store your data in another non-relational format, I would suggest moving to a NoSQL solution and not using a simple file based solution. I would also suggest using JSON format which has less overhead than XML. MongoDB is a NoSQL which is perfect at storing JSON data. Actually is is using BSON which is a binary JSON format.
Hope I helped!

Tools to do data processing from Java

I've got a legacy system that uses SAS to ingest raw data from the database, cleanse and consolidate it, and then score the outputted documents.
I'm wanting to move to a Java or similar object oriented solution, so I can implement unit testing, and otherwise general better code control. (I'm not talking about overhauling the whole system, but injecting java where I can).
In terms of data size, we're talking about around 1 TB of data being both ingested and created. In terms of scaling, this might increase by a factor of around 10, but isn't likely to increase on massive scale like a worldwide web project might.
The question is - what tools would be most appropriate for this kind of project?
Where would I find this information - what search terms should be used?
Is doing processing on an SQL database (creating and dropping tables, adding columns, as needed) an appropriate, or awful, solution?
I've had a quick look at Hadoop - but due to the small scale of this project, would Hadoop be an unnecessary complication?
Are there any Java packages that do similar functionality as SAS or SQL in terms of merging, joining, sorting, grouping datasets, as well as modifying data?
It's hard for me to prescribe exactly what you need given your problem statement.
It sounds like a good database API (i.e. native JDBC might be all you need with a good open source database backend)
However, I think you should take some time to check out Lucene. It's a fantastic tool and may meet your scoring needs very well. Taking a search engine indexing approach to your problem may be fruitful.
I think the question you need to ask yourself is
what's the nature of your data set, how often it will be updated.
what's the workload you will have on this 1TB or more data in the future. Will there be mainly offline read and analysis operations? Or there will also have a lot random write operations?
Here is an article talking about if to choose using Hadoop or not which I think is worth reading.
Hadoop is a better choice if you only have daily or weekly update of your data set. And the major operations on the data is read-only operations, along with further data analysis. For the merging, joining, sorting, grouping datasets operation you mentioned, Cascading is a Java library running on top of Hadoop which supports this operation well.

Java Messenger : save message archives on the computer

I am doing a Java Messenger for people to chat and I an looking for a way to record the message archives on the user's computer.
I have 2 possibilities in my mind :
To Save the conversations in XML files that I store in my documents folder.
To use SQlite, but the problem is that I don't know how it is possible to integrate it to my setup package and I don't know if it is very useful.
What would be the best solution for you ?
Thank you
Another option is using JavaDb, which comes for free with Java 6 (and later versions)
Before you make a choice, you should think about questions such as:
presumably you want this transparent to the user (i.e. no admin involved)
is performance an issue ?
what happens if the storage schema needs migration
do you need transactionality (unlikely, I suspect)
etc. It's quite possible that even a simple text file would suffice. Perhaps your best bet is to choose a simple solution (e.g. a text file) and implement that, and see how far it takes you. However, provide a suitable persistence level abstraction such that you can slot in a different solution in the future with minimal disruption.
I would go for the XML files as they are more generic and could be opened outside your messenger with more or less human readable format. I use Pidgin for instant messaging and it saves chat history in XML. Also to read the history from your application you can transform then easily in HTML to display it nicely.
If you use JAXB, converting Java objects to/from XML is very easy. You just put a few annotations on your classes, and run them through a JAXB marshaller/unmarshaller. See http://docs.oracle.com/javaee/5/tutorial/doc/bnbay.html
Use google's protocolbuffer or 10gen's bson. they are much smaller and faster.
http://code.google.com/apis/protocolbuffers/docs/javatutorial.html
http://bsonspec.org/
One issue is these are in the binary presentation and you might want to make the archive transparent/readable to users

Basic Java application data storage

I'm working on (essentially) a calendar application written in Java, and I need a way to store calendar events. This is the first "real" application I've written, as opposed to simple projects (usually for classes) that either don't store information between program sessions or store it as text or .dat files in the same directory as the program, so I have a few very basic questions about data storage.
How should the event objects and other data be stored? (.dat files, database of some type, etc)
Where should they be stored?
I'm guessing it's not good to load all the objects into memory when the program starts and not update them on the hard drive until the program closes. So what do I do instead?
If there's some sort of tutorial (or multiple tutorials) that covers the answers to my questions, links to those would be perfectly acceptable answers.
(I know there are somewhat similar questions already asked, but none of them I could find address a complete beginner perspective.)
EDIT: Like I said in one of the comments, in general with this, I'm interested in using it as an opportunity to learn how to do things the "right" (reasonably scalable, reasonably standard) way, even if there are simpler solutions that would work in this basic case.
For a quick solution, if your data structures (and of course the way you access them) are sufficiently simple, reading and writing the data to files, using your own format (e.g. binary, XML, ...), or perhaps standard formats such as iCalendar might be more suited to your problem. Libraries such as iCal4J might help you with that.
Taking into account the more general aspects of your question, this is a broader topic, but you may want to read about databases (relational or not). Whether you want to use them or not will depend on the overall complexity of your application.
A number of relational databases can be used in Java using JBDC. This should allow you to connect to the relational database (SQL) of your choice. Some of them run within their own server application (e.g. MS SQL, Oracle, MySQL, PostgreSQL), but some of them can be embedded within your Java application, for example: JavaDB (a variant of Apache Derby DB), Apache Derby DB, HSQLDB, H2 or SQLite.
These embeddable SQL databases will essentially store the data on files on the same machine the application is running on (in a format specific to them), but allow you to use the data using SQL queries.
The benefits include a certain structure to your data (which you build when designing your tables and possible constraints) and (when supported by the engine) the ability to handle concurrent access via transactions. Even in a desktop application, this may be useful.
This may imply a learning curve if you have to learn SQL, but it should save you the trouble of handling the details of defining your own file format. Giving structure to your data via SQL (often known by other developers) can be better than defining your own data structures that you would have to save into and read from your own files anyway.
In addition, if you want to deal with objects directly, without knowing much about SQL, you may be interested in Object-Relational Mapping frameworks such as Hibernate. Their aim is to hide the SQL details from you by being able to store/load objects directly. Not everyone likes them and they also come with their own learning curve (which may entail learning some details of how SQL works too). Their pros and cons could be discussed at length (there are certainly questions about this on StackOverflow or even DBA.StackExchange).
There are also other forms of databases, for example XML databases or Semantic-Web/RDF databases, which may or may not suit your needs.
How should the event objects and other data be stored? (.dat files,
database of some type, etc)
It depends on the size of the data to be stored (and loaded), and if you want to be able to perform queries on your data or not.
Where should they be stored?
A file in the user directory (or in a subdirectory of the user directory) is a good choice. Use System.getProperty("user.home") to get it.
I'm guessing it's not good to load all the objects into memory when
the program starts and not update them on the hard drive until the
program closes. So what do I do instead?
It might be a perfectly valid thing to do, unless the amount of data is so great that it would eat far too much memory. I don't think it would be a problem for a simple calendar application. If you don't want to do that, then store the events in a database and perform queries to only load the events that must be displayed.
A simple sequential file should suffice. Basically, each line in your file represents a record, or in your case an event. Separate each field in your records with a field delimiter, something like the pipe (|) symbol works nice. Remember to store each record in the same format, for example:
date|description|etc
This way you can read back each line in the file as a record, extract the fields by splitting the string on your delimiter (|) symbol, and use the data.
Storing the data in the same folder as your application should be fine.
The best way I find to handle the objects (for the most part), is to determine whether or not the amount of data you are storing is going to be large enough to have consequences on the user's memory. Based on your description, it should be fine in this program.
The right answer depends on details, but probably you want to write your events to a database. There are several good free databases out there, like MySQL and Postgres, so you can (relatively) easily grab one and play with it.
Learning to use a database well is a big subject, bigger than I'm going to answer in a forum post. (I could recommend that you read my book, "A Sane Approach to Database Design", but making such a shameless plug on a forum would be tacky!)
Basically, though, you want to read the data from the database when you need it, and update it when it changes. Don't read everything at start up and write it all back at shut-down.
If the amount of data is small and rarely changes, keeping it all in memory and writing it to a flat file is simpler and faster. But most applications don't fit that description.

Easy way to store and retrieve objects in Java without using a relational DB? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Do you know of an "easy" way to store and retrieve objects in Java without using a relational DB / ORM like Hibernate?
[Note that I am not considering serialization as-is for this purpose, as it won't allow to retrieve arbitrary objects in the middle of an object graph. Neither am I considering DB4O because of its restrictive license. Thanks.]
"Easy" meaning: not having to handle low-level details such as key/value pairs to rebuild an object graph (as with BerkeleyDB or traditional caches). The same applies for rebuilding objects from a document- or column-oriented DB (CouchDB, HBase, ..., even Lucene).
Perhaps there are interesting projects out there that provide a layer of integration between the mentioned storage systems and the object model (like ORM would be for RDBMSs) that I am not aware of.
Anyone successfully using those in production, or experimenting with persistence strategies other than relational DBs? How about RDF stores?
Update: I came across a very interesting article: A list of distributed key-value stores
Object Serialization (aka storing things to a file)
Hibernate (uses a relational database but it is fairly transparent to the developer)
I would suggest Hibernate because it will deal with most of the ugly details that bog developers down when using a database while still allowing for the optimizations that have been made to database software over the years.
NeoDatis looks interesting. It is licensed under the LGPL, so not quite as restrictive as the GLP proper.
Check out their 1 minute tutorial to see if it will work for your needs.
I would like to recommend XStream which simply takes your POJOs and creates XML out of them so you can store it on disk. It is very easy to use and is also open source.
I'd recommend Hibernate (or, more general, OR-mapping) like Matt, but there is also a RDBMS at the backend and I'm not so sure about what you mean by
...without using a relational DB?...
It also would be interesting to know more about the application, because OR-mapping is not always a good idea (development performance vs. runtime performance).
Edit: I shortly learned about terracotta and there is a good stackoverflow discussion here about replacing DBs with that tool. Still experimental, but worth reading.
I still think you should consider paying for db4o.
If you want something else, add "with an MIT-style license" to the title.
Check out comments on Prevayler on this question. Prevayler is a transactional wrapper around object serialization - roughly, use objects in plain java and persist to disk through java API w/o sql, a bit neater than writing your own serialization.
Caveats- with serialization as a persistance mechanism, you run the risk of invalidating your saved data when you update the class. Even with a wrapper library you'll probably want to customize the serialization/deserialization handling. It also helps to include the serialVersionUID in the class so you override the JVM's idea of when the class is updated (and therefore can't reload your saved serialized data).
Hmm... without serialization, and without an ORM solution, I would fall back to some sort of XML based implementation? You'd still have to design it carefully if you want to pull out only some of the objects from the object graph - perhaps a different file for each object, where object relationships are referenced by a URI to another file?
I would have said that wasn't "easy" because I've always found designing the mapping of XML to objects to be somewhat time consuming, but I was really inspired by a conversation on Apache Betwixt that has me feeling hopeful that I'm just out of date, and easier solutions are now available.
Terracotta provides a highly available, highly scalable persistent to disk object store. You can use it for just this feature alone - or you can use it's breadth of features to implement a fully clustered application - your choice.
Terracotta:
does not break object identity giving you the most natural programming interface
does not require Serialization
clusters (and persists) nearly all Java classes (Maps, Locks, Queues, FutureTask, CyclicBarrier, and more)
persists objects to disk at memory speeds
moves only object deltas, giving very high performance
Here's a case study about how gnip uses Terracotta for in-memory persistence - no database. Gnip takes in all of the events on Facebook, Twitter, and the like and produces them for consumers in a normalized fashion. Their current solution is processing in excess of 50,000 messages / second.
It's OSS and has a high degree of integration with many other 3rd party frameworks including Spring and Hibernate.
I guess I have found a sort of answer to my question.
Getting the document-oriented paradigm mindset is no easy task when you have always thought your data in terms of relationships, normalization and joins.
CouchDB seems to fit the bill. It still could act as a key-value store but its great querying capabilities (map/reduce, view collations), concurrency readiness and language-agnostic HTTP access makes it my choice.
Only glitch is having to correclty define and map JSON structures to objects, but I'm confident I will come up with a simple solution for usage with relational models from Java and Scala (and worry about caching later on, as contention is moved away from the database). Terracotta could still be useful but certainly not as with an RDBMS scenario.
Thank you all for your input.

Categories

Resources