Basic Java application data storage

Basic Java application data storage - java

I'm working on (essentially) a calendar application written in Java, and I need a way to store calendar events. This is the first "real" application I've written, as opposed to simple projects (usually for classes) that either don't store information between program sessions or store it as text or .dat files in the same directory as the program, so I have a few very basic questions about data storage.
How should the event objects and other data be stored? (.dat files, database of some type, etc)
Where should they be stored?
I'm guessing it's not good to load all the objects into memory when the program starts and not update them on the hard drive until the program closes. So what do I do instead?
If there's some sort of tutorial (or multiple tutorials) that covers the answers to my questions, links to those would be perfectly acceptable answers.
(I know there are somewhat similar questions already asked, but none of them I could find address a complete beginner perspective.)
EDIT: Like I said in one of the comments, in general with this, I'm interested in using it as an opportunity to learn how to do things the "right" (reasonably scalable, reasonably standard) way, even if there are simpler solutions that would work in this basic case.

For a quick solution, if your data structures (and of course the way you access them) are sufficiently simple, reading and writing the data to files, using your own format (e.g. binary, XML, ...), or perhaps standard formats such as iCalendar might be more suited to your problem. Libraries such as iCal4J might help you with that.
Taking into account the more general aspects of your question, this is a broader topic, but you may want to read about databases (relational or not). Whether you want to use them or not will depend on the overall complexity of your application.
A number of relational databases can be used in Java using JBDC. This should allow you to connect to the relational database (SQL) of your choice. Some of them run within their own server application (e.g. MS SQL, Oracle, MySQL, PostgreSQL), but some of them can be embedded within your Java application, for example: JavaDB (a variant of Apache Derby DB), Apache Derby DB, HSQLDB, H2 or SQLite.
These embeddable SQL databases will essentially store the data on files on the same machine the application is running on (in a format specific to them), but allow you to use the data using SQL queries.
The benefits include a certain structure to your data (which you build when designing your tables and possible constraints) and (when supported by the engine) the ability to handle concurrent access via transactions. Even in a desktop application, this may be useful.
This may imply a learning curve if you have to learn SQL, but it should save you the trouble of handling the details of defining your own file format. Giving structure to your data via SQL (often known by other developers) can be better than defining your own data structures that you would have to save into and read from your own files anyway.
In addition, if you want to deal with objects directly, without knowing much about SQL, you may be interested in Object-Relational Mapping frameworks such as Hibernate. Their aim is to hide the SQL details from you by being able to store/load objects directly. Not everyone likes them and they also come with their own learning curve (which may entail learning some details of how SQL works too). Their pros and cons could be discussed at length (there are certainly questions about this on StackOverflow or even DBA.StackExchange).
There are also other forms of databases, for example XML databases or Semantic-Web/RDF databases, which may or may not suit your needs.

How should the event objects and other data be stored? (.dat files,
database of some type, etc)
It depends on the size of the data to be stored (and loaded), and if you want to be able to perform queries on your data or not.
Where should they be stored?
A file in the user directory (or in a subdirectory of the user directory) is a good choice. Use System.getProperty("user.home") to get it.
I'm guessing it's not good to load all the objects into memory when
the program starts and not update them on the hard drive until the
program closes. So what do I do instead?
It might be a perfectly valid thing to do, unless the amount of data is so great that it would eat far too much memory. I don't think it would be a problem for a simple calendar application. If you don't want to do that, then store the events in a database and perform queries to only load the events that must be displayed.

A simple sequential file should suffice. Basically, each line in your file represents a record, or in your case an event. Separate each field in your records with a field delimiter, something like the pipe (|) symbol works nice. Remember to store each record in the same format, for example:
date|description|etc
This way you can read back each line in the file as a record, extract the fields by splitting the string on your delimiter (|) symbol, and use the data.
Storing the data in the same folder as your application should be fine.
The best way I find to handle the objects (for the most part), is to determine whether or not the amount of data you are storing is going to be large enough to have consequences on the user's memory. Based on your description, it should be fine in this program.

The right answer depends on details, but probably you want to write your events to a database. There are several good free databases out there, like MySQL and Postgres, so you can (relatively) easily grab one and play with it.
Learning to use a database well is a big subject, bigger than I'm going to answer in a forum post. (I could recommend that you read my book, "A Sane Approach to Database Design", but making such a shameless plug on a forum would be tacky!)
Basically, though, you want to read the data from the database when you need it, and update it when it changes. Don't read everything at start up and write it all back at shut-down.
If the amount of data is small and rarely changes, keeping it all in memory and writing it to a flat file is simpler and faster. But most applications don't fit that description.

Related

How to storage data for specific dates

I am writing an Android application in Java and have the following problems.
I want to store some data, that I log at different days in the week. And I want to show this data in a diagram for example, and to show me the data that has been logged to this date. My question is, what is the best method to solve this problem. Should I use an sqLite database or can I save my data in List? It should be fast and easy to handle when I use the data to show it in my statistics (f.e. diagram) or to filter for specific dates.

You will want to use some method that will be persistent across executions of your program, and of course a database will provide persistent storage. If you use a list, you'll have to save it to storage somehow (perhaps via serialising to a file).

To add to the answers above — and since you asked about it specifically — you should definitely consider sqlite over serializing your own file.
The 2013 PostgreSQL Conference Keynote presented some insightful statistics into the benefits of using sqlite over flat files. Sqlite is, according to its creator (who gave the keynote) "a replacement for fopen()" and uses a mature, familiar SQL API, so it would seem perfectly suited to your needs.

The question is too vague and lacking in detail to provide specific suggestion. But here are some rough ideas.
Little data, simple data
For small amounts of data in simple lists that can fit into memory, write values to text files. I would use the Apache Commons CSV library to assist with the chore of actually writing the files in Comma-separated or Tab-delimited formats.
Little data, slightly complicated data
For storing slightly more complicated objects in a collection that can fit into memory, use the Simple XML Serialization library.
Much data, and/or very complicated data
If you have large amounts of data that do not fit comfortably into memory, or you have many interrelated lists that should be stored as related tables, use a relational database. SQLite is indeed very lite, intended as an alternative to writing to files, not intended to compete against full-fledged databases. For more serious database work, I suggest the H2 Database Engine, built in pure Java.
Be sure to learn about:
java.time classes (especially LocalDate & DayOfWeek)
ISO 8601 formats

MySQL Commands into .DBF Tables

I am currently writing a database system in Java that writes and reads to a MySQL database hosted on XAMPP. The system is fully up and running using MySQL commands to select, update, add, delete etc.
The issue is that we are currently using an old database written in Visual FoxPro that has its tables stored as .DBF files. Rather than taking a few years to get a fully working system and then moving everybody over to the MySQL system at once, we would like to have both systems working concurrently with gradually more people beginning to move over and use the MySQL system.
This is where I am having issues. Is there a way to both update a MySQL table and .DBF file when a job is added through the Java program? Is it as simple as using a MySQL command to directly modify the file? I understand there could be possibilities in Python or PHP however I have never learnt either of these languages and would prefer an easier solution.

Trying to keep two databases in synch is a bad idea. If you miss something and the data no longer matches, how will you know which is right? (A man with one clock always knows what time it is; a man with two is never sure.)
If the existing system is complex enough, maybe consider moving one module at a time.
If the existing system is well-written, it might be possible to switch it to work with MySQL without too much effort. That said, my experience is that very few applications are well enough designed and written to make that a simple task.
Alternatively, you might set up your new system with some kind of wrapper that talks to the existing DBFs until you're done and then can easily switch to MySQL.

Creating java applet or windows application that has internal database

I want to make a form or applet that will serve as GUI to a "database". The basic outline of it will be just a menu where people can put in a date and then all of the items related to that date will show up. I also want to have an insert option where people can add information for a date.
However, I was thinking if this could be possible with creating a database. I want to have this be pretty portable and give it to anyone so don't want to have to deal with connecting to database and installing DB server and all that.
Is it possible to keep all the data within the program and the filesize of the program just grows dynamically as more information is put in?

There are several "stand alone" or "single user" database engines around.
H2
HSQLDB
As I understand it, Apache Derby and even Java DB can be configured for "single user" operation, but you would need to verify this.
If you don't care about having a Java based database, you could also look at SQLLite
Applets have very restrictive security constraints, generally meaning that they can't read or write files to a local disk. You can run a in memory database, but once the applet/database is closed, the data is lost

There is a very lightweight file DBMS called Derby, created by Apache. Take a look at it here: http://db.apache.org/derby/
It's free and simple to use. It is not a very powerful solution, but it does sound like something you need.

Tools to do data processing from Java

I've got a legacy system that uses SAS to ingest raw data from the database, cleanse and consolidate it, and then score the outputted documents.
I'm wanting to move to a Java or similar object oriented solution, so I can implement unit testing, and otherwise general better code control. (I'm not talking about overhauling the whole system, but injecting java where I can).
In terms of data size, we're talking about around 1 TB of data being both ingested and created. In terms of scaling, this might increase by a factor of around 10, but isn't likely to increase on massive scale like a worldwide web project might.
The question is - what tools would be most appropriate for this kind of project?
Where would I find this information - what search terms should be used?
Is doing processing on an SQL database (creating and dropping tables, adding columns, as needed) an appropriate, or awful, solution?
I've had a quick look at Hadoop - but due to the small scale of this project, would Hadoop be an unnecessary complication?
Are there any Java packages that do similar functionality as SAS or SQL in terms of merging, joining, sorting, grouping datasets, as well as modifying data?

It's hard for me to prescribe exactly what you need given your problem statement.
It sounds like a good database API (i.e. native JDBC might be all you need with a good open source database backend)
However, I think you should take some time to check out Lucene. It's a fantastic tool and may meet your scoring needs very well. Taking a search engine indexing approach to your problem may be fruitful.

I think the question you need to ask yourself is
what's the nature of your data set, how often it will be updated.
what's the workload you will have on this 1TB or more data in the future. Will there be mainly offline read and analysis operations? Or there will also have a lot random write operations?
Here is an article talking about if to choose using Hadoop or not which I think is worth reading.
Hadoop is a better choice if you only have daily or weekly update of your data set. And the major operations on the data is read-only operations, along with further data analysis. For the merging, joining, sorting, grouping datasets operation you mentioned, Cascading is a Java library running on top of Hadoop which supports this operation well.

Should we drop stored procedures and run database calls from java programs [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I am fighting to keep the use of stored procedures in our company. There are a few people who say they are bad and we should not use them. We are using DB2 on the i-series.
Please help in my argument to keep stored procedures alive in my company.

You're not going to like this, and I'm probably going to get downvoted into oblivion, but I'm with the rest of your comapny.
Stored Procedures used to offer many benefits (security, performance, etc.) but with parameterized queries and better query optimization, stored procedures really just add another layer of overhead to your application and give you another place you need to update/modify code.
I prefer to keep everything in a single spot so that when I need to edit code, I can go to one place and make my changes there.
If you want more details about the arguments for moving away from Stored Prcoedures, check out this CodingHorror article:
Coding Horror: Who Needs Stored Procedures Anyway?
...and I just noticed that the article is from 2004. I have to believe that databases have gotten better since then which means this would ring even more true today than it did then.

Doing everything over JDBC essentially means that you are inserting a network layer between you and the database. All in all it means that data is more "remote" and come to you slower. Stored procedures can work directly on the data inside the database, and the resulting difference in speed may astonish you.
Please note that you can write stored procedures in any IBM i language including Java, in case it is a matter of programmings skills. Also, you have access to the FULL machine, not just some database internals. Here the AS/400 is so vastly different from any other database product, that experiences from other databases simply - in my opinion - does not apply.
I would recommend the Midrange mailing lists as they have the largest concentration of AS/400 programming skills I know of.

This is one of those Marmite issues. if you are primarily a database programmer you will think that stored procedures should be used extensively. If you are an application programmer - say a Java or a .Net coder - the chances are you will say that they should be avoided completely.
Not that this meets application programmers want to write their own SQL statements. No, these days they tend to want to abstract everything behind convoluted ORM services. These are not easier to understand than stored procedures but are available within the same IDE, so they require less context switching.
There are two big things in favour of stored procedures. The first is that people who know PL/SQL are likely to be familiar with Oracle databases (T-SQL & SQL Server, etc), and so will tend to write better programs for that database (defined as programs which take advantage of the platform's features and are fitted to its functionality) than people who don't.
The second thing is that data persists. Application developers are fond of talking about "database independence" but what really matters is application independence. Front-ends come and go but the data model endures forever. In the last ten years Java applications have been written as Applets, Servlets, JSPs, Tiles and Faces, with add-ons in JavaScript, Groovy, AJAX and JSON, connecting to the database through hand-rolled JDBC, EJB (v1,2,3), TopLink, Hibernate and IBatis... forgive me if I've missed a few. Applications whose UI is a skin over a layer of stored procedures are easier to upgrade to the latest and greatest than applications where the business logic has to be re-written every time. And they will perform better too.
However, in the long run applications which interact directly with the database are probably going to die away. Everything is going to talk to the service bus, and that will decide from where to get the data. Of course, shops where the database is exposed through a well-designed API of stored procedures may find it easier to move to this brave new world than those places which are going to have to extract everything out of their ORM logic.

OK I'll come out in favor of stored procs.
First if you use them exclusively, they make refactoring the database much simpler as you can use the dependencies stored in the database to find out what would be affected by a change (well in SQL Server anyway, can't speak for other datbases).
Second, if you need to change just the query, they are far simpler to deploy.
They are also easier to performance tune as they can easily be called without firing up the application.
If you have complex logic then you save some performance by not having to send all that over the network to the database server. May not seem like a big gain, but if the complex query is run thousands of times a day, it can add up.
Security is also extremely important. If you do not use store procedures, you must set rights at the table or view level. This opens up the database to internal fraud. Yes, parameterized queries reduce the risk of sql injection, but that is not the only threat you need to guard against. If you have personal or financial data and you do not use stored procs (and ones with NO dynamic SQl) and limit your users to only being able to do things through the procs, then your system is in extreme danger from internal employees who can steal data or bypass internal controls to steal money. Read about internal controls in the accounting standards to see why this is a problem.
ORMs also tend to write just downright bad SQL code especially if the query is complex. Further as people start to use them instead of stored procs, I have found that the people who have never used stored procs have a poorer understanding of how to get data out of the database and frequently get the wrong data. Using an ORM is fine if you already understand SQL and can determine when to rewrite the autogenerated code into something that works better. But too many users don't have the skill to write complex code because they never learned the basics.
Finally since you already have stored procs for your application, getting rid of them altogether is a way to introduce new bugs becasue you had to generate new code.

They're useful when you have a layered set of apps. For example, a single core DB with web services offering the atomic operations (which happen to be stored procedures) and a ESB or a set of applications consuming those WSs.
In a single-app/single-db case, the idea is to keep the code in one place as others suggested.
But well, that's just me.

I am a long-time Java developer who has recently come across several projects that made heavy use of stored procedures that have put the use of stored procedures in a really bad light for me.
Having said that, I am reluctant to make a blanket statement that stored procedures are bad as a system design option, because really it depends on the project in question and what the particular stored procedures are trying to accomplish.
My preference is to avoid any kind of stored procedure for simple CRUD operations (it may sound laughable to some to have stored procedures handle these operations, but I've encountered several systems that were doing this) -- This ends up resulting in a lot of code having to be written (and tested and maintained) on the Java side to manage these procedure calls from what I've observed. It's better to just use Hibernate (or some other ORM library) to handle these kinds of operations...if for no other reason than it tends to reduce the amount of code needing to be maintained. It also can cause problems when trying to refactor or make any significant changes to the system, as you're not just having to concern yourself with class/table changes, but stored procedures that handle CRUD ops as well. And this can be exacerbated further if you're in a situation where developers cannot make changes to the database themselves, or there is some formal process in place to coordinate changes between the two parts of the system.
On the other hand, having stored procedures that require limited interaction with the Java code (basically, you just fire off a call to one with a few arguments), and run in a semi-autonomous fashion is not a terrible thing either. I've encountered a few situations (particularly where we were migrating or importing data into a system) where using a stored procedure was a much better route than writing a bunch of Java code to handle the functionality.
I guess the real answer here would be that you should be examining what each store procedure in the system is doing currently and evaluate them on a case-by-case basis to determine if perhaps it's easier to handle the operation in Java or the database. Some may very well work better in Java (either by ORM library, or actual hand-written code), some may not. In either case, the goal should always be to make sure the system is understandable and easy to maintain for everyone, not just whether stored procedures are good or bad in and of themselves.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.