Should I use a text file or Database? - java

So I'm putting together an RSS parser which will process an RSS feed, filter it, and then download the matched items. Assume that the files being downloaded are legal torrent files.
Now I need to keep a record of the files that I have already downloaded, so they aren't done again.
I've already got it working with SQLite (create database if not exists, insert row if a select statement returns nothing), but the resulting jar file is 2.5MB+ (due to the sqlite libs).
I'm thinking that if I use a text file, I could cut down the jar file to a few hundred kilobytes.
I could keep a list of the names of files downloaded - one per line - and reading the whole file into memory, search if a file exists, etc.
The few questions that occur to me know:
Say if 10 files are downloaded a day, would the text file method end
up taking too much resources?
Overall which one is faster
Anyway, what do you guys think? I could use some advice here, as I'm still new to programming and doing this as a hobby thing :)

If you need to keep track only of few informations (like name of the file), you can for sure use a simple text file.
Using a BufferedReader to read you should achieve good performance.

Theoretically DB (either relational or NoSQL is better. But if the distribution size is critical for you using file system can be preferable.
The only problem here is the performance of data access (either for write or for read). Probably think about the following approach. Do not use one single file. Use directory that contains several files instead. The file name will contain key (or keys) that allow access specific data just like key in map. In this case you will be able to access data relatively easily and fast.
Probably take a look on XStream. They have implementation of Map that is implemented as described above: stores entries on disk, each entry in separate file.

Related

Where and how to store text files for reading/writing

I have an app that accesses words from a csv text files. Since they usually do not change I have them placed inside a .jar file and read them using .getResourceAsStream call. I really like this approach since I do not have to place a bunch of files onto a user's computer - I just have one .jar file.
The problem is that I wanted to allow "admin" to add or delete the words within the application and then send the new version of the app to other users. This would happen very rarely (99.9% only read operations and 0.1% write). However, I found out that it is not possible to write to text files inside the .jar file. Is there any solution that would be appropriate for what I want and if so please explain it in detail as I'm still new to Java.
It is not possible because You can't change any content of a jar which is currently used by a JVM.
Better Choose alternate solution like keeping your jar file and text file within the same folder

Save all data in one file or in multiple files?

Is it better to save all users data in one file or create a file for each user with his data? Which one is faster?
EDIT: To explain how the file is used, it is managed by my UserManager class which when loading a user is requested, seaches for line [id] where id matches and then reads all the following lines which belong to that id. When it saves user data, it reads the entire file and apply changes, then write the file with changes.
I would not advocate using files to store data. Use a database (NoSQL or Relational).
If you are forced to use a file(Again, bad idea!), then the more performent of the two would be to read from a single file, if you are reading more than one user's info at a time, as you would only have to open a single stream as opposed to multiple streams. The same goes for writing.
EDIT:
As pointed out by #BackSlash, if you only read/write one user at a time, then performance will NOT always be the same. See comment below
Saving (user) data to a file(s) can be slow. Multiple files or one. It doesn't have to be if you cache your data.
People generally turn to solutions that do this for you, eg a DataBase ;)
They are fast and built for this. Also there are a lot of examples out there.
eg: https://github.com/saintedlama/passport-local-mongoose
Uses node.js (with express, passport, mongoose (mongoDB for node.js), ...)
There is a example in the project
Simply install node.js, mongodb and run the example
If you have to use files. It will be better use to one file. Otherwise you have to open and close file for every user.

Storing Data in Java

I'm currently trying to write a simple journal-like program in Java that allows me to add "entries" and be able to browse all the "entries" I have added since the very beginning. The problem is, if I run the program, add two entries, exit the program, and then run the program again, I want to be able to have access to the two entries I previously added. I guess my questions is then, how am I able to "save" (if that's the right word) the entries that I add so that they won't be wiped out every time the program terminates?
I did some looking around, and it appears there's a tool I can use called the Java Cache System, but I'm not entirely sure if that's what I need for my situation. I'd appreciate if somebody could point me in the right direction.
When you run the program and create the entries your storing them in primary storage aka RAM. As you have discovered these entries will not persist across different executions of your program.
You need to store the entries in secondary storage aka the hard drive. This can be done by writing the entries to a file saved on disk and then reading those entries upon startup of the program. Java provides several mechanisms to read and write files to the file system on a machine.
Some applications use a database to store information in a relational manner so that it is available via a SQL request, however I would recommend using a simple file to store your entries.
The simplest way would be to store this data somehow in a file, and then read it from the file when the application starts, a few simple examples on how to write/read from file:
http://www2.cs.uic.edu/~sloan/CLASSES/java/MyFileReader.java
http://www2.cs.uic.edu/~sloan/CLASSES/java/MyFileReader.txt
http://www2.cs.uic.edu/~sloan/CLASSES/java/MyFileWriter.java
http://www2.cs.uic.edu/~sloan/CLASSES/java/MyFileWriter.txt
Now, you store your objects in memory instead of this you can try to serialize them to some format like xml. And then in next run load them from xml. Or you can try to use dataBase for storing objects.
I faced same problem in past but little bit different.I clearly understood your problem , My solution is whatever the journal you are entering and getting saved should be saved in a particular location in your Location such as "C:\Your_Directory\Journal_folder\"
so it will be easier when you initially enter the journal it stores in above location ,again if u exit and reopen the application just try to retrieve the data from the above Mentioned target Location.
therefore every time when ever you enter the application it retrieves the data from that location if not it displays empty

Sorting a large number of files into a hierarchical tree structure in java

I have a large number of files (a couple thousand XML files), and I need to write a GUI in java which sorts these files into a tree structure based on "Category" elements within the XML data of each file. This program may be run multiple times a day, and small changes/additions may be made to these files daily as well.
How can I save this sorted structure in a way that will minimize load time during subsequent executions of the application? This program will - unfortunately - be working with files on a USB harddrive, so therefore I am trying to avoid parsing each XML document every time the application is run in order to build this tree.
For example, each XML file may have multiple attributes (ie. "Person" with a value of "Fred", and "Organization" with a value of "Google"), and I would like to allow the user to select groups of files based on these category values within the GUI.
Thank you in advance for any and all assistance =)
Ok, here's what you need to do.
Create a SQL database that will store BOTH the file names and the relevant XML tree structure data.
MySQL Is a good, free option.
When the application starts up, have it scan the directory for file names and compare with the database's list of file names.
Any names that are not indexed should be parsed and added to the database.
Spawn a new thread to go through these unindexed files and process them, so the user doesn't see any lag.
Include a button on the application called "Recreate Cache".
Leave a warning "Only press this button when a file has changed" or something
Let the user tell your application when an old file has changed, since it almost never happens.
Alternatively to options 2/3, you could do this:
Create a Daemon task
This would be a separate program that keeps the database maintained
Have it watch for changes to the XML directory and update the database appropriately.
It could also periodically check for changes to the other files, once a day at 2 AM maybe.
Don't read and parse every file again and again each time they must be displayed. You can store the data from the XML files in some other format, that allow for fast and efficient reads. The format perfect for that is a relational database.
So here is what you need to do:
Install a SQL engine. I am no expert in licencing, but MySQL should achieve what you need and it's for free. Create a table with comlumns that matches the structure of your XML files.
Write a system service that watches for changes on file system (you can use FileSystemWatcher from the .NET). You can use Java instead of C#, but then you would have to implement it by periodical polls.
Each time a change occurs, the services takes the file and sends it to the SQL database. There you can easily parse the file by SELECT ExtractValue(xml). Once you get the data, you commit them to the table as a insert (new files) or update (edited files).
Each time you need to load the files into the tree, you run a simple SELECT statement on the database, returning the data in structure you need.

Edit files single line in java

I'm trying to edit configuration file in Java. What I really need to do is to change single line, so reading the whole file and writing it back would be waste of time, since configuration file can be big.
Is there a more efficient way to do this? Except reading in/editing/writing out file. I thouhgt of converting entire file to string, replacing the line I want and writting it back.
I don't know how efficient would that be, can someone give me some other suggestions or the one I mentioned are ok, execution time is important.
I would recommend to use the Preferences API instead. Then on the Windows platform your preferences is stored in the registry. On other platforms the corresponding way to save application preferences is used. See also Preferences API Overview.
How big of a configuration file are we talking here? 1k lines? 10k? 1m lines? If the line you want to edit is the last line, just seek to the start of the line, truncate the file there and write the new one. If it's not... you will need to read it whole and write it again.
Oh, and the 2 options you mention are actually the same (read/edit/write).
On the third hand, I think it's irrelevant (unless you have weird constraints, like a flash storage device which takes too long to write, and has limited write cycles), given the sizes of most config files.

Categories

Resources