It is kind of basic thing to store file as it seems, but I stumbled upon certain problem. Here is what I understood about storing files on server while programming in JAVA.
You can't store in mysql, because if the file is heavy, it is not recommended. So, I decided not to opt this.
It is also not recommended to store files in containers, such as Tomcat or Wildfly, may be because of the fact that it needs to get deployed and something like that??
You can definitely store file, Apache File server? I am confused in this. Can we do this, and store files in here, reference it to the database? Is this the similar way websites store their images or files on?
I also came across some databases such as NoSql, but I didn't go much into depth, thinking it might be wrong at the end, and I would have invested my time in other stuff.
Saying so, what is the good way to store file in the server, reference it on JAVA web application, and record it in database?
We don't usually put our files in the database (NoSQL or RDBMS) because they are not file systems. If someone uploads a file, you store it in a file system and probably record the name and other metadata in the db for future use. You can technically put the contents of a file in a database and it has its own merits and drawbacks - https://softwareengineering.stackexchange.com/a/150787/156860
Uploaded files are not recommended to reside in web/app servers because you might have more than one server to handle the load and if you put your file in just one server, the other server might face problems in accessing the content. And if one server goes down, the files in that server are not available for other servers to use. You'd probably need a shared drive/disk which all the servers can connect to and read the file.
JavaEE handles this by providing a Resource Adapter which abstracts how the application interacts with 'resources' in general (could be a file or some other resource as well). But without having more detail on what kind of files, how big, and how are they being used, it'll be hard to get to 'the' solution for the question.
It really depends on what your use-case is. If you give more insight into what you're trying to solve for you should get better responses.
I think generally speaking that cloud solutions like Amazon's S3 are very common, good, and easy to work with these days. You also get the benefit of your servers not necessarily having to serve the files themselves which means more bandwidth for non-file based requests.
You can't store in mysql, because if the file is heavy, it is not recommended. So, I decided not to opt this.
It really depends on what you're doing, if you have high traffic and large files then this isn't going to scale well. If you have low traffic and small files, then there's no reason this isn't a viable solution.
It is also not recommended to store files in containers, such as Tomcat or Wildfly, may be because of the fact that it needs to get deployed and something like that??
Not quite sure what you mean by "in" containers, do you mean in your war file? Or on the local file system? It's generally not recommended but again, it really depends on traffic, file sizes, etc.. The local file system can work, but you should use something like NFS so if the server goes down the files don't go with it.
I've seen people use the local file system and git to store text files that are saved through the server. This gave them redundancy and versions.
You can definitely store file, Apache File server? I am confused in this. Can we do this, and store files in here, reference it to the database? Is this the similar way websites store their images or files on?
I've never personally used Apache File Server, but what's confusing you about it? After doing a simple Google it looks like it's basically just an Apache server, which means you make HTTP requests to store and retrieve files. This would be similar to how S3 works and so I'd say it's probably a good general solution.
I also came across some databases such as NoSql, but I didn't go much into depth, thinking it might be wrong at the end, and I would have invested my time in other stuff.
The actual NoSql solution you choose will have it's own requirements so I can't speak to how good the solution will be, but again, it really depends on your traffic and file sizes. The NoSql solutions I've used all have HTTP interfaces so working with them won't be all that different than Apache or S3. The biggest difference is that you have to query for the files like you would from a Database in a NoSql solution, but those queries are usually sent over HTTP so, not really that different.
Related
I'm trying to sort out a way to access an NFS share (ideally all privileges, but I'll settle for read only for now) from our application in Java. I've spent most of the day researching and the closest I came was the yanfs project (nee WebNFS) but it doesn't seem to have been updated since the aughties and it doesn't have any documentation either. I ran some low grade experiments with it but those were unsuccessful.
Because of the nature of our application, I can't pre-mount the volumes (there could be zero to many) and I would like to avoid calling sudo mount inside the program if at all possible. Unfortunately this approach is the only semi-viable solution I can come up with. Any suggestions would be welcome.
Also: No modern NFS java client libraries? Really? That can't possibly be right.
Have you checked out this library https://github.com/dCache/nfs4j ?
It has a server and client pure java implementation for NFSv3, NFSv4 and NFS4.1.
It is a bit low level and it does NOT provide simple usage like XFile in yaNFS.
So you have to do a bit of work to read and write to files, But at least it gets the job done, Accessing NFS exports without mounting.
You can find some file access examples at the project repo.
Since time is of the essence, we're going to cheat a bit for now. So this is the solution I worked out in case anyone comes along later.
I looked into autofs like #dsh suggested. With Autofs I set up the /etc/auto.master file to have the following line:
/mnt/fromNFS /usr/local/etc/auto.fromNFS --timeout=60
I then touched the /usr/local/etc/auto.fromNFS and changed its ownership to the user and group that is to run the app.
Now I can anagrammatically modify the auto.fromFNS file to include lines for the given nfs share. When I then go to access that directory, it nicely gets mounted with no need to sudo.
Its not ideal but it looks like it will get the job done for now.
Thanks to everyone for their suggestions.
I'm working on (essentially) a calendar application written in Java, and I need a way to store calendar events. This is the first "real" application I've written, as opposed to simple projects (usually for classes) that either don't store information between program sessions or store it as text or .dat files in the same directory as the program, so I have a few very basic questions about data storage.
How should the event objects and other data be stored? (.dat files, database of some type, etc)
Where should they be stored?
I'm guessing it's not good to load all the objects into memory when the program starts and not update them on the hard drive until the program closes. So what do I do instead?
If there's some sort of tutorial (or multiple tutorials) that covers the answers to my questions, links to those would be perfectly acceptable answers.
(I know there are somewhat similar questions already asked, but none of them I could find address a complete beginner perspective.)
EDIT: Like I said in one of the comments, in general with this, I'm interested in using it as an opportunity to learn how to do things the "right" (reasonably scalable, reasonably standard) way, even if there are simpler solutions that would work in this basic case.
For a quick solution, if your data structures (and of course the way you access them) are sufficiently simple, reading and writing the data to files, using your own format (e.g. binary, XML, ...), or perhaps standard formats such as iCalendar might be more suited to your problem. Libraries such as iCal4J might help you with that.
Taking into account the more general aspects of your question, this is a broader topic, but you may want to read about databases (relational or not). Whether you want to use them or not will depend on the overall complexity of your application.
A number of relational databases can be used in Java using JBDC. This should allow you to connect to the relational database (SQL) of your choice. Some of them run within their own server application (e.g. MS SQL, Oracle, MySQL, PostgreSQL), but some of them can be embedded within your Java application, for example: JavaDB (a variant of Apache Derby DB), Apache Derby DB, HSQLDB, H2 or SQLite.
These embeddable SQL databases will essentially store the data on files on the same machine the application is running on (in a format specific to them), but allow you to use the data using SQL queries.
The benefits include a certain structure to your data (which you build when designing your tables and possible constraints) and (when supported by the engine) the ability to handle concurrent access via transactions. Even in a desktop application, this may be useful.
This may imply a learning curve if you have to learn SQL, but it should save you the trouble of handling the details of defining your own file format. Giving structure to your data via SQL (often known by other developers) can be better than defining your own data structures that you would have to save into and read from your own files anyway.
In addition, if you want to deal with objects directly, without knowing much about SQL, you may be interested in Object-Relational Mapping frameworks such as Hibernate. Their aim is to hide the SQL details from you by being able to store/load objects directly. Not everyone likes them and they also come with their own learning curve (which may entail learning some details of how SQL works too). Their pros and cons could be discussed at length (there are certainly questions about this on StackOverflow or even DBA.StackExchange).
There are also other forms of databases, for example XML databases or Semantic-Web/RDF databases, which may or may not suit your needs.
How should the event objects and other data be stored? (.dat files,
database of some type, etc)
It depends on the size of the data to be stored (and loaded), and if you want to be able to perform queries on your data or not.
Where should they be stored?
A file in the user directory (or in a subdirectory of the user directory) is a good choice. Use System.getProperty("user.home") to get it.
I'm guessing it's not good to load all the objects into memory when
the program starts and not update them on the hard drive until the
program closes. So what do I do instead?
It might be a perfectly valid thing to do, unless the amount of data is so great that it would eat far too much memory. I don't think it would be a problem for a simple calendar application. If you don't want to do that, then store the events in a database and perform queries to only load the events that must be displayed.
A simple sequential file should suffice. Basically, each line in your file represents a record, or in your case an event. Separate each field in your records with a field delimiter, something like the pipe (|) symbol works nice. Remember to store each record in the same format, for example:
date|description|etc
This way you can read back each line in the file as a record, extract the fields by splitting the string on your delimiter (|) symbol, and use the data.
Storing the data in the same folder as your application should be fine.
The best way I find to handle the objects (for the most part), is to determine whether or not the amount of data you are storing is going to be large enough to have consequences on the user's memory. Based on your description, it should be fine in this program.
The right answer depends on details, but probably you want to write your events to a database. There are several good free databases out there, like MySQL and Postgres, so you can (relatively) easily grab one and play with it.
Learning to use a database well is a big subject, bigger than I'm going to answer in a forum post. (I could recommend that you read my book, "A Sane Approach to Database Design", but making such a shameless plug on a forum would be tacky!)
Basically, though, you want to read the data from the database when you need it, and update it when it changes. Don't read everything at start up and write it all back at shut-down.
If the amount of data is small and rarely changes, keeping it all in memory and writing it to a flat file is simpler and faster. But most applications don't fit that description.
For the sake of brevity consider a facebook style image content serving app. Users can upload content as well as access content shared by other people. I am looking at best ways of handling this kind of file serving application through Java servlets. There is surprisingly little information available on the topic. I'd appreciate if someone can tell me their personal experiences on a small setup (a few hundred users).
So far I am tempted to use the database as a file system (using mongodb) but the approach seems cumbersome and tedious and will need replicating part of the functionality already provided by OS native filesystems. I don't want to use commercial software or have the bandwidth to write my own like facebook. All I want is to be able to do this through free software on a small server with a RAID or something similar. A solution that scales well to multiple servers would be a plus. The important thing is to serve it through java servlets (I am willing to look into alternatives but they have to be usable through java).
I'd appreciate any help. Any references to first hand experiences would be helpful as well. Thanks.
Guru -
I set up something exactly like this for members of my extended family to share photos. It is a slightly complicated process that includes the following:
1) Sign up for Amazon Web Services, notably their S3 (Simple Storage Service). There is a free storage tier that should cover the amount of users you described.
2) Set up a web application that accepts uploads. I use Uploadify in combination with jQuery and ajax, to upload to a servlet that accepts, scans, logs, and does whatever else I want with the file(s). On the servlet side, I use ESAPI's upload validation mechanism, part of the validation engine, which is just built on top of Commons File Upload, which I have also used by itself.
3) After processing the file(s) appropriately, I use JetS3t as my Java-AmazonS3 API and upload the file to Amazon S3. At that point, users can download or view photos depending on their level of access. The easiest way I have found to do this is to use JetS3t in combination with the Web Application Authentication to create Temporary URL's, which give the user access to the file for a specific amount of time, after which the URL becomes unusable.
A couple of things, if you are not concerned with file processing and trust the people uploading their files completely, you can upload directly to Amazon S3. However, I find it much easier to just upload to my server and do all of my processing, checking, and logging before taking the final step and putting the file on Amazon S3.
If you have any questions on the specifics of any of this, just let me know.
While Owens suggestion is an excellent one, there is another option you can consider - what you are describing is a Content Repository.
Since you have sufficient control of the server to be able to install a (non-commercial) piece of software, you may be interested in the Apache Jackrabbit* Content Repository. It even includes a Java API, so you should be able to control the software (at least as far as adding, and extracting content) from your Servlets.
Actually, if you combine this idea with Owens and expand on it, you could actually host the repository on the Amazon S3 space, and use the free-tier Amazon EC2 instance to host the software itself. (Although, I understand that the free-tier EC2 instance is only free for the first year)
HTH
NB. I'm sure other content repositories exist, but JackRabbit is the only one I've played with (albeit briefly).
What is the best way to implement a big (1GB or more) file uploader website in PHP or Java? Using the default way of uploading in PHP or Java results in running out of RAM space and slowing the website very dramatically.
It would be unwise to open the file on the client side, read its whole content to memory, close it and then start sending the contents, precisely because the contents can exceed the available memory.
One alternative is to open the file, read a chunk of it (remembering where the last chunk ended of course), close the file, upload to the server, and reassemble the file on the server side by appending to previous chunks. This is not a trivial procedure, and should take into consideration things like resource management, IO faults and synchronization, especially when working in parallel with multiple threads.
We've been using http://www.javaatwork.com/ftp-java-upload-applet/details.html for uploading very big files to dedicated hosting. It works a treat even with lots of RAW (photo) files.
Only drawback is it's not multi-threading and locks your browser until it's all uploaded.
Still to find another Java uploader as good looking as this (important to us), but there are a few multi-threaded ones out there that look pretty bad :-)
I would recommend JumpLoader [google it] , as it offers a lot of useful features. I have integrated it into my opensource CMS project, works just fine (of course, few tunings here and there is needed). Has Javascript interface which you can access with raw Jscript or JQuery [I used latter, coded little plugin for it]. The only drawback would be JumpLoader on applet's forehead :P , which you can have removed for 100 bucks.
Overall, features like multiple uploading, image and document editing in pre-upload, partitioned uploads, transmission integrity check via md5 fingerprinting blah blah blah, are very attractive.
I'm working on a small (java) project where a website needs to maintain a (preferably comma-separated) list of registered e-mail addresses, nothing else, and be able to check if an address is in the list. I have no control over the hosting or the server's lack of database support.
Prevayler seemed a good solution, but the website is a ghost town, with example code missing from just about everywhere it's supposed to be, so I'm a little wary.
What other options are recommended for such a task?
Use an embedded database like HSQLDB, H2 or Derby/JavaDB. They need no installation and can use simple files as their storage mechanism.
Yeah, prevayler and its historical concurrent, space4j, are really good candidates for such a simple case. They're far simpler than DB, provides however some useful concepts and are way fast (since in fact FS is only a backup of the in-memory datastore.
You may want to consider Berkeley DB.