I am writing a RESTful service which consumes application/octet-stream and accepts binary files to write them to disk (Tomcat 8, Windows Server 2012R2, JAX-RS). I then need to insert the file contents into an Oracle table.
The service itself runs fine, accepts files and writes them to disk.
My problem (or call it best-practice question) is, how to transfer the data to the Oracle DB. Of course I can open a connection in the service itsef which gets called everytime the service accepts a file, but is this really the "correct" way? We're talking about MANY small files (let's say 100 per minute, each about 300 byte in size).
Should I create a connection pool? Or even a standalone program which keeps the Oracle connection open permanently? Unfortunately, I can't really benchmark at the moment because I am on an isolated test server.
So, tl;dr: How to transfer the content of many small files accepted by a RESTful service to an Oracle DB?
As you are deploying on Tomcat, using the Tomcat managed connection pool is the most generic way. We use this and get very good performance out of it. You could roll your own and benchmark it, but I am not sure about the merits of this. I know I would try the way that's best integrated with Tomcat first and only if it does not perform move to libs like C3P0.
Depending on your use-case you could do without writing the files to disk and instead just insert them into the DB. Since your files are small, there would not even be a reason to go async or fork threads for insertion.
Connection pools are the most generic way to go here. There's lots of reasons - separating the concerns of connection management from connection use, scalability control through configuration, overcoming the latency associated with connection setup,...
There's lots of implementations of simple connection pools out there, they can be found in application servers or libraries - c3p0 is a nice and easy one for a standalone, self-contained webapp.
Related
My employer has currently given me a project that has me scratching my head about synchronization.
I'm going to first talk about the situation I'm in:
I've been asked to create a pdf-report/quotation-tool that takes data (from csv-files; because the actual database the data is on is being used by old IBM software and they for reasons (unknown) don't want any direct access to this database (so instead of making copies of the data to other databases, they apparently found it incredibly fine to just create a folder on the server with loads and loads and loads of CSV-files.)), this piece of software is to load data into the application, query it, transform where needed, do calculations and then return with a pdf-file to the end-user.
The problem here is that getting, querying, and calculating things takes a fair amount of time, the other problem is: they want it to be a WebApp because the business team does not want to install any new software, they're mostly moving towards doing everything online (since the start of the pandemic), it being a WebApp means that every computation has to be done by the WebApp and getting the data likewise.
My question: Is each call to a servlet by a separate user treated as a separate servlet and should I only synchronize the methods on the business logic (getting and using the data); or should I write some code that puts itself in the middle of the servlet, receives a user-id (as reference), that then runs the business-logic in a synchronized-fashion, then receiving data and returning the pdf-file?
(I hope you get the gist of it...)
Everything will run on Apache Tomcat 8 if that helps. Build is Java 11lts.
Sorry, no code yet. But I've made some drawings.
With java web applications, the usual pattern is for the components to not have conversational state (meaning information specific to a specific user's request). If you need to keep state for a user on the server, you can use the http session. With a SPA or Ajax application it's often easier to keep a lot of that kind of state in the browser. The less state you keep on the server the easier things are as your application scales, you don't have to pin sessions to servers (messing up load balancing) or copy lots of session state across a cluster.
For simple (non-reactive) web apps that do blocking i/o, each request-response cycle gets its own dedicated thread from tomcat's pool. That thread delivers the http request to the servlet, handles the business logic and blocks while talking to the database, then carries the http response.
(Reactive webapps are going to be more complex to build, you will need a non-blocking database driver and you will have less choices for databases, so I would steer clear of those, at least for your first web application.)
The threadpool used by tomcat has to protect itself from concurrent access but that doesn't impact your code. Likewise there are 3rd party middletier caching libraries that have to deal with concurrency but you can avoid dealing with it directly. All of your logic is confined to one thread so it doesn't interfere with processing done by other threads unless there are shared mutable data structures. Those data structures would be the part of the application where synchronization might be one of several possible solutions.
Synchronization or other locking schemes are local to one instance of the application. If you want to stand up multiple instances of this application then you need to be aware each one would be locking separately from the others. So for some things it's better to do locking in the database, since that is shared across webapp instances.
If you can make use of a database to store your data, so that you can rely on the database for caching and indexing, then it seems likely your application should be able to avoid having doing a lot of locking.
If you want examples there are a lot of small examples for building web apps using spring at https://spring.io/guides. These are spring boot applications that are self hosted so you can put them together quickly and run them right away.
Going rogue with a database may not be the best course since databases need looking after by DBAs. My advice is put together two project plans, one for using a database, and one for using the flat files. The flat file one will have to allow for addressing issues like handling caching, indexing data, replication of data from the legacy database, and not having standard tools that generate pdfs from sql queries. The alternative plan using a database should have a lot less sorting out of infrastructure and a shorter time til you can get down to cranking out reports.
I have the following problem: I have Java application - Sprint boot, which uses Angular in the frontend. This application needs to store some data on the client side, however, this data is lost when the client changes their browser or opens an anonymous browser tab.
I need an alternative, other than linking data to the user in the database. Something that is implemented in Java itself.
Is there any way I can store data in Java - Even though I know they will be volatile, that is, we can assume that my application server will be up 100% of the time.
**edit
My server run a openshift plataform that have multiple pods, the load baancer of server are configured in a NON-Sticky sessions design. That's why we can assuming that my server will be 100% active.
This really depends on the design of your server. For example, why is it guaranteed to be up 100% of the time? Do you have multiple redundant instances? In that case you need to coordinate that "storage" between all instances; you may even want to deal with a quorum of instances keeping the state etc. Doesn't seem to be trivial. Or do you have just one single instance? But how do you guarantee 100% uptime?
I strongly recommend using some kind of data store or at least distributed cache.
I'm writing a Java EE program that manages client data. The original specification was for it to talk to a hosted database that the client would setup, but now (just before deadline) they're saying they can't get that together and want it to talk to local files instead. (The program will be used in contexts where internet access is infeasible. Also, the user will want to email or dropbox the files around and expresses some hostility to the idea of my starting a mysql server on their local machine.)
The persistence layer is implemented with javax.persistence, configured to talk to a mysql database. What I'd love is to keep the program as similar as possible and switch to configuring my EntityManagerFactory to talk to a flat file if it can't connect to mysql. Is there a jdbc driver that will talk to a flat file?
(Searching on this subject turns up a number of options, but none of them look obviously supported/tested/likely to be robust. If there's a standard solution, it's the one I'm looking for.)
I have never connected to a database in java before. May I know if I should go about accessing a derby database with servlet?
I have checked this: How do I access a database from my servlet or JSP?
But I saw comments on the article saying that this is a bad way to connect. Could any one explain or show me the best way to that I should code to access my derby database?
Thank you very much.
They are all right indeed, in suggesting that. We don't don't access database directly from Servlets or JSPs, these both are meant to be web tier, isn't it?
So, what to do? Grab a JDBC tutorial. The official one is an excellent choice here. That will give you a good idea about connecting to database from Java, and grasp over JDBC API. After that you should go and read about DAO pattern, and how we employ that in real apps.
Moreover, I think you also should read about MVC pattern, because it seems to me that you are not very clear on that as well.
Once you understand all these and come up with a toy like application using all these stuff. Next step would be to have a look into Connection Pooling mechanism.
Since you are using servelt you must be using a container line Apache Tomcat. You should look to define a connection pool like this http://tomcat.apache.org/tomcat-5.5-doc/jndi-datasource-examples-howto.html. If you are using any other container then that will also have similar setup.
Other option is to create a separate DBManager kind of class which looks after initializing and returning connection. This class you can use in the servlet.
Using JDBC and having your app server's application pool is a good start. You can also use some API to make your life easier like Hibernate.
It is a "bad way", because it doesn't make use of a (JNDI-managed) connection pool to obtain connections. Although acquiring a connection costs "only" a few hundred milliseconds, this has impact in a busy multiuser environment. A connection pool will worry about opening and closing connections and release them immediately on every getConnection() call so that it effectively costs almost zero milliseconds. If you sum that up in a busy multiuser environment, then the differences are noticeable.
A connection pool is usually to be configured in flavor of a JNDI datasource which is managed by the servletcontainer in question. As you didn't mention which one you're using, I can at highest point to one of my answers which contains a Tomcat 6.0 targeted example: here.
Hope this helps.
I got a java web project handling several objects (again containing n objects of type A (e.g. time and value) and m objects of type B (e.g. time and String array)). The web projects itself contains several servlets/jsps for visualization as well as some logic for data manipulation and currently runs on an Apache Tomcat.
Is it possible to store the whole data in the servers (or most of the time: local) memory while the server is running? If the Tomcat is shut down, the data could be stored in a simple file, no restrictions there. On server startup, I just want to read in the files and write the objects to memory. How can I initiate the Tomcat to do so?
The reason why I do not want to use an extra database is, that I want to deliver a zip file containing the tomcat including the deployed *.war file (as I don't want my prof getting stuck with tomcat server setup etc.)
Thanks, ChrisH
You could implement ServletContextListener and write the load-from-file and save-to-file logic in the contextInitialized() and contextDestroyed() methods which are invoked during webapp's startup and shutdown respectively.
You can read and write objects to disk, but they all need to implement java.io.Serializable first. Here is a Serialization tutorial with code examples.
That said, have you considered an embedded database so that you don't need to install a database server? You could use the JDK6's built-in JavaDB for this or its competitor HSQLDB. Alternatively, if it are pure key-value pairs, then you could also just use the java.util.Properties API for this (tutorial here). Just place the propertiesfile somewhere in the classpath and use ClassLoader#getResourceAsStream() to get an InputStream of it, or place it somewhere in WEB-INF and use ServletContext#getResourceAsStream().
I think that HSQLDB is exactly what you need, a small database server that is also embedded natively in Apache Tomcat. It stores data in memory allowing also to write and read contents from a file.
If the app shuts down unexpectedly, you'll lose all your data, because it won't have time to write it to disk.
You could use a database like SQLite/derby/hsql etc. which store their data to the filesystem.
If you don't want to mess with a DB, then you could store everything in memory and flush it to disk every time it's modified. A couple tips here:
Serialization can make this really easy. Make all your objects implement Serializable, and give them a serial version id
use a BufferedOutputStream when
writing to disk, this is faster than a straight FileOutputStream
DO NOT overwrite your old data file directly! Write to a new file, and when done writing, move the completed file on top of your old file. That way, if the server shuts down while you're in the middle of writing your data file, you still have the good file which was written before.
You should acquire a read lock on your data while writing it. Any other code which modifies the data should get a Write lock on the data.
If you don't care about the possibility that your application may scribble all over your data files, your Tomcat / JVM may crash, or your machine may die losing all in-memory objects, then managing persistence as you suggest is an option. But you'll have quite a bit of infrastructure to build, test and maintain. And you'll miss out on the "value add" tools that most RDBMs provide; backup, a query tool, optimizers, replication, etc.
But if catastrophic data loss is not an option for you, you should use an RDBMs, ODBMs, whatever to do your persistence.