Given a JDBC-based application, that was not designed for real-time propagation of changes from one instance of the app running on computer A to another instance runnning on computer B in a two-way synchronization schema. How can you do this elegantly, without using Symmetric DS?
We though of using XMPP and XStream, transforming POJOs to XML or JSON, sending them via the XMPP, Smack API to the pre-configured "chat room" where other bots, listening, would replay the data they receive. Thus, even offline client apps, would receive the "DiscussionHistory" by sending their last "since timestamp".
I kind of looked everywhere for a "near real-time database change propagation" in Java, or even in H2, but where changes are propagated between each node registered, but the only solution I could think of is to use the XMPP protocol, build a "bot" chat-room around it, have nodes send their data there while others listen for changes.
The so-called "bots" are application instances on different computers, of an accounting application that should allow for real-time collaboration on the same database, but allow for offline modifications (so no centralized server to store changes).
One common approach is to build your caching so that the application always queries the database if a particular entry is not found. Then you would only have to synchronize cache-evictions to force all nodes in a group to re-load a certain entry. This is fairly easily achieved using, for instance, spring method caching and ehcache.
Related
My employer has currently given me a project that has me scratching my head about synchronization.
I'm going to first talk about the situation I'm in:
I've been asked to create a pdf-report/quotation-tool that takes data (from csv-files; because the actual database the data is on is being used by old IBM software and they for reasons (unknown) don't want any direct access to this database (so instead of making copies of the data to other databases, they apparently found it incredibly fine to just create a folder on the server with loads and loads and loads of CSV-files.)), this piece of software is to load data into the application, query it, transform where needed, do calculations and then return with a pdf-file to the end-user.
The problem here is that getting, querying, and calculating things takes a fair amount of time, the other problem is: they want it to be a WebApp because the business team does not want to install any new software, they're mostly moving towards doing everything online (since the start of the pandemic), it being a WebApp means that every computation has to be done by the WebApp and getting the data likewise.
My question: Is each call to a servlet by a separate user treated as a separate servlet and should I only synchronize the methods on the business logic (getting and using the data); or should I write some code that puts itself in the middle of the servlet, receives a user-id (as reference), that then runs the business-logic in a synchronized-fashion, then receiving data and returning the pdf-file?
(I hope you get the gist of it...)
Everything will run on Apache Tomcat 8 if that helps. Build is Java 11lts.
Sorry, no code yet. But I've made some drawings.
With java web applications, the usual pattern is for the components to not have conversational state (meaning information specific to a specific user's request). If you need to keep state for a user on the server, you can use the http session. With a SPA or Ajax application it's often easier to keep a lot of that kind of state in the browser. The less state you keep on the server the easier things are as your application scales, you don't have to pin sessions to servers (messing up load balancing) or copy lots of session state across a cluster.
For simple (non-reactive) web apps that do blocking i/o, each request-response cycle gets its own dedicated thread from tomcat's pool. That thread delivers the http request to the servlet, handles the business logic and blocks while talking to the database, then carries the http response.
(Reactive webapps are going to be more complex to build, you will need a non-blocking database driver and you will have less choices for databases, so I would steer clear of those, at least for your first web application.)
The threadpool used by tomcat has to protect itself from concurrent access but that doesn't impact your code. Likewise there are 3rd party middletier caching libraries that have to deal with concurrency but you can avoid dealing with it directly. All of your logic is confined to one thread so it doesn't interfere with processing done by other threads unless there are shared mutable data structures. Those data structures would be the part of the application where synchronization might be one of several possible solutions.
Synchronization or other locking schemes are local to one instance of the application. If you want to stand up multiple instances of this application then you need to be aware each one would be locking separately from the others. So for some things it's better to do locking in the database, since that is shared across webapp instances.
If you can make use of a database to store your data, so that you can rely on the database for caching and indexing, then it seems likely your application should be able to avoid having doing a lot of locking.
If you want examples there are a lot of small examples for building web apps using spring at https://spring.io/guides. These are spring boot applications that are self hosted so you can put them together quickly and run them right away.
Going rogue with a database may not be the best course since databases need looking after by DBAs. My advice is put together two project plans, one for using a database, and one for using the flat files. The flat file one will have to allow for addressing issues like handling caching, indexing data, replication of data from the legacy database, and not having standard tools that generate pdfs from sql queries. The alternative plan using a database should have a lot less sorting out of infrastructure and a shorter time til you can get down to cranking out reports.
I have the following problem: I have Java application - Sprint boot, which uses Angular in the frontend. This application needs to store some data on the client side, however, this data is lost when the client changes their browser or opens an anonymous browser tab.
I need an alternative, other than linking data to the user in the database. Something that is implemented in Java itself.
Is there any way I can store data in Java - Even though I know they will be volatile, that is, we can assume that my application server will be up 100% of the time.
**edit
My server run a openshift plataform that have multiple pods, the load baancer of server are configured in a NON-Sticky sessions design. That's why we can assuming that my server will be 100% active.
This really depends on the design of your server. For example, why is it guaranteed to be up 100% of the time? Do you have multiple redundant instances? In that case you need to coordinate that "storage" between all instances; you may even want to deal with a quorum of instances keeping the state etc. Doesn't seem to be trivial. Or do you have just one single instance? But how do you guarantee 100% uptime?
I strongly recommend using some kind of data store or at least distributed cache.
We're designing an architecture for communicating several applications and we have decided to use Mirth as (pseudo)ESB. In our processes we want to give back control to users as soon as we can, so when an action is fired by an user (for example, pressing Save button after filling in a form) some (necessary) changes are made in database and then a message has to be sent to another system. User doesnt have to wait until message is sent, so our applications gives back control when database changes are done. Message composition is done in background asynchronously. But we donĀ“t really know which approach we should follow:
a) Start a new thread in our app where we collect all necessary data (starting from "primary data", this is, some primary keys that allow us to find all information) to fill an HL7 message and send it to queue where Mirth is listening.
b) Send "primary data" to Mirth and delegate HL7 message composition to it.Mirth can access directly to database to collect necessary data or another option could be invoking some REST/SOAP services of our own.
In case of option B, we have some doubts about how to invoke Mirth:
b.1) Our app makes database modifications and writes primary data on a queue (distributed transaction).
b.2) Our app makes database modifications and call a SOAP or Rest service published by Mirth which all it does is writing message on a queue where Mirth is also reading (no distributed transaction in our app).
Some argue that composing message in our app and using Mirth only as a broker is "missusing" Mirth. On the other side, there is some mates that find accessing app database from Mirth is very intrusive and it should not know our schema. Last option, invoking an app service from Mirth which returns all necessary information for HL7 is like sending "primary data" from app to Mirth only to get it back when Mirth calls service (passing that data as a parameter).
Thank you for your advices.
I'm not sure if Mirth is the appropriate tool to use as an Enterprise Service Bus where your requirements include real time notifications/events to allow the user to proceed after submitting a form.
Without knowing more, such as the architecture in play, we can't really advise you.
IMO, as one who experienced with Mirth integration, as well as designing database dependent applications, I would say that Mirth isn't the appropriate tool for the job.
(1) There is not enough information for an "expert advice" and no single clear technically-justified answer
(2) Option (a) looks like least expensive and easiest to implement for the 1st version, especially with reuse of stable tested libraries like HAPI
(3) In your design treat your Enterprise service bus as a black box component and concentrate on designing the interfaces and clarifying the asynchronous message sequences. This way the service bus internals, the message routing and queuing decisions can be postponed to the deployment time with some coding effort and by following the adapter design pattern
(4) Arguments worded like "missusing", "intrusive", "like it", "nice" perhaps indicate a valid point of view but as such do not create a measurable, verifiable decision criteria or performance indicators and should not be used alone
(5) This is the right time to apply a decision making process and weight-evaluate the various options. As a minimal formal input I'd recommend the Plus/Minus/Interesting
(6) In your decision following points should not be ommited:
securing data privacy (health state is a private property protected by law in some countries)
fault tolerance (robustness, reliability, exception handling)
maintenance costs (do you have qualified people to maintain it, can the solution monitor and auto-correct itself or someone will have to review millions of lines of logs manually)
development costs (do you have qualified people already, how many lines of code can you reuse vs. how many will you have to create/debug)
(7) I'm sorry that my answer is not directly helpful, my choice would be to compose the message in a reliable secured application server, whatever that means in this case and regardless of how it's axons or pseudopods would be connected
Last but not the least: record the why you made the choice - forever, so that you can test and validate your assumptions any time later when the original decision makers get lost in the sands of time
I am building a rich app on GAE using Canoo's RIA Suite. This package splits Java Swing components into server-side and client-side parts. On the server, it looks like a 'desktop' Java application. The client keeps its own map between these halves. When GAE starts a new instance, the client-side parts don't know about it -- if the next request they send is routed to the wrong instance bad things happen.
I figure I could get around this problem if I did one of two things:
Forced a GAE instance to serve exactly one HTTP session.
Directed each HTTP request to a specific GAE instance.
My question is, in the GAE environment, can either of these be done?
Neither of these two options will solve your problem, because an App Engine instance can die and be replaced at any moment.
If you can save a state of your server-side "half" in a datastore, you can load it when a request hits the "wrong" instance, but it's probably not a very efficient solution.
You may be better off using a Compute Engine instance.
I agree that neither of those two options will work for you. The implication of your current design is that you are storing state in memory on an instance, which will not work with GAE (or any autoscaling distributed system). You should put any state into some distributed data store, whether that is memcache (which is volatile), the datastore or cloudSQL
GAE/J has built in support for java sessions, the session state is persisted in the datastore across requests so that it is valid on any instance. For this to work, everything stored in your session will need to be serializable.
You can enable this by following these instructions.
Otherwise you can manage persisting server state yourself into the datastore accelerated by memcache, and linking it to a 'session' with a cookie. If you go down this road make sure you understand the implications of eventual consistency in the GAE datastore.
I am developing a stand-alone java client application that connects to a Glassfish v3 application for JPA/EJB facade style transactions. In other words, my client application do not connect directly to the database to CRUD, but it transfers JPA objets using EJB stateless sessions.
I have scenarios where this client application will be used in an external network connected with a VPN over Internet with a client connection of 512kbp/DSL, and a simple query takes so much time, I'm seeing the traffic graph and when I merge a entity in the client application I see megabytes of traffic (I couldn't believe how a purchase order entity could weight more than 1 mb).
I have LAZY fetch in almost every many-to-many relationship, but I have a lot of many-to-one relationships between entities (but this is the great advantage of JPA!).
Could I do something to accelerate the the speed of transactions between JPA/EJB server and the remote java client?
Thank you in advance.
How much data do you really transfer? Maybe the purchase order you're sending has a product, which has a model, which has a supplier, which has a set of models... and so on...
You could try serializing the object you're sending to the server into a file (using the standard ObjectOutputStrem) and check how big the file is.
I'm seeing the traffic graph and when
I merge a entity in the client
application I see megabytes of traffic
(I couldn't believe how a purchase
order entity could weight more than 1
mb).
RMI-IIOP is a bit more verbose than plain RMI. In my experience, it doesn't work well when transferring large graphs.
So far I remember (but maybe things changed in the meantime), when you transfer an lazy loaded entity, the parts that haven't been loaded yet are sent as-is (the proxy is serialized), which means you can not access them on the client because lazy loading won't work if there is no session anymore. Are you eagerly loading the entity before sending it back to the client?
Could I do something to accelerate the
the speed of transactions between
JPA/EJB server and the remote java
client?
But the crux of the problem is that you are in a scenario where you need to think about a strategy to transfer data. You must design you application is a way that you don't send large graphs; this concern must be addressed in the design of your app. Then you can decide to still send JPA entities or rely on Data Transfer Object (DTO).
You might also consider using an extended persistence context with a stateful session bean, this way I think an entity on the client side can still be loaded lazy. But I never used this personally and don't know if it works well or not.
If I understand your architecture correctly you have:
Client(works with disconnected Entities)
----RMI/IIOP--->
Server(SLSB, using entitiy manager, JPA persistence)
----JDBC------->
Database
In effect your SLSBs are expressing their interface in terms of the JPA Objects, your DTOs are the JPA objects. I see two possible scenarios:
your client needs only a subset of the data in your JPA objects, and you are transfering more than you acually need. For example you might only need an employee's name and you send his entire life history.
you are traversing more of the relationship treee than you intend
My feeling is that you should first determine exactly what you are getting in the client. Should be pretty easy to add some trace statements to see exactly what data you have.
Possibly by tweaking the lazy loading etc, you can then control the behaviour.
My expectation is that you may need to define client-specific "subset" DTOs and have your SLSB act more as a facade, sending only the subset data. It's more work, but you have fine control over what's in the interface.
Architecturally, fine tuning a remote interface is quite a reasonable thing to need to do.
Could I do something to accelerate the the speed of transactions between JPA/EJB server and the remote java client?
You can't accelerate things. However, you can transfer less (only the required part or lighter objects).