I need to design a solution for my problem. So, Please help me.
My Problem :
We have one Web Project. There we are using four tables called A, B, C, D.
For each table, we had created one front-end page and we provided a button for save the record to respective tables.
Now we need to share this each record data to another application using Web-Service integration.
I have the knowledge on JAX-WS web service.
We Identified the required fields and we created only one common WSDL for all four tables.
When the user will try to save the record, at the time only i need to raise the web service call.(Event Base)
Here we are fallowing the Synchronous web service. I.e. for every request system will wait for response from other end.
Suppose, first i am trying to save the record in table A. so, i filled all required fields in form A, and i am trying to hit the save button. record saved in database as well as raised a web service request to other end, sent an record to server and waiting for response.....
If in this mean while if again i am trying to send another record for same form A (or) new record for form B.
Then how to handle this kind of scenario, because already a thread was busy with server for their response. So, how to raise an multiple request concurrently as we as in synchronously.
Please suggest me with the possible solutions that i can apply.
An suggestion will be great helpful for me.
(Sorry for my bad English)
Looking at your scenario i see that you have something like:
Database -> Web JAX-WS Server -> Multiple JAX-WS Clients
When you call from the client to the WS Server a new thread is created to handle the request and process the response for every client. Web servers are "multithread" and support multiple clients calling at same time. Your problem probably is after the WS service.
If with your WS you are trying to read the same table with two clients there is no problem but if one client tries to save when other reads or two or more clients are updating the table a Transaction lock is probably the problem.
Depending on the database configuration you should need to configure your transaction isolation options and handle carfully your database connections opening and closing your transactions only when is absolutly required
For example if you are using MySQL with InnDB (http://dev.mysql.com/doc/refman/5.0/es/innodb-transaction-isolation.html) and your Transaction isolation is "SERIALIZABLE" when you perform a query al table is locked until transaction ends and any other client is waiting until the transaction is released or a timeout is raised.
But if you hace "REPEATABLE READ" only the records readed by one transaction are locked to other transactions. This can be "good" in some environments but two SQL sentences that aplies to the same row probably cause a dead lock. * Default for MySQL InnDB
Also you should use READ COMMITED or READ UNCOMMITED to allow read all table and modify different records. Also to handle the same record with minimal problems is always recommended: "Open your transactions only the minimal required time"
Check your WS client for singleton patterns that destroys first request when you create a second request. Also check if you are using an WS with state, preserving session or another server side objects in a user session that are the same for different requests..
Related to Stateles:
Is it possible to use #WebService, #Stateless and #Singleton altogether in one EJB 3 bean?
Related
I have some problems understanding the best concept for my problem.
My architecure is pretty basic. I have a backend with data that can be updated and clients which will load data with some filtes.
I have a backend that has the data in a EHCache.
The data model is pretty basic for example
{
id: string,
startDate: date,
endDate: date,
username: string,
group: string
}
The data can only be modified by another backend service.
When data is modified, added or deleted we have an data update event generated.
The clients are all web clients and have a Spring boot REST Service to fetch the data from the cache.
For the data request the clients sends his own request settings. There are different settings like date and text filter. For example
{
contentFilter: Filter,
startDateFilter: date,
endDateFilter: date
}
The backend use this settings to filter the data from the cache and then sends the response with the filtered data.
When the cache generates an update event every client gets notified by a websocket connection.
And then request the full data with the same request settings as before.
My problem is now that there are many cache updates happening and that the clients can have a lots of data to load if the full dataset is loaded everytime.
For example I have this scenario.
Full dataset in cache: 100 000 rows
Update of rows in cache: 5-10 random rows every 1-5 seconds
Client1 dataset with request filter: 5000 rows
Client2 dataset with request filter: 50 rows
Now everytime the client receives a update notification the client will load the complete dataset (5000 rows) and that every 1-5 seconds. If the update only happens on the same row everytime and the row isn´t loaded by the client because of his filter settings then the client would be loading the data unnecessarily.
I am not sure what would be the best solution to reduce the client updates and increase the performance.
My first thought was to just send the updated line directly with the websocket connection to the clients.
But for that I would have to know if the client "needs" the updated line. If the updates are happening on rows that the clients doesn´t need to load because of the filter settings then I would spam the client with unnecessary updates.
I could add a check on the client side if the id of the updated row is in the loaded dataset but then I would need a separate check if a row is added to the cache instead of an update.
But I am not sure if that is the best practice. And unfortunately I can not find many resources about this topic.
The most efficient things are always the most work, sadly.
I won't claim to be an expert at this kind of thing - on either the implementation(s) available or even the best practices - but I can give some food for thought at least, which may or may not be of help.
My first choice: your first thought.
You have the problem of knowing if the updated item is relevant to the client, due to the filters.
Save the filters for the client whenever they request the full data set!
Row gets updated, check through all the client filters to see if it is relevant to any of them, push out to those it is.
The effort for maintaining that filter cache is minimal (update whenever they change their filters), and you'll also be sending down minimal data to the clients. You also won't be iterating over a large dataset multiple times, just the smaller client set and only for the few rows that have been updated.
Another option:
If you don't go ahead with option 1, option 2 might be to group updates - assuming you have the luxury of not needing immediate, real-time updates.
Instead of telling the clients about every data update, only tell them every x seconds that there might be data waiting for them (might be, you little tease).
I was going to add other options but, to be honest, I don't see why you'd worry about much beyond option 1, maybe with an option 2 addition to reduce traffic if that's an issue.
'Best practice'-wise, sending down multiple FULL datasets to multiple clients multiple times a second is certainly not it.
Sending only the data relevant to each client is a much better solution, and if you can further reduce how much the client even needs to send (i.e. only their filter updates and not have them re-send something you could already have saved) is an added bonus.
Edit:
Ah, stateless server - though it's not really stateless. You're using web sockets, so the server has some kind of state for those connections. It's already stateful so option 1 doesn't really break anything.
If it's to be completely stateless, then you also can't store the updated rows of data, so you can't return those individually. You're back to what you're doing which is a full round-trip and data read + serve.
Option 3, though, if you're semi stateless (don't want to add any metadata to those socket connections) but do hold updated rows: timestamp them and have the clients send the time of their last update along with their filters - you can then return only the updated rows since their last update using their provided filters (timestamp becomes just another filter) (or maybe it is stateless, but the timestamp becomes another filter).
Either way, limiting the updated data back down to the client is the main goal if for nothing else than saving data transfer.
Edit 2:
Sounds like you may need to send two bits of data down (or three if you want to split things even further - makes life easier client-side, I guess):
{
newItems: [{...}, ...],
updatedItems: [{...}, ...],
deletedIds: [1,2...]
}
Yes, when their request for an update comes, you'll have to check through your updated items to see if any are deleted and of relevance to the client's filters, but you can send down a minimal list of ids rather than whole rows that your client can then remove.
I would like to retrieve data from my in-memory H2 database via rest endpoint using Spring and Java8. I have 2 endpoints: one to retrieve data, and second one to add data to database.
How can I achieve something like it is described below in easiest way? I am not sure what solution can be better I thought about JMS Queue or CompletableFuture (if it is possible). It should work for few users, they will call to retrieve data saved under their id number.
Scenario:
User calls rest-endpoint to retrieve data.
If data is present in database then it is retrieved and returned to user.
If data is not present in database then connection is hold for 60 seconds and if during that time something will appear in database (added via endpoint to add new data) then data will be returned.
If data is not present in database and new data won’t appear in 60 seconds then endpoint returns no content.
There were multiple ways of doing that and if requirements is clear then i suggest below two approaches.
Approach 1:
Find and retrieve if data available without waiting.
If data not available set resource id and retrieveTime in header and respond to consumer.
Based resource id you can be ready with data if available.
In this way you can sure about your endpoint service time always consistent and ideally it shouldn't be more than 3 seconds.
Approach 2.
if data not available then out sleep in 60 seconds (not in database connection scope) and then again try with same thread.
Don't need any queue or ansyc process here.
Here your loosing resources and service time would take more.
Apart from other approaches, if your systems using eventing then use eventing approach when there is record persistent then send event to consumer (all database has the feature to send event to source system today).
I have developed a Java console application which does the following;
Fetch Product details like Product Id, Name, cost etc from an Oracle database and put it in a map (say dbMap) - one product can have multiple records as there are sub products.
Fetch similar Product details from a REST server and store it in a map (say restMap)
Since the DB has the correct data, compares the two maps - dbMap and restMap and identifies what should be added, replaced and removed from the REST server.
For this purpose, I create one JSON patch request for each product - with add, replace, remove operations(around hundred or so for each product) and send it to REST server.
However, i see it takes a few minutes to perform all these operations and all these operations happen in a linear manner - the database call, the rest server call, the comparison and finally the Patch to REST server.
I am assuming, instead of tackling all the data in a single thread, if I can get a list of products and go product by product, with each product in its own thread and run these threads in parallel, it might be faster.
so, each thread might do the following - fetch product details of one product from database and also from REST server, compare them both and generate a patch request(with Add/remove/replace operations) for that product and send it to REST server.
Could you please suggest how I can implement this type of thread architecture in Java? (There seem to be several ways like threadpools, AKKA etc and I am confused.)
Since dbCall and restCall are not mutually depended you can make parallel calls in 2 Threads. And one dedicated thread to process comparison.
You may use producer consumer approach here.
You can use Thread pools for same using Executor service:
http://tutorials.jenkov.com/java-util-concurrent/executorservice.html
I am developing a dictionary application and using many external sources to collect the data.
This data is collected from those sources only for the first time, after that i persist it to my db and fetch it from their.
The problem i am facing is, some words like set, cut, put etc have 100's of meanings and many examples as well. It takes around 10 seconds to Persist all this data to mysql. I am using mybatis to persist data. And because of this, the response time is getting screwed up. Without this database persist, i get response in 400-500ms, if i show data directly after fetching from sources.
I am trying to find a way to persist the data in background. I am using MVC pattern so dao layer is separate.
Is it a good idea to use threading in the dao layer as a solution? Or should I use some messaging tool like Kafka to send a message to persist the given word in background? What else can I do?
Note: I prefer MySQL as the db right now, will probably use redis for caching later on.
My global answer on question + further comments:
Do not bulk insert with Mybatis foreach. Instead you shall execute the statement in a java iteration over the list of object to store, using ExecutorType Reuse or Batch(Read the documentation).
For transactions, in main mybatis-config xml, configure the environment:
transactionManager type JDBC to manage the transaction in the code session = sessionFactory.openSession(); session.commit(); session.rollback();
transactionManager type MANAGED to let the container manage.
Furthermore, you can let the web app send the response, while a new thread takes its time to store the data.
In my code, there is a websocket server that persists information to the database on behalf of the connected client.
I am using Jetty 9, Hibernate, and Postgres.
Essentially, when a client posts a data object via the websocket:
The server deserializes the data
Checks to see if the object already exists in the database, based on content
If a match is found
The server will update the database row, and indicate that the item already exists
Else
The server will create a new database row and indicate that the item was added.
In this system:
An endpoint (ie., URL) corresponds to a single user principal
Multiple connections to the same endpoint are allowed, meaning multiple connections from different clients to a single user.
Data objects in the database are specific to a user
(This means that the server will only check rows belonging to the user for already-existing items)
This is all working well, except when 2 clients post the same data at precisely the same time. Neither server instance knows about the other, and the database ends up with 2 rows that are the same, except for the server-generated ID.
How can I prevent the server from writing the same data at the same time?
I can't use a UNIQUE constraint in the database, because I actually do have to support having multiple rows with the same data, but posted at different times.
I can get all the other sessions attached to the same endpoint, using "session.getOpenSessions()".
Can I:
Synchronize on the endpoint, somehow?
Synchronize on something attached to the session?
Do something with threading configuration?
Thanks for any suggestions.
When you say:
Checks to see if the object already exists in the database, based on
content
You already define a unique constraint, meaning there's a column set combination that must be unique, so that when you match an existing row, you will update instead of inserting.
The database already offers a centralized concurrency control mechanism, so you should probably use an index on all the aforementioned columns.