GWT: Batching vs Disposability vs Statelessness - java

I recently watched several Google I/O videos where Google developers present GWT with respect to performance and security. In that video the speaker makes mention of several GWT-isms:
Client-side request "batching"
"Disposability"
The pursuit of GWT app "statelessness"
With respect to "batching" it seems like GWT can be configured to queue-up server-side RPC calls and send them all at once (instead of many tiny, performance-hindering calls). Unfortunately, I'm just not seeing the forest through the trees here: doe GWT handle batching for you, or do you have to write the logic that performs this bundling/batching? If you have to do it, what kinds of calls can/should be bundled? How do you know when its time to fire the batch off?
In GWT lingo, what does it mean when someone says:
"Clients and servers are disposable"; but
"Views" are not disposable
How does this concept of "batching" and "disposability" relate to GWT app "statelessness". By that, the speaker defined statelessness as:
Browser embodies the session (?!?!)
Server is stateless - except for caching (?!?!)
Client never notices a restart (?!?!)
If someone could help give me concrete understanding of these 3 items and how they relate to each other I think I'll start to "get gwt". Thanks in advance!

doe GWT handle batching for you, or do you have to write the logic that performs this bundling/batching? If you have to do it, what kinds of calls can/should be bundled? How do you know when its time to fire the batch off?
GWT-RPC has no batching mechanism. You can (relatively) easily add some by queueing "commands" in a list and then sending the list as a single GWT-RPC call. Some projects should do that for you with minimal effort (GWT-Platform for example).
RequestFactory on the other hand has batching built-in: you create a RequestContext instance and batch calls to it until you fire() it.
"Clients and servers are disposable"; but "Views" are not disposable
The first is related to statelessness (and, for example, with AppEngine, you don't control when a new server instance is created, shutdown or restarted: the server can disappear at any time, so don't keep state in memory).
The second is about performance: everything related to the DOM in the browser is slow, so constructing a new view (widgets stacked together) is heavy-weight (less so with Cell widgets though). As a result, you don't want to make them disposable, i.e. throw them away every now and then. You'll rather want to keep one view instance around that you reuse for the lifetime of the app.
Not exactly the same notion of "disposability".
Browser embodies the session (?!?!)
GWT is built of single-page apps. You can store state on the client simply in variables in your app; you don't need cookies or whatever to have the state shared between pages.
Server is stateless - except for caching (?!?!)
Storing session state on the server has a cost (state must be persisted –particularly if the server is disposable–, shared between servers –when you have a cluster / run in the cloud–, etc. you'll spend as many resources keeping existence of your session state as doing actual business logic).
Client never notices a restart (?!?!)
HTTP is a a disconnected protocol. If the server is restarted, the client won't know about it, and it shouldn't have to know about it.
If someone could help give me concrete understanding of these 3 items and how they relate to each other I think I'll start to "get gwt".
It's not about getting GWT, it's about getting the Web and getting single-page webapps, and how to scale them.
Whether they're made with GWT or jQuery on the client-side, and Java or Python or .NET on the server-side doesn't matter.
Read about REST, it sums it all.

Related

Where to syncronize inside a Java WebApp

My employer has currently given me a project that has me scratching my head about synchronization.
I'm going to first talk about the situation I'm in:
I've been asked to create a pdf-report/quotation-tool that takes data (from csv-files; because the actual database the data is on is being used by old IBM software and they for reasons (unknown) don't want any direct access to this database (so instead of making copies of the data to other databases, they apparently found it incredibly fine to just create a folder on the server with loads and loads and loads of CSV-files.)), this piece of software is to load data into the application, query it, transform where needed, do calculations and then return with a pdf-file to the end-user.
The problem here is that getting, querying, and calculating things takes a fair amount of time, the other problem is: they want it to be a WebApp because the business team does not want to install any new software, they're mostly moving towards doing everything online (since the start of the pandemic), it being a WebApp means that every computation has to be done by the WebApp and getting the data likewise.
My question: Is each call to a servlet by a separate user treated as a separate servlet and should I only synchronize the methods on the business logic (getting and using the data); or should I write some code that puts itself in the middle of the servlet, receives a user-id (as reference), that then runs the business-logic in a synchronized-fashion, then receiving data and returning the pdf-file?
(I hope you get the gist of it...)
Everything will run on Apache Tomcat 8 if that helps. Build is Java 11lts.
Sorry, no code yet. But I've made some drawings.
With java web applications, the usual pattern is for the components to not have conversational state (meaning information specific to a specific user's request). If you need to keep state for a user on the server, you can use the http session. With a SPA or Ajax application it's often easier to keep a lot of that kind of state in the browser. The less state you keep on the server the easier things are as your application scales, you don't have to pin sessions to servers (messing up load balancing) or copy lots of session state across a cluster.
For simple (non-reactive) web apps that do blocking i/o, each request-response cycle gets its own dedicated thread from tomcat's pool. That thread delivers the http request to the servlet, handles the business logic and blocks while talking to the database, then carries the http response.
(Reactive webapps are going to be more complex to build, you will need a non-blocking database driver and you will have less choices for databases, so I would steer clear of those, at least for your first web application.)
The threadpool used by tomcat has to protect itself from concurrent access but that doesn't impact your code. Likewise there are 3rd party middletier caching libraries that have to deal with concurrency but you can avoid dealing with it directly. All of your logic is confined to one thread so it doesn't interfere with processing done by other threads unless there are shared mutable data structures. Those data structures would be the part of the application where synchronization might be one of several possible solutions.
Synchronization or other locking schemes are local to one instance of the application. If you want to stand up multiple instances of this application then you need to be aware each one would be locking separately from the others. So for some things it's better to do locking in the database, since that is shared across webapp instances.
If you can make use of a database to store your data, so that you can rely on the database for caching and indexing, then it seems likely your application should be able to avoid having doing a lot of locking.
If you want examples there are a lot of small examples for building web apps using spring at https://spring.io/guides. These are spring boot applications that are self hosted so you can put them together quickly and run them right away.
Going rogue with a database may not be the best course since databases need looking after by DBAs. My advice is put together two project plans, one for using a database, and one for using the flat files. The flat file one will have to allow for addressing issues like handling caching, indexing data, replication of data from the legacy database, and not having standard tools that generate pdfs from sql queries. The alternative plan using a database should have a lot less sorting out of infrastructure and a shorter time til you can get down to cranking out reports.

In GAE, is there a way to force an instance to serve only 1 session?

I am building a rich app on GAE using Canoo's RIA Suite. This package splits Java Swing components into server-side and client-side parts. On the server, it looks like a 'desktop' Java application. The client keeps its own map between these halves. When GAE starts a new instance, the client-side parts don't know about it -- if the next request they send is routed to the wrong instance bad things happen.
I figure I could get around this problem if I did one of two things:
Forced a GAE instance to serve exactly one HTTP session.
Directed each HTTP request to a specific GAE instance.
My question is, in the GAE environment, can either of these be done?
Neither of these two options will solve your problem, because an App Engine instance can die and be replaced at any moment.
If you can save a state of your server-side "half" in a datastore, you can load it when a request hits the "wrong" instance, but it's probably not a very efficient solution.
You may be better off using a Compute Engine instance.
I agree that neither of those two options will work for you. The implication of your current design is that you are storing state in memory on an instance, which will not work with GAE (or any autoscaling distributed system). You should put any state into some distributed data store, whether that is memcache (which is volatile), the datastore or cloudSQL
GAE/J has built in support for java sessions, the session state is persisted in the datastore across requests so that it is valid on any instance. For this to work, everything stored in your session will need to be serializable.
You can enable this by following these instructions.
Otherwise you can manage persisting server state yourself into the datastore accelerated by memcache, and linking it to a 'session' with a cookie. If you go down this road make sure you understand the implications of eventual consistency in the GAE datastore.

Combining java spring/thread and database access for time-critical web applications

I'm developing an MVC spring web app, and I would like to store the actions of my users (what they click on, etc.) in a database for offline analysis. Let's say an action is a tuple (long userId, long actionId, Date timestamp). I'm not specifically interested in the actions of my users, but I take this as an example.
I expect a lot of actions by a lot of (different) users par minutes (seconds). Hence the processing time is crucial.
In my current implementation, I've defined a datasource with a connection pool to store the actions in a database. I call a service from the request method of a controller, and this service calls a DAO which saves the action into the database.
This implementation is not efficient because it waits that the call from the controller and all the way down to the database is done to return the response to the user. Therefore I was thinking of wrapping this "action saving" into a thread, so that the response to the user is faster. The thread does not need to be finished to get the reponse.
I've no experience in these massive, concurrent and time-critical applications. So any feedback/comments would be very helpful.
Now my questions are:
How would you design such system?
would you implement a service and then wrap it into a thread called at every action?
What should I use?
I checked spring Batch, and this JobLauncher, but I'm not sure if it is the right thing for me.
What happen when there are concurrent accesses at the controller, the service, the DAO and the datasource level?
In more general terms, what are the best practices for designing such applications?
Thank you for your help!
Take a singleton object # apps level and update it with every user action.
This singleton object should have a Hashmap as generic, which should get refreshed periodically say after it reached a threshhold level of 10000 counts and save it to DB, as a spring batch.
Also, periodically, refresh it / clean it upto the last no.# of the records everytime it processed. We can also do a re-initialization of the singleton instance , weekly/ monthly. Remember, this might lead to an issue of updating the same in case, your apps is deployed into multiple JVM. So, you need to implement the clone not supported exception in singleton.
Here's what I did for that :
Used aspectJ to mark all the actions of the user I wanted to collect.
Then I sent this to log4j with an asynchronous dbAppender...
This lets you turn it on or off with log4j logging level.
works perfectly.
If you are interested in the actions your users take, you should be able to figure that out from the HTTP requests they send, so you might be better off logging the incoming requests in an Apache webserver that forwards to your application server. Putting a cluster of web servers in front of application servers is a typical practice (they're good for serving static content) and they are usually logging requests anyway. That way the logging will be fast, your application will not have to deal with it, and the biggest work will be writing a script to slurp the logs into a database where you can do analysis.
Typically it is considered bad form to spawn your own threads in a Java EE application.
A better approach would be to write to a local queue via JMS and then have a separate component, e.g., a message driven bean (pretty easy with EJB or Spring) which persists it to the database.
Another approach would be to just write to a log file and then have a process read the log file and write to the database once a day or whenever.
The things to consider are: -
How up-to-date do you need the information to be?
How critical is the information, can you lose some?
How reliable does the order need to be?
All of these will factor into how many threads you have processing your queue/log file, whether you need a persistent JMS queue and whether you should have the processing occur on a remote system to your main container.
Hope this answers your questions.

How stateful should a web application be at most?

I heard a web application should be as stateless as possible. But it seems to me very hard to realize this often. For instance, what if I:
Process a request
Redirect the user to the start page
Want to display the result of the request?
If the result is a little bit more complex, then just a string which could be passed as a parameter (or I don't want to include that information via URL), then I cannot combine 2. and 3.
The only solution I can think of here is keeping the information as states in the Java program.
But that would break with the rule of a stateles web application, wouldn't it?
I heard a web application should be as stateless as possible
What? There is state everywhere in a web app, both in the client and on the server. Frameworks like Sproutcore/Ember even have components called State Managers to manage, um, the state.
The server maintains some state in a user's session (typically).
Did you hear that HTTP is stateless? That's another story, and completely true. Also, it can be a good idea to write server side components that don't share state, due to threading concerns. But neither of those points should be taken to imply that your application doesn't have state.

JSF State Saving initially to server & on session timeout transfer to client?

Are there any state saving method that would allow JSF application to intially save state data on server but after the session expire time interval , that state is transferred to client so that app is always responsive even after the session timeout on the server & memory is better managed on server?
Or any way this could be implemented? But I expect that this should be a part of the JSF specification !
Edit
After suggestion by BalusC, I'm highly impressed with the Stateless JSF principles & the current implementation for it. If anyone else here is also interested in stateless JSF being added to the JSF spec, consider having a look at or voting this issue.
Stateless JSF offers huge performance boosts for some payoffs like inability to create views dynamically (e.g. by binding, JSTL tags, etc), or modifying it.
A Stateless JSF operation mode
would be incredibly useful for high-load applications and
architectures:
https://web.archive.org/web/20140626062226/http://industrieit.com/blog/2011/11/stateless-jsf-high-performance-zero-per-request-memory-overhead/#comment-4
This has previously been suggested by Jacob:
http://weblogs.java.net/blog/jhook/archive/2006/01/experiment_goin.html
This would help JSF ditch the stigma of "slow and memory hog," and
help keep up with current tech trends (stateless architectures.)
How is that technically possible? The server can never reliably predict beforehand if the next request would create a new session and thus the response of the current request has to use client side state saving instead of server side state saving. If you ever succeed to implement it using plain JSP/Servlet, feel free to post a JSF specification enhancement request.
Just use client side state saving and make sure that partial state saving is enabled. The overhead is relatively minor as compared to full state saving.
Note that it's possible to use JSF entirely stateless. See also this blog. The only major payoff is that you can't create views dynamically (e.g. by binding, JSTL tags, etc), nor manipulate it after creation (e.g. by adding/removing component's children).
See also:
Why JSF saves the state of UI components on server?

Categories

Resources