In my web application, we do some socket job in our servlet, and we log socket data into database.
I want to make that logging process asynchronous to improve performance.
My idea is using a separate dedicated thread to do the logging job. In my servlet, I just submit data to a cache, and let the logging thread to process them one by one.
I have a little experience in threading, What collection I can use as the cache ? What's the basic code pattern to implement this ? Please provide some code to show how to achieve that.
sorry for my poor English
As my application is legacy system running in production environment.It just use servlet and jsp no other Java EE technology. It seems that adding JMS support is too expensive for me.
A queue and a thread pool should be good.Publish your messages on a queue, let the workers thread pick messages from the queue and save them in database. Depending on your requirement/load, you may tune your queue and thread pool size.
If you're looking to output the log to a single file, you could try using a Semaphore (preferably a Mutex) on the logger class to prevent simultaneous writes / a race condition. Semaphores are synchronization primitives designed so that the programmer can use them to ensure only a certain number of accesses can be made to any one data structure at any one time. I won't explain the whole concept but Java provides these things in the java.util.concurrent.Semaphore class. A mutex (mutual exclusion lock) is a semaphore that only allows one thread to be "hold" it at any given time. Give it a try!
If you are looking at using a dedicated thread to handle the logging, you will want to implement a Producer/Consumer pattern, and use the Queue to handle the storing of the information objects. The producer/consumer pattern is mostly used to help with thread synchronization and communication. Here is an example of a Producer/Consumer implementation that might help: http://www.tutorialspoint.com/javaexamples/thread_procon.htm
The other option, is to generate a standard logging operation, and then create thread pool threads that do this work. The benefit of this is the thread pool handles the scheduling of the threads and when they execute, but the down side is that you are not guaranteed FIFO logging with this method since the thread scheduler can arbitrarily choose which thread in the pool to run next.
Unless your leader insists about reinventing the wheel, use slf4j with logback's DBAppender behind an AsyncAppender. It's ready out of the box, it works like a charm.
You should really read about logback's appenders.
A full example can be found here.
Related
To give some context here, I have been following Project Loom for some time now. I have read The state of Loom. I have done asynchronous programming.
Asynchronous programming (provided by Java NIO) returns the thread to the thread pool when the task waits and it goes to great lengths to not block threads. And this gives a large performance gain, we can now handle many more request as they are not directly bound by the number of OS threads. But what we lose here, is the context. The same task is now NOT associated with just one thread. All the context is lost once we dissociate tasks from threads. Exception traces do not provide very useful information and debugging is difficult.
In comes Project Loom with virtual threads that become the single unit of concurrency. And now you can perform a single task on a single virtual thread.
It's all fine until now, but the article goes on to state, with Project Loom:
A simple, synchronous web server will be able to handle many more requests without requiring more hardware.
I don't understand how we get performance benefits with Project Loom over asynchronous APIs? The asynchrounous API:s make sure to not keep any thread idle. So, what does Project Loom do to make it more efficient and performant that asynchronous API:s?
EDIT
Let me re-phrase the question. Let's say we have an http server that takes in requests and does some crud operations with a backing persistent database. Say, this http server handles a lot of requests - 100K RPM. Two ways of implementing this:
The HTTP server has a dedicated pool of threads. When a request comes in, a thread carries the task up until it reaches the DB, wherein the task has to wait for the response from DB. At this point, the thread is returned to the thread pool and goes on to do the other tasks. When DB responds, it is again handled by some thread from the thread pool and it returns an HTTP response.
The HTTP server just spawns virtual threads for every request. If there is an IO, the virtual thread just waits for the task to complete. And then returns the HTTP Response. Basically, there is no pooling business going on for the virtual threads.
Given that the hardware and the throughput remain the same, would any one solution fare better than the other in terms of response times or handling more throughput?
My guess is that there would not be any difference w.r.t performance.
We don't get benefit over asynchronous API. What we potentially will get is performance similar to asynchronous, but with synchronous code.
The answer by #talex puts it crisply. Adding further to it.
Loom is more about a native concurrency abstraction, which additionally helps one write asynchronous code. Given its a VM level abstraction, rather than just code level (like what we have been doing till now with CompletableFuture etc), It lets one implement asynchronous behavior but with reduce boiler plate.
With Loom, a more powerful abstraction is the savior. We have seen this repeatedly on how abstraction with syntactic sugar, makes one effectively write programs. Whether it was FunctionalInterfaces in JDK8, for-comprehensions in Scala.
With loom, there isn't a need to chain multiple CompletableFuture's (to save on resources). But one can write the code synchronously. And with each blocking operation encountered (ReentrantLock, i/o, JDBC calls), the virtual-thread gets parked. And because these are light-weight threads, the context switch is way-cheaper, distinguishing itself from kernel-threads.
When blocked, the actual carrier-thread (that was running the run-body of the virtual thread), gets engaged for executing some other virtual-thread's run. So effectively, the carrier-thread is not sitting idle but executing some other work. And comes back to continue the execution of the original virtual-thread whenever unparked. Just like how a thread-pool would work. But here, you have a single carrier-thread in a way executing the body of multiple virtual-threads, switching from one to another when blocked.
We get the same behavior (and hence performance) as manually written asynchronous code, but instead avoiding the boiler-plate to do the same thing.
Consider the case of a web-framework, where there is a separate thread-pool to handle i/o and the other for execution of http requests. For simple HTTP requests, one might serve the request from the http-pool thread itself. But if there are any blocking (or) high CPU operations, we let this activity happen on a separate thread asynchronously.
This thread would collect the information from an incoming request, spawn a CompletableFuture, and chain it with a pipeline (read from database as one stage, followed by computation from it, followed by another stage to write back to database case, web service calls etc). Each one is a stage, and the resultant CompletablFuture is returned back to the web-framework.
When the resultant future is complete, the web-framework uses the results to be relayed back to the client. This is how Play-Framework and others, have been dealing with it. Providing an isolation between the http thread handling pool, and the execution of each request. But if we dive deeper in this, why is it that we do this?
One core reason is to use the resources effectively. Particularly blocking calls. And hence we chain with thenApply etc so that no thread is blocked on any activity, and we do more with less number of threads.
This works great, but quite verbose. And debugging is indeed painful, and if one of the intermediary stages results with an exception, the control-flow goes hay-wire, resulting in further code to handle it.
With Loom, we write synchronous code, and let someone else decide what to do when blocked. Rather than sleep and do nothing.
The http server has a dedicated pool of threads ....
How big of a pool? (Number of CPUs)*N + C? N>1 one can fall back to anti-scaling, as lock contention extends latency; where as N=1 can under-utilize available bandwidth. There is a good analysis here.
The http server just spawns...
That would be a very naive implementation of this concept. A more realistic one would strive for collecting from a dynamic pool which kept one real thread for every blocked system call + one for every real CPU. At least that is what the folks behind Go came up with.
The crux is to keep the {handlers, callbacks, completions, virtual threads, goroutines : all PEAs in a pod} from fighting over internal resources; thus they do not lean on system based blocking mechanisms until absolutely necessary This falls under the banner of lock avoidance, and might be accomplished with various queuing strategies (see libdispatch), etc.. Note that this leaves the PEA divorced from the underlying system thread, because they are internally multiplexed between them. This is your concern about divorcing the concepts. In practice, you pass around your favourite languages abstraction of a context pointer.
As 1 indicates, there are tangible results that can be directly linked to this approach; and a few intangibles. Locking is easy -- you just make one big lock around your transactions and you are good to go. That doesn't scale; but fine-grained locking is hard. Hard to get working, hard to choose the fineness of the grain. When to use { locks, CVs, semaphores, barriers, ... } are obvious in textbook examples; a little less so in deeply nested logic. Lock avoidance makes that, for the most part, go away, and be limited to contended leaf components like malloc().
I maintain some skepticism, as the research typically shows a poorly scaled system, which is transformed into a lock avoidance model, then shown to be better. I have yet to see one which unleashes some experienced developers to analyze the synchronization behavior of the system, transform it for scalability, then measure the result. But, even if that were a win experienced developers are a rare(ish) and expensive commodity; the heart of scalability is really financial.
I'm new in Google Cloud Platform. I'm using AppEngine standard Environment. I need to create Threads in java but I think it's not possible, is it?
Here is the situation:
I need to create Feeds for users.
There are three databases with names d1, d2, d3.
Whenever a user sends a request for feeds Java creates three threads, one for each database. For example t1 for d1, t2 for d2 and t3 for d3. These threads must run asynchronously for better performance and after that the data from these 3 threads is combined and sent in the response back to user.
I know how to write code for this, but as you know I need threads for this work. If AppEngine standard Env. doesn't allow it then what can I do? Is there any other way?
In GCP Documentation they said:
To avoid using threads, consider Task Queues
I read about Task Queues. There are two types of queues: Push and Pull. Both run asynchronously but they do not send a response back to the user. I think they are only designed to complete tasks in the background.
Can you please let me know how can I achieve my goal? What things I need to learn for this?
Note: the answer is based solely on documentation, I'm not a java user.
Threads are supported by the standard environment, but with restrictions. From Threads:
Caution: Threads are a powerful feature that are full of surprises. To learn more about using threads with Java, we recommend
Goetz, Java Concurrency in Practice.
A Java application can create a new thread, but there are some
restrictions on how to do it. These threads can't "outlive" the
request that creates them.
An application can
Implement java.lang.Runnable.
Create a thread factory by calling com.google.appengine.api.ThreadManager.currentRequestThreadFactory().
Call the factory's newRequestThread method, passing in the Runnable, newRequestThread(runnable), or use the factory object
returned by
com.google.appengine.api.ThreadManager.currentRequestThreadFactory()
with an ExecutorService (e.g., call
Executors.newCachedThreadPool(factory)).
However, you must use one of the methods on ThreadManager to create
your threads. You cannot invoke new Thread() yourself or use the
default thread factory.
An application can perform operations against the current thread, such
as thread.interrupt().
Each request is limited to 50 concurrent request threads. The Java
runtime will throw a java.lang.IllegalStateException if you try to
create more than 50 threads in a single request.
When using threads, use high level concurrency objects, such as
Executor and Runnable. Those take care of many of the subtle but
important details of concurrency like Interrupts and scheduling
and bookkeeping.
An elegant way to implement what you need would be to create a parametrable endpoint in your application
/runFeed?db=d1
And from your "main" application code you can perform a fetchAsync call from URLFetchService that will return you a java.util.concurrent.Future<HTTPResponse>
This will allow you a better monitoring of what your application does.
This will add network latency to your application and increase its cost since urlFetchService is not free.
I am going through different concurrency model in multi-threading environment (http://tutorials.jenkov.com/java-concurrency/concurrency-models.html)
The article highlights about three concurrency models.
Parallel Workers
The first concurrency model is what I call the parallel worker model. Incoming jobs are assigned to different workers.
Assembly Line
The workers are organized like workers at an assembly line in a factory. Each worker only performs a part of the full job. When that part is finished the worker forwards the job to the next worker.
Each worker is running in its own thread, and shares no state with other workers. This is also sometimes referred to as a shared nothing concurrency model.
Functional Parallelism
The basic idea of functional parallelism is that you implement your program using function calls. Functions can be seen as "agents" or "actors" that send messages to each other, just like in the assembly line concurrency model (AKA reactive or event driven systems). When one function calls another, that is similar to sending a message.
Now I want to map java API support for these three concepts
Parallel Workers : Is it ExecutorService,ThreadPoolExecutor, CountDownLatch API?
Assembly Line : Sending an event to messaging system like JMS & using messaging concepts of Queues & Topics.
Functional Parallelism: ForkJoinPool to some extent & java 8 streams. ForkJoin pool is easy to understand compared to streams.
Am I correct in mapping these concurrency models? If not please correct me.
Each of those models says how the work is done/splitted from a general point of view, but when it comes to implementation, it really depends on your exact problem. Generally I see it like this:
Parallel Workers: a producer creates new jobs somewhere (e.g in a BlockingQueue) and many threads (via an ExecutorService) process those jobs in parallel. Of course, you could also use a CountDownLatch, but that means you want to trigger an action after exactly N subproblems have been processed (e.g you know your big problem may be split in N smaller problems, check the second example here).
Assembly Line: for every intermediate step, you have a BlockingQueue and one Thread or an ExecutorService. On each step the jobs are taken from one BlickingQueue and put in the next one, to be processed further. To your idea with JMS: JMS is there to connect distributed components and is part of the Java EE and was not thought to be used in a high concurrent context (messages are kept usually on the hard disk, before being processed).
Functional Parallelism: ForkJoinPool is a good example on how you could implement this.
An excellent question to which the answer might not be quite as satisfying. The concurrency models listed show some of the ways you might want to go about implementing an concurrent system. The API provides tools used to implementing any of these models.
Lets start with ExecutorService. It allows you to submit tasks to be executed in a non-blocking way. The ThreadPoolExecutor implementation then limits the maximum number of threads available. The ExecutorService does not require the task to perform the complete process as you might expect of a parallel worker. The task may be limited to specific part of the process and send a message upon completion that starts the next step in an assembly line.
The CountDownLatch and the ExecutorService provide a means to block until all workers have completed that may come in handy if a certain process has been divided to different concurrent sub-tasks.
The point of JMS is to provide a means for messaging between components. It does not enforce a specific model for concurrency. Queues and topics denote how a message is sent from a publisher to a subscriber. When you use queues the message is sent to exactly one subscriber. Topics on the other hand broadcast the message to all subscribers of the topic.
Similar behavior could be achieved within a single component by for example using the observer pattern.
ForkJoinPool is actually one implementation of ExecutorService (which might highlight the difficulty of matching a model and an implementation detail). It just happens to be optimized for working with large amount of small tasks.
Summary: There are multiple ways to implement a certain concurrency model in the Java environment. The interfaces, classes and frameworks used in implementing a program may vary regardless of the concurrency model chosen.
Actor model is another example for an Assembly line. Ex: akka
I want to generate some text string that is going to be sent via TCP socket . I have accomplished it within few minutes.
However I want a producer consumer pattern.I dont care if it failed or not.
Should I create a Blocking Queque at application for this ? Should I create a service ?
Note that I want a single thread to manage this job.
In the case it's a short task (like you commented), I'd recommend putting it within an AsyncTask as a background thread. You can control anything about this separately, which will help you also debugging it. Services are more intended for long executing tasks, so I'd not recommend it at this scope (it's a bit harder even to communicate with other Activity's. Here you'll find the AsyncTask's documentation, and here a good example.
The Blocking structure depends on your needs - but I don't think you'll need that in your case. Anyway, if you would need that, there're lots of thread-safe data structures you may use, you might find this helpful.
Create a LinkedBlockingQueue where your producer adds data. Create a Timer that fires every second or so. The task of the Timer would be to send the messages over the wire.
For this, both the producer (the one generating the messages) and consumer (Timer) should have access to the LinkedBlockingQueue. The Timer will remove the first element of the LinkedBlockingQueue and then send it.
Sounds good ?
I'm in the process of converting our java code to use NIO, but I'm not sure of the best way to design it.
My initial approach was to create a pool of selector threads. The threads are started/killed as needed, and channels are registered to a selector thread when they are connected/accepted in a round-robin fashion. From there, each thread blocks on select(), and when woken up will run the appropriate callback associated with each channel that has a selected key.
In addition to this "multiple selector thread" design, I've also seen people say to use a single selector thread, and a pool of dispatch threads. When an IO operation is ready to be performed, the selector notifies a dispatcher thread, which then processes the request. This model has the benefit of not blocking the IO thread, but now we're forcing all of the IO into a single thread and dealing with synchronization/an event queue in the dispatcher.
Additionally I wouldn't be able to use a single direct byte buffer for reading each channel, passing it directly into the callback. Instead I'd have to copy the data out each time a read occurs into an array and reset. (I think..)
What's the best way to implement this?
Take a look at the Reactor Pattern
http://gee.cs.oswego.edu/dl/cpjslides/nio.pdf
How you want your selectors to work really depends on your usecase. (Number of connections, message size, etc)
What is the problem that you are trying to solve by converting from IO to NIO?
You really should look into Mina,
http://mina.apache.org/
It solves all the problems you mentioned.
Also have a look at netty which is really fast and feature rich and also is used in big systems and by big companies like Redhat (jboss), Twitter, Facebook... .