Multi threaded UDP server with Netty - java

I'm trying to implement a UDP server with Netty. The idea is to bind only once (therefore creating only one Channel). This Channel is initialized with only one handler that dispatches processing of incoming datagrams among multiple threads via an ExecutorService.
public class SpringConfig {
private Dispatcher dispatcher;
private String host;
private int port;
public Bootstrap bootstrap() throws Exception {
Bootstrap bootstrap = new Bootstrap()
.group(new NioEventLoopGroup(1))
.option(ChannelOption.ALLOCATOR, PooledByteBufAllocator.DEFAULT)
ChannelFuture future = bootstrap.bind(host, port).await();
throw new Exception(String.format("Fail to bind on [host = %s , port = %d].", host, port), future.cause());
return bootstrap;
public class Dispatcher extends ChannelInboundHandlerAdapter implements InitializingBean {
private int workerThreads;
private ExecutorService executorService;
public void channelRead(ChannelHandlerContext ctx, Object msg) throws Exception {
DatagramPacket packet = (DatagramPacket) msg;
final Channel channel =;
executorService.execute(new Runnable() {
public void run() {
//Process the packet and produce a response packet (below)
DatagramPacket responsePacket = ...;
ChannelFuture future;
try {
future = channel.writeAndFlush(responsePacket).await();
} catch (InterruptedException e) {
log.warn("Failed to write response packet.");
public void afterPropertiesSet() throws Exception {
executorService = Executors.newFixedThreadPool(workerThreads);
I have the following questions:
Should the DatagramPacket received by the channelRead method of the Dispatcher class be duplicated before being used by the worker thread? I wonder if this packet is destroyed after the channelRead method returns, even if a reference is kept by the worker thread.
Is it safe to share the Channel among all the worker threads and let them call writeAndFlush concurrently?

Nope. If you need the object to live longer you either turn it into something else or use ReferenceCountUtil.retain(datagram) and then ReferenceCountUtil.release(datagram) once you're done with it. You also shouldn't be doing await() at the executor service as well, you should register a handler for whatever happens.
Yes, channel objects are thread safe and they can be called from many different threads.


Is it valid to pass netty channels to a queue and use it for writes on a different thread later on?

I have the following setup. There is a message distributor that spreads inbound client messages across a configured number of message queues (LinkedBlockingQueues in my case), based on an unique identifier called appId (per connected client):
public class MessageDistributor {
private final List<BlockingQueue<MessageWrapper>> messageQueueBuckets;
public MessageDistributor(List<BlockingQueue<MessageWrapper>> messageQueueBuckets) {
this.messageQueueBuckets = messageQueueBuckets;
public void handle(String appId, MessageWrapper message) {
int index = (messageQueueBuckets.size() - 1) % hash(appId);
try {
} catch (Exception e) {
// handle exception
As I also need to answer the message later on, I wrap the message object and the netty channel inside a MessageWrapper:
public class MessageWrapper {
private final Channel channel;
private final Message message;
public MessageWrapper(Channel channel, Message message) { = channel;
this.message = message;
public Channel getChannel() {
return channel;
public Message getMessage() {
return message;
Furthermore, there is a message consumer, which implements a Runnable and takes new messages from the assigned blocking queue. This guy performs some expensive/blocking operations that I want to have outside the main netty event loop and which should also not block operations for other connected clients too much, hence the usage of several queues:
public class MessageConsumer implements Runnable {
private final BlockingQueue<MessageWrapper> messageQueue;
public MessageConsumer(BlockingQueue<MessageWrapper> messageQueue) {
this.messageQueue = messageQueue;
public void run() {
while (true) {
try {
MessageWrapper msgWrap = messageQueue.take();
Channel channel = msgWrap.getChannel();
Message msg = msgWrap.getMessage();
doSthExepnsiveOrBlocking(channel, msg);
} catch (Exception e) {
// handle exception
public void doSthExepnsiveOrBlocking(Channel channel, Message msg) {
// some expsive/blocking operations
The setup of all classes looks like the following (the messageExecutor is a DefaultEventeExecutorGroup with a size of 8):
int nrOfWorkers = config.getNumberOfClientMessageQueues();
List<BlockingQueue<MessageWrapper>> messageQueueBuckets = new ArrayList<>(nrOfWorkers);
for (int i = 0; i < nrOfWorkers; i++) {
messageQueueBuckets.add(new LinkedBlockingQueue<>());
MessageDistributor distributor = new MessageDistributor(messageQueueBuckets);
List<MessageConsumer> consumers = new ArrayList<>(nrOfWorkers);
for (BlockingQueue<MessageWrapper> messageQueueBucket : messageQueueBuckets) {
MessageConsumer consumer = new MessageConsumer(messageQueueBucket);
My goal with this approach is to isolate connected clients from each other (not fully, but at least a bit) and also to execute expensive operations on different threads.
Now my question is: Is it valid to wrap the netty channel object inside this MessageWrapper for later use and access its write method in some other thread?
Instead of building additional message distribution mechanics on top of netty, I decided to simply go with a separate EventExecutorGroup for my blocking channel handlers and see how it works.
Yes it is valid to call Channel.* methods from other threads. That said the methods perform best when these are called from the EventLoop thread itself that belongs to the Channel.

Improving the Performance of Netty Server in JAVA

I have a situation like: My Netty Server will be getting data from a Client at a blazing speed. I think the client is using somewhat PUSH mechanism for that speed. I don't know what exactly PUSH - POP mechanism is, but I do feel that the Client is using some mechanism for sending data at a very high speed.Now my requirement is, I wrote a simple TCP Netty server that receives data from the client and just adds to the BlockingQueue implemented using ArrayBlockingQueue. Now , as Netty is event based, the time taken to accept the data and store it in a queue is some what more , this is raising an exception at the client side that the Netty server is not running.But my server is running perfectly, but the time to accept single data and store it in the queue is more. How can I prevent this? Is there any fastest queue for this situation? I nam using BlockingQueue as another thread will take data from the queue and process it. So I need a synchronized queue. How can I improve the performance of the Server or is there any way to insert data at a very high speed? All I care about is latency. The latency needs to be as low as possible.
My Server code:
public class Server implements Runnable {
private final int port;
static String message;
Channel channel;
ChannelFuture channelFuture;
int rcvBuf, sndBuf, lowWaterMark, highWaterMark;
public Server(int port) {
this.port = port;
rcvBuf = 2048;
sndBuf = 2048;
lowWaterMark = 1024;
highWaterMark = 2048;
public void run() {
try {
} catch (Exception ex) {
System.err.println("Error in Server : "+ex);
public void startServer() {
// System.out.println("8888 Server started");
EventLoopGroup group = new NioEventLoopGroup();
try {
ServerBootstrap b = new ServerBootstrap();
.localAddress(new InetSocketAddress(port))
.childOption(ChannelOption.SO_RCVBUF, rcvBuf * 2048)
.childOption(ChannelOption.SO_SNDBUF, sndBuf * 2048)
.childOption(ChannelOption.WRITE_BUFFER_WATER_MARK, new WriteBufferWaterMark(lowWaterMark * 2048, highWaterMark * 2048))
.childOption(ChannelOption.TCP_NODELAY, true)
.childHandler(new ChannelInitializer<SocketChannel>() {
public void initChannel(SocketChannel ch) throws Exception {
channel = ch;
System.err.println("OMS connected : " + ch.localAddress());
ch.pipeline().addLast(new ReceiveFromOMSDecoder());
channelFuture = b.bind(port).sync(); =;;
} catch (InterruptedException ex) {
System.err.println("Exception raised in SendToOMS class"+ex);
} finally {
My ServerHandler code:
public class ReceiveFromOMSDecoder extends MessageToMessageDecoder<ByteBuf> {
private Charset charset;
public ReceiveFromOMSDecoder() {
* Creates a new instance with the specified character set.
public ReceiveFromOMSDecoder(Charset charset) {
if (charset == null) {
throw new NullPointerException("charset");
this.charset = charset;
protected void decode(ChannelHandlerContext ctx, ByteBuf msg, List<Object> out) throws Exception {
String buffer = msg.toString(charset);
Server.sq.insertStringIntoSendingQueue(buffer); //inserting into queue
Logger.error("Null string received"+buffer);
public void exceptionCaught(ChannelHandlerContext ctx, Throwable cause) {
// Logger.error(cause.getMessage());
Three quickies:
Doesn't look like you're sending a response. You probably should.
Don't block the IO thread. Use an EventExecutorGroup to dispatch the handling of the incoming payload. i.e. something like ChannelPipeline.addLast(EventExecutorGroup group, String name, ChannelHandler handler).
Just don't block in general. Ditch your ArrayBlockingQueue and take a look at JCTools or some other implementation to find a non-blocking analog.

Create thousands of Netty clients without also creating thousands of threads

I have created a fairly straight forward server using Netty 4. I have been able to scale it up to handle several thousand connections and it never climbs above ~40 threads.
In order to test it out, I have also created a test client that creates thousands of connections. Unfortunately this creates as many threads as it makes connections. I was hoping to minimize threads for the clients. I have looked at many posts for this. Many examples show single connection setup. This and this say to share NioEventLoopGroup across clients, which I do. I'm getting a limited number of nioEventLoopGroup, but getting a thread per connection elsewhere. I am not purposely creating threads in the pipeline and don't see what could be.
Here is a snippet from the setup of my client code. It seems that it should maintain a fixed thread count based on what I've researched so far. Is there something I'm missing that I should be doing to prevent a thread per client connection?
final EventLoopGroup group = new NioEventLoopGroup();
for (int i=0; i<100; i++)){
MockClient client = new MockClient(i, group);
public class MockClient implements Runnable {
private final EventLoopGroup group;
private int identity;
public MockClient(int identity, final EventLoopGroup group) {
this.identity = identity; = group;
public void run() {
try {
} catch (Exception e) {}
public void connect() throws Exception{
Bootstrap b = new Bootstrap();
.handler(new MockClientInitializer(identity, this));
final Runnable that = this;
// Start the connection attempt
b.connect(config.getHost(), config.getPort()).addListener(new ChannelFutureListener() {
public void operationComplete(ChannelFuture future) throws Exception {
if (future.isSuccess()) {
Channel ch = future.sync().channel();
} else {
//if the server is down, try again in a few seconds, 15, TimeUnit.SECONDS);
As has happened to me many times before, explaining the problem in detail made me think about it more and I came across the issue. I wanted to provide it here should anyone else come across the same issue with creating thousands of Netty clients.
I have one path in my pipeline that will create a timeout task to simulate a client connection rebooting. It turns out it was this timer task that was creating the extra threads per connection whenever it received a 'reboot' signal from the server (which happens every so often) up until there was a thread per connection.
private final HashedWheelTimer timer;
protected void channelRead0(ChannelHandlerContext ctx, Packet msg) throws Exception {
Packet packet = reboot();
ChannelFutureListener closeHandler = new ChannelFutureListener() {
public void operationComplete(ChannelFuture future) throws Exception {
RebootTimeoutTask timeoutTask = new RebootTimeoutTask(identity, client);
timer.newTimeout(timeoutTask, SECONDS_FOR_REBOOT, TimeUnit.SECONDS);
ctx.writeAndFlush(packet).addListener(new ChannelFutureListener() {
public void operationComplete(ChannelFuture future) throws Exception {
if (future.isSuccess()) {;
} else {;
Timeout Task
public class RebootTimeoutTask implements TimerTask {
public RebootTimeoutTask(...) {...}
public void run(Timeout timeout) throws Exception {

What's the best way to reconnect after connection closed in Netty

Simple scenario:
A lower level class A that extends SimpleChannelUpstreamHandler. This class is the workhorse to send the message and received the response.
A top level class B that can be used by other part of the system to send and receive message (can simulate both Synchronous and Asynchronous). This class creates the ClientBootstrap, set the pipeline factory, invoke the bootstrap.connect() and eventually get a handle/reference of the class A through which to be used to send and receive message. Something like:
ChannelFuture future = bootstrap.connect();
Channel channel = future.awaitUninterruptibly().getChannel();
A handler = channel.getPipeline().get(A.class);
I know in class A, I can override
public void channelClosed(ChannelHandlerContext ctx, ChannelStateEvent e);
so that when the remote server is down, I can be notified.
Since after channel is closed, the original class A reference (handler above) in class B is not valid anymore, so I need to replace it with a new reference.
Ideally, I want class A to have a mechanism to notify class B within the above overrided channelClosed method so the bootstrap.connect can be invoked again within class B. One way to do this is to have a reference in class A that reference class B. To do that, I would need to pass class B reference to the PipelineFactory and then have the PipelineFactory pass the reference of B to A.
Any other simpler way to achieve the same thing?
Channel.closeFuture() returns a ChannelFuture that will notify you when the channel is closed. You can add a ChannelFutureListener to the future in B so that you can make another connection attempt there.
You probably want to repeat this until the connection attempt succeeds finally:
private void doConnect() {
Bootstrap b = ...;
b.connect().addListener((ChannelFuture f) -> {
if (!f.isSuccess()) {
long nextRetryDelay = nextRetryDelay(...);, ..., () -> {
}); // or you can give up at some point by just doing nothing.
I don't know if this is the right solution but to fix the thread leak of trustin's solution I found I could shutdown the event loop after the scheduler had triggered:
final EventLoop eventloop =;
b.connect().addListener((ChannelFuture f) -> {
if (!f.isSuccess()) {
long nextRetryDelay = nextRetryDelay(...);
eventloop.schedule(() -> {
}, nextRetryDelay, ...);
Here's another version encapsulating the reconnect behavior in a small helper class
Bootstrap clientBootstrap...
EventLoopGroup group = new NioEventLoopGroup();
Session session = new Session(clientBootstrap,group);
Disposable shutdownHook = session.start();
interface Disposable {
void dispose();
class Session implements Disposable{
private final EventLoopGroup scheduler;
private final Bootstrap clientBootstrap;
private int reconnectDelayMs;
private Channel activeChannel;
private AtomicBoolean shouldReconnect;
private Session(Bootstrap clientBootstrap, EventLoopGroup scheduler) {
this.scheduler = scheduler;
this.clientBootstrap = clientBootstrap;
this.reconnectDelayMs = 1;
this.shouldReconnect = new AtomicBoolean(true);
public Disposable start(){
//Create a new connectFuture
ChannelFuture connectFuture = clientBootstrap.connect();
connectFuture.addListeners( (ChannelFuture cf)->{
if(cf.isSuccess()){"Connection established");
reconnectDelayMs =1;
activeChannel =;
//Listen to the channel closing
var closeFuture =activeChannel.closeFuture();
closeFuture.addListeners( (ChannelFuture closeFut)->{
if(shouldReconnect.get()) {
activeChannel.eventLoop().schedule(this::start, nextReconnectDelay(), TimeUnit.MILLISECONDS);
else{"Session has been disposed won't reconnect");
int delay =nextReconnectDelay();"Connection failed will re-attempt in {} ms",delay);,delay , TimeUnit.MILLISECONDS);
return this;
* Call this to end the session
public void dispose() {
try {
if(activeChannel !=null) {
}catch(InterruptedException e){
L.warn("Interrupted while shutting down TcpClient");
private int nextReconnectDelay(){
this.reconnectDelayMs = this.reconnectDelayMs*2;
return Math.min(this.reconnectDelayMs, 5000);

Pre-initializing a pool of worker threads to reuse connection objects (sockets)

I need to build a pool of workers in Java where each worker has its own connected socket; when the worker thread runs, it uses the socket but keeps it open to reuse later. We decided on this approach because the overhead associated with creating, connecting, and destroying sockets on an ad-hoc basis required too much overhead, so we need a method by which a pool of workers are pre-initializaed with their socket connection, ready to take on work while keeping the socket resources safe from other threads (sockets are not thread safe), so we need something along these lines...
public class SocketTask implements Runnable {
Socket socket;
public SocketTask(){
//create + connect socket here
public void run(){
//use socket here
On application startup, we want to initialize the workers and, hopefully, the socket connections somehow too...
MyWorkerPool pool = new MyWorkerPool();
for( int i = 0; i < 100; i++)
pool.addWorker( new WorkerThread());
As work is requested by the application, we send tasks to the worker pool for immediate execution...
pool.queueWork( new SocketTask(..));
Updated with Working Code
Based on helpful comments from Gray and jontejj, I've got the following code working...
public class SocketTask implements Runnable {
private String workDetails;
private static final ThreadLocal<Socket> threadLocal =
new ThreadLocal<Socket>(){
protected Socket initialValue(){
return new Socket();
public SocketTask(String details){
this.workDetails = details;
public void run(){
Socket s = getSocket(); //gets from threadlocal
//send data on socket based on workDetails, etc.
public static Socket getSocket(){
return threadLocal.get();
ExecutorService threadPool =
Executors.newFixedThreadPool(5, Executors.defaultThreadFactory());
int tasks = 15;
for( int i = 1; i <= tasks; i++){
threadPool.execute(new SocketTask("foobar-" + i));
I like this approach for several reasons...
Sockets are local objects (via ThreadLocal) available to the running tasks, eliminating concurrency issues.
Sockets are created once and kept open, reused
when new tasks get queued, eliminating socket object create/destroy overhead.
One idea would be to put the Sockets in a BlockingQueue. Then whenever you need a Socket your threads can take() from the queue and when they are done with the Socket they put() it back on the queue.
public void run() {
Socket socket = socketQueue.take();
try {
// use the socket ...
} finally {
This has the added benefits:
You can go back to using the ExecutorService code.
You can separate the socket communication from the processing of the results.
You don't need a 1-to-1 correspondence to processing threads and sockets. But the socket communications may be 98% of the work so maybe no gain.
When you are done and your ExecutorService completes, you can shutdown your sockets by just dequeueing them and closing them.
This does add the additional overhead of another BlockingQueue but if you are doing Socket communications, you won't notice it.
we don't believe ThreadFactory addresses our needs ...
I think you could make this work if you used thread-locals. Your thread factory would create a thread that first opens the socket, stores it in a thread-local, then calls the Runnable arg which does all of the work with the socket, dequeuing jobs from the ExecutorService internal queue. Once it is done the method would finish and you could get the socket from the thread-local and close it.
Something like the following. It's a bit messy but you should get the idea.
ExecutorService threadPool =
new ThreadFactory() {
public Thread newThread(final Runnable r) {
Thread thread = new Thread(new Runnable() {
public void run() {
// our tasks would then get the socket from the thread-local;
return thread;
So your tasks would implement Runnable and look like:
public SocketWorker implements Runnable {
private final ThreadLocal<Socket> threadLocal;
public SocketWorker(ThreadLocal<Socket> threadLocal) {
this.threadLocal = threadLocal;
public void run() {
Socket socket = threadLocal.get();
// use the socket ...
I think you should use a ThreadLocal
package com.stackoverflow.q16680096;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class Main
public static void main(String[] args)
ExecutorService pool = Executors.newCachedThreadPool();
int nrOfConcurrentUsers = 100;
for(int i = 0; i < nrOfConcurrentUsers; i++)
pool.submit(new InitSocketTask());
// do stuff...
pool.submit(new Task());
package com.stackoverflow.q16680096;
public class InitSocketTask implements Runnable
public void run()
Socket socket = SocketPool.get();
// Do initial setup here
package com.stackoverflow.q16680096;
public final class SocketPool
private static final ThreadLocal<Socket> SOCKETS = new ThreadLocal<Socket>(){
protected Socket initialValue()
return new Socket(); // Pass in suitable arguments here...
public static Socket get()
return SOCKETS.get();
package com.stackoverflow.q16680096;
public class Task implements Runnable
public void run()
Socket socket = SocketPool.get();
// Do stuff with socket...
Where each thread gets its own socket.

