GC selection for Java/Netty file upload REST service

GC selection for Java/Netty file upload REST service - java

I have a Java REST service which runs on Netty and is used for uploading(streaming) large number of files of 5MB to 500MB. When I'm increasing the number of concurrent uploads at some point the application goes out of memory, which is expected, but I'm looking for recommendations on which Java GC and VM settings should I use in this scenario to improve the performance and to reduce the memory footprint.
I would really appreciate if somebody could share similar experiences.
UPDATE: To add more context to the question, the REST service is getting file as a stream and passing the same stream to Amazon S3.

Likely you should use a flow control when sending large data amount. You can check if the channel is writable (Channel.isWritable()) before sending data and wait untill it will be ready. You can use notifications via ChannelInboundHandler.channelWritabilityChanged(ChannelHandlerContext ctx) to track this. Without flow control all your large files will consume memory waiting for sending in Netty's output buffer.

Related

Performance of Socket write vs disk write

My java application logs a fair amount of information to a logfile on disk. Some of this logged information is more important than the rest; except that in rare cases the less-important info is needed to explain to the end-user why the code in production took a certain decision.
I was wondering if it will be a good idea to log the less important information to a socket instead of the file on disk. Is socket write significantly faster than disk write?
Update: Basically, I wanted to log to a socket in the same subnet or even the same machine, assuming that it would be faster than writing to disk. Another process (not part of my application) would then read from that socket at its convenience. I was thinking this would be logstash pulling from a socket. Async logging to disk using another thread is another alternative but I wanted to consider the socket option first if that is an easy solution with minimal performance hit.

You have few choices:
local storage is usually faster than network
you could use async logging to disk, so your process fires and forgets (which is fast!)
logstash can read from Unix Domain Sockets, if you are on *nix; these are usually faster than I/O
If you are writing somewhere fast and from there it is being forwarded in a slower fashion (logstash logging over network to some Elastic instance) where is the buffering happening? Such setup will generate growing backlog of messages yet to be shipped if the logging happens at high rate for prolonged period of time.
In the above scenarios buffering will happen (respectively):
direct sync write to disk: final log file on the disk is the buffer
async logging framework: buffers could eat into your heap or process memory (when outside of heap, or in some kernel area, therefore in RAM)
unix domain sockets: buffered in the kernel space, so RAM again
In the last 2 options things will get increasingly creaky in constant high volume scenario.
Test and profile...
or just log to the local disk and rotate the files, deleting old ones.

Socket is not a destination. It's a transport. Your question "send data to socket" should therefore be rephrased to "send data to network", "send data to disk" or "send data to another process".
In all these cases, socket itself is unlikely to be a bottleneck. The bottleneck will be either network, disk or application CPU usage - depending on where you are actually sending your data from the socket. On OS level, sockets are usually implemented as zero-copy mechanism, which means that the data is just passed to the other side as a pointer and is therefore highly efficient.

Should I use memory mapped files for my simple flow?

I am interested in implemented the following simple flow:
Client sends to a server process a simple message which the server stores. Since the message does not have any hierarchical structure IMO the best approach is to save it in a file instead of an rdb.
But I want to figure out how to optimize this since as I see it there are 2 choices:
Server sends a 200 OK to the client and then stores the message so
the client does not notice any delay
Server saves the message and then sends the 200OK but then the
client notices the overhead of the file I/O.
I prefer the performance of (1) but this could lead to a client thinking all went ok when actually the msg was never saved (for various error cases).
So I was thinking if I could use nio and memory mapped files.
But I was wondering is this a good candidate for using mem-mapped files? Would using a memory mapped file guarantee that e.g. if the process crashed the msg would be saved?
In my mind the flow would be creating/opening and closing many files so is this a good candidate for memory-mapping files?

Server saves the message and then sends the 200OK but then the client notices the overhead of the file I/O.
I suggest you test this. I doubt a human will notice a 10 milli-second delay and I expect you should get better than this for smaller messages.
So I was thinking if I could use nio and memory mapped files.
I use memory mapping as it can reduce the overhead per write by up to 5 micro-second. Is this important to you? If not, I would stick with the simplest approach.
Would using a memory mapped file guarantee that e.g. if the process crashed the msg would be saved?
As long as the OS doesn't crash, yes.
In my mind the flow would be creating/opening and closing many files so is this a good candidate for memory-mapping files?
Opening and closing files is likely to be fast more expensive than writing the data. (By an order of magnitude) I would suggest keeping such operations to a minimum.
You might find this library of mine interesting. https://github.com/peter-lawrey/Java-Chronicle It allows you to persist messages in the order of single digit micro-seconds for text and sub-micro-second for a small binary message.

Java/Scala streaming timeseries data

I've got underlying timeseries database based on HBase (OpenTSDB), on top of it works my applications that somehow loads data from HBase. I need to stream this fetched time series to application clients. What is the best solution for it? If client paused processing already received data, server should also pause, if client dead, server should stop sending data.
I would like a pure Java solution. I know about ZMq, but had not very nice experience with it. Maybe I should take a look on Netty?
P.S. Amount of data is big enough. Gigabytes and tens of gigabytes per one request to server.

Is there a Java local queue library I can use that keeps memory usage low by dumping to the hard drive?

This maybe not possible but I thought I might just give it a try. I have some work that process some data, it makes 3 decisions with each data it proceses: keep, discard or modify/reprocess(because its unsure to keep/discard). This generates a very large amount of data because the reprocess may break the data into many different parts.
My initial method was to send it to my executionservice that was processing the data but because the number of items to process was large I would run out of memory very quickly. Then I decided to maybe offload the queue off to a messaging server(rabbitmq) which works fine but now I'm bound by network IO. What I like about rabbitmq is it keeps messages in memory up to a certain level and then dumps old messages to the local drive so if I have 8 gigs of memory on my server I can still have a 100 gig message queue.
So my question is, is there any library that has a similar feature in Java? Something that I can use as a nonblocking queue that keeps only X items in queue(either by number of items or size) and writes the rest to the local drive.
note: Right now I'm only asking for this to be used on one server. In the future I might add more servers but because each server is self-generating data I would try to take messages from one queue and push them to another if one server's queue is empty. The library would not need to have network access but I would need to access the queue from another Java process. I know this is a long shot but thought if anyone knew it would be SO.

Not sure if it id the approach you are looking for, but why not using a lightweight database like hsqldb and a persistence layer like hibernate? You can have your messages in memory, then commit to db to save on disk, and later query them, with a convenient SQL query.

Actually, as Cuevas wrote, HSQLDB could be a solution. If you use the "cached table" provided, you can specify the maximum amount of memory used, exceeding data will be sent to the hard drive.

Use the filesystem. It's old-school, yet so many engineers get bitten with libraries because they are lazy. True that HSQLDB provides lots of value-add features, but in the context of being light weight....

Memory usage in Google App Engine

I am a bit confused. I wrote a Java stand alone app and now I want to use GAE to
deploy it on the web and on the way also to learn about GAE.
In my application, I read data from file, store it in memory, process it, and then store the results in memory or file.
I understand that now I need to store the results in the GAE's data store, which is fine. So I can run my program independently on my computer, then write the results to file, and then use GAE to upload all the results to the data store, and then users can query it. However, is there a way that I can transfer the entire process into the GAE application? so the application reads data from file, do the processing (use the memory on the application server and not my computer - needs at least 4GB of RAM), and then when it's done (might take 1-2 hours), writes everything to the GAE data store? (so it's an internal "offline" process that no users are involved).
I'm a bit confused since Google don't mention anything about memory quota.
Thanks!

You will not be able to do your offline processing the way you are envisioning. There is a limit to how much memory your app can use, but that is not the main problem. All processing in app engine is done in request handlers. In other words, any action you want your app to do will be written as if it is handling a web request. Each of these handlers is limited to 30 seconds of running time. If your process tries to run longer, it will get shut down. App engine is optimized for serving web requests, not doing heavy computations.
All that being said, you may be able to break up your computational tasks into 30 second chunks and store intermediate results in the datastore or memcache. In that case you could use a cron job or task queue (both described in the app engine docs) to keep calling your processing handlers until the data crunching was done.
In summary, yes, it may be possible to do what you want, but it might not be worth the trouble. Look into other cloud solutions like Amazon's EC2 or Hadoop if you want to do computationally intensive things.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.