I'm working in a remote administration toy project. For now, I'm able to capture screenshots and control the mouse using the Robot class. The screenshots are BufferedImage instances.
First of all, my requirements:
- Only a server and a client.
- Performance is important, since the client might be an Android app.
I've thought on opening two socket connections, one for mouse and system commands and the second one for the video feed.
How could I convert the screenshots to a video stream? Should I convert them to a known video format or would it be ok to just send a series of serialized images?
The compression is another problem. Sending the screen captures in full resolution would result in a low frame rate, according to my preliminary tests. I think I need at least 24 fps to perceive movement, so I've to both downscale and compress. I could convert the BufferedImages to jpg files and then set the compression rate, but I don't want to store the files on disk, they should live in RAM only. Another possibility would be to serialize instances (representing an uncompressed screenshot) to a GZipOutputStream. What is the correct approach for this?
To summarize:
In case you recommend the "series of images" approach, how would you serialize them to the socket OutputStream?
If your proposition is to convert to a know video format, which classes or libraries are available?
Thanks in advance.
UPDATE: my tests, client and server on same machine
-Full screen serialized BufferedImages (only dimension, type and int[]), without compression: 1.9 fps.
-full screen images through GZip streams: 2.6 fps.
-Downscaled images (640 width) and GZip streams: 6.56 fps.
-Full screen images and RLE encoding: 4.14 fps.
-Downscaled images and RLE encoding: 7.29 fps.
If its just screen captures, I would not compress them using a Video compression scheme, most likely you don't want lossy compression (blurred details in small text etc are the most common defects).
For getting a workable "remote Desktop" feel, remember the previously sent Screenshot and send only the difference to get to the next one. If nothing (or very little) changes between frames this is very efficient.
It will however not work well in certain situations like playing a video, game or scrolling a lot in a document.
Compressing the difference between two BufferedImage can be done with more or less elaborate methods, a very simple, yet reasonably effective method is simply to subtract one image from the other (resulting in zeros everywhere they are identical) and compressing the result with simple RLE (run length encoding).
Reducing the color precision can be used to further reduce the amount of data (depending on the use case you could omit the least significant N bits of each color channel, for most GUI applications look not much different if you reduce colors from 24 bits to 15 bits).
Break the screen up into a grid squares (or strips)
Only send the grid square if it's different from the previous
// server start
sendScreenMetaToClient(); // width, height, how many grid squares
...
// server loop ImageBuffer[] prevScrnGrid while(isRunning) {
ImageBuffer scrn = captureScreen();
ImageBuffer[] scrnGrid = screenToGrid(scrn);
for(int i = 0; i < scrnGrid.length; i++) {
if(isSameImage(scrnGrid[i], prevScrnGrid[i]) == false) {
prevScrnGrid[i] = scrnGrid[i];
sendGridSquareToClient(i, scrnGrid[i]); // send the client a message saying it will get grid square (i) then send the bytes for grid square (i)
}
} }
Don't send serialized java objects just send the image data.
ByteArrayOutputStream imgBytes = new ByteArrayOutputStream();
ImageIO.write( bufferedImage, "jpg", imgBytes );
imgBytes.flush();
Firstly, I might suggest only capturing a small part of the screen, rather than downscaling and potentially losing information, perhaps with something like a sliding window which can be moved around by pushing the edges with a cursor. This is really just a small design suggestion though.
As for compression, I would think that a series of images would not compress separately as well as with a decent video compression scheme, especially as frames are likely to remain consistent between captures in this scenario.
One option would be to use Xuggle, which is capable of capturing the desktop via Robot in a number of video formats afaiu, but I can't tell if you can stream and decode with this.
For capturing jpegs and converting them, then you can also use this.
Streaming these videos seems to be a little more complicated, though.
Also, it seems that the abandoned Java Media Framework supports this functionality.
My knowledge in this area is not fantastic tbh, so sorry if I have wasted your time, but it looks like some more useful information on the feasibility of using Xuggle as a screensharer has been compiled here. This also appears to link to their own notes on existing approaches.
If it doesn't need to be pure Java I reckon this would all be much easier using just by interfacing with a native screen capture tool...
Maybe it would be easiest just to send video as a series of jpegs after all! You could always implement your own compression scheme if you were feeling a little crazy...
I think you described a good solution in your question. Convert the images to jpeg, but don't write them as files to disk. If you want it to be a known video format, use M-JPEG. M-JPEG is a stream of jpeg frames in a standard format. Many digital cameras, especially older ones, save videos in this format.
You can get some information about how to play an M-JPEG stream from this question's answers: Android and MJPEG
If network bandwidth is a problem, then you'll want to use an inter-frame compression system such as MPEG-2, h.264, or similar. That requires a lot more processing than M-JPEG but is far more efficient.
If you're trying to get 24fps video then there's no reason not to use modern video codecs. Why try and recreate that wheel?
Xuggler works fine for encoding h264 video and sounds like it would serve your needs nicely.
Related
In short, my question is: Which is the fastest format for encoding with JCodec, without loosing too much quality (like mangled colors)?
An example to what I mean with "Mangled colours, can be found in the videos in the description of this issue.
The rest here is contextual information to my considerations and what I have tried:
I am creating a screen recorder in Java. I have solved the issue of getting more than 10 FPS as BufferedImages (at least on Windows. Xorg is not very cooperative), but encoding is not fast enough to follow.
My solution is threaded with a producer, ad consumer and a BlockingQueue for transfering frames.
I need it to be able to encode at least 15 FPS full HD, but more is better.
I probably need to re-encode after encoding the first time, but for now, I just want to store the frames without loosing too much quality, and saving at least some bits.
I am considering PRORES, since other formats does not seem to play well (most just doesn't write anything, and h.264 mangles the colours), but is that a viable alternative?
Other ways of storing a lot of BufferedImage objects are welcome too, but I would prefer encoding directly to video. (I was considering writing PNGs or BMPs enumerated to a zip, but have not gotten my head around it yet.)
So, lets say I want to recode some PNG to JPEG in Java. The image has extreme resolution, lets say for example 10 000 x 10 000px. Using "standard" Java image API Writers and Reader, you need at some point to have entire image decoded in RAM, which takes extreme amount of RAM space (hundreds of MB). I have been looking how other tools do this, and I found that ImageMagick uses disk pixel storage, but this seems to by way too slower for my needs. So what I need is tru streaming recoder. And by true streaming I mean read and process data by chuncks or bins, not just give stream as input but decode it whole beforehand.
Now, first the theory behind - is it even possible, given JPEG and PNG algorithms, to do this using streams, or lets say in bins of data? So there is no need to have entire image encoded in memory(or other storage)? In JPEG compression, first few stages could be done in streams, but I believe Huffman encoding needs to build entire tree of value probabilities after quantization, therefore it needs to analyze whole image - so whole image needs to be decoded beforehand, or somehow on demand by regions.
And the golden question, if above could be achieved, is there any Java library that can actually work in this way? And save large amount of RAM?
If I create a 10,000 x 10,000 PNG file, full of incompressible noise, with ImageMagick like this:
convert -size 10000x10000 xc:gray +noise random image.png
I see ImageMagick uses 675M of RAM to create the resulting 572MB file.
I can convert it to a JPEG with vips like this:
vips im_copy image.png output.jpg
and vips uses no more than 100MB of RAM while converting, and takes 7 seconds on a reasonable spec iMac around 4 years old - albeit with SSD.
I have thought about this for a while, and I would really like to implement such a library. Unfortunately, it's not that easy. Different image formats store pixels in different ways. PNG or GIFs may be interlaced. JPEGs may be progressive (multiple scans). TIFFs are often striped or tiled. BMPs are usually stored bottom up. PSDs are channeled. Etc.
Because of this, the minimum amount of data you have to read to recode to a different format, may in worst case be the entire image (or maybe not, if the format supports random access and you can live with a lot of seeking back and forth)... Resampling (scaling) the image to a new file using the same format would probably work in most cases though (probably not so good for progressive JPEGs, unless you can resample each scan separately).
If you can live with disk buffer though, as the second best option, I have created some classes that allows for BufferedImages to be backed by nio MappedByteBuffers (memory-mapped file Buffers, kind of like virtual memory). While performance isn't really like in-memory images, it's also not entirely useless. Have a look at MappedImageFactory and MappedFileBuffer.
I've written a PNG encoder/decoder that does that (read and write progressively, which only requires to store a row in memory) for PNG format: PNGJ
I don't know if there is something similar with JPEG
How to process videos in Java without any external API?
I have a mpeg4 or mpeg2 video which I want to view in a JPanel reading each frame one after another and then displaying the frames with paintComponent(). Displaying each file as BufferedImage in Graphics g.
My question is how to get an array of BufferedImage class from the video files?
You certainly can process videos in Java without relying on external libraries. It will take a fair amount of coding effort on your part, so be forewarned.
If you're decoding an MPEG-2 file, that will probably be a program stream, so you'll need to write the code to take that apart. MPEG-4 part 2 video will probably arrive in an MP4 container which will require a lot of code to take apart (not too hard, just a lot of details). So this is just the container; inside will be chunks of compressed video and audio.
Now you will need to decode either the MPEG-2 or MPEG-4 video. This will entail parsing variable length codes from the bitstream and recovering syntax elements. For intraframes, this will give you reconstruction data to apply to a stream of macroblocks. You will combine things like DCT coefficients, differential coding, dequantization factors, zigzag scan patterns, discrete cosine transforms, and possibly post-processing filters in order to recover the original image. That's just for intraframes; then there are the interframes which also apply motion vectors and copy data from previously decoded frames.
After getting a frame, you will find that it is in a YUV colorspace. You will probably need to manually convert it to an RGB colorspace in order to plot it onto a BufferedImage.
It's entirely possible that you could implement all of this on your own. Or you could find an appropriate Java-friendly multimedia library, complete with its own API, to do the heavy lifting for you.
im transeffing image throw tcp/ip and i like to optimize it and still good quality as much as possible
what kind of methods or algorithms i can use ?
p.s
now if i think about it maybe i should ask what is the best and the fast way to send image
via tcp/ip
To find the right answer to your question, you need to have a look at the images themselves. Are they real world images captured on camera? Or are they synthetic images, like icons or graphs?
Lossy compression (like JPEG) works very well for real scenes with many gradients and smooth edges. For images with solid colors and hard edges, you have a much higher (even perceived) loss in image quality and less gain in compression rates compared to lossless compression.
Basically, established image formats for your domain are PNG (Portable Network Graphics) and JPEG. PNG images are always compressed lossless, but their compression algorithm works better than competition, i.e. GIF. If the images are well-suited, you gain compression rates comparable to JPEG, if not (like real world images), you gain typical ZIP compression rates (around 50%).
After determining lossy/lossless compression (or a combination, based on picture type -- you could also think of compressing images first in both formats and then compare, if processing time does not matter as much as network througput), you should also take the advantage of progressive coding, which is supported both by JPEG and PNG formats.
With progressive coding, basically the data is organized in a way that the more data you receive, the better the quality (other than just sending the images row-by-row). The advantage here is that you can show the image to the user already while it is still being received. However, for this you need a decoder who exposes this functionality.
I don't know about the libraries available in Java for this.
You should check Java Advanced Imaging API.
But to use it effectively you will need to understand what type of image operations are right for your problem. This will depend, among other things, on the encoding of your source image.
As for the "good quality as much as possible", you will most likely need to experiment with various compression techniques and their relevant parameters before deciding which one gives the right balance of speed, size and quality for your needs.
You may take a look at this. It's a comparison between common compression algorithms (quality and compression rate).
Edit: it is not directly java, but you probably can find an implementation of the desired algorithm.
For images intended for human viewing JPEG is quite nice. What is in the remote end? A browser?
In the software 'Teamviewer', the quality of the images can be changed. It looks like the image comes from 32bit to 16bit (Or other values, like in the screen device settings in Windows). The image is realy smaller because you notice that the speed of the desktop sharing gets higher. I don't want something like: "scale down, send and than scale up".
Now my question: Is it possible to make a low-quality image.
Thanks
You have four alternatives for lossy compression:
reduce spatial resolution (size)
reduce bitdepth
compress in another domain (JPEG)
a combination of these
And you will probably get the best gain with JPEG for rich pictures like photos, and with bitdepth reduction (even down to using 8bit or less palette) on others with less variation in colors. Please note that bitdepth reduction is most effective if combined with lossless compression afterwards, like runlength encoding (did you know that even jpeg uses that?)
Yes, you can change the compression settings for many different types of Images.
Google found this: Adjust JPEG image compression quality when saving images in Java
You can use image converters for this purpose. When user uploads a file its sent to the converter which does its thing (according to defined settings). You would however need access to run applications on the server I think.
ypnos already mentioned bit depth reduction. Reading your question I also immediately though of dithering, which will preserve the image better as you reduce the size of the color space. You can pretty easily find implementations of the Floyd-Steinberg algorithm around the net.