How should I cache an audio file in Java (using JLayer)?

How should I cache an audio file in Java (using JLayer)? - java

I'm developing an application that necessitates the use of sound files (through the JLayer audio library at the moment) and am wondering if it is possible to cache a file in memory so that it doesn't have to be read off of the hard disk every time it's needed.
For example, I can store an image in a BufferedImage object instead of calling ImageIO.read every time I need an image; is there a similar way to approach this using JLayer? Should I use a different audio library? I just don't want to be hitting the disk every time the user triggers a sound (potentially many times per second).
If it makes a difference, the audio is in MP3 format.

You can read any file (up to a certain size) into a byte[] and process it from there.
Java's abstraction of InputStreams (where the consumer need not know where the data is from) should make this very straightforward (assuming your audio library is not hard-coded to read from files).
Start by looking up ByteArrayOutputStream and ByteArrayInputStream.

Related

Java Sound API: Recording and monitoring same input source

I'm trying to record from the microphone to a wav file as per this example. At the same time, I need to be able to test for input level/volume and send an alert if it's too low. I've tried what's described in this link and seems to work ok.
The issue comes when trying to record and read bytes at the same time using one TargetDataLine (bytes read for monitoring are being skipped for recording and vice-versa.
Another thing is that these are long processes (hours probably) so memory usage should be considered.
How should I proceed here? Any way to clone TargetDataLine? Can I buffer a number of bytes while writing them with AudioSystem.write()? Is there any other way to write to a .wav file without filling the system memory?
Thanks!

If you are using a TargetDataLine for capturing audio similar to the example given in the Java Tutorials, then you have access to a byte array called "data". You can loop through this array to test the volume level before outputting it.
To do the volume testing, you will have to convert the bytes to some sort of sensible PCM data. For example, if the format is 16-bit stereo little-endian, you might take two bytes and assemble to either a signed short or a signed, normalized float, and then test.
I apologize for not looking more closely at your examples before posting my "solution".
I'm going to suggest that you extend InputStream, making a customized version that also performs the volume test. Override the 'read' method so that it obtains the byte that it returns from the code you have that tests the volume. You'll have to modify the volume-testing code to work on a per-byte basis and to pass through the required byte.
You should then be able to use this extended InputStream as an argument when you create the AudioInputStream for the output-to-wav stage.
I've used this approach to save audio successfully via two data sources: once from an array that is populated beforehand, once from a streaming audio mix passing through a "mixer" I wrote to combine audio data sources. The latter would be more like what you need to do. I haven't done it from a microphone source, though. But the same approach should work, as far as I can tell.

JAI: How do I extract a single page input stream from a multipaged TIFF image container?

I have a component that converts PDF documents to images, one image per page. Since the component uses converters producing in-memory images, it hits the JVM heap heavily and takes some time to finish conversions.
I'm trying to improve the overall performance of the conversion process, and found a native library with a JNI binding to convert PDFs to TIFFs. That library can convert PDFs to single TIFF files only (requires intermediate file system storage; does not even consume conversion streams), therefore result TIFF files have converted pages embedded, and not per-page images on the file system. Having a native library improves the overall conversion drastically and the performance gets really faster, but there is a real bottleneck: since I have to make a source-page to destination-page conversion, now I must extract every page from the result file and write all of them elsewhere. A simple and naive approach with RenderedImages:
final SeekableStream seekableStream = new FileSeekableStream(tempFile);
final ImageDecoder imageDecoder = createImageDecoder("tiff", seekableStream, null);
...
// V--- heap is wasted here
final RenderedImage renderedImage = imageDecoder.decodeAsRenderedImage(pageNumber);
// ... do the rest stuff ...
Actually speaking, I would really like just to extract a concrete page input stream from the TIFF container file (tempFile) and just redirect it to elsewhere without having it to be stored as an in-memory image. I would imagine an approach similar to containers processing where I need to seek for a specific entry to extract data from it (say, something like ZIP files processing, etc). But I couldn't find anything like that in ImageDecoder, or I'm probably wrong with my expectations and just missing something important here...
Is it possible to extract TIFF container page input streams using JAI API or probably third-party alternatives? Thanks in advance.

I could be wrong, but don't think JAI has support for splitting TIFFs without decoding the files to in-memory images. And, sorry for promoting my own library, but I think it does exactly what you need (the main part of the solution used to split TIFFs is contributed by a third party).
By using the TIFFUtilities class from com.twelvemonkeys.contrib.tiff, you should be able to split your multi-page TIFF to multiple single-page TIFFs like this:
TIFFUtilities.split(tempFile, new File("output"));
No decoding of the images are done, only splitting each IFD into a separate file, and writing the streams with corrected offsets and byte counts.
Files will be named output/0001.tif, output/0002.tif etc. If you need more control over the output name or have other requirements, you can easily modify the code. The code comes with a BSD-style license.

Android pdf writer APW high resolution images cause out of memory expection

I am using android pdf writer
(apw) in my app successfully for the most part. However, when I try to include a high resolution in a pdf document, I get an out of memory exception.
Immediately before creating the pdf file, the library must have the content itself converted into a string (representing the raw pdf content), which is then converted to a byte array. The byte array is written to the file in a file output stream (see example via website).
The out of memory expection occurs when the string is generated because representing all the pixels of a bitmap image in string format is very memory intensive. I could downsample the image using the android API, however, it is essential that the images are put into the pdf at high resolution (~2000 x 1000).
There are many scanner type apps which seem to be able to able generate pdf high res images, so there must be a way around it, surely. Granted, they may be using other libraries, but surely there is someone who has figured out a way around it with this library given that it is free and therefore popular(?)
I emailed the developer, but there was no response.
Potential solutions (I can think of) include:
Modifying the library to load a string representing e.g. the first 10% of the PDF, and writing to file chunk by chunk. (edit)
Modifying the library to output a stringoutput stream, or other output stream to a temp file (or final file) as the actual pdf content is being written in the pdfwriter object.
However as a relative java noob (and even more of a pdf specification noob), I am unable to understand the library well enough to do this myself.
Has anyone come across this problem and found a way around it? Anyone willing to hazard a suggestion, or take a look at the library itself even to see if there is a fix of some sort.
Thanks for your help.
nme32
Edit:
Logcat says heap size is in the range on 40 to 60mb before the crash. I understand (do correct me if not) that Android limits the available memory to apps depending on what else is running, though it is in the 50mb ballpark, depending on device.
When loading the image, I think APW essentially converts it to bitmap, that is represents the image pixel by pixel then puts it into string format, meaning it doesn't matter which image format you use, it may as well be bitmap.

First of all the resolution you are mentioning is very high. And i have already mentioned the issues related to Images in Android in this Answer
Secondly in case first solution doesn't work for you i would suggest Disk based LruCache.And store the chunks into that disk based cache and then retrieve and use it. Here is an Example of that.
Hope this would help. If it doesn't comment on this answer and i will add more solutions.

Avoid obtaining same InputStream more than once

I can see there are a number of posts regarding reuse InputStream. I understand InputStream is a one-time thing and cannot be reused.
However, I have a use case like this:
I have downloaded the file from DropBox by obtaining the DropBoxInputStream using the DropBox's Java SDK. I then need to upload the file to another system by passing the InputStream. However, as part of the download, I have to provide the MD5 of the file. So I have to read the file from the stream before uploading the file. Because the DropBoxInputStream I received can only be used once, I have to get another DropBoxInputStream after I have calculated the MD5 and before uploading the file. The procedure is like:
Get first DropBoxInputStream
Read from the DropBoxInputStream and calculate MD5
Get the second DropBoxInputStream
Upload file using the MD5 and the second DropBoxInputStream.
I am thinking that, if there are many way for me to "cache" or "backup" the InputStream before I calculate the MD5 so that I can save step 3 of obtaining the same DropBoxInputStream again?
Many thanks
EDIT:
Sorry I missed some information.
What I am currently doing is that I use a MD5DigestOutputStream to calculate MD5. I stream data across the MD5DigestOutputStream and save them locally as a temp file. Once the data goes through the MD5DigestOutputStream, it will calculate the MD5.
I then call a third party library to upload the file using the calculated md5 and a FileInputStream which reads from the temp file.
However, this requires huge disk space sometime and I want to remove the needs to use temp file. The library I use only accepts a MD5 and InputStream. This means I have to calculate the MD5 on my end. My plan is to use my MD5DigestOutputStream to write data to /dev/null (not keeping the file) so that I can calculate theMD5, and get the InputStream from DropBox again and pass that to the library I use. I assume the library will be able to get the file directly from DropBox without the need for me to cache the file either in the memory of at the disk. Will it work?

Input streams aren't really designed for creating copies or re-using, they're specifically for situations where you don't want to read off into a byte array and use array operations on that (this is especially useful when the whole array isn't available, as in, for e.g. socket comunication). You could buffer up into a byte array, which is the process of reading sections from the stream into a byte array buffer until you have enough information.
But that's unnecessary for calculating an md5. Notice that InputStream is abstract, so it needs be implemented in an extended class. It has many implementations- GZIPInputStream, fileinputstream etc. These are, in design pattern speak, decorators of the IO stream: they add extra functionality to the abstract base IO classes. For example, GZIPInputStream gzips up the stream.
So, what you need is a stream to do this for md5. There is, joyfully, a well documented similar thing: see this answer. So you should just be able to pass your dropbox input stream (as it will be itself an input stream) to create a new DigestInputStream, and then you can both take the md5 and continue to read as before.
Worried about type casting? The idea with decorators in Java is that, since the InputStream base class interfaces all the methods and 'beef' you need to do your IO, there's no harm in passing instances of objects inheriting from InputStream in the constructor of each stream implementation, and you can still do the same core IO.
Finally, I should probably answer your actual question- say you still want to "cache" or "backup" the stream anyway? Well, you could just write it to a byte array. This is well documented, but can become faff when your streams get more complicated. Alternatively, try looking at a PushbackInputStream. Here, you can easily write a function to read off n bytes, perform and operation on them, and then restore them to the stream. Generally good to avoid these implementations of streams in Java, as it's bad for memory use, but no worse than buffering everything up which you'd otherwise have to do.
Or, of course, I would have a go with DigestInputStream.
Hope this helps,
Best.

You don't need to open a new InputStream from DropBox.
Once you have read the file from DropBox, you have it locally. So it is either in memory (in a byte array) or you stored it in a local file. Now you can create an InputStream that reads the data from memory (ByteArrayInputStream) or disk (FileInputStream) in order to upload the file.
So instead of caching the InputStream (which you can't) you cache the contents (which you can).

Is it advisable to store large strings in memory, or repeatedly read a file?

Let's say I have various text/json/xml/whatever files (stored locally, in the assets directory), ranging in size from 20 - 500 KB. Assuming these files are going to be referenced frequently, throughout the application, is it better to:
A) Read the file once, the first time it's requested, and store the data in a variable
or
B) Read the file each time it's requested, grab the requested bit of information, and allow GC to clean up afterward?
Coming from web-dev, I generally use option (A), but I wonder if the storage limitation of mobile devices makes B preferred in this context (Android app development).
TYIA.

You can store your data into memory by compressing it.That it will reduce your memory footprint at any point of time.So this technique can be applicable to both PCs and mobile phones.Later on when you need the data, read and decompress it.So read the file once, then compress and store it in the memory.The following example uses GZIPOutputStream to compress a string.
public static String compress(String str){
ByteArrayOutputStream out = new ByteArrayOutputStream();
GZIPOutputStream gzip = new GZIPOutputStream(out);
gzip.write(str.getBytes());
gzip.close();
return out.toString("ISO-8859-1");
}

If the file is being requested frequently, definitely it's better to read the file once and store in cache.
You can also read this article titled How Google Taught me to Cache and Cash In in HighScalability website.

That depends on the total size of the files, accessing frequency, and your targeting customers. Although high-end phone got very large memory, the are many low-ends system which has fewer memory.
It might deserve to use some LRU cache to reach a balance.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.