Due to (quite annoying) limitations on many J2ME phones, audio files cannot be played until they are fully downloaded. So, in order to play live streams, I'm forced to download chunks at a time, and construct ByteArrayInputStreams, which I then feed to Players.
This works well, except that there's an annoying gap of about 1/4 of a second every time a stream ends and a new one is needed. Is there any way to solve this problem, or the problem above?
The only good way to play long (3 minutes and more) tracks with J2ME JSR135, moderately reliably, on the largest number of handsets out there, is to use a "file://" url when you create the player, or to have the inputstream actually come from a FileConnection.
recent blackberry phones can use a ByteArrayInputstream only when they have a large java heap memory available.
a lot of phones running on the Symbian operating system will allow you to put files in a private area for the J2ME application while still being able to play tracks in the same location.
Unfortunately you can't get rid of these gaps, at least not on any device I've tried it on. It's very annoying indeed. It's part of the spec that you can't stream audio or video over HTTP.
If you want to stream from a server, the only way to do it is to use an RTSP server instead though you'll need to check support for this on your device.
And faking RTSP using a local server on the device (rtsp://localhost...) doesn't work either.. I tried that too.
EDIT2: Or you could just look at this which seems to be exactly what you want: http://java.sun.com/javame/reference/apis/jsr135/javax/microedition/media/protocol/DataSource.html
I would create two Player classes and make sure that I had received enough chunks before I started playing them. Then I would start playing the first chunk through player one and load the second one into player two. Then I would use the TimeBase class to keep track of how much time has passed and when I knew the first chunk would end (you should know how long each chunk has to play) then I would start playing the second chunk through the second player and load the third chunk into the first and so on and so forth until there are no more chunks to play.
The key here is using the TimeBase class properly to know when to make the transition. I think that that should get rid of the annoying 1/4 second gap bet between chunks. I hope that works, let me know if it does because it sounds really interesting.
EDIT: Player.prefetch() could also be useful here in reducing latency.
Related
Some months ago, I have written an own stream source client in Java for streaming playlists to your Icecast2 server.
The logic is simple:
You have multiple "Channels" and every channel has a playlist (in this case a folder filled with mp3 files). After a channel has started, it begins streaming by picking the first song and stream it via http to the icecast2 server. As you can imagine, after a song ended, the next one is picked.
Here is the code which I am currently using for sending audio to icecast:
https://gist.github.com/z3ttee/e40f89b80af16715efa427ace43ed0b4
What I would like to achieve is to implement a crossfade between two songs. So when a song ends, it should fade out and fade in the next one simultaneously.
I am relatively new when it comes to working with audio in java. What I know, that I have to rework the way the audio is sent to icecast. But there is the problem: I have no clue how to start or where to start.
If you have any idea where or how to start, feel free to share your experience.
Thank you in advance!
I think for cross-fading, you are likely going to have to use a library that works with the audio at the PCM level. If you wish to write your own mixer, the basic steps are as follows:
read the data via the input stream
using the audio format of stream, convert the audio to pcm
as pcm, the audio values can be mixed by simple addition -- so over the course of the cross fade, ramp one side up from zero and the other down to zero
convert the audio back to the original format and stream that
The cross fade that is linear, e.g., the audio data is multiplied by steps that progress linearly from 0 to 1 or vice versa (e.g., 0.1, 0.2, 0.3,...) will tend to leave the midpoint quieter than when running the beginning or ending track solo. A sine function is often used instead to keep the sum a steady volume.
There are two libraries I know of that might be helpful for mixing, but would likely require some modification. One is TinySound, the other is AudioCue (which I wrote). The modifications required for AudioCue might be relatively painless. The output of the mixer is enclosed in the class AudioMixerPlayer, a runnable that is located on line 268 of AudioMixer.java. A possible plan would be to modify the output line of this code, substituting your broadcast line for the SourceDataLine.
I should add, the songs to be played would first be loaded into the AudioCue class, which then exposes the capability of real-time volume control. But it might be necessary to tinker with the manner in which the volume commands are issued.
I'm really interested in having this work and could offer some assistance. I'm just now getting involved in projects with Socket and SocketServer, and would like to get some hands-on with streaming audio.
I am developing a java voice chat for a game, I have a problem with the audio mix when several players are talking at the same time. The audio is only sent to nearby players, so I'm storing each user's buffers separately on the client and sending the id along with the voice packet on the server. To listen, I'm going through the list of users and checking the buffers of existing users to reproduce them. However, I have a problem with audio mixing, probably mixing it wrong. How should I mix these audio packages? Audio is 16-bit PCM. When several players are talking together, there is a lot of noise/hiss in these audios, the audio is practically inaudible.
What would be the correct algorithm to apply to this mixer?
Based on my limited experience with mike inputs, my starting point would be to try the following steps:
convert 16-bit bytes to PCM
(consider applying a low-pass filter to the PCM, and possibly volume gain)
add the PCM values together from each line being mixed
convert PCM back to bytes
I can't tell from your description if you are doing step 1 correctly.
A possible place for further research might be at jitsi.org. Their service is written in Java and is open source. It would be interesting to know how they handle this. But it seems to me the most usual thing is that only one line is selected and played at any one time. There may be good reasons for this limitation. But I don't know if it's technical (e.g.,the noise accumulates in a way that drowns out the voices) or if it's just that people talking at the same time create mass confusion quite easily. I supposed there may be echoes/feedback considerations as well. I will be looking forward to seeing what information other people might contribute on this.
I've gone through the tutorials for the Java Sound API and I've successfully read off data from my microphone.
I would now like to go a step further and get data synchronously from multiple microphones in a microphone array (like a PS3 Eye or Respeaker)
I could get a TargetDataLine for each microphone and open/start/write the input to buffers - but I don't know how to do this in a way that will give me data that I can then line up time-wise (I would like to eventually do beamforming)
When reading from something like ALSA I would get the bytes from the different microphone simultaneously, so I know that each byte from each microphone is from the same time instant - but the Java Sound API seems to have an abstration that obfuscates this b/c you are just dumping/writing data out of separate line buffers and processing it and each line is acting separately. You don't interact with the whole device/mic-array at once
However I've found someone who managed to do beamforming in Java with the Kinect 1.0 so I know it should be possible. The problem is that the secret sauce is inside a custom Mixer object inside a .jar that was pulled out of some other software.. So I don't have any easy way to figure out how they pulled it off
You will only be able to align data from multiple sources with the time synchronous accuracy to perform beam-forming if this is supported by the underlying hardware drivers.
If the underlying hardware provides you with multiple, synchronised, data-streams (e.g. recording in 2 channels - in stereo), then your array data will be time synchronised.
If you are relying on the OS to simply provide you with two independent streams, then maybe you can rely on timestamping. Do you get the timestamp of the first element? If so, then you can re-align data by dropping samples based on your sample rate. There may be a final difference (delta-t) that you will have factor in to your beam-forming algorithm.
Reading about the PS3 Eye (which has an array of microphones), you will be able to do this if the audio driver provides all the channels at once.
For Java, this probably means "Can you open the channel with an AudioFormat that includes 4 channels"? If yes, then your samples will contain multiple frames and the decoded frame data will (almost certainly) be time aligned.
To quote the Java docs : "A frame contains the data for all channels at a particular time".
IDK what "beamforming" is, but if there is hardware that can provide synchronization, using that would obviously be the best solution.
Here, for what it is worth, is what should be a plausible algorithmic way to manage synchronization.
(1) Set up a frame counter for each TargetDataLine. You will have to convert bytes to PCM as part of this process.
(2) Set up some code to monitor the volume level on each line, some sort of RMS algorithm I would assume, on the PCM data.
(3) Create a loud, instantaneous burst that reaches each microphone at the same time, one that the RMS algorithm is able to detect and to give the frame count for the onset.
(4) Adjust the frame counters as needed, and reference them going forward on each line of incoming data.
Rationale: Java doesn't offer real-time guarantees, as explained in this article on real-time, low latency audio processing. But in my experience, the correspondence between the byte data and time (per the sample rate) is very accurate on lines closest to where Java interfaces with external audio services.
How long would frame counting remain accurate without drifting? I have never done any tests to research this. But on a practical level, I have coded a fully satisfactory "audio event" scheduler based on frame-counting, for playing multipart scores via real-time synthesis (all done with Java), and the timing is impeccable for the longest compositions attempted (6-7 minutes in length).
Android provides a default of 15 steps for its sound systems which you can access through Audio Manager. However, I would like to have finer control.
One method of doing so seems to be altering specific files within the Android system to divide the sound levels even further then default. I would like to programmatically achieve the same effect using Java.
Fine volume control is an example of the app being able to divide the sound levels into one hundred distinct intervals. How do I achieve this?
One way, in Java, to get very precise volume adjustment is to access the PCM data directly and multiply it by some factor, usually from 0 up to 1. Another is to try and access the line's volume control, if it has one. I've given up trying to do the latter. The precision is okay in terms of amplitude, but the timing is terrible. One can only have one volume change per audio buffer read.
To access the PCM data directly, one has to iterate through the audio read buffer, translate the bytes into PCM, perform the multiplication then translate back to bytes. But this gives you per-frame control, so very smooth and fast fades can be made.
EDIT: To do this in Java, first check out the sample code snippet at the start of this java tutorial link, in particular, the section with the comment
// Here, do something useful with the audio data that's now in the audioBytes array...
There are several StackOverflow questions that show code for the math to convert audio bytes to PCM and back, using Java. Should not be hard to uncover with a search.
Pretty late to the party, but I'm currently trying to solve this issue as well. IF you are making your own media player app and are running an instance of a MediaPlayer, then you can use the function setVolume(leftScalar, rightScalar) where leftScalar and rightScalar are floats in the range of 0.0 to 1.0. representing logarithmic scale volume for each respective ear.
HOWEVER, this means that you must have a reference to the currently active MediaPlayer instance. If you are making a music app, no biggie. If you're trying to run a background service that allows users to give higher precision over all media output, I'm not sure how to use this in that scenario.
Hope this helps.
I am trying to implement my own remote desktop solution in java. Using sockets and TCP/UDP.
I know I could use VNC or anything else, but its an assignmentwork from school that I want to do.
So for moving the mouse and clicking I could use the Robot class. I have two questions about this:
What about sending the video? I know the Robot class can capture the screen too, so should I just send images in a sequence and display in order at the other side of the connection? Is this the best way to implement remote desktop?
Also should I use TCP or UDP?
I think UDP would be harder to implement since I will have to figure out which image comes after the other.
What you are trying to do will work, but incredibly slow. The images must be compressed before you send them over the net. Before compressing, the number of colors should be reduced. Also, only the portions of the image which have changed since the last update should be sent.
When transferring mouse coordinates an update should only occur if the new mouse position is more than x pixels away from the last position away or a number of y seconds is over. Otherwise you spend so much traffic for the mouse position that there is no room for the images.
UDP will be the best solution here, since it is fastest for video streaming (which is what you are effectively doing).
About 2:
UDP would be much harder, since it's a datagram-based protocol there are limits on how much data you can send at a time; it's not very likely that you are going to be able to fit entire images into single datagrams. So, you're going to have to work with differential/partial updates, which becomes complicated pretty quickly.
TCP, however, is stream-based and only delivers data in-order. If a packet in the middle disappears and needs to be re-sent, all following packets need to wait, even if they've been received by the target machine. This creates lag, which is often very undesirable in interactive applications.
So UDP is probably your best choice, but you can't design it around the assumption that you can send whole images at a time, you need to come up with a way to send just parts of images.
Basically a video is a sequence of images ( frames ) displayed by second. You should send as much as your bandwidth allows you.
On the other hand, there is no point to send the raw image, you should compresss it as much as you can, and definitely consider lose a lot of resolution in the process.
You can take a look at this SO question about image compression if you compress it enough you may have a vivid video.
It will be better if you use Google Protocol buffer or Apache thrift. You will send binary data which will be smaller - by this, your software will work faster.