I've a stream with contains a audio/video stream. The video encoding is H.264 AVC and the audio encoding is G.711 µ-law (PCMU). (I have no control over the output format.)
When I try to display the video into a VideoView frame, the device says that it cannot play the video. I guess that's because Android doesn't support the aforementioned audio encoding.
Is there a way to somehow display the selected stream in the VideoView?
Can I just use a Java code snippet to 'compute' the codec? I've seen the class android.net.rtp.AudioCodec, which contains a field PCMU, and I can imagine that class would be useful.
Or do I have to insert a library or some native code (FFmpeg) into my application and use that? If yes, how?
How to fix it?
If you want to manipulate a stream before you display it, you are unfortunately getting into tricky territory on an Android device - if you do have any possibility of doing the conversion on the serve side it will likely by much easier (and you should also usually have more 'horsepower' on the server side).
Assuming that you do need to convert on the server side, then a technique you can use is to stream the video from the server to your app, convert it as needed and then 'stream' from a localhost server in your app to a VideoView in your app. Roughly the steps are:
Stream the encrypted file from the server as usual, 'chunk by chunk'
On your Android device, read from the stream and convert each chunk as it is received
Using a localhost http server on your Android device, now 'serve' the converted chunks to the MediaPlayer (the media player should be set up to use a URL pointing at your localhost http server)
An example of this approach, I believe, is LibMedia: sure: http://libeasy.alwaysdata.net (note, this is not a free library, AFAIK).
For the actual conversion you can use ffmpeg as you suggest - there are a number of ffmpeg wrapper for Android projects that you can either use or look at to design your own. Some examples:
http://hiteshsondhi88.github.io/ffmpeg-android-java/
https://github.com/jhotovy/android-ffmpeg
Take a look at the note about libffmpeginvoke in the second link in particular if you are planning to write your won wrapper.
One thing to be aware of is that compressions techniques, in particular video compression, often use an approach where one packet or frame is compressed relative to the frames before it (and sometimes even the frames after it). For example the first frame might be a 'key frame', the next five frames might just contain data to explain how they differ from the key frame, and the the seventh frame might be another key frame and so on. This means you generally want a 'chunk' which has all the required key frames if you are converting the chunk from one format to another.
I don't think you will have this problem converting from PCM to AAC audio, but it useful to be aware of the concept anyway.
Related
I know that exoplayer has support for RTSP, but I need C++ code that works on players from lots of OSs, so I need to parse the RTP packet in C++ to NAL units before passing to exoplayer
I found a way to decode RTP packets using live555 and extract its NAL units. According to ExoPlayer's documentation:
Components common to all ExoPlayer implementations are:
A MediaSource that defines the media to be played, loads the media, and from which the loaded media can be read.
A MediaSource is
injected via ExoPlayer.prepare at the start of playback. ...
So I need a custom MediaSource that can extract NAL units from my C++ code.
At the class reference for MediaSource we can see that there are already some MediaSources available. I though maybe SmoothStreaming MediaSource could work but there's no description of what it does exactly and in its constructor I have to provide an Uri or an SsManifest (what).
I can see that there exists a NAL unit utility in this library so maybe things are already half done
So how to build or use an already available MediaSource to read NAL units for ExoPlayer to play?
As an additinal, how would you pass the NAL units from C++ to Java? In the code I found it's simply storing them in a C++ buffer. Should I read this buffer in Java somehow?
UPDATE:
I've been researching how this library works. It all begins with a MediaSource which have objects like Extractor and DataSource. Seems that ExtractorMediaSource is a MediaSource where you can provide your own Extractor and DataSource.
As I understood, DataSource is a class that gets the raw bytes from any possible place, be it a file read or a network packet. Based on the available Extractor classes on the library like Mp4Extractor and Mp3Extractor, an Extractor is something that will interpret the data read from DataSource. The two main methods from the Extractor interface are:
void init(ExtractorOutput output)
int read(ExtractorInput input, PositionHolder seekPosition)
I don't know what are ExtractorInput and ExtractorInput for, but they look important.
So somehow Extractor reads from DataSource, parses it and sends to Renderer in a common format?
I need to know how is this common format so I can parse the NAL units that I read from a custom DataSource.
You cannot play NAL units or raw H.264 stream with ExoPlayer. The picture data must exist within a supported container/format.
It's not clear what your mysterious C++ code is doing, is it an NDK setup? What's its role in Android decoding? Are you saying you're unable to pass [from the C++ function] a data array into some Android function as function parameter? Is it something like this Java to C++ setup? It's not clear what your real problem is...
If you insist on Exoplayer, I can tell you that FLV is the one (on containers list) that might be the best option since it can be built in real-time (re-muxing). You first create an FLV header that holds SPS and PPS data then followed by keyframe (extracted H264 data from the first NAL). You'll have to get familiar with FLV bytes structure, but each frame header is around 13 bytes followed by NAL data, repeat for each frame until end. This woud be realtime transcoding.
As a second option, for Android, you could just use MediaCodec to decode the H264 as extracted from the NAL units. Here is a useful example source. Just use as:
MediaCodec.createDecoderByType("video/avc"); //then later give NAL units
Study also functions of this other source for ideas of how it works to via Android's own decoder.
Other starting points:
Android Decode raw h264 stream with MediaCodec
(NAL units).
Decode H264 raw stream using mediacodec
I did something very similar, you can check my blog post about some aspects on customizing ExoPlayer (https://medium.com/#mahmoud.mohamed.bahaa/diving-into-exoplayer-getting-more-control-over-the-framework-cac436d1472c)
I'm trying to record from the microphone to a wav file as per this example. At the same time, I need to be able to test for input level/volume and send an alert if it's too low. I've tried what's described in this link and seems to work ok.
The issue comes when trying to record and read bytes at the same time using one TargetDataLine (bytes read for monitoring are being skipped for recording and vice-versa.
Another thing is that these are long processes (hours probably) so memory usage should be considered.
How should I proceed here? Any way to clone TargetDataLine? Can I buffer a number of bytes while writing them with AudioSystem.write()? Is there any other way to write to a .wav file without filling the system memory?
Thanks!
If you are using a TargetDataLine for capturing audio similar to the example given in the Java Tutorials, then you have access to a byte array called "data". You can loop through this array to test the volume level before outputting it.
To do the volume testing, you will have to convert the bytes to some sort of sensible PCM data. For example, if the format is 16-bit stereo little-endian, you might take two bytes and assemble to either a signed short or a signed, normalized float, and then test.
I apologize for not looking more closely at your examples before posting my "solution".
I'm going to suggest that you extend InputStream, making a customized version that also performs the volume test. Override the 'read' method so that it obtains the byte that it returns from the code you have that tests the volume. You'll have to modify the volume-testing code to work on a per-byte basis and to pass through the required byte.
You should then be able to use this extended InputStream as an argument when you create the AudioInputStream for the output-to-wav stage.
I've used this approach to save audio successfully via two data sources: once from an array that is populated beforehand, once from a streaming audio mix passing through a "mixer" I wrote to combine audio data sources. The latter would be more like what you need to do. I haven't done it from a microphone source, though. But the same approach should work, as far as I can tell.
So, I went over the Java's sound tutorial and I did not find it all so helpful.
Anyways, what I understood from the tutorial for recording sound from a mic is this:
Although they do show how to get a target data line and so on, they do not tell how you can actually record sound [or maybe I didn't get it all well].
My understanding so far has been this:
Mixer can be your sound card or sound software drivers that can be used to process the sound, whether input or output
TargetDataLine is used when you want to output your sound into the computer. Like save it to the disk
Port is where your external devices like mic, etc are connected
Problems that remain
How do I select the proper mixer? Java's tut says that you get all the available mixers and query each one to see if it has what you want. That's quite vague for a beginner
How do I get the port on which my integrated mic is? Specifically, how do I get input from it into the mixer?
How do I output this to the disk?
Using the AudioSystem.getTargetDataLine(AudioFormat format) method you will get
... a target data line that can be used for recording audio data in the format specified by the AudioFormat object. The returned line will be provided by the default system mixer, or, if not possible, by any other mixer installed in the system that supports a matching TargetDataLine object.
See the accepted answer for Java Sound API - capturing microphone for an example of this.
If you want more control of which data line to use you can enumerate all the mixers and the data lines they support and pick the one you want. Here is some more information regarding how you would go about doing that: Java - recording from mixer
Once you've obtained the TargetDataLine you should open() it, and then call read() repeatedly to obtain data from that data line. The byte[] that you fill up with data with each call to read() can be written to disk e.g. through a FileOutputStream.
I want to split a video file into multiple parts and then rejoin some of them to make a new video file.
I am doing it by looping over the packets using xuggle and then writing some of them (after adjusting its timestamps) to the new file, but when I play the file, there is some disturbance in the transition frames. (It might be because the decoding of frame depends upon its preceding frame which has been discarded as part of the program)
How can I get rid of the disturbance?
Ideally you split on keyframes, since they usually don't depend on preceding frames.
The IPacket class has a isKey function to test for this condition.
I'm not sure what sort of compression format you are working with though. I have tried splitting a mp4 stream with xuggler, and found the results to be quite buggy.
I want to write a program that will be able to call into my company's bi-weekly conference calls, and record the call, so it can then be made into a podcast.
I am thinking of using Gizmo's SIP interface (and the fact that it allows you to make toll-free calls for free), but I am having trouble finding any example code (preferably in Java) that will be able to make an audio call, and get hold of the audio stream.
I have seen plenty of SIP programming tutorials that deal with establishing a session, and then they seem to just do some hand waving, and say "here is where you can establish the audio connection" without actually doing it.
I am experienced in Java, so I would prefer to use it, but other language suggestions are welcome as well.
I have never written a VOIP application, so I'm not really sure where to start. Can anyone suggest a good library or other resource that would help me get started?
Thanks!
Look for a VOIP softphone writtin in Java, then modify it to save the final audio stream instead of sending it to be played.
Side note: In many states you would be violating the law unless you do one of several things, varying by state: Notify the participants they're being recorded, insert BEEPs every N seconds, both, etc. Probably you only have to comply with the laws of the state you're calling from. Even worse, you may need to allow the users to decline recording (requires you to be there before recording starts). If you control the conference server, you may be able to get it to play a canned announcement that the call is being recorded.
You could do this with Twilio with almost no programming whatsoever. It will cost you 3¢ per minute, so if your company's weekly call is 45 minutes long, you're looking at $1.35 per week, about as close to free as possible. Here are the steps:
Sign up for Twilio and make note of your Account ID and token
Create a publicly accessible file on your web server that does nothing but output the following XML (see the documentation for explanation of the record parameters):
<Response>
<Record timeout="30" finishOnKey="#" />
</ Response>
When it's time to start the recording, perform a POST to this URL (documented here) with your browser or set up an automated process or script to do it for you:
POST http://api.twilio.com/2008-08-01/Accounts/ACCOUNT SID HERE/Calls
HTTP/1.1
Called=CONFERENCE NUMBER HERE
&Url=WEB PAGE HERE
&Method=GET
&SendDigits=PIN CODE HERE
If you want to get really creative, you can actually write code to handle the result of the recording verb and email the link to the MP3 or WAV file that Twilio hosts for you. But, if this is a one off, you can skip it because you can access all your recordings in the control panel for your account anyway.
try peers with mediaDebug option true in peers.xml. This option records all outgoing and incoming media streams in a media/ folder with a date pattern for file name. Nevertheless this file will probably not be usable as is. It contains raw uncompressed lienar PCM samples. You can use Audacity, sox or ffmpeg to convert it to whatever you want.
https://voip.dev.java.net/
They have some sample code there.