I am working on a speech recognizer project as a part of it want from a wav file want to find the presence of silence or detect the presence of word .and if a word is found then copy that word from start to end into a new wav file so it original wav file has 10 words then output is 10 file..problem is with detecting the silence or word
want suggestion on how to implement this in java..
please suggest..
Well, wav is just PCM data. I'd start by reading this:
http://en.wikipedia.org/wiki/Pulse-code_modulation
I've done this before...
You start be pulling samples out of the PCM data. You then check each to see if it is greater than a threshold values that you've set. For instance assuming 16 bit samples...Example any value from zero to 15000 is silence, anything greater than 15001 is sound. Just remember to deal with unsigned ints or you'll have negative in the PCM. Also, remember log vs linear when you're playing with the threshold.
Related
I'm trying to record from the microphone to a wav file as per this example. At the same time, I need to be able to test for input level/volume and send an alert if it's too low. I've tried what's described in this link and seems to work ok.
The issue comes when trying to record and read bytes at the same time using one TargetDataLine (bytes read for monitoring are being skipped for recording and vice-versa.
Another thing is that these are long processes (hours probably) so memory usage should be considered.
How should I proceed here? Any way to clone TargetDataLine? Can I buffer a number of bytes while writing them with AudioSystem.write()? Is there any other way to write to a .wav file without filling the system memory?
Thanks!
If you are using a TargetDataLine for capturing audio similar to the example given in the Java Tutorials, then you have access to a byte array called "data". You can loop through this array to test the volume level before outputting it.
To do the volume testing, you will have to convert the bytes to some sort of sensible PCM data. For example, if the format is 16-bit stereo little-endian, you might take two bytes and assemble to either a signed short or a signed, normalized float, and then test.
I apologize for not looking more closely at your examples before posting my "solution".
I'm going to suggest that you extend InputStream, making a customized version that also performs the volume test. Override the 'read' method so that it obtains the byte that it returns from the code you have that tests the volume. You'll have to modify the volume-testing code to work on a per-byte basis and to pass through the required byte.
You should then be able to use this extended InputStream as an argument when you create the AudioInputStream for the output-to-wav stage.
I've used this approach to save audio successfully via two data sources: once from an array that is populated beforehand, once from a streaming audio mix passing through a "mixer" I wrote to combine audio data sources. The latter would be more like what you need to do. I haven't done it from a microphone source, though. But the same approach should work, as far as I can tell.
I use in a java code a wav file that I load into an AudioInputStream using AudioInputStream ais = AudioSystem.getAudioInputStream("file.wav")
Once I have done that, I wish to basically pick up the n amount of seconds at the end (let's say the 5 last seconds) and "fade out" the volume (FloatControl.Type.MASTER_GAIN??).
Once done I could transfert my audiostreaminput back into a wav file using: AudioSystem.write(ais, Type.WAVE, file_output);
the result is a same wav file but with the last 5 seconds fading out (volume decreasing).
Any idea on how to do this? I tried changing the ais into bytes[], or a sourcedataline... but didn't find what I wanted, as most examples are about changing volume of an audio "in-play" (I also saw things around using Clip which also seems to be dealing an audio file in-play)
Many thanks everyone
Start by turning the sound into a byte array. Then turn the bytes into samples: you'll need to find a tutorial specifically for this, it's a little involved in Java (http://www.jsresources.org/ is a good resource). Samples are the direct representation of the sound wave.
To decrease the volume, multiply all the samples by something less than 1, and then save them back to a byte array. To fade out you'll need to multiply the last n samples by a decreasing function. Then write out the file with the proper WAV headers.
These are just a few pointers for a complex process, hopefully they will send you in the right direction.
I'm currently making my own music player in Java (NetBeans IDE), and I'm working out on the playlist, storing the file paths in an array. Should I save those file paths as binary or text? Because I plan on having a "save" and "load" button that would save and load a playlist.
My advice, use the "cue" sheet format (which is ascii) - from the Wikipedia article, A cue sheet is a plain text file containing commands with one or more parameters.
Essential commands
FILE
Names a file containing the data and its format (such as MP3, and WAVE audio file formats, and plain "binary" disc images)
TRACK
Defines a track context, providing its number and type or mode (for instance AUDIO or various CD-ROM modes). Some commands that follow this command apply to the track rather than the entire disc.
INDEX
Indicates an index (position) within the current FILE. The position is specified in mm:ss:ff (minute-second-frame) format. There are 75 such frames per second of audio. In the context of cue sheets, "frames" refer to CD sectors, despite a different, lower-level structure in CDs also being known as frames.[5] INDEX 01 is required and denotes the start of the track, while INDEX 00 is optional and denotes the pregap. The pregap of Track 1 is used for Hidden Track One Audio (HTOA). Optional higher-numbered indexes (02 through 99) are also allowed.
PREGAP and POSTGAP
Indicates the length of a track's pregap or postgap, which is not stored in any data file. The length is specified in the same minute-second-frame format as for INDEX.
Actually, you will probably find it more convenient to use File[] or Path[].
Prefer text over binary though.
You should always prefer human-readable formats when possible. JSON, XML, or line-oriented text are all good options.
It depends on whether or not you want the user to be able to open and read the playlist file. Humans can't read binary files, so you should probably use a text file unless you want the contents to be unreadable.
I currently have a program which reads from a text file and then writes to a database after each line it reads, the size of the text file is undetermined, some days the file could be more or less lines than other days.
I already have a swing worker that executes my functions so my progress bar works but right now I just have setIndeterminate to true so the user knows something is being done just not the actual progress.
Is there a way I can increment the progress bar after each line is read, but have it not reach 100 too early or too late, preferably without reading the the text file entirely before hand. Thanks, Beef.
I'd use File.length() to determine the file size. Then keep track of the number of bytes read to determine the progress.
If the file size is known before you are starting to read it you can read it line-by-line and count the percentage after every line: count the number of bytes in each line and devide it by the total number of bytes (i.e. file.length()).
and wrap output to the GUI (in your case the Progress from the JProgressBar) to the invokeLater(), because you are pretty out of the EDT, more in Concurency in Swing
To determine the file size is easy using java.io.File so the problem here is to get the size in bytes actually read for a simple progress bar two possibilities come to mind:
Estimation: Assume 1 Character = 1 Byte (or 1 Character = 2 Byte if your file is UTF-16,...). The 1 Byte guess will have you underestimate your actual size read so you will have a jump of the bar at the end depending on how many multi byte characters are in your file.
Calculate: Recode the characters read to a byte array using the files character encoding and take the length of the encoded array.
Count: As proposed by Brendan in the comments below, use a CountingInputStream (in correct order after Buffering) to count the bytes actually read.
Nr. 2. seems unneccessary overhead for this case to me so I think I'd stick to Nr. 1 or Nr. 3
You can have in your app something like this image :)
I want to change the volume of an audio file
and save the new file using java.sound.sampled.
I tried to use the mixer to create a source line
from the file given and a target line to the new file.
So that I can change the mixer settings to change the volume.
But the sound is being played to the system speaker.
Am I thinking along correct way or not?
Is there any other way to record a file from a line?
The code is available here
A solution I got is www.jsresources.org/examples/AmplitudeConverter.html.
But can the same be done within java.sound.sampled
without using external libraries.
To change the volume, if you don't use a "Control" (see the Java Sound Tutorials), there is the option of directly modifying the samples themselves.
In your innermost loop, convert the bytes in the innermost buffer into a sample (if it is WAV 16-bit encoding, then you need to put the two bytes together to make the single SHORT value), then multiply that value by a float that ranges from 0 to 1, where 0 is the quietest and 1 leaves the sound at full volume. Then take the result and break it back down into two bytes and pass it along.
Do you need the code to do this? There are several other posts here where folks convert from bytes to INTs or Float and back.
Hmmm. This question is pretty old. Well maybe my answer will help someone new to the same problem.