command (voice) recognition based project - java

I am searching for a way to recognised pre-registered voice commands in java for a project and i couldn't come up with a good approach yet, I have looked into fast Fourier and different ways to treat wave files but i can't decide how i should go about implementing it.
the idea is simple, the user will record his/her voice with a short phrase and then when the phrase is repeated the application should recognise which command is issued.
any ideas or suggestions would be most welcome
thanks in advance

Voice recognition is an unsolved problem that multibillion dollar companies are spending millions and years to solve. Simply said, if you're only at the FFT level, you aren't going to do it. Instead, you should be looking for libraries that do it for you. One is even included in Android- check out http://developer.android.com/reference/android/speech/RecognizerIntent.html

Related

Writing personal translation app using my own database to work on android

I would like to make an app that lets you search for part of a word or phrase, then returns the closest results from a personal database of words I have learnt in another language. Once the results have been returned I would like the option to play the sound file with the associated results. I can write the database in whatever program I need, and the sounds files would be in either wav or mp3 format. The app would also need to allow the user to input foreign letters, there are about 10 extra required as I am using Romanian. These could be separate on the screen if necessary, as in separate to the keyboard input.
Would this be an easy enough project to undertake, what sort of size would it be, I am more than happy to spend about a week on it. I am familiar with coding, particularly in Python, so writing in Python would be best, but I can use Java also. This would need to work on the android system. What is the best program to use to write the app?
Android studio covers basically every aspect you've mentioned. You'll probably need to google a lot to learn how to implement individual things.
Short guide: (research on your own how to do each step)
You can use SQLite database for storing your data.
Add your media (sound files) to resources and then play them or whatever
Android keyboard supports different languages, just set it to Romanian
You will need some basic understanding of Android programming if you want to do this. I'm uncertain how much time you're willing to put into this, but it should be doable in 2-3 weeks, heavily depending on your previous coding experience.

Possibly non-intrusive method to change complete android application code with a library

I'm having quite tough problem while developing a testing framework for android apps. The text got a bit long so the actual question is in bold for those that don't want to read the context.
Basically, what I'd like to achieve right now is to trace user activity while he's using the application as one of the features. There's my app that manages context data all the time and developer's app - the one being tested. My idea to do this was to get coordinates where user touched the screen along with taking a screenshot simultaneously. Then I'd use the coordinates to mark the spot on the screenshot to get the idea of what user was doing the whole time with the app. Take hints on user experience and trace crashes.
Non-system apps cannot take a screenshot for security reasons, but application itself can take a screenshot of its Activities without much trouble for non-rooted users, e.g. like here. My only hope here is to interfere with developers' code to implement the functionality of doing so while my testing app is running. Each Activity then would have to extend my overridden Activity instead of regular one, implement an interface, implement broadcast receiver etc.
I am going to write a library for developer who would like his app to be tested with my framework. I'd like it to do the job for me and be as non-intrusive as it's possible for him to use. How to achieve that the best way?
Ideal case would assume linking the library to project with maybe a small addition in manifest that'd get the job done and after just unlinking, removing that bit of xml in manifest for production.
That's an open question. I don't expect any bits of code, but some nifty Java trick, Android OS functionality or even completely other approach that'd solve my problem
I tried to be as clear as possible with the question, but that's a quite tough matter for me to describe so that could have turned out contrary. Don't hesitate to ask me for more details, to speak my mind more clearly or even rewrite the question. Thank you all very much for help!

Having a hard time finding a Java mp3 player(api) with a lot of functionality

I've been looking at different API's (recently I've used JLayer), and I can't seem to find one with as much functionality as I'm looking for.
I have an assignment to make a media player, that plays MP3 files(was recommended that we use some api). I know how to do the GUI, storing the files into playlists, and other things. The thing that I need help with is the playing the music part.
I don't know if there is an API that fills all of my needs, but this is my check list.
Needs to be able to:
Play an mp3 file
Pause and start again
Able to set the volume
Able to skip to a certain place in a song
Would be nice to have the length of the song
I think that is essentially my list of things I'd like to see in an API, I've just been having a hard time finding one.
Any suggestions would be awesome! Thanks!
The gstreamer - http://code.google.com/p/gstreamer-java/ should help you.

Java Voice Biometric

I want to develop an application based on Voice Biometric Recognition.
Specifically, I want to develop an application which will record a voice from the telephone, and identify the speaker. If the same person calls again it will recognize the voice. Like other Biometric applications do here my need is to do a voice biometric. Are there any URLs or examples which will help me. I searched but not able to find a solution.
FreeSpeech is a text-independent speaker verification system that verifies a caller's identity
I want to achieve the above one FreeSpeech Recognition in my application.
Is it possible to do the below things by using any Open Source.
The individual records a voice print, then
The system keeps track of the voice prints and can distinguish recordings from live speech
If yes, can you please provide me a URL or example which will help me.
Well, I got the light from This Url to achieve the above task but not able to get the expected out put.
After wasting 20 to 25 Hrs, Finally I got the solution by using MARF Framework.
I got the sample app from the http://sourceforge.net/projects/marf/files/Applications/%5Bf%5D%20SpeakerIdentApp/0.3.0-devel-20060226/
And for now, it's working fine for me. This links is very useful for me to make the sample app executable. http://marf.sourceforge.net/
You can take a look at this previous SO post in which various Java Speech Recognition Engines are described such as Sphinx.
I am not an expert on this domain so please take my answer as is , it's not an authorative one... I think you have different ways to achieve your goals :
- finding a Java library is one , the most natural one
- recording the voice in Java then applying one of the several algorithms available for such job , you may find many research papers dealing with that subject
- depending from the architecture choices, you may find different libraries implemented in C dealing with voice signal, using JNI or JNA is one way to deal with C/C++ libraries, Web Services or CORBA are other ways to achieve this....
HTH
Jerome

Continuous speech recognition while singing?

As part of my application I'm looking to add speech recognition, but not really in the traditional sense. I have a bunch of lyrics (divided into verses) that are sung by someone, and the idea is to find what verse is currently being sung so it can be displayed on screen.
I've played around with sphinx and got some basic examples set up and working, but while there seems to be plenty of documentation around on registering spoken text where you can wait for a delay then process the result, I can't find much on the idea of recognising sentences continuously. This is of course before I get to the part where the words are being sung and not spoken!
Has anyone got any experience with this, and if so is there anywhere that would provide a good starting point? Or is what I'm trying to achieve way too ambitious with sphinx and is it never really going to work properly? I'm open to looking at other libraries but they must be free, and sphinx was the most widely talked about one I could dig up.
It's perfectly possible to recognize speech as soon as it's pronounced with a little delay. Moreover if you more or less understand what do you expect to get. This is called "partial result" and is available in all CMUSphinx decoders through API. Basically you can retrieve hypothesis in process.
There is a little issue to consider on how to stabilize this result (how to extract the stable part of it) but this technique is called backtracking and could be easily implemented
For singing, given the music can be filtered out it's also doable.

Categories

Resources