Continuous speech recognition while singing? - java

As part of my application I'm looking to add speech recognition, but not really in the traditional sense. I have a bunch of lyrics (divided into verses) that are sung by someone, and the idea is to find what verse is currently being sung so it can be displayed on screen.
I've played around with sphinx and got some basic examples set up and working, but while there seems to be plenty of documentation around on registering spoken text where you can wait for a delay then process the result, I can't find much on the idea of recognising sentences continuously. This is of course before I get to the part where the words are being sung and not spoken!
Has anyone got any experience with this, and if so is there anywhere that would provide a good starting point? Or is what I'm trying to achieve way too ambitious with sphinx and is it never really going to work properly? I'm open to looking at other libraries but they must be free, and sphinx was the most widely talked about one I could dig up.

It's perfectly possible to recognize speech as soon as it's pronounced with a little delay. Moreover if you more or less understand what do you expect to get. This is called "partial result" and is available in all CMUSphinx decoders through API. Basically you can retrieve hypothesis in process.
There is a little issue to consider on how to stabilize this result (how to extract the stable part of it) but this technique is called backtracking and could be easily implemented
For singing, given the music can be filtered out it's also doable.

Related

Use Java or other languages in a Flutter Application

Since I got no answer and not much feedback on this question: Android Flutter Analyze Audio Waveform
and found nothing online about what I'm looking for, I'll simply ask a broader question, since a comment on that answer told me to use native code and use a platform channel to connect it to flutter but when I asked some clarifications I got nothing.
So my question is If I can do operations in Java (which has been around since a much longer time, and thus has a way bigger documentation), and then use the outcome in Flutter.
More precisely, could I do these things in Java and Flutter:
1) Analyse Audio waveform and find peak points in specific frequencies, and use the timestamp to display them in flutter;
Edit 1:
What Are Peak Points?
This Is the waveform of different frequencies ranges (The Orange one is bass (80-255Hz)), and the Points circled in black are Peak Points. I should analyze the audio specter of a song and Find the peak points in certain frequencies. Then When I Find the Peaks I need to save the timestamps, for example, 16 seconds in and so on.
2) Edit 2:
I need to Edit some photos in a video, like a video collage, for which each frame of a 30 or 60fps video is an image.
3) Edit 3:
I need to add basic frame specific effects to the video, for example a blur that will change frame to frame, or a glare.
4) Adding Music to that video and save it to an mp4 or avi or any format.
5) Edit 4: Most Important thing, I don't want to this all in real time, but more like an After Effect like Render process, in which all the frames are rendered together. The Only thing that would be nice is a sort of progress bar telling the user that the Render is at frame, for example, 200 of 300, but I don't want to display any of the frames or the video, just to render it in background and then save it to an mp4 video that can be viewed after.
As You can see it's a difficult process to do in a language to which you hardly find a tutorial on how to play music due to its early state. But Uis and some other things in flutter are way easier to do and Flutter is also Multi-Platform. So I prefer to stick to Flutter language.
Edit 5:
I took a look at Qt and JUCE, and found out that Qt seems a valid alternative but it seems for what understood more like a "closed" system, I mean, for example I looked the multimedia library but for what I've understood, you can do basic stuff, for example play a video, but not collage frames and save it. (Don't know if I explained myself well). JUCE On the other side, looks better but it seems more for PC audio VST than for mobile applications including video rendering. And another thing is that these two are not free and open source like Flutter is.
Then There is Kivy, which could and could not be the best, because it is a Python port for Mobile Devices and I have a lot of experience with Python And I think it's one of the easier language to learn, but on the other side, it hasn't got that much UI power. and as you mentioned there could be problem using libraries on Android.
You stated I could use C++ or Java With Flutter, but with C++ you told that it's a difficult process. So My question turned out to be Could I write the process in java with a Normal Android Application And Then in some way use the functions in a Flutter App?
Edit 6:
I found a possivle alternative:
Kha (http://kha.tech/). But again found nothing on how to use it with Flutter. Could it be a good Idea?
I'm asking more of a confirmation on if I could use Java or any other language to do what I need in a Flutter Application. And If yes if it's complicated or not that much. (I'm a beginner sorta). But Some tutorial or links to kickstart the code would be helpful aswell!
Flutter at this time is great for building UIs but as you've mentioned, it doesn't have a lot of power or compatibility with libraries yet. A good part of the reason for that is that it doesn't have easy integration with c++, but I won't get into that now.
What you're asking is most likely possible but it's not going to be simple at all to do. First, it sounds like you're wanting to pull particular frames from a video, and display them - that's going to be an additional complication. And don't forget that on a mobile device you have somewhat limited processing power - things will have to be very asynchronous which can actually cause problems for flutter unless you're careful.
As to your points:
This is a very general ask. I'd advise looking up android audio processing libraries. I'm almost sure it's possible, but SO questions are not meant for asking advise on which framework to use. Try https://softwarerecs.stackexchange.com/.
Once again, fairly general and a bit unclear about what you're asking... Try sofwarerecs. I assume you're wanting to take several frames and make them into a video?
Some of those effects (i.e. zoom) you could definitely do with flutter using a Transform. But that would just be while playing in flutter rather then adding to the video files themselves. To do that, you'll have to use the video library in android/java code.
Once again, the video library should do this.
This should also be part of the video library.
I do know of one audio/video library off the top of my head called Processing that may do what you need, but not for sure. It does have an android sdk though. OpenCV would be another but only for video/image processing and I haven't used it directly with Java so I'm not sure how easy it is to use.
For how you'd actually go about implementing this along with flutter... you're going to need to use Platform Channels. I mentioned them in the comment to your other answer but figured you could look that up yourself. The documentation does do a much better job of explaining how that works and how to set it up than I can. But the TLDR is that essentially, what they allow you to do is to send serialized data from native code (java/kotlin/swift etc) to flutter code (dart) and vice-versa, which gets translated into similar data structures in the target language. You can set up various 'channels' upon which the data flows, and within those channels set up 'methods' which get called at either end, or simply send events back and forth.
The complication I mentioned at the beginning is that sending images back and forth across the channels between flutter and dart isn't all that optimal. You most likely won't get a smooth 24/30/60fps of images being sent from java to dart, and it might slow down the rest of the flutter ui significantly. So what you'll want to use for the actual viewport is instead a Texture, which simply displays data from the android side. You'll have to figure out how to write to a texture from android yourself, but there's lots of information available for that. Controls, the visualization of the audio, etc can be done directly in flutter with data that is retrieved from native.
What you'll have is essentially a remote control written in dart/flutter, which sends various commands to a audio/video processing library & wrapper code in Java.
If all that sounds complicated, that's because it is. And as much as flutter is very nice to build UIs in, I have doubts as to whether it's going to be worth the extra complications if you're only targeting android.
Not really related to the answer but rather some friendly advice:
There is one other thing I'll mention - I don't know your level of proficiency with programming and with different languages, but video/audio processing and such are generally not done in java but rather in actual native code (i.e. c/c++). As such, there are actually two levels of abstraction you're going to have to be dealing with here (to some degree as it will probably be abstracted somewhat or a lot depending on the library you're using) - c/c++ to java and java to dart.
You may want to cut out the middlemen and work more directly with native - in that case I'd recommend at least taking a look at Qt or JUCE as they may be more suitable than flutter for your particular use case. There's also Kivy (uses python) which may work well as there's a ton of image/video/audio processing libraries for Python somehow... although they may not all work on android and still have the c++ => python translation to some degree. You'll have to look into licensing etc though - Qt has a broad enough OS licence for most android apps, but JUCE you'd have to pay for unless you're doing open source. I'd have to recommend Qt slightly more than the others as it actually has native decoding of video frames etc, although you'd probably want to incorporate OpenCV or something for the more complicated effects you are talking about. But it would probably be on the same level of complicated as simply writing in java code, but with a slightly different UI style & easier integration with c++ libraries.

Writing personal translation app using my own database to work on android

I would like to make an app that lets you search for part of a word or phrase, then returns the closest results from a personal database of words I have learnt in another language. Once the results have been returned I would like the option to play the sound file with the associated results. I can write the database in whatever program I need, and the sounds files would be in either wav or mp3 format. The app would also need to allow the user to input foreign letters, there are about 10 extra required as I am using Romanian. These could be separate on the screen if necessary, as in separate to the keyboard input.
Would this be an easy enough project to undertake, what sort of size would it be, I am more than happy to spend about a week on it. I am familiar with coding, particularly in Python, so writing in Python would be best, but I can use Java also. This would need to work on the android system. What is the best program to use to write the app?
Android studio covers basically every aspect you've mentioned. You'll probably need to google a lot to learn how to implement individual things.
Short guide: (research on your own how to do each step)
You can use SQLite database for storing your data.
Add your media (sound files) to resources and then play them or whatever
Android keyboard supports different languages, just set it to Romanian
You will need some basic understanding of Android programming if you want to do this. I'm uncertain how much time you're willing to put into this, but it should be doable in 2-3 weeks, heavily depending on your previous coding experience.

command (voice) recognition based project

I am searching for a way to recognised pre-registered voice commands in java for a project and i couldn't come up with a good approach yet, I have looked into fast Fourier and different ways to treat wave files but i can't decide how i should go about implementing it.
the idea is simple, the user will record his/her voice with a short phrase and then when the phrase is repeated the application should recognise which command is issued.
any ideas or suggestions would be most welcome
thanks in advance
Voice recognition is an unsolved problem that multibillion dollar companies are spending millions and years to solve. Simply said, if you're only at the FFT level, you aren't going to do it. Instead, you should be looking for libraries that do it for you. One is even included in Android- check out http://developer.android.com/reference/android/speech/RecognizerIntent.html

Leap Motion Tremor Recognition?

I'm a fairly new developer that has been working on stuff using Leap Motion for Processing https://github.com/voidplus/leap-motion-processing. I'm enjoying Processing thus far.
I came across this demo on YouTube http://www.youtube.com/watch?v=7o1v7RayEV8&feature=youtube_gdata
I need to build something like this but I have no idea where to start?! I cant even tell what language this app is built in and I cant find any documentation for it online....
If anybody could provide some pointers in the right direction it would be great! i'm going to continue to lurk the internet for more information....
You could build this in just about anything you like - Processing included. If you've already been using the Leap Motion API you should already know how to read the relevant values, so the rest would be a case of recording them over a period of time (perhaps into arrays or objects?) and then generating charts and statistical calculations on that data.
A good starting point for seeing which parameters can be usefully read from the Leap Motion can be seen here: http://js.leapmotion.com/examples/parcoords.html
This demo also includes sourcecode so you can quickly see how the values are being accessed from the API and handled.
(Note: this is written in JavaScript and uses ThreeJS, so best to view with a modern browser.)

Having a hard time finding a Java mp3 player(api) with a lot of functionality

I've been looking at different API's (recently I've used JLayer), and I can't seem to find one with as much functionality as I'm looking for.
I have an assignment to make a media player, that plays MP3 files(was recommended that we use some api). I know how to do the GUI, storing the files into playlists, and other things. The thing that I need help with is the playing the music part.
I don't know if there is an API that fills all of my needs, but this is my check list.
Needs to be able to:
Play an mp3 file
Pause and start again
Able to set the volume
Able to skip to a certain place in a song
Would be nice to have the length of the song
I think that is essentially my list of things I'd like to see in an API, I've just been having a hard time finding one.
Any suggestions would be awesome! Thanks!
The gstreamer - http://code.google.com/p/gstreamer-java/ should help you.

Categories

Resources