Notifications
Clear all

Pocket Sphinx Speech Recogniztion Engine

48 Posts
5 Users
12 Likes
6,307 Views
Robo Pi
(@robo-pi)
Robotics Engineer
Joined: 5 years ago
Posts: 1669
Topic starter  

@spyder 

I looked into it a little bit.  It appears to be a personal assistant package.  Not sure if that would fit the needs for my project.  I don't want the personal assistant.  I just what the TTS and SRE tools.  I'm going to empty them out and have the robot learn how to talk from scratch using Python.  So I'll basically be building the personal assistant part from the ground up.

I tired to install things more simply, but it didn't seem to work.  I installed pocketsphinx-python first.  Thinking that I could just get away with that alone.  But it didn't work.  I couldn't access pocketsphinx either from the command line or through Python.

So I went back and reinstalled pocketsphinx the way I originally installed it and this time it worked.  At least from the command line.   Unfortunately Python still doesn't recognize it.   So I guess I'm going to have to reinstall Pocketsphinx-Python again.  I guess it needs to be installed after pocketsphinx is already installed?

Only time will tell.  I'm getting sick of doing this.  Once I get it squared away I think I'll just stick with what actually works. 😊 

Obviously the way I did it the first time worked.  So I should be able to repeat that again.

 

 

 

DroneBot Workshop Robotics Engineer
James


   
ReplyQuote
Robo Pi
(@robo-pi)
Robotics Engineer
Joined: 5 years ago
Posts: 1669
Topic starter  

I got it working again!  😎 

Apparently you do need to install sphinxbase and pocketsphinx first, according the video in the second post of this thread.  It will then work from the command line.

Then if you want to use it with Python you need to also install pocketsphinx-python.  And that has to be done AFTER sphinxbase and pocketsphinx had first been installed.

Fortunately even though I had installed pocketsphinx-python first.  I didn't need to to an entire new install.  All I needed to do was rerun the following line:

sudo python3 setup.py install

That did the trick!

So there doesn't appear to be any shortcuts.  It's looks like the whole shebang needs to be done as I had previously outlined.

It's not that bad, and at least it works when it's all done. 😊 

So it looks like I have pocketsphinx installed correctly for use with Python.

Now I can move forward to write some of my own Python code to better control Pocket Sphinx.

I guess this concludes the installation process of Pocket Sphinx.

If there are any shortcuts I don't know what they are.

It working now!  Yippee!

I'm really tired and it 's nice to go to bed after SUCCESS! 👍 

It's not that hard to install as long as you just do everything in the proper order.

DroneBot Workshop Robotics Engineer
James


   
ReplyQuote
codecage
(@codecage)
Member Admin
Joined: 5 years ago
Posts: 1037
 

@robo-pi

Are you installing all this on a RasPI, a Jetson Nano, or what?  Inquiring minds want to know. 😎 

SteveG


   
ReplyQuote
Robo Pi
(@robo-pi)
Robotics Engineer
Joined: 5 years ago
Posts: 1669
Topic starter  
Posted by: @codecage

@robo-pi

Are you installing all this on a RasPI, a Jetson Nano, or what?  Inquiring minds want to know. 😎 

I currently have it installed on two Jetson Nanos.  I installed it manually on both because I wanted to make sure the installation process I used worked.   I'm running the NVIDIA Jetson Nano system image that Paul McWhorter had us install in his AI course.  Also running the CODE-OSS as the Python IDE.

The system is Ubuntu 18.04 also known as the "Bionic Beaver".

I also have it installed up to where it works at the command line on a Raspberry Pi 4 that is also running Ubuntu 18.04.  I haven't yet installed the pocketsphinx-python on the Raspberry Pi yet.   I'll probably do that today.

I will also be installing it on a second Raspberry Pi 4 that running Raspbian.  I'm trying it out on different platforms and OS to see what the differences are.   There are slightly different things that need to be done, but nothing major.  The Ubuntu on Raspberry Pi is slightly different from the Ubuntu on the Jetson Nano.  Mostly little things like how the desktop is laid out and where to find various system parameters.

~~~~~

Just for the record.  I've also installed eSpeak on all four of these systems.   And again, there were very slight differences in the installation.  Mostly to do with the sound system.  Nothing major. 

I'm running CODE-OSS on all four systems too.  Again, this is the same CODE-OSS IDE that Paul Mcwhorter had us install on the Jetson Nano using CURL.  Although I'm thinking this should work with any Python IDE.  Raspbian comes with Geany and Thonny installed.  I could check to see if they work with eSpeak and Pocket Sphinx too.  I would imagine that they should work just fine.

I haven't tried installing eSpeak or Pocket Sphinx on Windows.  I used Microsoft Speech Platform on Windows so there's no need for eSpeak or Pocket Sphinx over there.   Of course, I also need to use C# instead of Python on Windows too because I don't think Python will work with Microsoft Speech Platform.  Although I'm not really sure about that.   I just know that Microsoft Speech Platform is designed to be used by C# so they work really well together.

In any case, I'll let you know when I get to Raspbian with Pocket Sphinx.  Hopefully it will install without a problem.

DroneBot Workshop Robotics Engineer
James


   
ReplyQuote
Robo Pi
(@robo-pi)
Robotics Engineer
Joined: 5 years ago
Posts: 1669
Topic starter  

Potentially Disgusting News!

I just came from a pocketsphinx forum.  I was asking for documentation on the methods available in pocketsphinx-python.

Instead of giving me the information I asked for all I could get are people telling me that Pocket Sphinx is "dead" and obsolete. 😫 

They are telling me to use Vosk-Kaldi instead.

I have no clue what Vosk-Kaldi is, and I can't find any good tutorials or information on it either.   So this is a bit disgusting.

I'm told to install vosk, and Kaldi and us them.   I've been told this by several people.  So apparently it's the new way to go.

So it looks like I might be back at square one having to learn all about vosk-kaldi now instead.  I hope it's as good as Pocket Sphinx in terms of flexibility and being useful through Python.

So I guess it's back to the drawing board.  What a BUMMER!

Looks like I'm back to pulling teeth with Vosk-Kaldi now.   Information doesn't appear to be readily available for this package either.

All I've been given is this:

Vosk API on GitHub

I guess I'll need to install this and see what's up.

It better be better!  That's all I have to say.

I want to move forward with my AI project and I'm getting bogged down trying to get a decent SRE that I can use from Python.

Hopefully this vosk-kaldi package will work!

Pocket Sphinx was working!  It was decoding my voice at near 100% accuracy.  It must like my voice. 😊 

Anyway, back to SQUARE ONE with vosk-kaldi.  We'll see where that leads.

DroneBot Workshop Robotics Engineer
James


   
ReplyQuote
(@starnovice)
Member
Joined: 5 years ago
Posts: 110
 

@robo-pi James, I found this documentation on Kaldi https://kaldi-asr.org/doc/  

 

Pat Wicker (Portland, OR, USA)


   
ReplyQuote
Robo Pi
(@robo-pi)
Robotics Engineer
Joined: 5 years ago
Posts: 1669
Topic starter  
Posted by: @starnovice

@robo-pi James, I found this documentation on Kaldi https://kaldi-asr.org/doc/  

 

Well, that's certainly nice to see. 😊 

I don't see any reference to "vosk", and i'm not even sure what vosk is all about.

However I just tried to install vosk using the GitHub page and it won't install.  I get an error that it can't find a matching distribution for my system.  This may be due to the arm64 architecture.

Not sure if I can use Kaldi without vosk?  This is all new territory for me here.

I'll look into the page you just posted and see if I can learn anything.

In the meantime Pocket Sphinx is working.  It's not like it didn't work.   And so far Pocket Sphinx has been decoding my voice close to 100% accuracy.  It must like my voice. 😊 

I just wanted to find more information about PocketSphinx-Python methods I could access from Python.  But when I ask that question people refuse to answer and just tell me to move over to vosk.  They don't even mention Kaldi.  But clearly vosk requires Kaldi.  And I have no clue what the difference is between Vosk and Kaldi or why both are required?

I feel like taking a vacation from the whole shebang right about now to be honest.

DroneBot Workshop Robotics Engineer
James


   
ReplyQuote
Robo Pi
(@robo-pi)
Robotics Engineer
Joined: 5 years ago
Posts: 1669
Topic starter  

Some More Information:

Installing vosk:

After pulling some teeth I finally got Vosk installed.   There's a special trick involve they keep a secret until you ask somebody!  I got the secret now. 😊 

Apparently Vosk is a packaged that is need if you want to use Kaldi with Python.  So that must be what Vosk is all about.  Vosk is a very small program only 2.5MB.

Installing Kaldi:

Kaldi is the actual SRE.  And is quite large.  Not sure how big but it took a long time to download.

There are dependencies involved that must be installed first.  And these will depend on your computer.  Thankfully they have a script called "check_dependencies.sh" that will tell you what your computer is missing and how to install the missing dependencies.

I'm still working on downloading the GitHub Clone,....

The Exciting News:

From what I've read in the Kaldi documentation link provided by @starnovice Kaldi does have quite a few improvements over Pocket Sphinx.

First, it was designed in cooperation with NVIDIA and takes full advantage of the CUDA GPU.  So it should run much faster than Pocket Spinx.  That's a huge PLUS there for those who will be using a Jetson Nano.  Not sure if this will help much on a Raspberry Pi 4.

Second,  It uses new SRE technology called "Online Decoding".  This has nothing to do with being online on the Internet.  What this term actually means is the Kaldi is able to decode the incoming voice waveform to some degree even before it's been captured into an audio file.  Supposedly this is a major breakthrough in SRE.  It supposedly make the system faster, and more efficient.  I haven't read the details of the technology, nor would it be important to know to use this SRE.  But clearly it's a feature that will help with the process overall.

Third,  It has three different DNNs or Deep Neural Network models that you can choose to use.  The documentation explains which model you might want to use depending on your SRE application.

So I guess they are right when they suggest to use this much more modern SRE.  Based on just these few features it sounds like it's going to live Pocket Sphinx in the dust!  And I just skimmed over this information so there's probably even more cool features too.

~~~~

Ok, while I was writing this post I installed 3 dependcies that my computer needed and I just ran the check_dependencies.sh and it says that everything is ALL OK.

So now it's time to do the actual install.

This is a pain that I need to start from square one again, but I guess this will be worth it in the long haul.

Hopefully the installation will go without a hitch and I'll have it up and running very soon.

DroneBot Workshop Robotics Engineer
James


   
ReplyQuote
Robo Pi
(@robo-pi)
Robotics Engineer
Joined: 5 years ago
Posts: 1669
Topic starter  

Update before I go to bed:

Kaldi took almost 2 hours to install.  And that was after having cloned it from github and installing all the dependencies.  So it's anything but lightweight software. 

I just tried a python example code for it and I'm getting a gazillion errors.   So that's it for tonight.  I'll look into all these errors tomorrow.

PocketSphinx might not be as nice.  But it is far lighter software overall.  I think Pocket Sphinx is still working.  Let me check.

Yep, Pocket Sphinx survived the installation of vosk-Kaldi.   So at least I still have Pocket Sphinx running underneath this new software.

I'm still open concerning which way I'll eventually go.  Pocket Sphinx was looking pretty promising, and at least it's working!  And it was actually working pretty well too.  But it did take some time to decode sentences.  Not too bad.  But there was some hesitation there.  I would hope that Kaldi would decode much faster, especially considering how huge the software is and all it's bragging about using the CUDA and advanced SRE technologies. .

But it's not working right out of the box yet.  Still more trouble-shooting to do.

I go to bed now. I might even take tomorrow off.  This has exhausted me just to get this installed!

DroneBot Workshop Robotics Engineer
James


   
ReplyQuote
codecage
(@codecage)
Member Admin
Joined: 5 years ago
Posts: 1037
 
Posted by: @robo-pi

I currently have it installed on two Jetson Nanos. 

I only have espeak installed at the moment on a Win10 laptop.  Have not installed anything on a Jetson Nano or RasPi yet.  I have a second Jetson arriving on Thursday according to email with tracking number I got from Nvidia this morning.  I think I'll do all my TTS & SRE work from there after benefiting from all your hard work! 🤣 

SteveG


   
ReplyQuote
codecage
(@codecage)
Member Admin
Joined: 5 years ago
Posts: 1037
 
Posted by: @robo-pi

There's a special trick involve they keep a secret until you ask somebody!  I got the secret now. 😊 

 

OK, what's the secret? Or were you sworn to secrecy?  😊  

SteveG


   
ReplyQuote
Robo Pi
(@robo-pi)
Robotics Engineer
Joined: 5 years ago
Posts: 1669
Topic starter  
Posted by: @codecage

OK, what's the secret? Or were you sworn to secrecy?  😊  

Here's the secret code:

pip3 install https://github.com/alphacep/vosk-api/releases/download/0.3.3/vosk-0.3.3-cp36-cp36m-linux_aarch64.whl

Probably not all that secret, but I get tired of having to hunt for all these little tricks myself. 🤣 

Vosk actually installed really quickly.

Then, I went on to install the Kaldi package because it was my understanding that vos uses Kaldi.  But now they are telling me that I didn't need to do that and that vosk should work by itself.

That's crazy!

Besides, it doesn't seem to be working anyway!

So I'm currently confused.  I'm going to try to just install vosk on my second nano and see if it really does work without Kaldi.

And I'm still thinking that Pocket Sphinx might actually be all I need for what I want to do anyway.  I think this Kaldi SRE is designed to be able to talk to anyone about anything.  I don't need that.  My AI project is to build up a Linguistic AI model from the ground up. I don't want an AI system that's already built.  So it could be that these guys on the Pocket Sphinx forum are just reacting to common knowledge which may not even apply to what I'm trying to do.

Anyway, this whole thing is getting frustrating.   I just want to be able to recognize a few words.   And I'll even be putting those words into the dictionary myself.   So I really don't need a full-blown AI system that can already understand a gazillion words.  In truth it doesn't really understand anything.  It just recognizes the words in the dictionary but it has no idea what the words actually mean.  So it ends up being just another dumb chat bot that can recognize a huge amount of words.

I'm thinking of taking today off.  I'm really beat.  I just came back from Walmart.  They have a special senior time slot on Tuesday mornings from 6 to 7 for seniors only.   So that's when I go shopping.   A nice clean store that is freshly stocked, and the lowest risk of contracting the virus, I hope.

So after I put all this food away I'm going back to bed. 🤣 

DroneBot Workshop Robotics Engineer
James


   
ReplyQuote
codecage
(@codecage)
Member Admin
Joined: 5 years ago
Posts: 1037
 

@robo-pi

You do that!  I'm beginning to feel your frustration and you're the one doing all the work!  It's amazing all the run around one can get trying to get something installed when communicating with someone who should be able to make it much simpler.  I think it stems from the fact that they don't stop to realize you may be coming into this process from an entirely different avenue and for different reasons than they imagine.

I really appreciate all the work you are doing and giving such detailed descriptions of the pitfalls you have encountered.

And I think you may be onto something about using PocketSphinx and not bothering with vosk-kaldi.

I think I may setup two identical SD cards for my soon to arrive Jetson Nano, then put the PocketSphinx solution on one card and vosk-kaldi on the other.  But that may be changed depending on your future discoveries!

Thanks again for sharing!!!!  😎 

SteveG


   
ReplyQuote
codecage
(@codecage)
Member Admin
Joined: 5 years ago
Posts: 1037
 

@robo-pi

Are you using the JetPack version 4.2.1 dated 2019/07/19 that Paul McWhorter is using for his tutorial, or are you using later versions?  Or do you have a totally separate SD card for the McWhorter tutorials from the one you use with your Linquistic AI work?

SteveG


   
ReplyQuote
Robo Pi
(@robo-pi)
Robotics Engineer
Joined: 5 years ago
Posts: 1669
Topic starter  

I just ran the jetsonUtilties script and got the following information

james@james-desktop:~/jetsonUtilities$ ./jetsonInfo.py
NVIDIA Jetson TX1
L4T 32.2.0 [ JetPack 4.2.1 ]
Ubuntu 18.04.2 LTS
Kernel Version: 4.9.140-tegra
CUDA 10.0.326

So yes, I'm running JetPak 4.2.1

What I've done is create a 16GB SD card with this JetPack Ubuntu configured to my preferences.  It also knows my password, and is set up for my Wifi.

I then installed the following software on it:

CODE-OSS - set up for Python
Arduino IDE - with my favorite libraries
Vokoscreen - for making screen capture videos
Audacity - for editing audio files
Libre Office Writer - customized with my toolbar and templates
Kolour Paint
Gedit
Disk Usage Analyzer
Disks Tools
Chromium Browser - customized
Nano - my favorite command line editor

All these have also been added to my task bar.

This all fits on a 16 GB SD card with room to spare and I'll probably redo this later adding a few more favorites. 

In any case, this is just my new "System Image".   When I want a new system I just burn this image onto a larger SD card.  Usually a 64GB.    So every time I start a new system card it's already completely configured with all my favorite core software already installed and customized.   Having the original system on a 16GB card makes it easier to store the image as it's only 16GB instead of 64GB,   You can always burn it to a larger SD card. 

When I'm actually running the system it's always on a 64GB card so I have plenty of free space to work with. 

But this way when I burn a new system I don't need to start from scratch every time.  So my 16GB system card is my GOLD!

That's the starting point for all my other system cards.   I never need to start over from complete scratch again.  I don't want to have to reinstall and reconfigure all those core programs every time I make a new system card.

DroneBot Workshop Robotics Engineer
James


   
ReplyQuote
Page 2 / 4