Notifications

Clear all

Pocket Sphinx Speech Recogniztion Engine

Page 4 / 4 Prev

Help Wanted

Last Post by Robo Pi 4 years ago

48 Posts

5 Users

12 Likes

6,308 Views

RSS

Robo Pi

(@robo-pi)

Robotics Engineer

Joined: 5 years ago

Posts: 1669

Topic starter 2020-04-24 5:11 pm

Posted by: @codecage

Still need to figure out why my Win10 machines no longer recognize the SD cards after the starting image from NVIDIA is put on them.

I haven't had that problem. But here's a page with some suggestions:

How to Fix SD Card Not Detected on Windows 10

I'm having extreme frustration trying to find information on how to modify (or even find) the vocabulary dictionary that vosk uses.

It's frustrating enough to find information that's difficult to understand. But in this case I can't even find any information at all.

I'm about ready to go back to using pocket sphinx. Although vosk does seem to be a quicker and possibly even more accurate decode of speech into words. But if the dictionary is not easy to modify that's going to be a major problem for me.

Right now I'm trying to get vosk to recognize "alysha" as a wake word. But apparently it doesn't have alysha in the dictionary. Instead it keeps coming back with elisa. I mean I could use elisa as the word it returns when I say alysha and deal with that in my python code. But I'd rather just put alysha in the dictionary.

In fact, for my Linguistic AI projects I'm absolutely going to need to be able to modify the dictionary.

So as things are right now, even though I have vosk up and running and it seems to be working fairly well, without the ability to easily modify and edit the dictionary it's going to be worthless for my purposes.

So I may have no choice but to go back to pocketsphinx. I know I can modify the dictionary there.

DroneBot Workshop Robotics Engineer
James

ReplyQuote

Robo Pi

(@robo-pi)

Robotics Engineer

Joined: 5 years ago

Posts: 1669

Topic starter 2020-04-25 1:09 am

The Final Judgement: Pocket Sphinx Wins!

After spending a lot of time pulling out my hair trying to find information on Vosk, I finally learned that it's not easy to modify the vocabulary dictionary for vosk. So I'm going back to Pocket Sphinx for my Linguistic AI project. Being able to build dictionaries from scratch and swapping them out is a major part of how I plan to build my Linguistic AI system. I should mention here for anyone who might be interested, that this is also quite easy to do using Microsoft Speech Platform on Windows as well.

In any case, I'll be moving back to Pocket Sphinx for my project. It actually appears to me to be decoding my speech very well. It can also be trained for specific voices. Although I haven't looked into how that feature works yet.

I should also point out that Vosk does decode speech a bit faster, and probably more accurately too. So for someone who isn't concerned with customizing the vocabulary dictionary Vosk might be the better choice. But since dictionary manipulation is paramount for what I want to do, I'll be going with Pocket Sphinx.

As always, different people have different criteria for what they need in their projects. So which SRE is best for your purpose may differ. There is a small delay between when I stop talking and pocket sphinx returns the decoded speech. It's not bad, but definitely noticeable. There may be parameters that can be adjusted for that. IT could be just waiting to make sure you're done talking. Or it could just take a long time to look through its huge dictionary. If the latter is the case, then when I edit my dictionaries to only contain a few words it should respond much faster as well.

In any case, I'm going to move forward from here with Pocket Sphinx.

DroneBot Workshop Robotics Engineer
James

ReplyQuote

Robo Pi

(@robo-pi)

Robotics Engineer

Joined: 5 years ago

Posts: 1669

Topic starter 2020-04-25 4:54 am

So much fun that only a geek could have!

Okay, since I've moved forward with Pocket Sphinx I've been playing with the Pocket Sphinx dictionary.

The firs thing I did was break the main dictionary up into smaller text files for future reference. I copied all the words that start with each letter of the alphabet and saved them each in their own file. I'll later glean over those and create a new dictionary that only contains words that I'm interested in using for now.

The second thing I did was create a dictionary that only has a few keywords and the entire alphabet of letters only.

This is where the folly of the speech recognition really shows. It has extreme trouble getting similar sounding letters correctly. I'll need to look into seeing if I can improve on this because part of my idea is to be able to spell words for the robot when she asks me to spell a word. So it would be paramount to get individual letters correctly. Letters like b, d, e, p, and t are often reported incorrectly when just single letters are spoken. But even a human may have difficulty knowing for sure which letter was spoken in some of these cases. It does much better on larger words.

I was also right about it responding much faster with a smaller dictionary. Less words to look up.

In addition to the individual letters of the alphabet I also have "my name is James" in the dictionary as individual words. As well as "stop" and "listening", again as individual words.

Because these are the only words currently in the dictionary it nails it every time when I say 'stop listening'. And so I can use those words to have the program stop listening. That works very well.

Lot's more work to do for sure.

Vosk comparison

I can't really make a fair comparison with vosk because I have no way to reduce the dictionary like I did for pocket sphinx. However, I did try repeating the alphabet to see how many letters vosk could recognize. One problem with vosk is that because it has a large dictionary that cannot be easily reduced it refuses to recognize a lot of letters and instead responds with words that sound similar to the names of letters. And sometimes words that aren't even close to the names of the letter. So it probably has similar problems with very short sounds.

Microsoft Speech Platform

I found this to be the case with Microsoft Speech Platform too. SREs tend to do better with more complex words, and even phrases. This is because there's a lot more information to match up. In fact, with Microsoft Speech Platform I was able to put short phrases into the dictionary and that worked very well as it could recognize phrases with far greater consistency then individual words.

Back to Pocket Sphinx

I haven't found any phrases in the pocketsphinx dictionary. It appears to have a format of a single word followed by a space and then the phonemes. So it's not possible to define phrases, at least not with spaces between the words. It may be possible to define phrases as one huge word like 'stoplistening" as a single word. I'll have to look into exactly what the capabilities are there.

In any case, I'm off to the races for Linguistic AI. 😊

I'll just try to work around any limitations for now. I don't want to get too bogged down with the SRE.

DroneBot Workshop Robotics Engineer
James

ReplyQuote

Page 4 / 4 Prev

RE: ADS1115 AC voltage conversion to ESP8266

Hi @krassimir_db_forum, It sounds like you are star...

By DaveE , 7 hours ago
RE: Had a bit of an accident today...

At least your weren't attacked by an AI robot in your s...

By scsiraidguru , 7 hours ago
RE: SCARA robot arm

@queenidog The small amount of debug you are showing sa...

By Ron , 7 hours ago
RE: coming to you live

i hope you build along❤️

By Duce robot , 8 hours ago
RE: SCARA robot arm

Hi again. Well it goes not good! I've got it all wire...

By queenidog , 8 hours ago
RE: ADS1115 AC voltage conversion to ESP8266

@krassimir_db_forum Since I am seeing 230VAC plus I am ...

By Ron , 12 hours ago
RE: IDE won't recognize Nano

@gtmize You need to post the Amazon link again in a way...

By Ron , 21 hours ago
Krassimir from Bulgaria

Hi everyone. I am a hobbyist and almost 5 years sinc...

By krassimir_db_forum , 23 hours ago
RE: IDE won't recognize Nano

thanks! Nano V3.0, Nano Board ATmega328P 5V 16M M...

By gtmize , 1 day ago
RE: Higher voltage DC power supply

Thanks Ron,Your memories pre-date my arrival on this bl...

By Itinerant John , 1 day ago
RE: Had a bit of an accident today...

[/quote] On another related issue, I got the idea tod...

By Will , 1 day ago
RE: Emergency Brake in Elevator (dumb waiter)

@dbwfred You shouldn't be impressed with it as it's ...

By Will , 2 days ago
RE: Emergency Brake in Elevator (dumb waiter)

@will Hey Will very impressed with what you have drafte...

By DBWFred , 2 days ago
RE: Higher voltage DC power supply

@itinerant-john I can still remember my great grandmoth...

By Ron , 2 days ago
RE: coming to you live

By Duce robot , 2 days ago
RE: nRF24L01 Module will only work when connected to computer via USB

Hi @jettedlikeshek, RE: I am confused as to why thi...

By DaveE , 2 days ago
RE: G’day from an Ozy newbie

@dbwfred G'day cobbler, welcome to the forum. We'll...

By Will , 3 days ago
G’day from an Ozy newbie

Hello everyone, my name is Fred. Very new to Arduino,...

By DBWFred , 3 days ago
RE: nRF24L01 Module will only work when connected to computer via USB

@zander Yes this sounds like a plan. Thank you again fo...

By JetteDlikeshek , 3 days ago

46 Forums
3,996 Topics
42.9 K Posts
163 Online
5,007 Members

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed