Notifications
Clear all

Looking for Suggestions for Voice Recognition with ESP32-S3

18 Posts
3 Users
2 Reactions
906 Views
huckOhio
(@huckohio)
Member
Joined: 5 years ago
Posts: 301
Topic starter  

I am looking for suggestions/recommendations on adding Voice Recognition (VR) to a project I completed last year ( https://forum.dronebotworkshop.com/show-tell/rabbit-shack-chicken-coop-automation-project/#post-43862 ).  The project above is based on the ESP32-S3.

I've never worked with VR so this is completely new and my initial thought was to have a VR module that can recognize a voice command and then pass data to the existing ESP32-S3 to execute the command.  

I have a ESP32-S3-EYE that has VR, but reading the documentation it does not appears to have any GPIO  pins available that I can use to send commands to the ESP32-S3.  I also found the Voice Recognition Module V3 that has tutorials connecting to Arduino UNO, but not sure if that can be used with the ESP32 (may need level converter).  I also found some VR libraries that are for specific Arduino boards. Also, Espressif has some VR libraries, but they appear to only work with the ESP-IDF.

I would appreciate any suggestions/links/etc. to get me started.

Thanks!  Mike


   
Quote
Ron
 Ron
(@zander)
Father of a miniature Wookie
Joined: 4 years ago
Posts: 7914
 

@huckohio I can't help Mike, but I will point out that the ESP world is in some disarray at the moment with lot's of broken code. The board level has changed from R2.x to R3.x to make use of the S3 series unique features. Unfortunately that broke a lot of existing code and libraries.

Maybe as a first sanity check, update your boards and then re-compile all your code to make sure nothing is broke. If it is, roll your boards back to 2.0.17.

I recall someone on the forum spoke of using some sort of cloud based VR but I think that is a poor solution. I also have a vague

Good luck.

p.s. I just did a quick check of existing libraries, and I don't see anything that could be considered 'ready for prime time'.

p.s. In case you missed it, see these links

https://randomnerdtutorials.com/esp32-migrating-version-2-to-3-arduino/

https://docs.espressif.com/projects/arduino-esp32/en/latest/migration_guides/2.x_to_3.0.html

First computer 1959. Retired from my own computer company 2004.
Hardware - Expert in 1401, and 360, fairly knowledge in PC plus numerous MPU's and MCU's
Major Languages - Machine language, 360 Macro Assembler, Intel Assembler, PL/I and PL1, Pascal, Basic, C plus numerous job control and scripting languages.
My personal scorecard is now 1 PC hardware fix (circa 1982), 1 open source fix (at age 82), and 2 zero day bugs in a major OS.


   
ReplyQuote
(@davee)
Member
Joined: 4 years ago
Posts: 1908
 

Hi Mike @huckohio,

  I hadn't even heard of the ESP32-S3-EYE board before, but out of curiosity, did a 5 second Google.

As you have no doubt already seen, the Getting Started guide at

https://github.com/espressif/esp-who/blob/master/docs/en/get-started/ESP32-S3-EYE_Getting_Started_Guide.md#13-block-diagram

says:

Except for GPIO3 that can be used to configure LED statuses, all GPIOs of the ESP32-S3-WROOM-1 module have already been used to control specific components or functions of the board. If you would like to configure any pins yourself, please refer to the schematics provided in Section Related Documents.

Which confirms that the board designers have not made life easy for you.

A further 5 seconds, looking at one of the schematics suggested that there MIGHT be opportunities for reconfiguring one or more GPIO pins, so that communications with another processor becomes feasible, but that will require surgery to the board, and probably, also surgery to the supporting software from Espressif, both of which are likely to be tricky at best, and possibly, practically impossible. (Bear in mind, some of the processor pins and wiring shown on the schematics might be hidden under the rectangular lid ... check before deciding on any hardware surgery.)

So whilst, I would not rule out the possibility for an experienced practitioner of both hardware and software being able to use the board as you suggest, it looks like a lot of skilled work, with no guarantee of success.

If you are looking for a challenge, then you might consider purchasing one, and start with the 'out-of-the-box' functionality, to evaluate whether the voice recognition capability would meet your needs, before worrying about the problems of interfacing it to a second processor. With a little luck, you may get some useful experience, but be prepared for hitting a brick wall or two.

Ron (@zander) has already pointed out other contemporary issues with the Espressif family, which he is more familiar with than I am, so also be prepared for a 'rocky ride' on the 'out of the box' support situation.

Good luck, with whatever you decide.

Best wishes, Dave


   
Ron reacted
ReplyQuote
Ron
 Ron
(@zander)
Father of a miniature Wookie
Joined: 4 years ago
Posts: 7914
 

@huckohio Does that board have WiFi or Bluetooth? If so, use that (maybe esp-now) to communicate to the other esp32 to do the VR.

I think you will do a lot better using a Pi4, it has the VR (search on this forum for 'voice recognition'). Maybe start there then back port the ESP32-S3-EYE code to the Pi4?

I am not a fan of VR and have disabled it on all the devices I own that have it (except Alexa), I don't even use SIRI. I had several bad experiences with my car listening to me and then doing something stupid.

Good Luck.

EDIT: Just stumbled on this 

Espressif’s speech recognition SDK

in https://www.espressif.com/en/products/devkits/esp-eye/overview.

First computer 1959. Retired from my own computer company 2004.
Hardware - Expert in 1401, and 360, fairly knowledge in PC plus numerous MPU's and MCU's
Major Languages - Machine language, 360 Macro Assembler, Intel Assembler, PL/I and PL1, Pascal, Basic, C plus numerous job control and scripting languages.
My personal scorecard is now 1 PC hardware fix (circa 1982), 1 open source fix (at age 82), and 2 zero day bugs in a major OS.


   
ReplyQuote
huckOhio
(@huckohio)
Member
Joined: 5 years ago
Posts: 301
Topic starter  

@zander Thanks Ron.  I compiled all three systems (rabbit shed, chicken coop, and house) with the new library with no issues.  I am not a fan of the ESP32 solutions because they primarily use the ESP IDF - which I have no experience.  

I am still researching.


   
ReplyQuote
huckOhio
(@huckohio)
Member
Joined: 5 years ago
Posts: 301
Topic starter  

@davee Thanks Dave.  I am not that "skilled practitioner" so I've eliminated the ESP32-S3-EYE as a possibility.


   
ReplyQuote
huckOhio
(@huckohio)
Member
Joined: 5 years ago
Posts: 301
Topic starter  

@zander

Posted by: @zander

If so, use that (maybe esp-now) to communicate to the other esp32 to do the VR.

That is one possibility that I am looking into.

Thanks Ron.

 


   
ReplyQuote
(@davee)
Member
Joined: 4 years ago
Posts: 1908
 

Hi @huckohio,

   Probably a wise decision to give the ESP32-S3-EYE a miss at present, as I think there is a significant risk that you will hit some pretty hard brick walls, which can be very challenging, especially if you are working on your own.

  Of course, the more you do, the more skills you will acquire, and may become a 'skilled practitioner' in your own right, so don't give up.

Best wishes and good luck, Dave


   
ReplyQuote
Ron
 Ron
(@zander)
Father of a miniature Wookie
Joined: 4 years ago
Posts: 7914
 

@huckohio Maybe a Pi Zero 2?

First computer 1959. Retired from my own computer company 2004.
Hardware - Expert in 1401, and 360, fairly knowledge in PC plus numerous MPU's and MCU's
Major Languages - Machine language, 360 Macro Assembler, Intel Assembler, PL/I and PL1, Pascal, Basic, C plus numerous job control and scripting languages.
My personal scorecard is now 1 PC hardware fix (circa 1982), 1 open source fix (at age 82), and 2 zero day bugs in a major OS.


   
ReplyQuote
huckOhio
(@huckohio)
Member
Joined: 5 years ago
Posts: 301
Topic starter  

@zander I'll look at it.


   
ReplyQuote
Ron
 Ron
(@zander)
Father of a miniature Wookie
Joined: 4 years ago
Posts: 7914
 

@huckohio Here are a few links.

https://dronebotworkshop.com/hacking-google-aiy-voice-kit-part-1/

https://dronebotworkshop.com/hacking-google-aiy-voice-kit-2/

This camera may be better in terms of pins, and it has an onboad Mic I think. I just got a couple in the mail last week. https://dronebotworkshop.com/xiao-esp32s3-sense/#XIAO_ESP32S3_Sense_Pinouts

Hope this helps.

First computer 1959. Retired from my own computer company 2004.
Hardware - Expert in 1401, and 360, fairly knowledge in PC plus numerous MPU's and MCU's
Major Languages - Machine language, 360 Macro Assembler, Intel Assembler, PL/I and PL1, Pascal, Basic, C plus numerous job control and scripting languages.
My personal scorecard is now 1 PC hardware fix (circa 1982), 1 open source fix (at age 82), and 2 zero day bugs in a major OS.


   
ReplyQuote
Ron
 Ron
(@zander)
Father of a miniature Wookie
Joined: 4 years ago
Posts: 7914
 

@huckohio Here is another possibility. It's still an esp32-s3, but it appears to have quite a few free pins, and an on-board mems mic. I have 2 but have not yet messed with them. Price is also reasonable.

https://www.seeedstudio.com/XIAO-ESP32S3-Sense-p-5639.html

First computer 1959. Retired from my own computer company 2004.
Hardware - Expert in 1401, and 360, fairly knowledge in PC plus numerous MPU's and MCU's
Major Languages - Machine language, 360 Macro Assembler, Intel Assembler, PL/I and PL1, Pascal, Basic, C plus numerous job control and scripting languages.
My personal scorecard is now 1 PC hardware fix (circa 1982), 1 open source fix (at age 82), and 2 zero day bugs in a major OS.


   
ReplyQuote
huckOhio
(@huckohio)
Member
Joined: 5 years ago
Posts: 301
Topic starter  

@zander Thanks Ron.  As I was researching options/links last night I decided on the ESP32-S3-Box-3 (Link).  It's a little overkill, but it works with the arduino IDE and has GPIO pins that I can use. 

I did look at Bill's videos with the Seeeduino, but I was under the impression that the mic was for video audio, not for voice commands.  I may have misunderstood. 

While this started out as a fun thing to add to the chicken coop, I am seeing other possible usages.

Thanks for all you comments/recommendations.  I appreciate you (and @davee) and the help you offer everyone.

 

Mike


   
ReplyQuote
Ron
 Ron
(@zander)
Father of a miniature Wookie
Joined: 4 years ago
Posts: 7914
 

@huckohio That looks interesting, but it appears to need ESP-IDF which you earlier were not happy with. Am I misunderstanding something I may have missed on a very quick look?

On this page

 

Link may be not showing, so here it is again

Screenshot 2024 06 15 at 15.08.45

I screen gabbed the following

Screenshot 2024 06 15 at 15.04.33

First computer 1959. Retired from my own computer company 2004.
Hardware - Expert in 1401, and 360, fairly knowledge in PC plus numerous MPU's and MCU's
Major Languages - Machine language, 360 Macro Assembler, Intel Assembler, PL/I and PL1, Pascal, Basic, C plus numerous job control and scripting languages.
My personal scorecard is now 1 PC hardware fix (circa 1982), 1 open source fix (at age 82), and 2 zero day bugs in a major OS.


   
ReplyQuote
huckOhio
(@huckohio)
Member
Joined: 5 years ago
Posts: 301
Topic starter  

@zander Ron, I hope I didn't misunderstand the marketing material, but here is what espressif listed under features:

image

The last bullet lists development frameworks as ESP-IDF, Arduino, PlatformIO, Circuit Python and more.


   
ReplyQuote
Page 1 / 2