Yes, but some commands require knowing who is speaking, right? If you say « Add XXX to my shopping list » or « Tell YY that Iâm leaving the house, » what happens?
I think itâs worth looking at how they handle this and thinking carefully about the usage
Itâs certain that via messages we will have commands related to the user, so if we donât have this information on the voice side, these commands need to be deactivated or we need to find a way to know who is speaking.
Currently, the user is used to determine in which language to classify the text, if we no longer have this information it means that we need a global setting to be defined for the voice part.
Similarly, the Rhasspy part, it would be great to define what we want in terms of interface in Gladys
We can consider planning for a case where we recognize who is speaking. But it wonât work right away, as far as I know, no open-source tool is currently capable of doing this.
However, yes, implementing this possibility at the code level can be considered.
Of course! Iâm not saying we should do it, we just need to plan ahead and make sure Gladys doesnât crash if a request requires knowing who is speaking.
Also, we will need to block requests that have non-textual responses: « show me the living room camera » is of little interest vocally.
Hello hello,
I came across a great offline and open-source voice recognition solution (on a Discord server)
It really works like a charm. I tried the small model (39MB) and it works perfectly and in real-time.
Plus, the documentation is well done and it installs in 2 minutes.
Honestly, I think itâs a great lead for gladys
(And otherwise, nothing to do with it but do you know how to change your username on the forum ?)
Hello @jeremy37! I donât think that has progressed, I believe @damalgos is currently moving
However, we now have a Google Home integration, which in one way or another allows you to have voice recognition, both at home and on your phone/your car: