Assistant vocal Raspberry Pi

introduction

Sur mon ordinateur principal (Raspberry Pi 4/4 Go), je lance souvent les mêmes tâches, ou documents. Avec une commande vocale, je pourrai plus rapidement exécuter ces tâches, même sans être assis devant mon écran. Un micro omnidirectionnel pourrait capter des ordres vocaux à plusieurs mètres.

Le micro agit alors comme un troisième moyen d'agir sur un ordinateur, après le clavier et la souris.

Les assistants vocaux des GAFAM se contentent d'interragir avec le web pour trouver des réponses à des questions simples, ils ne sont pas prévus pour commander un ordinateur.

Sur un raspberry (testé sur Raspbery Pi 4 avec 2 Go de RAM), utiliser piTalk téléchargeable sur github

Exemple de fichier talking.py qui fonctionne après l'installation avec le script raspberryPico.sh fourni.

Le fichier de configuration est au format xml : ini/keys.xml

Code source python du 5 juillet 2021

#!/usr/env/python3
# coding: utf-8
# la ligne coding ci-dessus permet d'éviter les erreurs d'accents
# https://openclassrooms.com/forum/sujet/quot-syntaxerror-non-ascii-character-xc3-in-filequot-12536

# documentation projet : http://gangand.net/pp/projets/assistant_vocal/

import speech_recognition as sr
import os

# This script was written by Greg Colburn (ThePony on github). Feel free
# to contact me on FB at https://www.facebook.com/9millionthGreg BUT...
# when contacting me please be sure to tell my WHY you are messaging me!
# donations to keep snacks on hand for my coding spurts are welcome at
# paypal at (colburn greg at.symbol.goes.here ya hoo dot com), thanks

# DOC : https://stackoverflow.com/questions/9942594/unicodeencodeerror-ascii-codec-cant-encode-character-u-xa0-in-position-20
# encoding=utf8
import sys
reload(sys)
sys.setdefaultencoding('utf8')
# evite les erreurs du type
# UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 0: ordinal not in range(128)

r = sr.Recognizer()
m = sr.Microphone()


# nouvelle variable unique a utiliser par la suite
# action 'browser' necessite 'site'
exitWord  = ['quitter', 'Quitterie']

cle = {}

# Voir aussi
# DOC : https://stackoverflow.com/questions/36244380/enumerate-for-dictionary-in-pythonw
# for index, (key, value) in enumerate(your_dict.items()):
    # print(index, key, value)

import subprocess

def afficher_cles():
    print("*" * 50)
    n = 0
    for i, j in enumerate(cle):
        n += 1
        # print("%3d - %15s : %s" % (n, j, cle[j]['mots']) )
        print("%3d - %15s : %s" % (n, j, (', ').join(cle[j]['mots']) ) )
        # print("%15s : %s" % (j, cle[j]['script']) )
    print("*" * 50)
    print()

def charger_cles():
    """ utiliser le fichier talking.ini
    """
    # DOC : syntaxe parsage fichier xml, https://docs.python.org/3/library/xml.etree.elementtree.html
    import xml.etree.ElementTree as ET
    tree = ET.parse('ini/keys.xml')
    root = tree.getroot()
    n= 0
    for child in root:
        n = n + 1
        if child.tag == 'key':
            print ('clé n° {} : {}'.format(n,str(child.get('name'))) )
            cle[str(child.get('name'))] = {}
            for c2 in child:
                key = str(child.get('name'))
                if c2.tag == 'action':
                    print ('action : %s' % str(c2.text))
                    cle[key]['action'] = str(c2.text)
                elif c2.tag == 'script':
                    print ('script : %s' % str(c2.text))
                    cle[key]['script'] = str(c2.text).replace("BACKGROUND", "&")
                elif c2.tag == 'mots':
                    print ('mots   : %s' % str(c2.text))
                    cle[key]['mots'] = str(c2.text).lower().split('|')
                elif c2.tag == 'description':
                    print ('desc   : %s' % str(c2.text))
                    cle[key]['description'] = str(c2.text)
            print('')

def date_heure():
    # https://waytolearnx.com/2020/06/date-et-heure-en-python.html
    from datetime import datetime
    t = datetime.now()
    # return str(t.year) + ' ' + str(t.month) + ' ' + str(t.hour) + ' ' + str(t.minute) + ' ' + str(t.second)
    # https://docs.python.org/fr/3.5/library/string.html
    # https://stackoverflow.com/questions/339007/how-to-pad-zeroes-to-a-string
    return '{0}-{1:02d}-{2:02d}-{3:02d}-{4:02d}'.format(t.year, t.month, t.hour, t.minute, t.second)

def lancement_intelligent():
    """ Utiliser un fichier .ini
    plus facile à éditer """
    try:

        ## ----------------------------------------------------------------
        ## Les 4 lignes suivantes fonctionnent bien sur microphone sur pied
        ## connecté sur carte son usb HFR209-B
        ## mais problème avec microphone usb TONOR
        print("Micro sur pied, un instant SVP ...")
        with m as source: r.adjust_for_ambient_noise(source)
        ## The above line takes an ambient sample of noise to set threshhold levels.
        ## This may not work on all microphones and should be tweaked as needed
        ## ----------------------------------------------------------------

        ## ----------------------------------------------------------------
        ## tests TONOR
        ## voir https://pypi.org/project/SpeechRecognition/1.3.0/
        # print("Micro TONOR, un instant SVP ...")
        # with m as source:
            # r.adjust_for_ambient_noise(source)  # https://github.com/Uberi/speech_recognition/blob/master/examples/calibrate_energy_threshold.py

        # ## ----------------------------------------------------------------


        while True:
            print("Prêt, veuillez parler. (Presser Ctrl+c ou dire quitter pour quitter)")
            with m as source: audio = r.listen(source)
            print(".")
            try:
                # value = r.recognize_google(audio)
                value = r.recognize_google(audio, language="fr-FR")
                # passer en francais
                # DOC : https://stackoverflow.com/questions/49732536/how-to-change-the-language-of-google-speech-recognition

                if str is bytes:
                    commande_vocale = "{}".format(value).encode("utf-8")
                else:
                    commande_vocale = "{}".format(value)
                commande_vocale = commande_vocale.lower()
                print("commande_vocale = %s " % commande_vocale)

                if commande_vocale in exitWord:
                    quit()
                elif commande_vocale in ['aide', 'menu']:
                    afficher_cles()
                    continue

                cle_trouvee = False
                for i, j in enumerate(cle):
                    # print("%15s : %s" % (j, cle[j]['mots']) )
                    if commande_vocale in cle[j]['mots']:
                        cj = cle[j]
                        if cle[j]['action'] == 'script':    # quasiment tout peut être scripté, action 'script' à privilégier
                            print("Description : %s" % cle[j]['description'])
                            print('--------------------------------------------')
                            f = open("/tmp/talking.sh", "w")
                            f.write(cle[j]['script'])
                            f.close()
                            subprocess.Popen( ['sh', '/tmp/talking.sh' ] ).pid
                            cle_trouvee = True
                        break

                if cle_trouvee == False: # recherche internet
                    print("Recherche internet : %s" % commande_vocale)
                    liste = commande_vocale.split(' ')
                    recherche_google = 'https://www.google.com/search?q=' + ('+').join(liste)
                    print(recherche_google)
                    subprocess.Popen( ['chromium-browser', recherche_google]).pid

            except sr.UnknownValueError:
                print("L'API Google API n'a pas compris ...")
            except sr.RequestError as e:
                print("!")
                #print("Uh oh! Couldn't request results from Google Speech Recognition service; {0}".format(e))

    except KeyboardInterrupt:
        pass

def main():
    charger_cles()
    # subprocess.Popen( ['clear']).pid
    afficher_cles()
    lancement_intelligent()

main()
# 'action' : 'main_pgm'
# 'action' : 'exec' ,
# 'action' : 'script' ,
# 'action' : 'browser' ,
# 'action' : 'gpio_domotique' ,
# 'action' : 'recherche_fichier'

Voir aussi Kalliopé

Autres sources (sauvegardes)

Connectez d'abord un micro USB sur un port USB du Raspberry.
Essayer avec un micro omnidirectionnel comme le TONOR à 30€

commandes

Cela peut être le début d'un système domotique de base
("Quelle heure est-il/What time is it",
"Quel temps fait-il"/"What's the weather like",
"Jouer un bon morceau de musique"/"Play some good music",
"Allume la radio"/"Turn the radio on",
"Alarme dans 10 minutes"/"Alarm in 10 minutes",
"Une bonne blague"/"A good joke",
DICTER DU TEXTE,
recherche et lancement d'un fichier avec s3f (super fast file finder)
lancer plusieurs actions d'un coup en lançant un script shell à la voix,
ouvrir le(s) dernier(s) fichier(s) édités dans le traitement de texte,
ouvrir le mail,
éteindre la machine,
éteindre la machine dans XX minutes,
lancer la musique,

Assistant vocal Raspberry Pi

Démarrage

introduction

commandes

images

Relais

relais

relais_distant

micro_omni

micro_labtec