Opened 8 years ago

Last modified 5 years ago

#705 new enhancement/feature request

Voice recognition based control for Navit

Reported by: tegzed Owned by: Dandor (tegzed)
Priority: minor Milestone: version 0.6.0
Component: core Version: git master
Severity: Keywords: input, ui, speech, voice, control
Cc:

Description

Hello,

Recently I was thinking about the idea of adding voice recognition support for navit which can make easier controlling navit during driving. There are open source tools and libraries for voice recognition (like CMUSphinx). Using such a tool it would be possible to bind spoken commands to navit functions like zooming in and out, stopping navigation etc. It would also be useful to send voice "keyboard commands" for menu navigation like "up", "down" "enter" etc. Also typing can be made easier to help for example search input with commands like "type Budapest" .

Possibly the dbus interface can be used to send recognized voice data from commandline voice recognition tool to navit or compiling libSphinx into navit executable to use its functionalities.

What do you think about the idea? What are the possible further usage areas? Does anybody have some experience with such tools?

Attachments (5)

voice2dbus.pl (337 bytes) - added by tegzed 8 years ago.
simple tool to filter pocketsphinx output and forward recognized text to navit via dbus interface
wlist5o.dic (454 bytes) - added by tegzed 8 years ago.
added dictionary entries for new voice commands
pocketsphinx.diff (2.6 KB) - added by tegzed 8 years ago.
call dbus-send directly from pocketsphinx_continuous to eliminate the need for intermediate forwarding script
voice_recognition.diff (6.2 KB) - added by tegzed 8 years ago.
improved voice recognition patch and added support for toggling norting and 2D/3D view
navit_voice_command_map.txt (151 bytes) - added by tegzed 8 years ago.
command mapping file for pocketsphinx_continuous to enable user defined command binding

Download all attachments as: .zip

Change History (11)

comment:1 Changed 8 years ago by tegzed

  • Priority changed from major to minor
  • Type changed from defect/bug to enhancement/feature request

Changed 8 years ago by tegzed

simple tool to filter pocketsphinx output and forward recognized text to navit via dbus interface

comment:2 Changed 8 years ago by tegzed

Hello,

I did some initial work on this topic. Using the attached modifications I was able to bind some navit commands to spoken commands as well as issuing virtual keystrokes by voice.

If you want to test it the following steps should taken:

  1. build a dbus enabled navit with the above patch applied
  1. install pocketsphinx package with wsj language model
  1. replace the dictionary file /usr/share/pocketsphinx/model/lm/wsj/wlist5o.dic with the attached one (save the original one if you need it) This contains a limited set of voice commands that navit is accepting. It worth using a small dictionary since it improves recognition rate.
  1. Start navit
  1. launch the voice data forwarding tool (voice2dbus.pl is also attached, the dbus-send utility is needed to be installed):
    pocketsphinx_wsj | perl voice2dbus.pl

If everything went ok, you can issue navit voice commands.

Currently following commands are supported:

FAR --> zoom out CLOSE --> zoom in

and some virtual keystokes: ENTER ESCAPE LEFT RIGHT UPWARD DOWN

Some voice commands are weird at first but the used synonyms seem to be better recognizable than the more natural ones. The list of commands and "keystrokes" can be extended easily. This set of commands are enough for test purposes. The voice forwarding "tool" is very primitive and ugly at the moment but it's functional, maybe a separate application can be built later with libpocketsphinx to do this.

comment:3 Changed 8 years ago by mineque

  • Owner changed from KaZeR to tegzed

comment:4 Changed 8 years ago by mineque

  • Owner changed from tegzed to Dandor (tegzed)

Changed 8 years ago by tegzed

added dictionary entries for new voice commands

Changed 8 years ago by tegzed

call dbus-send directly from pocketsphinx_continuous to eliminate the need for intermediate forwarding script

Changed 8 years ago by tegzed

improved voice recognition patch and added support for toggling norting and 2D/3D view

Changed 8 years ago by tegzed

command mapping file for pocketsphinx_continuous to enable user defined command binding

comment:5 Changed 8 years ago by tegzed

Hi,

I did some more work on voice recognition. Now by applying the attached pocketsphinx.diff patch to pocketsphinx source(in folder trunk/pocketsphinx/src/programs of the cmusphinx svn tree) pocketsphinx_continuous utility will call dbus-send directly and there is no need for the voice2dbus.pl script to forward commands to navit. It also uses a command mapping file called navit_voice_command_map.txt in the directory from where pocketspinx_continuous is launched to map user defined words to commands. Using this mapping user can bind any command to any spoken word. This will enable for example the usage of non-engilsh sphinx configuration or associating a well recognizable word to a command. The format of the mapping file is textual; navit command text followed by the spoken word. These are separated by whitespaces. See attached navit_voice_command_map.txt for example. The supported commands are also listed in that file. Since my previous comment support for toggling northing and 2d/3d view are added. If you have any idea about what navit activities should be supported by voice recognition please write a comment to this ticket.

comment:6 Changed 5 years ago by usul

  • Keywords input ui speech voice control added
  • Milestone set to version 0.6.0

I really like the idea, but would think that Navit is a pretty hard scenario:

  • Recognition outdoors in loud scenarios (think about a drive on a highway)
  • Embedded devices with bad performance
  • Huge Global Dataset for street/places all with en and local names

Anyway, I will schedule it for next major release. Maybe somebody wan'ts to have a look again here.

Note: See TracTickets for help on using tickets.