Opened 10 years ago
Last modified 8 years ago
#705 new enhancement/feature request
Voice recognition based control for Navit
Reported by: | tegzed | Owned by: | Dandor (tegzed) |
---|---|---|---|
Priority: | minor | Milestone: | version 0.6.0 |
Component: | core | Version: | git master |
Severity: | Keywords: | input, ui, speech, voice, control | |
Cc: |
Description
Hello,
Recently I was thinking about the idea of adding voice recognition support for navit which can make easier controlling navit during driving. There are open source tools and libraries for voice recognition (like CMUSphinx). Using such a tool it would be possible to bind spoken commands to navit functions like zooming in and out, stopping navigation etc. It would also be useful to send voice "keyboard commands" for menu navigation like "up", "down" "enter" etc. Also typing can be made easier to help for example search input with commands like "type Budapest" .
Possibly the dbus interface can be used to send recognized voice data from commandline voice recognition tool to navit or compiling libSphinx into navit executable to use its functionalities.
What do you think about the idea? What are the possible further usage areas? Does anybody have some experience with such tools?
Attachments (5)
Change History (11)
comment:1 Changed 10 years ago by tegzed
- Priority changed from major to minor
- Type changed from defect/bug to enhancement/feature request
Changed 10 years ago by tegzed
comment:2 Changed 10 years ago by tegzed
Hello,
I did some initial work on this topic. Using the attached modifications I was able to bind some navit commands to spoken commands as well as issuing virtual keystrokes by voice.
If you want to test it the following steps should taken:
- build a dbus enabled navit with the above patch applied
- install pocketsphinx package with wsj language model
- replace the dictionary file /usr/share/pocketsphinx/model/lm/wsj/wlist5o.dic with the attached one (save the original one if you need it) This contains a limited set of voice commands that navit is accepting. It worth using a small dictionary since it improves recognition rate.
- Start navit
- launch the voice data forwarding tool (voice2dbus.pl is also attached, the dbus-send utility is needed to be installed):
pocketsphinx_wsj | perl voice2dbus.pl
If everything went ok, you can issue navit voice commands.
Currently following commands are supported:
FAR --> zoom out CLOSE --> zoom in
and some virtual keystokes: ENTER ESCAPE LEFT RIGHT UPWARD DOWN
Some voice commands are weird at first but the used synonyms seem to be better recognizable than the more natural ones. The list of commands and "keystrokes" can be extended easily. This set of commands are enough for test purposes. The voice forwarding "tool" is very primitive and ugly at the moment but it's functional, maybe a separate application can be built later with libpocketsphinx to do this.
comment:3 Changed 10 years ago by mineque
- Owner changed from KaZeR to tegzed
comment:4 Changed 10 years ago by mineque
- Owner changed from tegzed to Dandor (tegzed)
Changed 10 years ago by tegzed
call dbus-send directly from pocketsphinx_continuous to eliminate the need for intermediate forwarding script
Changed 10 years ago by tegzed
improved voice recognition patch and added support for toggling norting and 2D/3D view
Changed 10 years ago by tegzed
command mapping file for pocketsphinx_continuous to enable user defined command binding
comment:5 Changed 10 years ago by tegzed
Hi,
I did some more work on voice recognition. Now by applying the attached pocketsphinx.diff patch to pocketsphinx source(in folder trunk/pocketsphinx/src/programs of the cmusphinx svn tree) pocketsphinx_continuous utility will call dbus-send directly and there is no need for the voice2dbus.pl script to forward commands to navit. It also uses a command mapping file called navit_voice_command_map.txt in the directory from where pocketspinx_continuous is launched to map user defined words to commands. Using this mapping user can bind any command to any spoken word. This will enable for example the usage of non-engilsh sphinx configuration or associating a well recognizable word to a command. The format of the mapping file is textual; navit command text followed by the spoken word. These are separated by whitespaces. See attached navit_voice_command_map.txt for example. The supported commands are also listed in that file. Since my previous comment support for toggling northing and 2d/3d view are added. If you have any idea about what navit activities should be supported by voice recognition please write a comment to this ticket.
comment:6 Changed 8 years ago by usul
- Keywords input ui speech voice control added
- Milestone set to version 0.6.0
I really like the idea, but would think that Navit is a pretty hard scenario:
- Recognition outdoors in loud scenarios (think about a drive on a highway)
- Embedded devices with bad performance
- Huge Global Dataset for street/places all with en and local names
Anyway, I will schedule it for next major release. Maybe somebody wan'ts to have a look again here.
simple tool to filter pocketsphinx output and forward recognized text to navit via dbus interface