Opened 6 years ago

Closed 5 years ago

#1030 closed defect/bug (fixed)

Navit crashes if not run from source dir

Reported by: sleske Owned by: KaZeR
Priority: major Milestone:
Component: core Version: git master
Severity: Keywords:
Cc:

Description

If I build navit (via autotools build) and run from source directory, everything works.

However, if I start it from somewhere else, e.g. with curr. dir one level above working directory, it crashes:

$ ./navit/navit -c navit/navit.xml 
config file n is set to `navit/navit.xml'
navit:main_real:trying navit/navit.xml
navit:map_new:invalid type 'textfile'
navit:navit_new:return 0x85243f0
navit:navit_ref:refcount 2
navit:vehicle_new:invalid type 'gpsd'
navit:tracking_ref:refcount 2
navit:tracking_unref:refcount 1
navit:speech_new:wrong type 'cmdline'
navit:xinclude:Unable to include /usr/local/share/navit/maps/*.xml
navit:navit_init:no gui
navit:navit_destroy:enter 0x85243f0
navit:tracking_unref:refcount 0
navit:navit_unref:refcount -1219161105
navit:navit_destroy:enter 0x85243f0
GNU gdb (GDB) 7.3-debian
[....]
#4  0x0806598c in sigsegv (sig=11) at debug.c:81
#5  <signal handler called>
#6  0x08095831 in transform_get_projection (this_=0x0) at transform.c:604
#7  0x0806dfbc in do_draw (displaylist=0x8524880, cancel=1, flags=0) at graphics.c:2071
#8  0x0806ea54 in graphics_draw_cancel (gra=0x0, displaylist=0x8524880) at graphics.c:2258
#9  0x0807d5b7 in navit_destroy (this_=0x85243f0) at navit.c:3201
#10 0x0807d8a6 in navit_unref (this_=0x85243f0) at navit.c:3255
#11 0x0805c3eb in end_element (context=0x85229c8, element_name=0x85c4280 "navit", 
    user_data=0xbfb9d11c, error=0xbfb9d05c) at xmlconfig.c:669
[...]

This is on i686, running Debian Linux, SVN rev. 5019. I can reproduce this starting with rev. 4981.

Change History (6)

comment:1 Changed 6 years ago by tryagain

Do you have Navit installed system wide when you call it in such a way?

What are you trying to get doing so?

Navit depends on a number of items to be available. First of them is config file (navit.xml) of course. But config file should point to some minimal (probably undocumented) set of modules if they are not built into the executable.

When navit starts from the build directory, it expects to find modules in appropriate subdirectories of the build tree. Else, it expects to be installed systemwide and looks for modules in systemwide paths like /usr/local/share/navit/lib or something similar (which may be defined during pre-build configuration process).

If you wonder why navit is crashed when no modules are found, please rephrase your ticket. I agree this is a problem but this ticket isn't about it.

If you want navit to seek for the modules in the subdirectory relative to navit.xml, expecting that given navit.xml resides in the build tree, i'd say it's a bad idea. User might run a system-wide installed navit with handmade ~/navit.xml. And we do not expect navit to look for modules in user home directory in that case.

Also it might crash if you have system wide installed older version of navit (especially, modules) but running freshly compiled navit binary. It is not a problem at all if navit will crash in such a situation.

tryagain

comment:2 Changed 6 years ago by tryagain

I think I have got the idea behind this ticket.

You're asking that Navit should detect if it was run from the build directory by examining the directory where the navit binary is sitting, not the current directory from which the command to start navit is called.

This is different from the current behavior, but seems to be acceptable.

Still have no idea why the change is needed though. Why do you run navit not from the directory it's built in?

comment:3 Changed 6 years ago by sleske

Thanks for the explanation. I did not realize this thing about modules being loaded at runtime (I naively thought everything is compiled into the "navit" binary).

This explains the error messages like "invalid type 'gpsd'", "no gui" etc.

Then this bug is really about the confusing behavior. Navit should print some error message about not finding its modules, and maybe a note like "Did you install Navit correctly?" It should definitely not crash; crashing is rarely helpful :-).

Anyway, seen like this, this bug is just annoying, not really problematic.

I'll leave it open, and try to fix Navit so it prints some meaningful error message rather than crashing if it does not find its modules.

Last edited 6 years ago by sleske (previous) (diff)

comment:4 Changed 6 years ago by sleske

Ok, I now understand better what's happening:

Navit cannot start without its plugins

The fact that Navit does not start if you are not running from the directory of the "navit" binary is no bug, but by design, as explained by tryagain.

I've added some warnings to make this easier to see (rev. 5064). Navit now prints

Warning: No plugins found. Is Navit installed correctly?

if the plugin loading code did not find any plugins.

Navit crashes instead of shutting down

This is apparently a problem with the reference counting implementation.

The problem is that navit_destroy is called twice:

When parsing the <navit> tag, in end_element (xmlconfig.c:655ff), navit_init is called. If no GUI is found (navit.c:1972), navit_init will call navit_destroy. Then end_element calls navit_unref, even though the navit structure has already been freed by navit_destroy, and mayhem ensues. In my test, the refcount in the deallocated navit structure happens to look negative, so navit_unref calls navit_destroy, and segfaults.

I'm not sure how the reference counting is supposed to work, but as I understand it, navit_destroy must never be called directly. The only code calling it should be navit_unref (once the refcount reaches zero).

However, Navit's code is littered with calls to navit_destroy. Now I'm confused:

  • How can this work? Won't calls to navit_destroy all lead to crashes if navit_unref is called later?
  • What is the point of using refcounting on the navit struct? The struct lives as long as Navit is running, so there's no need to ever free it, is there?

comment:5 Changed 6 years ago by sleske

I have removed the calls to navit_destroy in navit_init (rev. 5065). This fixes the crash.

I'd still be interested to get answers to my questions above... I'll keep this bug open for a while.

comment:6 Changed 5 years ago by sleske

  • Resolution set to fixed
  • Status changed from new to closed

The problem described is fixed, so no point in keeping this open.

The questions about refcounting I'll try to get answered elsewhere.

Note: See TracTickets for help on using tickets.