Opened 6 years ago

Closed 4 years ago

Last modified 4 years ago

#794 closed defect/bug (fixed)

Search in binfile for "Germany, Düsseldorf, Gladbacher Str" finds Gladbacher Str in Neuss, not Düsseldorf

Reported by: sleske Owned by: cp15
Priority: major Milestone:
Component: mapdrivers/OSM Version: git master
Severity: Keywords:
Cc:

Description

If I search for the street "Gladbacher Str" in Düsseldorf (Germany), Navit finds the "Gladbacher Str" in Neuss, not the one in Düsseldorf.

To reproduce:

  • Get a binfile created from OSM data which contains Neuss and Düsseldorf, e.g. using Navit's planet extractor.
  • open search dialog, search for "Germany", city "Düsseldorf", street "Gladbacher Str"

Navit will list "Düsseldorf, Gladbacher Str" in the search results. However, if you jump to the result on the map, you see that it found the Gladbacher Str in Neuss (to the west of the Rhine), not the one in Düsseldorf.

Change History (14)

comment:1 Changed 6 years ago by sleske

Some more information about the problem:

  • The search on openstreetmap.org does find the right street (search for "Gladbacher Str, Düsseldorf"), so it's not a problem with the OSM data.
  • Navit correctly finds other streets nearby Gladbacher Str (e.g. "Ahnenweg", which branches off from Gladbacher Str), so it does not appear to be a problem with Düsseldorf in general.
  • The Gladbacher Str in Neuss is very close to the border to Düsseldorf (Neuss borders on Düsseldorf). Maybe this is confusing Navit.

comment:2 Changed 6 years ago by sleske

I tried to debug the problem, and I found out that apparently Navit actually *does* find the street, but suppresses the search result.

In map/binfile/binfile.c, the function static int duplicate (line 2024 ff.) is called for each result of the search. It seems to try to weed out duplicate search results.

If I disable this function (by always making it return 0), I get about 40 results for "Gladbacher Str", among them both the one in Neuss and the one in Düsseldorf.

As far as I can tell, the "duplicate" function just throws out duplicate results based on name, so if a search legitimately finds two results with identical names, it would only return the first. The "duplicate" function also tries to check coordinates, but I don't understand that part; in the cases where I debugged it, it never found coordinates, so that part of the code didn't do anything.

comment:3 Changed 6 years ago by sleske

As far as I see, this problem will occur every time a search finds two (or more) different streets with the same name: only one of them will be shown.

Note that this will not only occur in cases where city borders are not strictly observed by the search algorithm (as in this example); many cities in Germany really have different streets with the same name. For example, there are 8 (!) "Gartenstraße" in Berlin. So this is not just a problem with the search function being a bit too generous when deciding which streets belong to a given city.

The only solution I can see would be to try and group the search results into connected streets. Then only show one result per set of connected streets. That should solve the problem. I'd like to tacke this problem, but first I'd appreciate some feedback if my analysis is correct, and if there are things I overlooked. Thanks!

comment:4 Changed 6 years ago by woglinde

Hi,

as discussed with you on IRC the Problem comes from the is_in tag and the solution is to get navit further in direction of using relations instead of the tag.

Bye Henning

comment:5 follow-up: Changed 6 years ago by sleske

Thanks for the feedback. Actually, I believe there are two separate problems here:

1) Navit's search sometimes finds streets outside the city it's supposed to search in (in this case a street is found in Neuss, while the search was for Düsseldorf).

2) It the results of the search by street, Navit never shows more than one result per street name. That means that if a search correctly finds multiple streets with the same name, they will be suppressed. To see this problem, try search for "Berlin, Gartenstraße" in Navit. When I tried it (using an extract of Berlin, Germany from the Navit planet extractor), Navit found only one street "Gartenstraße", while e.g. Google Maps lists 8 streets (in different city districts).

I believe these two problems are completely independent (thought they often occur together).

Work on the "is_in problem" you mention would only solve 1), so I believe 2) still needs ot be addressed.

comment:6 in reply to: ↑ 5 Changed 6 years ago by tryagain

Replying to https://wiki.navit-project.org/index.php/user:sleske:

2) It the results of the search by street, Navit never shows more than one result per street name. That means that if a search correctly finds multiple streets with the same name, they will be suppressed. To see this problem, try search for "Berlin, Gartenstraße" in Navit. When I tried it (using an extract of Berlin, Germany from the Navit planet extractor), Navit found only one street "Gartenstraße", while e.g. Google Maps lists 8 streets (in different city districts).

Indeed, if a single street consists of multiple polylines, then housenumber search is affected by the same problem. Search for housenumbers is done within some area around one of polylines representing street so houses placed relatively far from that segment are not being found.

As a workaround, address filtering can be done in POIs dialog.

comment:7 Changed 5 years ago by sleske

Note: #1005 describes a similar problem, apparently also caused by the approximation of city areas.

comment:8 Changed 4 years ago by sleske

Another duplicate: #1097.

comment:9 Changed 4 years ago by tryagain

Hi,

I'm working now on these issues...

I have found in OSM wiki addr:suburb tag which can be used to distinguish between identicaly named street in different city districts. It seems to be of some use in Russia.

Though it looks like Berlin Gartenstrasse's do not have addr:suburb tag on them. How would you distinguish them?

I don't think it would be useful to supply all sub-city districts for all streets found as it will mess up the search result list. Also, when a street crosses the city district boundary, it tends to keep housenumber numeration. And humans [usually] would consider it as one single street, and be unpleased if we ask them to choose one of multiple virtual streets in such case.

Here in Russia address usually does not contain suburb name, knowing city, street and housenumber is usually enough to take a taxi or order some pizza. For street name collisions, we use suburb name (or postal code for paper mail). Addr:suburb is thought for such collided cases. It would not be specified for most of streets and buildings, so displaying it will not mess up the list.

How would you tell taxi driver which one Gartenstrasse in Berlin do you want?

Technically, we could use postal_code street/building tag to distinguish Berlin Gartenstrasse's. Do you think using postal_code from the street and addr:suburb would be useful to distinguish addresses?

I also thinking to get out of relaying on distance from the street way to house. Sometimes houses are put too far from the street, sometimes there even are no street ways at all, but a few houses have some common street name in addr:street tag. I think these should be findable by street name too.

tryagain.

Last edited 4 years ago by tryagain (previous) (diff)

comment:10 follow-up: Changed 4 years ago by sleske

I have found in OSM wiki addr:suburb tag which can be used to distinguish between identicaly named street in different city districts.[...] Though it looks like Berlin Gartenstrasse's do not have addr:suburb tag on them. How would you distinguish them?

Generally, in Germany suburbs are used as well: For example, if you look at a street directory of Berlin, it will show the suburb after the street to disambiguate multiple streets with the same name. You can also see this on Google Maps: Search for "Gartenstraße, Berlin", and Google maps will list multiple streets, each with suburb and postcode for disambiguation.

As to representation in OSM data: I think the addr:suburb tag is rarely used in Germany. I believe instead the suburbs, if they exist, exist as areas (closed way or relation tagged with boundary=administrative), so you must check which suburb area a given street/house is in.

As an example, look up "Gartenstraße,Berlin" on http://nominatim.openstreetmap.org/ . It will list multiple Gartenstraße (because Berlin has several), and for each it will list the suburb it's in. If you click the "detail" link, you can see that Nominatim uses the suburb areas in OSM data to find the suburb.

We should do something similar; I don't know whether this is currently feasible in Navit and/or maptool.

I don't think it would be useful to supply all sub-city districts for all streets found as it will mess up the search result list. Also, when a street crosses the city district boundary, it tends to keep housenumber numeration.

Yes, it probably makes sense to only show the district/suburb information if there are multiple streets with the same name in one city. Using the postal_code is also possible, but I think it's less useful. The suburb names are usually much better known than the postcodes, so that's what people use. Of course, postcode might be useful as a fallback.

Anyway, I believe the most important point is to actually list multiple streets with the same name, instead of only showing one (as I explained in the report). If two identically-named streets are listed, it will look strange, but at least both streets can be selected (which you cannot do at present). Maybe I can tackle that shortly.

I also thinking to get out of relaying on distance from the street way to house. Sometimes houses are put too far from the street, sometimes there even are no street ways at all, but a few houses have some common street name in addr:street tag. I think these should be findable by street name too.

That's a good idea, but I believe this is a separate problem. I think you should file a separate bug for that.

comment:11 in reply to: ↑ 10 ; follow-up: Changed 4 years ago by tryagain

Replying to http://sleske.myopenid.com/:

I have found in OSM wiki addr:suburb tag which can be used to distinguish between identicaly named street in different city districts.[...] Though it looks like Berlin Gartenstrasse's do not have addr:suburb tag on them. How would you distinguish them?

Generally, in Germany suburbs are used as well: For example, if you look at a street directory of Berlin, it will show the suburb after the street to disambiguate multiple streets with the same name. You can also see this on Google Maps: Search for "Gartenstraße, Berlin", and Google maps will list multiple streets, each with suburb and postcode for disambiguation.

As to representation in OSM data: I think the addr:suburb tag is rarely used in Germany. I believe instead the suburbs, if they exist, exist as areas (closed way or relation tagged with boundary=administrative), so you must check which suburb area a given street/house is in.

As an example, look up "Gartenstraße,Berlin" on http://nominatim.openstreetmap.org/ . It will list multiple Gartenstraße (because Berlin has several), and for each it will list the suburb it's in. If you click the "detail" link, you can see that Nominatim uses the suburb areas in OSM data to find the suburb.

We should do something similar; I don't know whether this is currently feasible in Navit and/or maptool.

I don't think it would be useful to supply all sub-city districts for all streets found as it will mess up the search result list. Also, when a street crosses the city district boundary, it tends to keep housenumber numeration.

Yes, it probably makes sense to only show the district/suburb information if there are multiple streets with the same name in one city. Using the postal_code is also possible, but I think it's less useful. The suburb names are usually much better known than the postcodes, so that's what people use. Of course, postcode might be useful as a fallback.

Anyway, I believe the most important point is to actually list multiple streets with the same name, instead of only showing one (as I explained in the report). If two identically-named streets are listed, it will look strange, but at least both streets can be selected (which you cannot do at present). Maybe I can tackle that shortly.

Reporting multiple streets will be misleading.

For example, for "heerstrasse, berlin" there are six streets reported by Nominatim. But actually it's only one street which is drawn partially as primary, partially as residential and has two associated post codes. But it still keeps houses numeration regardless of the above changes.

What would you say if navit will present you with six results for "heerstrasse" and require to select exactly one of them to be able to find a housenumber?

Nominatim does not split address search into town, street, house number steps so you can simply enter one of

15, heerstrasse, berlin
119, heerstrasse, berlin
647, heerstrasse, berlin

and get your result. In navit, similar query currently can be done with POI address filtering.

But with our current step-by-step housenumber search, you would have to select the needed street part and only then you'll be able to find the needed house. Selecting right street part will be tricky, especially when you are not familiar with suburb names and postal codes of the city.

What I suggest, is doing street duplicate removal by both addr:street (name) tag and addr:suburb, if it's given. Then filter out house numbers using street name, suburb name, housenumber within the whole city polygon like it's done for streets now.

If you want to flood your navit search results with duplicate streets, just add "item_coord_rewind(item);" right before "if (!item_coord_get(item, &d->c, 1)) {" line of binfile.c. I think you would be unhappy with result.

Another idea would be report districts matching the search phrase when the street search is done and return user to street search again if he selects a district. Then they could select a street within the district found and find housenumbers only within that selected district. But this will require to process suburb multipolygon relations in maptool, and we have to test if it will eat cpu cycles on the server within acceptable limits. I think i will test this after I finish testing currently experimental feature of village/town/city multipolygones.

tryagain.

comment:12 in reply to: ↑ 11 Changed 4 years ago by sleske

Replying to http://wiki.navit-project.org/index.php/user:tryagain:

Reporting multiple streets will be misleading.

For example, for "heerstrasse, berlin" there are six streets reported by Nominatim. But actually it's only one street which is drawn partially as primary, partially as residential and has two associated post codes. But it still keeps houses numeration regardless of the above changes.

Yes, obviously these street "fragments" must somehow be united. My idea was to only report one entry per connected set of streets (see comment 3 above). Thus a street would only be reported multiple times if the "fragments" are not all connected together.

What I suggest, is doing street duplicate removal by both addr:street (name) tag and addr:suburb, if it's given.

Nice idea. However, this means we will present streets multiple times if they are long and pass through several suburbs; that also seems ugly (or maybe not?).

Then filter out house numbers using street name, suburb name, housenumber within the whole city polygon like it's done for streets now.

Wouldn't searching the whole city be too much (and too slow)? But that probably needs to be tested; anyway, if we sometimes do not find house numbers because they are too far from "their" street, that's a serious bug, so the extra work may well be worth it.

Anyway, go ahead with your experiments. I'm looking forward to seeing your results.

comment:13 Changed 4 years ago by usul

  • Resolution set to fixed
  • Status changed from new to closed

I testet it with the current linux and it works well and finds the right result at 5112'41"N 6°45'42"E. So ticket closed

comment:14 Changed 4 years ago by sleske

Actually, I kept this bug open because of the discussion of the more general problem with duplicate street names. I opened a new bug for this : #1134. The problem described here is indeed fixed.

Note: See TracTickets for help on using tickets.