IntroductionMany questions and issues are about the map data in the application. This article tries to explain some technical details of internal data format and data processing. It can be interesting for developers as well as non-developers, but this article requires knowledge and familiarity with the OSM data structure.
All the OsmAnd data is in 'obf' files. 'obf' files have a complex structure and can consist of multiple parts. The 'obf' file is in essence a "container" format containing multiple data blocks.
Currently obf files can contain (multiple) POI parts, (multiple) Map parts, (multiple) Transport parts and (multiple) Adress parts. This list of "funtional" parts can be extended in the future.
To combine, split off (extract) or delete some parts from the obf file, you can use 'binary_inspector' console tool provided with OsmAndMapCreator. As a side note: It is not recommended at all to have/use 'obf' files of more than 2 Gb.
POI, Transport part
Address part1. How does mapcreator generate its list of all places that will appear later in OsmAnd's offline address search? What objects are used in detail for that? What nodes with a place tag are included, and which are excluded?
All places that are visible in OsmAnd as Cities are taken from nodes that have the tag place (wiki.openstreetmap.org/wiki/Place). Currently used are city, town, suburb, village, hamlet.
2. How does mapcreator handle an area polygon that is given via a relation with boundary=administrative? How do you associate a place given as a node with its boundary, when it is present in the OSM data?
Simply saying: it works by name currently.
Mapcreator tries to visit all boundaries and creates a closed (!) boundary from a relation or from separated ways and associate these with one of the name(s). After that it tries to match the place with a boundary name by way of an algorithm. Also there is an additional check that the boundary contains the place.
If there are many boundaries of different admin_level with the same name (containing each other, like district/town/region having the same name) the highest admin_level with exact matching will be chosen.
TODO: More details should be here (about districts of the city ...)
3. Where is the documentation about which admin level is the correct one to build an association to a certain place node? Which countries prefer which admin level(s)?
Currently the admin_level is an association between admin_level relation. The admin_centre is not used. This because only few relations provide that information.
4. How does mapcreator know which street belongs to which place? Are there different cases when a boundary polygon is given and when there is none?
There are many strategies to check and they are prioritized :
- The most important are places and their boundaries. To make sure that the "street management algorithm" will work correct, tmatching the place to the (multiple) boundaries should be successfull. If the street belongs to many boundaries it will be registered in all appropriate places.
- is_in tag (deprecated). So if a street has the "is_in" tag, it will be parsed and splitted by comma and the street will be attached to all cities (by exact name matching).
(TO CHECK: basic check street is in city radius?)
- If the street doesn't belongs to any boundary (boundary were not properly closed could be an issue) it tries to find the closest/biggest city and register in it (sometimes it register in town for 1 km and missing 100m closest hamlet).
- OsmAndMapCreator can display what streets are associated to what city (context menu -> Show address). Local obf files should be present and configured in Settings.
- Binary expector tool can show list of streets for each city. Run it to see parameters.
- Currently all index files contain gen.log. Viewing the log file you can find errors in map creation process and that could an answer why some streets are not in the proper address index place.
Address Part - workflowThere are these relations: city -> 0..1 boundary boundary -> 0.. city (used to define suburb of city)
1. iterate all Osm NODEs and register as cities if the tag = PLACE is present.
- extract cities (TOWN, CITY)
- extract vilagges (anything else)
2. iterate all RELATIONs and WAYs with type=boundary and register all boundaries
- boundary is called Entity (way or relation) with tag 'boundary=administrative' or with tag 'place=...'
- boundary should be admin_level > 4 or don't have it
- boundary is not always associated with a city (or state, ...).
- boundary can have 'admin_center', 'label' member pointing to a city node
- boundary exactly matches by name city node and city node is in boundary
- boundary matches start, end or substring by name city node and city node is in boundary
2.1 Many boundaries can be associated with one city.Here is the order how the most important boundary is taken and associated with the city:
- Boundary is matched by name exactly and has tag place
- Boundary is matched by name exactly and has admin_level 8 > 7 > 6 > 9 > 10 > 5... or nothing
- Boundary has admin_id matching
- All other cases including sorting of admin_level.
3. If the city doesn't have any assigned boundary then all boundaries that doesn't have center cities and contain that city will be checked and the boundary with admin_level >=7 will be assigned.
4. For each boundary, make a list of cities that are in it.
5. iterate all RELATIONS and find addresses(https://wiki.openstreetmap.org/wiki/Relations/Proposed/Postal_Addresses)
- relation with "address" tag type, and is "house" or "a6" address_type
- search for associatedStreet relation and house members
- try to find the city for the street and city for house address.
- look up cities (we already must have find it in steps before!!)
- if we have city and street, register it to database:
- for street registration, see: register street for a city
- if street is registered, and we are processing street:
- iterate over all relation members:
- find street -> write the nodes of the street to db
- find house -> write the house to to the street
- if street is registered, and we are processing house:
- find house number
- find house border: if found, store: building for the street
Register street (street, location of street (los), cities):for each city:
- Find existing street registration within the city:
- if street exists:
- if city part is unknown -> update the existing street's city part
- try to find cityPart for our street, and lookup the street again
- if street does not exists: (might change after the lookup)
- register the street for city, city part, location, and street name
- Find Or Register street
- find close cities to the street
- if the street is in the boundary of the city, add the city for search
- if no city was find, using boundary, find closest city for the street
- Register street:
- for the found cities
6. iterate all NODES then WAYS then RELATIONS (iterate main entity)
6.1 find ways - interpolations:
- for each interpolation, findOrRegister a street with the location of the interpolation
- for each two nodes create a building representing the interpolation
6.2 for any entity, find addr:housenumber and addr:street tag (can be also the interpolation nodes again!!!)
- skip if building for this entity already exists!
- findOrRegister streets for the street
- find the house number
- if housenumber contains '-', try to create interpolated house number (missing latlon2?)
- if housenumber contains '/', try to lookup second street addr:street2 (--> seems only for RU osm: https://wiki.openstreetmap.org/wiki/RU:Key:addr)
TODO: there are more variations for this: adr:housenumber2, addr2:street, addr2:housenumber etc....
- for each street, store the existing house
6.3 for way with tag - name & tag - highway, but without addr:housenumber and addr:street:Note: this might be ways for cars, with names (highway, or so)
- skip if such street exists already
- findOrRegister the street for city
- write the nodes for each found street in each city
6.4 Each relation with "postal_code", store for later use.Note: this does not include the address:type = pc and addr:postalcode !!!
7. process post codes
- for each stored postal_code relation
- for each building member, update the postal_code
8. write the index:
- split cities into: cities+towns, suburbs (suburb with is_in tag), villages (not city or town)
- write cities+towns using suburbs
- read street from cities+towns + apropriate suburbs for each town
- in here, there might be more streets with same name for one city, in such case we try to find a city part for the street (suburb), where the street is in. There should be not more streets with same name within one city part!
- for each street
- For each building, register/create/find postcode, register the street
- write villages
- same as towns...
- write extracted postcodes and their streets