Page 1 of 1

Understanding scores xlocate 1 vs xlocate 2/Developer

Posted: Thu Jul 13, 2023 7:07 am
by Bernd Welter
Hi Jochen,

recently I've been asked by some xlocate 1 user who started to migrate to the newer xlocate 2 API about the handling of scores within the geocoding (multifield): he was used to interpret the xlocate 1 total score in a specific manner:
  • During the automatic process of geocoding he distinguished between
    • "Addresses with a first hit with a total score > xx%" : these addresses are flagged "geocoded"
    • "The others" (<=xx% or no hits at all") : these addresses are flagged "to be revised manually"
  • Addresses of the second category will then be checked by a user before they can be used in the succeeding processes.
With this simple approach he was successful with xLocate 1 and he applied the same logic in xLocate 2. Now for some addresses the "xlocate 2" score lead to a significant change of the score and this caused the following issue:
  • Some addresses decreased the total score from 100% to just 70%: they used to be "geocoded" automatically but need the manual check now. This means:
    • Much more manual revision
    • In roughly 9 out of 10 cases the confirmation through the user is simply to accept hit number 1
What was first seen as a bug may just require an explanation;
  • The underlying data structures are different between xLocate 1 (accesses the map data) and xlocate 2 (requires a special index directory structure)
  • The engine's search is different
  • The metric of the score computation has changed and xLocate2 returns also the field scores
What does this mean?
  • Threshold values for the distinguishing process may have to be aligned
  • You may have to include the field scores into your process. As the interpretation of these values may depend on your business there's no generic approach:
    • Some users may just have a focus on postalCode/city name with a low priority on street/housenumber
    • Others would need a proper confirmation of the street and housenumber as well.
If you want us to assist in the process of migration from xlocate 1 to xlocate 2 or PTV Developer let us know. We can then set up meetings and look into your approach.

Bernd

Re: Understanding scores xlocate 1 vs xlocate 2/Developer

Posted: Thu Jul 13, 2023 9:09 am
by bocajo
The total score is mainly used to sort the result list. This applies to both xLocate 1 as for xLocate 2. As Bernd said, the score calculation of xLocate 2 and xLocate 1 differs. There is a technical concept of how the score calculation of xLocate 2 works.

I recommend not only using the total score as criterion for wether an address has been geocoded correctly or not but also to considere the field scores.
For example, a result of a total score >= 80 could be said to be geocoded. Results of a total score < 80 don't mean that they are to bad to be considered geocoded. For these cases you could use the field scores for example you could say if the field scores for postalCode and (city or district) and street is >=80 you also considere as geocoded. See the example below:
example-field-scores.png
This is just one example that you need to adapt to your use case and test well before you use it productively ;)