GI_Forum 2023, Volume 11, Issue 1
|
Verlag der Österreichischen Akademie der Wissenschaften Austrian Academy of Sciences Press
A-1011 Wien, Dr. Ignaz Seipel-Platz 2
Tel. +43-1-515 81/DW 3420, Fax +43-1-515 81/DW 3400 https://verlag.oeaw.ac.at, e-mail: verlag@oeaw.ac.at |
|
DATUM, UNTERSCHRIFT / DATE, SIGNATURE
BANK AUSTRIA CREDITANSTALT, WIEN (IBAN AT04 1100 0006 2280 0100, BIC BKAUATWW), DEUTSCHE BANK MÜNCHEN (IBAN DE16 7007 0024 0238 8270 00, BIC DEUTDEDBMUC)
|
GI_Forum 2023, Volume 11, Issue 1, pp. 110-117, 2023/06/27
Geocoding is a process of converting human-readable addresses into latitude and longitude points. Whilst most geocoders tend to perform well on structured addresses, their performance drops significantly in the presence of unstructured addresses, such as locations written in informal language. In this paper, we make an extensive comparison of geocoder performance on unstructured location mentions within tweets. Using nine geocoders and a worldwide English-language Twitter dataset, we compare the geocoders’ recall, precision, consensus and bias values. As in previous similar studies, Google Maps showed the highest overall performance. However, with the exception of Google Maps, we found that geocoders which use open data have higher performance than those which do not. The open-data geocoders showed the least per-continent bias and the highest consensus with Google Maps. These results suggest the possibility of improving geocoder performance on unstructured locations by extending or enhancing the quality of openly available datasets.
Keywords: commercial geocoders, natural language, Twitter, open data, spaCy