Extracting precise geographical information from the textual content, referred to as toponym recognition, is fundamental in geographical information retrieval and crucial in a plethora of spatial analyses, e.g., mining location-based information from social media, news reports, and surveys for various applications. A recent review article, "TopoBERT: A Plug and Play Toponym Recognition Module Harnessing Fine-tuned BERT" was published in the International Journal of Digital Earth. TopoBERT is a ready-to-use toponym recognition module that takes the advantage of large pretrained language models. It outperforms five baseline models and demonstrated state-of-the-art performance. Its generalizability has been tested on an unseen dataset and the module and data are open-sourced so that they can benefit other scholars who have similar interests.
This work is the result of collaborative research by eminent scholars, including Lab member Bing Zhou, Dr. Lei Zou, Dr. Yingjie Hu, Dr. Yi Qiang, and Dr. Daniel Goldberg.
Figure: TopoBERT architectures. (a) Architecture with CNN1D as classifier. (b) Architecture with linear classifier. (c) Architecture with MLP as classifier
This paper is innovative, original, significant, and theoretically sound. Since detecting location names from geospatial big data such as social media, online help requesting platforms, news media is crucial for spatial knowledge discovery, while also noticing the booming of Natural Language Processing technologies, it is a very timely study and should be of interest to a wide range of audiences.