摘要
该文针对地名地址匹配中由模糊、不完整与非标准化输入带来的挑战,提出一种以大模型驱动的多路召回优化方法。该方法融合4类互补召回策略,关键词与倒排索引保障基础匹配效率与可解释性;向量相似度增强语义变体的识别;地理邻近度利用空间坐标提升模糊地理参照处理;大模型生成式召回覆盖非规范化输入。基于广州市标准地名库并构造含21300条扰动地址的测试集,实验结果显示该方法在准确率、召回率、MRR及NDCG@10等指标上均显著优于单一路径基线,验证该策略在复杂输入场景下的有效性与鲁棒性,为高精度地名地址匹配提供可行技术路径。
To address the challenges brought by fuzzy,incomplete and non-standardized inputs in place name and address matching,this paper proposes a large-model-driven multi-way recall optimization method.The method integrates four types of complementary recall strategies:keywords and inverted indexing ensure basic matching efficiency and interpretability;vector similarity enhances the identification of semantic variants;geographical proximity uses spatial coordinates to enhance fuzzy georeferencing processing;large model generative recall covers non-normalized input.Based on Guangzhou City's standard place name database and constructing a test set containing 21300 disturbed addresses,experimental results show that this method is significantly better than the single path baseline in terms of accuracy,recall rate,MRR and NDCG@10,which verifies the effectiveness and robustness of this strategy in complex input scenarios,and provides a feasible technical path for high-precision place name address matching.
出处
《科技创新与应用》
2025年第36期35-38,共4页
Technology Innovation and Application
基金
国家重点研发计划项目(2022YFC3800704-2)资助。
关键词
地名地址匹配
多路召回
大模型
语义检索
倒排索引
place name and address matching
multi-way recall
large model
semantic retrieval
inverted index