An Efficient Approach for Geo-Multimedia Cross-Modal Retrieval

Zhu, Lei, Long, Jun, Zhang, Chengyuan, Yu, Weiren, Yuan, Xinpan and Sun, Longzhi (2019). An Efficient Approach for Geo-Multimedia Cross-Modal Retrieval. IEEE Access, 7 , pp. 180571-180589.

Abstract

Due to the rapid development of mobile Internet techniques, such as online social networking and location-based services, massive amount of multimedia data with geographical information is generated and uploaded to the Internet. In this paper, we propose a novel type of cross-modal multimedia retrieval, called geo-multimedia cross-modal retrieval, which aims to find a set of geo-multimedia objects according to geographical distance proximity and semantic concept similarity. Previous studies for cross-modal retrieval and spatial keyword search cannot address this problem effectively because they do not consider multimedia data with geo-tags (geo-multimedia). Firstly, we present the definition of k NN geo-multimedia cross-modal query and introduce relevant concepts such as spatial distance and semantic similarity measurement. As the key notion of this work, cross-modal semantic representation space is formulated at the first time. A novel framework for geo-multimedia cross-modal retrieval is proposed, which includes multi-modal feature extraction, cross-modal semantic space mapping, geo-multimedia spatial index and cross-modal semantic similarity measurement. To bridge the semantic gap between different modalities, we also propose a method named cross-modal semantic matching (CoSMat for shot) which contains two important components, i.e., CorrProj and LogsTran, which aims to build a common semantic representation space for cross-modal semantic similarity measurement. In addition, to implement semantic similarity measurement, we employ deep learning based method to learn multi-modal features that contains more high level semantic information. Moreover, a novel hybrid index, GMR-Tree is carefully designed, which combines signatures of semantic representations and R-Tree. An efficient GMR-Tree based k NN search algorithm called k GMCMS is developed. Comprehensive experimental evaluations on real and synthetic datasets clearly demonstrate that our approach outperforms the-state-of-the-art methods.

Publication DOI:	https://doi.org/10.1109/ACCESS.2019.2940055
Divisions:	College of Engineering & Physical Sciences > Systems analytics research institute (SARI) College of Engineering & Physical Sciences
Additional Information:	This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
Uncontrolled Keywords:	Cross-modal retrieval,deep learning,geo-multimedia,kNN spatial search,General Computer Science,General Materials Science,General Engineering
Publication ISSN:	2169-3536
Last Modified:	07 Aug 2025 07:41
Date Deposited:	16 Sep 2019 10:58
Full Text Link:
Related URLs:	https://ieeexpl ... cument/8827517/ (Publisher URL) http://www.scop ... tnerID=8YFLogxK (Scopus URL)
PURE Output Type:	Article
Published Date:	2019-12-23
Published Online Date:	2019-09-09
Accepted Date:	2019-09-09
Authors:	Zhu, Lei Long, Jun Zhang, Chengyuan Yu, Weiren ( 0000-0002-1082-9475) Yuan, Xinpan Sun, Longzhi

Download

Version: Accepted Version

License: Creative Commons Attribution

| Preview

Export / Share Citation

Explore Further

Statistics

Additional statistics for this record

Record administration