Training‐free few‐shot construction tool and material detection using pre‐trained vision‐language model

Abstract

Direct visual understanding of construction entities, such as tools and materials (T&M), underpin construction management and resource scheduling. Traditional supervised learning methods suffer from high annotation cost, severe computational demands, and limited datasets. In contrast, training‐free approaches offer an effective alternative well‐suited for construction scenarios constrained by data scarcity and limited resources. Besides, vision‐language models (VLMs) can directly learn image semantics through natural language supervision and also demonstrate strong zero‐shot detection capabilities without requiring retraining. Existing methods often exhibit limited image–text semantic alignment in construction scenarios, which restricts their effectiveness in construction tasks. Therefore, there is an urgent need for approaches that can enhance cross‐modal understanding in such domain‐specific contexts. To address this challenge, this paper proposes a training‐free, knowledge‐enhanced VLM to recognize T&M in construction tasks. The proposed approach leverages image matching and image–text knowledge alignment strategies, thereby utilizing the training‐free nature of existing VLMs while benefiting from enhanced performance brought by knowledge integration. This method offers a novel solution for construction management and robotic collaboration tasks that are traditionally constrained by data and computational resource dependencies.

Publication DOI: https://doi.org/10.1111/mice.70129
Divisions: College of Engineering & Physical Sciences > School of Infrastructure and Sustainable Engineering > Civil Engineering
College of Engineering & Physical Sciences
Aston University (General)
Funding Information: National Natural Science Foundation of China, Grant/Award Number: 72201226; Research Grants Council, Grant/Award Numbers: 26208323, C6044-23GF
Additional Information: Copyright © 2025 The Author(s). Computer-Aided Civil and Infrastructure Engineering published by Wiley Periodicals LLC on behalf of Editor. This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.
Publication ISSN: 1467-8667
Last Modified: 12 Nov 2025 08:04
Date Deposited: 11 Nov 2025 14:20
Full Text Link:
Related URLs: https://onlinel ... 1111/mice.70129 (Publisher URL)
PURE Output Type: Article
Published Date: 2025-11-06
Published Online Date: 2025-11-06
Accepted Date: 2025-10-27
Authors: Zhang, Zhaoxin
Yu, Yantao
Pan, Zaolin
Antwi‐Afari, Maxwell Fordjour (ORCID Profile 0000-0002-6812-7839)

Download

Export / Share Citation


Statistics

Additional statistics for this record