Zhang, Zhaoxin, Yu, Yantao, Pan, Zaolin and Antwi‐Afari, Maxwell Fordjour (2025). Training‐free few‐shot construction tool and material detection using pre‐trained vision‐language model. Computer-Aided Civil and Infrastructure Engineering ,
Abstract
Direct visual understanding of construction entities, such as tools and materials (T&M), underpin construction management and resource scheduling. Traditional supervised learning methods suffer from high annotation cost, severe computational demands, and limited datasets. In contrast, training‐free approaches offer an effective alternative well‐suited for construction scenarios constrained by data scarcity and limited resources. Besides, vision‐language models (VLMs) can directly learn image semantics through natural language supervision and also demonstrate strong zero‐shot detection capabilities without requiring retraining. Existing methods often exhibit limited image–text semantic alignment in construction scenarios, which restricts their effectiveness in construction tasks. Therefore, there is an urgent need for approaches that can enhance cross‐modal understanding in such domain‐specific contexts. To address this challenge, this paper proposes a training‐free, knowledge‐enhanced VLM to recognize T&M in construction tasks. The proposed approach leverages image matching and image–text knowledge alignment strategies, thereby utilizing the training‐free nature of existing VLMs while benefiting from enhanced performance brought by knowledge integration. This method offers a novel solution for construction management and robotic collaboration tasks that are traditionally constrained by data and computational resource dependencies.
| Publication DOI: | https://doi.org/10.1111/mice.70129 |
|---|---|
| Divisions: | College of Engineering & Physical Sciences > School of Infrastructure and Sustainable Engineering > Civil Engineering College of Engineering & Physical Sciences Aston University (General) |
| Funding Information: | National Natural Science Foundation of China, Grant/Award Number: 72201226; Research Grants Council, Grant/Award Numbers: 26208323, C6044-23GF |
| Additional Information: | Copyright © 2025 The Author(s). Computer-Aided Civil and Infrastructure Engineering published by Wiley Periodicals LLC on behalf of Editor. This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made. |
| Publication ISSN: | 1467-8667 |
| Last Modified: | 12 Nov 2025 08:04 |
| Date Deposited: | 11 Nov 2025 14:20 |
| Full Text Link: | |
| Related URLs: |
https://onlinel ... 1111/mice.70129
(Publisher URL) |
PURE Output Type: | Article |
| Published Date: | 2025-11-06 |
| Published Online Date: | 2025-11-06 |
| Accepted Date: | 2025-10-27 |
| Authors: |
Zhang, Zhaoxin
Yu, Yantao Pan, Zaolin Antwi‐Afari, Maxwell Fordjour (
0000-0002-6812-7839)
|
0000-0002-6812-7839