
Research Article
Local Feature Patch Matching Enhanced Re-ranking Model for Clothing Image Retrieval
@INPROCEEDINGS{10.4108/eai.18-12-2025.2365290, author={Simeng Cheng and Zhuangye Luo and Feng Zeng and Xiaowei Xie}, title={Local Feature Patch Matching Enhanced Re-ranking Model for Clothing Image Retrieval}, proceedings={Proceedings of the 13th International Conference on Identification, Information and Knowledge in the Internet of Things, IIKI 2025, 18-21 December 2025, Chengdu, China}, publisher={EAI}, proceedings_a={IIKI}, year={2026}, month={6}, keywords={image retrieval re-ranking model self-attention cross-attention Swin Transformer}, doi={10.4108/eai.18-12-2025.2365290} }- Simeng Cheng
Zhuangye Luo
Feng Zeng
Xiaowei Xie
Year: 2026
Local Feature Patch Matching Enhanced Re-ranking Model for Clothing Image Retrieval
IIKI
EAI
DOI: 10.4108/eai.18-12-2025.2365290
Abstract
Effective clothing image retrieval depends on robust global features and highly discriminative local features. We propose an effective re-ranking model to handle the issues of fine-grained difference discrimination and occlusion interference in clothing image retrieval. The global features are initially extracted by the Swin Transformer for a global-level retrieval, resulting in an initial set of candidate images. Then, a re-ranking network composed of multiple BiAttention layers is introduced to mine more discriminative local features from query-candidate image pairs. Each BiAttention module captures intra-image semantic information through self-attention, enhancing the representation capability for fine-grained differences. It also captures inter-image correlation by using bidirectional cross-attention to enhance the model's robustness to occlusion. In addition, the re-ranking network incorporates spatial relative attention to reinforce spatial position constraints. Positional importance weighting is further applied to matched feature blocks, so that the final similarity evaluation between image pairs appropriately concentrates on important feature regions. A large-scale clothing image dataset is built to validate the model's performance. Experimental results show that the proposed method is efficient in large-scale clothing image retrieval.


