Local Feature Patch Matching Enhanced Re-ranking Model for Clothing Image Retrieval

Simeng Cheng; Zhuangye Luo; Feng Zeng; Xiaowei Xie

Proceedings of the 13th International Conference on Identification, Information and Knowledge in the Internet of Things, IIKI 2025, 18-21 December 2025, Chengdu, China

Research Article

Local Feature Patch Matching Enhanced Re-ranking Model for Clothing Image Retrieval

Download60 downloads

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.4108/eai.18-12-2025.2365290,
    author={Simeng  Cheng and Zhuangye  Luo and Feng  Zeng and Xiaowei  Xie},
    title={Local Feature Patch Matching Enhanced Re-ranking Model for Clothing Image Retrieval},
    proceedings={Proceedings of the 13th International Conference on Identification, Information and Knowledge in the Internet of Things, IIKI 2025, 18-21 December 2025, Chengdu, China},
    publisher={EAI},
    proceedings_a={IIKI},
    year={2026},
    month={6},
    keywords={image retrieval re-ranking model self-attention cross-attention Swin Transformer},
    doi={10.4108/eai.18-12-2025.2365290}
}

Simeng Cheng
Zhuangye Luo
Feng Zeng
Xiaowei Xie
Year: 2026
Local Feature Patch Matching Enhanced Re-ranking Model for Clothing Image Retrieval
IIKI
EAI
DOI: 10.4108/eai.18-12-2025.2365290

Simeng Cheng¹, Zhuangye Luo¹, Feng Zeng¹^,*, Xiaowei Xie¹

1: School of Computer Science and Engineering, Central South University, Changsha, 410017, China

*Contact email: fengzeng@csu.edu.cn

Abstract

Effective clothing image retrieval depends on robust global features and highly discriminative local features. We propose an effective re-ranking model to handle the issues of fine-grained difference discrimination and occlusion interference in clothing image retrieval. The global features are initially extracted by the Swin Transformer for a global-level retrieval, resulting in an initial set of candidate images. Then, a re-ranking network composed of multiple BiAttention layers is introduced to mine more discriminative local features from query-candidate image pairs. Each BiAttention module captures intra-image semantic information through self-attention, enhancing the representation capability for fine-grained differences. It also captures inter-image correlation by using bidirectional cross-attention to enhance the model's robustness to occlusion. In addition, the re-ranking network incorporates spatial relative attention to reinforce spatial position constraints. Positional importance weighting is further applied to matched feature blocks, so that the final similarity evaluation between image pairs appropriately concentrates on important feature regions. A large-scale clothing image dataset is built to validate the model's performance. Experimental results show that the proposed method is efficient in large-scale clothing image retrieval.

Keywords: image retrieval, re-ranking model, self-attention, cross-attention, Swin Transformer

Published: 2026-06-17
Publisher: EAI

: http://dx.doi.org/10.4108/eai.18-12-2025.2365290

Local Feature Patch Matching Enhanced Re-ranking Model for Clothing Image Retrieval

Abstract

About EAI

Community

Publish with EAI