About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Proceedings of the 13th International Conference on Identification, Information and Knowledge in the Internet of Things, IIKI 2025, 18-21 December 2025, Chengdu, China

Research Article

From Foundation to Field: LISA Fine-Tuning for Mine Open-Vocabulary Segmentation

Download16 downloads
Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.4108/eai.18-12-2025.2365260,
        author={JiBo  Wang and Libin  Jiao and Zhen  Bao and Wenchao  Gao and Lianzhi  Huo},
        title={From Foundation to Field: LISA Fine-Tuning for Mine Open-Vocabulary Segmentation},
        proceedings={Proceedings of the 13th International Conference on Identification, Information and Knowledge in the Internet of Things, IIKI 2025, 18-21 December 2025, Chengdu, China},
        publisher={EAI},
        proceedings_a={IIKI},
        year={2026},
        month={6},
        keywords={Multimodal large models open-vocabulary semantic segmentation underground coal-mine applications LoRA fine-tuning},
        doi={10.4108/eai.18-12-2025.2365260}
    }
    
  • JiBo Wang
    Libin Jiao
    Zhen Bao
    Wenchao Gao
    Lianzhi Huo
    Year: 2026
    From Foundation to Field: LISA Fine-Tuning for Mine Open-Vocabulary Segmentation
    IIKI
    EAI
    DOI: 10.4108/eai.18-12-2025.2365260
JiBo Wang1, Libin Jiao1, Zhen Bao2, Wenchao Gao1, Lianzhi Huo3,*
  • 1: School of Artificial Intelligence, China University of Mining and Technology-Beijing, Beijing, China
  • 2: CHN Energy Science and Technology and Environment Co., Ltd., China; CHN Energy Zhi Shen Control Technology Co., Ltd., China
  • 3: the Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, China
*Contact email: huolz@aircas.ac.cn

Abstract

Underground mining environments are characterized by dim illumination, cluttered man-made structures, and frequent occlusions. However, most existing underground segmentation methods still follow closed-set pixel classification over fixed labels, making them unable to use natural-language instructions or dynamically segment context-specific targets. In this work, we present MineLISA, an instruction-guided segmentation framework adapted from the Language Instructed Segmentation Assistant (LISA) for industrial underground mining applications. MineLISA takes natural-language prompts as input and generates pixel-level masks for underground mining objects on the MUSeg multimodal semantic-segmentation dataset. To adapt LISA under realistic resource constraints, we employ LoRA-based parameter-efficient fine-tuning on vision-language alignment modules and the lightweight segmentation decoder, and re-weight the segmentation loss to emphasize thin, safety-critical structures such as cables and pipelines. This design improves the alignment between textual instructions and underground visual patterns while remaining suitable for hardware with limited GPU memory. Experiments on MUSeg show that, compared with the original LISA, MineLISA achieves substantially improved instruction-conditioned mask predictions and more stable segmentation across diverse underground object categories, indicating its potential for real-world coal-mine deployment.

Keywords
Multimodal large models, open-vocabulary semantic segmentation, underground coal-mine applications, LoRA fine-tuning
Published
2026-06-17
Publisher
EAI
http://dx.doi.org/10.4108/eai.18-12-2025.2365260
Copyright © 2025–2026 EAI
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center
  • Cookie Preferences

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL