
Research Article
From Foundation to Field: LISA Fine-Tuning for Mine Open-Vocabulary Segmentation
@INPROCEEDINGS{10.4108/eai.18-12-2025.2365260, author={JiBo Wang and Libin Jiao and Zhen Bao and Wenchao Gao and Lianzhi Huo}, title={From Foundation to Field: LISA Fine-Tuning for Mine Open-Vocabulary Segmentation}, proceedings={Proceedings of the 13th International Conference on Identification, Information and Knowledge in the Internet of Things, IIKI 2025, 18-21 December 2025, Chengdu, China}, publisher={EAI}, proceedings_a={IIKI}, year={2026}, month={6}, keywords={Multimodal large models open-vocabulary semantic segmentation underground coal-mine applications LoRA fine-tuning}, doi={10.4108/eai.18-12-2025.2365260} }- JiBo Wang
Libin Jiao
Zhen Bao
Wenchao Gao
Lianzhi Huo
Year: 2026
From Foundation to Field: LISA Fine-Tuning for Mine Open-Vocabulary Segmentation
IIKI
EAI
DOI: 10.4108/eai.18-12-2025.2365260
Abstract
Underground mining environments are characterized by dim illumination, cluttered man-made structures, and frequent occlusions. However, most existing underground segmentation methods still follow closed-set pixel classification over fixed labels, making them unable to use natural-language instructions or dynamically segment context-specific targets. In this work, we present MineLISA, an instruction-guided segmentation framework adapted from the Language Instructed Segmentation Assistant (LISA) for industrial underground mining applications. MineLISA takes natural-language prompts as input and generates pixel-level masks for underground mining objects on the MUSeg multimodal semantic-segmentation dataset. To adapt LISA under realistic resource constraints, we employ LoRA-based parameter-efficient fine-tuning on vision-language alignment modules and the lightweight segmentation decoder, and re-weight the segmentation loss to emphasize thin, safety-critical structures such as cables and pipelines. This design improves the alignment between textual instructions and underground visual patterns while remaining suitable for hardware with limited GPU memory. Experiments on MUSeg show that, compared with the original LISA, MineLISA achieves substantially improved instruction-conditioned mask predictions and more stable segmentation across diverse underground object categories, indicating its potential for real-world coal-mine deployment.


