About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Pervasive Computing Technologies for Healthcare. 17th EAI International Conference, PervasiveHealth 2023, Malmö, Sweden, November 27-29, 2023, Proceedings

Research Article

Heuristic-Based Extraction and Unigram Analysis of Nursing Free Text Data Residing in Large EHR Clinical Notes

Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-031-59717-6_9,
        author={Syed Mohtashim Abbas Bokhari and Kriste Krstovski and Jennifer Withall and Rachel Lee and Patricia Dykes and Mai Tran and Kenrick Cato and Sarah Rossetti},
        title={Heuristic-Based Extraction and Unigram Analysis of Nursing Free Text Data Residing in Large EHR Clinical Notes},
        proceedings={Pervasive Computing Technologies for Healthcare. 17th EAI International Conference, PervasiveHealth 2023, Malm\o{}, Sweden, November 27-29, 2023, Proceedings},
        proceedings_a={PERVASIVEHEALTH},
        year={2024},
        month={6},
        keywords={nursing documentation health informatics clinical notes nursing notes heuristics natural language processing information retrieval unigram analysis},
        doi={10.1007/978-3-031-59717-6_9}
    }
    
  • Syed Mohtashim Abbas Bokhari
    Kriste Krstovski
    Jennifer Withall
    Rachel Lee
    Patricia Dykes
    Mai Tran
    Kenrick Cato
    Sarah Rossetti
    Year: 2024
    Heuristic-Based Extraction and Unigram Analysis of Nursing Free Text Data Residing in Large EHR Clinical Notes
    PERVASIVEHEALTH
    Springer
    DOI: 10.1007/978-3-031-59717-6_9
Syed Mohtashim Abbas Bokhari1,*, Kriste Krstovski2, Jennifer Withall1, Rachel Lee3, Patricia Dykes4, Mai Tran1, Kenrick Cato5, Sarah Rossetti1
  • 1: Department of Biomedical Informatics, Columbia University
  • 2: Data Science Institute, Columbia University
  • 3: School of Nursing, Columbia University
  • 4: Harvard Medical School, Brigham and Women’s Hospital
  • 5: University of Pennsylvania
*Contact email: mohtashim_abbas@yahoo.com

Abstract

Free text in nurses’ notes can play an important role in clinical decision-making; however, such information has not been explored to the fullest of its potential as it is hard to extract it from electronic health records (EHRs). Free text is a subset of the information recorded in nursing notes. Automated extraction of free text is challenging due to EHRs’ size and structural diversity. Understanding these structural and content-level differences is essential for the extraction. Free text is embedded in other relatively structured texts, which are difficult to detect automatically. Moreover, there is no information indicating whether a note is a free text. As a first step in automating the extraction process, we explore heuristic-based algorithms with the goal of establishing a baseline and developing an annotated dataset, which could then be used for further machine learning-based extraction algorithms for a more scalable solution. In this research, we analyze over 200,000 EHR notes and extract 40,000 free text notes from them. Furthermore, we use the unigram language model to analyze the differences between free and structured texts to better understand the free text content.

Keywords
nursing documentation health informatics clinical notes nursing notes heuristics natural language processing information retrieval unigram analysis
Published
2024-06-04
Appears in
SpringerLink
http://dx.doi.org/10.1007/978-3-031-59717-6_9
Copyright © 2023–2025 ICST
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL