About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Mining User-Generated Content for Security

Research Article

Security Level Classification of Confidential Documents Written in Turkish

Download(Requires a free EAI acccount)
658 downloads
Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-642-12630-7_41,
        author={Erdem Alparslan and Hayretdin Bahsi},
        title={Security Level Classification of Confidential Documents Written in Turkish},
        proceedings={Mining User-Generated Content for Security},
        proceedings_a={MINUCS},
        year={2012},
        month={10},
        keywords={document classification security Turkish support vector machine na\~{n}ve bayes TF-IDF stemming data loss prevention},
        doi={10.1007/978-3-642-12630-7_41}
    }
    
  • Erdem Alparslan
    Hayretdin Bahsi
    Year: 2012
    Security Level Classification of Confidential Documents Written in Turkish
    MINUCS
    Springer
    DOI: 10.1007/978-3-642-12630-7_41
Erdem Alparslan1,*, Hayretdin Bahsi1,*
  • 1: National Research Institute of Electronics and Cryptology-TUBITAK
*Contact email: ealparslan@uekae.tubitak.gov.tr, bahsi@uekae.tubitak.gov.tr

Abstract

This article introduces a security level classification methodology of confidential documents written in Turkish language. Internal documents of TUBITAK UEKAE, holding various security levels (unclassified-restricted-secret) were classified within a methodology using Support Vector Machines (SVM’s) [1] and naïve bayes classifiers [3][9]. To represent term-document relations a recommended metric “TF-IDF" [2] was chosen to construct a weight matrix. Turkic languages provide a very difficult natural language processing problem in comparison with English: “Stemming”. A Turkish stemming tool "zemberek" was used to find out the features without suffix. At the end of the article some experimental results and success metrics are projected.

Keywords
document classification security Turkish support vector machine naïve bayes TF-IDF stemming data loss prevention
Published
2012-10-23
http://dx.doi.org/10.1007/978-3-642-12630-7_41
Copyright © 2009–2025 ICST
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL