5th International ICST Conference on Collaborative Computing: Networking, Applications, Worksharing

Research Article

Exploiting tags for concept extraction and information integration

Download501 downloads
  • @INPROCEEDINGS{10.4108/ICST.COLLABORATECOM2009.8330 ,
        author={Martha L. Escobar-Molano and Antonio Badia and Rafael Alonso},
        title={Exploiting tags for concept extraction and information integration},
        proceedings={5th International ICST Conference on Collaborative Computing: Networking, Applications, Worksharing},
        proceedings_a={COLLABORATECOM},
        year={2009},
        month={12},
        keywords={Centralized control Collaboration Computer science Data analysis Data engineering Data mining Machine learning Ontologies Tagging Vocabulary},
        doi={10.4108/ICST.COLLABORATECOM2009.8330 }
    }
    
  • Martha L. Escobar-Molano
    Antonio Badia
    Rafael Alonso
    Year: 2009
    Exploiting tags for concept extraction and information integration
    COLLABORATECOM
    ICST
    DOI: 10.4108/ICST.COLLABORATECOM2009.8330
Martha L. Escobar-Molano1, Antonio Badia2, Rafael Alonso1
  • 1: SET Corporation, 1005 Glebe Road, Suite 400, Arlington VA
  • 2: Computer Engineering and Computer Science, University of Louisville, Louisville KY

Abstract

The use of tags to annotate content creates an opportunity to explore alternatives to automate the process of extracting semantics from data sources. Semantic information is needed for many complex tasks like concept extraction and information integration. In order to establish the value of user-generated annotation, this paper presents two experiments on which only user tags are used as input. At the core of semantic extraction is the identification of concepts and relationships that are present in the data. We show, through an experimental study on tagged photographs, how to extract concepts associated with photographs and their relationships. Our experiments demonstrate that supervised machine learning techniques can be used to extract a concept associated with a photograph with an overall precision score of 80%. Our experiments also show that a variation of the Jaccard similarity coefficient on sets of tags can be used to determine equivalence relationships between the concepts associated with these sets.