Machine Learning and Intelligent Communications. 4th International Conference, MLICOM 2019, Nanjing, China, August 24–25, 2019, Proceedings

Research Article

Noise Reduction in Network Embedding

Download
73 downloads
  • @INPROCEEDINGS{10.1007/978-3-030-32388-2_10,
        author={Cong Li and Donghai Guan and Zhiyuan Cui and Weiwei Yuan and Asad Khattak and Muhammad Fahim},
        title={Noise Reduction in Network Embedding},
        proceedings={Machine Learning and Intelligent Communications. 4th International Conference, MLICOM 2019, Nanjing, China, August 24--25, 2019, Proceedings},
        proceedings_a={MLICOM},
        year={2019},
        month={10},
        keywords={Network embedding Noise identification Voting},
        doi={10.1007/978-3-030-32388-2_10}
    }
    
  • Cong Li
    Donghai Guan
    Zhiyuan Cui
    Weiwei Yuan
    Asad Khattak
    Muhammad Fahim
    Year: 2019
    Noise Reduction in Network Embedding
    MLICOM
    Springer
    DOI: 10.1007/978-3-030-32388-2_10
Cong Li,*, Donghai Guan,*, Zhiyuan Cui,*, Weiwei Yuan,*, Asad Khattak1,*, Muhammad Fahim2,*
  • 1: Zayed University
  • 2: Innopolis University
*Contact email: 18851870127@163.com, dhguan@nuaa.edu.cn, 565508802@qq.com, yuanweiwei@nuaa.edu.cn, Asad.Khattak@zu.ac.ae, m.fahim@innopolis.ru

Abstract

Network Embedding aims to learn latent representations and effectively preserves structure of network and information of vertices. Recently, networks with rich side information such as vertex’s label and links between vertices have attracted significant interest due to its wide applications such as node classification and link prediction. It’s well known that, in real world applications, network always contains mislabeled vertices and edges, which will cause the embedding preserves mistake information. However, current semi-supervised graph embedding algorithms assume the vertex label is ground-truth. Manually relabel all mislabeled vertices is always inapplicable, therefore, how to effective reduce noise so as to maximize the graph analysis task performance is extremely important. In this paper, we focus on reducing label noise ratio in dataset to obtain more reasonable embedding. We proposed two methods for any semi-supervised network embedding algorithm to tackle it: first approach uses a model to identify potential noise vertices and correct them, second approach uses two voting strategy to precisely relabel vertex. To the best of our knowledge, we are the first to tackle this issue in network embedding. Our experiments are conducted on three public data sets.