
Research Article
Feature Filtering Spectral Clustering Method Based on High Dimensional Online Clustering Method
@INPROCEEDINGS{10.1007/978-3-030-97124-3_14, author={Zizhou Feng and Yujian Gu and Bin Yang and Baitong Chen and Wenzheng Bao}, title={Feature Filtering Spectral Clustering Method Based on High Dimensional Online Clustering Method}, proceedings={Simulation Tools and Techniques. 13th EAI International Conference, SIMUtools 2021, Virtual Event, November 5-6, 2021, Proceedings}, proceedings_a={SIMUTOOLS}, year={2022}, month={3}, keywords={Golgi appratus Malonylation SMOTE Protein}, doi={10.1007/978-3-030-97124-3_14} }
- Zizhou Feng
Yujian Gu
Bin Yang
Baitong Chen
Wenzheng Bao
Year: 2022
Feature Filtering Spectral Clustering Method Based on High Dimensional Online Clustering Method
SIMUTOOLS
Springer
DOI: 10.1007/978-3-030-97124-3_14
Abstract
Golgi is an important eukaryotic organelle. Golgi plays a key role in protein synthesis in eukaryotic cells, and its dysfunction will lead to various genetic and neurodegenerative diseases. In order to better develop drugs to treat diseases, one of the key problems is to identify the protein category of Golgi apparatus. In the past, the physical and chemical properties of Golgi proteins have often been used as feature extraction methods, but more accurate sub-Golgi protein identification is still challenged by existing methods. In this paper, we use the tape-bert model to extract the features of Golgi body. To create a balanced dataset from an unbalanced Golgi dataset, we used the SMOTE oversampling method. In addition, we screened out the important eigenvalues of 300 dimensions to identify the types of Golgi proteins. In 10-fold cross validation and independent test set test, the accuracy rate reached 90.6% and 95.31%.