location:Home > 2019 VOL.2 Jun No.3 > Database Annotation Data Clustering Method Based On Semi-Supervised Learning

2019 VOL.2 Jun No.3

  • Title: Database Annotation Data Clustering Method Based On Semi-Supervised Learning
  • Name: Bingjie Liu
  • Company: Jiangxi Vocational and Technical College of Communication
  • Abstract:

    When the traditional k-means data clustering method was used for clustering, the number of clusters must be obtained in advance, and the cluster initial center was randomly selected, which caused the clustering result to be unstable and the clustering error to be large. Aiming at the above problems, a new improved K-means clustering algorithm based on semi-supervised learning was proposed for database annotation data clustering. Firstly, the minimum spanning tree of the graph was established with a small amount of label data, and the cluster number and initial cluster center required for the K-means clustering algorithm were iteratively split, and then the subsequent data clustering was completed according to the traditional k-means data clustering method flow. Experiments showed that this method had fewer iterations than the traditional k-means data clustering method, and the stability was greatly improved, and the clustering error was reduced.

  • Keyword: Semi-Supervised Learning, Data Clustering, K-Means Clustering Algorithm,Graph Theory Knowledge
  • DOI: 10.12250/jpciams2019030118
  • Citation form: Bingjie Liu.Database Annotation Data Clustering Method Based On Semi-Supervised Learning[J]. Computer Informatization and Mechanical System, 2019, vol. 2, pp. 69-73.
Reference:

[1] Liu Tao, Yin Hongjian. Semi-supervised learning based on K-means clustering algorithm[J]. Application Research of Computers, 2010, 27 (3): 913-916.

[2] Chen Xinquan, Su Jintao. k-means clustering framework based on semi-supervised learning[J]. Journal of Guangxi University (Natural Science Edition), 2014 (5): 1074-1082.

[3] Zheng Wenjing, Li Lei. Research on Semi-supervised Sentiment Classification Based on Cluster Kernel[J].Computer technology and development, 2016,26(12): 87-91.

[4] Cheng Xuemei, Yang Qiuhui, Zhai Yupeng, et al. Test Case Selection Technique Based on Semi-supervised Clustering Method[J]. Computer Science, 2018, 45 (1): 249-254.

[5] Rodin, Mao Xiancheng, Deng Hao. A Semi-supervised Density Peak Clustering Algorithm[J].Geography and Geographic Information Science, 2017, 33 (2): 69-74.

[6] Li Zhaoming, Xu Shengbing, Hao Zhifeng. Cross-Entropy Semi-supervised Clustering Based on Pairwise Constraints[J].Pattern Recognition and Artificial Intelligence, 2017,30(7): 598-608.

[7] Cheng Rujiao, Xu Hongyan. Semi-Supervised Clustering Algorithm Based on RFM Model[J]. Computer Systems & Applications, 2017, 26 (11): 170-175.

[8] Chen Zhiyu, Wang Huijun, Hu Ming, et al. An Active Semi-supervised Clustering Algorithm Based on Seeds Set and Pairwise Constraints[J].Journal of Jilin University (Science Edition), 2017, 55 (3): 664-672.


Tsuruta Institute of Medical Information Technology
Address:[502,5-47-6], Tsuyama, Tsukuba, Saitama, Japan TEL:008148-28809 fax:008148-28808 Japan,Email:jpciams@hotmail.com,2019-09-16