location:Home > 2018 Vol.1 Jun No.3 > Research on Classification and Redundant Information Filtering of Massive Data in Big Database

2018 Vol.1 Jun No.3

  • Title: Research on Classification and Redundant Information Filtering of Massive Data in Big Database
  • Name: Adair Cornell
  • Company: Aarhus University, Denmark
  • Abstract:

    The precise classification of massive information in big database was researched in this paper, and also the redundant information as the interference should be filtered in the subject. According to the traditional data classification method, the frequency points were concentrated and the data classification frequent points were not easy to be eliminated. The nodes classification technology with low self adaptive property refused the nodes in high disturbance and in the deep attenuation parts, and then the classification precision and the immunity of the disturbance property were limited greatly. A new optimum data classification method and the corresponding redundant information model were proposed based on the chaotic probability analysis. The classification error rates was mapped as a probability density function based on the channel mapping function method, the classification probability was allocated with this probability density function. The random series which could reflect the essential feature was produced based on the chaotic probability analysis method which could meet to the demands of the random frequency classification. And the data clustering and optimization classification was realized finally. Simulation was taken with the KDD_CUP2009 experimental big database, and simulation result shows that the proposed method can classify each type of the data effectively. The performance of the data classification is perfect, comparing to the traditional neural net fuzzy c-means method, the classification precision rate was improved by 17.8% It show that the model and algorithm has excellent classification performance and can be taken in the application such as data mining, fault diagnosis and target recognition as engineering practice.

  • Keyword: Database; Massive data; Classification; Redundant information; Chaos;
  • DOI: 10.12250/jpciams2018030119
  • Citation form: Adair Cornell.Research on Classification and Redundant Information Filtering of Massive Data in Big Database[J]. Computer Informatization and Mechanical System, 2018, vol. 1, pp. 1-7.
Reference:

[1]Darwin Mayorga-Cruz, Oscar Sarmiento-Martinez, Carmina Menchaca Campos. Analysis of Michelson Optical Interferometry Using Recurrence Plots During Corrosion of Aluminium in NaCl Solution[J]. ECS Trans. 2009; 20: 433-446.
[2]Cristina Stan, C. P. Cristescu, D. G. Dimitriu. Analysis of the intermittent behavior in a low-temperature discharge plasma by recurrence plot quantification[J]. Physics of Plasmas, 2010; 17(4): (042115)1-6.
[3]M. Thiel, M. C. Romano. How much information is contained in a recurrence plot[J]. Physics Letters A, 2004; 330: 343-349.
[4]Dunn J C. A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters. Q.cybernet, 1974(3): 32-571.
[5]Senthil Arumugam M, Rao MVC, Aarthi Chandramohan. A new and improved version of particle swarm optimization algorithm with global-local best parameters[C]. Knowl Inf Syst, 2008(16): 331-357.
[6]ZHANG Yi, SHENG Huiping, HU Guangbo. Study on Compressor Fault Diagnosis Based on Space Reconstruction and K-L Transform[J].Compressor Technology, 2011(4): 19-21.
[7]HU Guangbo, LIANG Hong, XU Qian. Research on Chaotic Feature Extraction of Ship Radiated Noise[J]. Computer Simulation, 2011, 28(2): 22-24.
[8]Terrill Philip lan, Wilson Stephen James, Suresh Sadasivam, et al. Attractor structure discriminates sleep states: Recurrence plot analysis applied to infant breathing patterns [J]. IEEE Transactions on Biomedical Engineering, 2010, 57(5): 1108-1116.
[9]Guhathakurta Kousik, Bhattacharya Basabi, Chowdhury A. Roy. Using recurrence plot analysis to distinguish between endogenous and exogenous stock market crashes[J]. Physica A: Statistical Mechanics and its Applications, 2010, 389(9): 1874-1882.

Tsuruta Institute of Medical Information Technology
Address:[502,5-47-6], Tsuyama, Tsukuba, Saitama, Japan TEL:008148-28809 fax:008148-28808 Japan,Email:jpciams@hotmail.com,2019-09-16