An Advanced Clustering Algorithm (ACA) for Clustering Large Data Set to Achieve High Dimensionality

Amanpreet Kaur Toor; Amarpreet Singh

An Advanced Clustering Algorithm (ACA) for Clustering Large Data Set to Achieve High Dimensionality

User Rating: 0 / 5

Print Email

International Journal of Applied Information Systems
Foundation of Computer Science (FCS), NY, USA
Volume 7 - Number 2
Year of Publication: 2012
Authors: Amanpreet Kaur Toor, Amarpreet Singh
10.5120/ijais14-451136

1616

Export

Amanpreet Kaur Toor and Amarpreet Singh 2014. An Advanced Clustering Algorithm (ACA) for Clustering Large Data Set to Achieve High Dimensionality. International Journal of Applied Information Systems. 7, 2 (April 2014), 5-9. DOI=http://dx.doi.org/10.5120/ijais451136

@article{10.5120/ijais2017451568,
author = {Amanpreet Kaur Toor and Amarpreet Singh},
title = {An Advanced Clustering Algorithm (ACA) for Clustering Large Data Set to Achieve High Dimensionality},
journal = {International Journal of Applied Information Systems},
issue_date = {April 2014},
volume = {7},
number = {},
month = {April},
year = {2014},
issn = {},
pages = {5-9},
numpages = {},
url = {/archives/volume7/number2/618-1136},
doi = { 10.5120/ijais14-451136},
publisher = { xA9 2013 by IJAIS Journal},
address = {}
}

%1 451136
%A Amanpreet Kaur Toor
%A Amarpreet Singh
%T An Advanced Clustering Algorithm (ACA) for Clustering Large Data Set to Achieve High Dimensionality
%J International Journal of Applied Information Systems
%@ 
%V 7
%N 
%P 5-9
%D 2014
%I  xA9 2013 by IJAIS Journal

Abstract

The cluster analysis method is one of the critical methods in data mining; this method of clustering algorithm will manipulate the clustering results directly. This paper proposes an Advanced Clustering Algorithm in order to addresses the concern of high dimensionality and large data set [1]. The Advanced Clustering Algorithm method avoids computing the distance of each data object to the cluster recursively and save the execution time. ACA requires a simple data structure to store information in each iteration, which is to be used in the next iteration. Experimental results show that the Advanced Clustering Algorithm method can effectively improve the speed of clustering and accuracy, reducing the computational complexity of the traditional algorithm Kohonen SOM. This paper includes Advanced Clustering Algorithm (ACA) and its simulated experimental results with different data sets.

References

Yuan F, Meng Z. H, Zhang H. X and Dong C. R, "A New Algorithm to Get the Initial Centroids," Proc. of the 3rd International Conference on Machine Learning and Cybernetics, pp. 26–29, August 2004.
Sun Jigui, Liu Jie, Zhao Lianyu, "Clustering algorithms Research",Journal of Software ,Vol 19,No 1, pp. 48-61,January 2008.
Amanpreet Kaur Toor, Amarpreet Singh, " Analysis of Clustering Algorithm based on Number of Clusters, error rate, Computation Time and Map Topology on large Data Set", International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Volume 2, Issue 6, November- December 2013.
Amanpreet Kaur Toor, Amarpreet Singh, " A Survey paper on recent clustering approaches in data mining", International Journal of Advanced Research in Computer Science and Software Engineering Vol 3, Issue 11, November 2013.
Sun Shibao, Qin Keyun," Research on Modified K-means Data Cluster Algorithm"I. S. Jacobs and C. P. Bean, "Fine particles, thin films and exchange anisotropy," Computer Engineering, vol. 33, No. 13, pp. 200– 201,July 2007.
Merz C and Murphy P, UCI Repository of Machine Learning Databases, Available: ftp://ftp. ics. uci. edu/pub/machine-learning-databases
Fahim A M,Salem A M,Torkey F A, "An efficient enhanced k-means clustering algorithm" Journal of Zhejiang University Science A, Vol. 10, pp:1626-1633,July 2006.
Zhao YC, Song J. GDILC: A grid-based density isoline clustering algorithm. In: Zhong YX, Cui S, Yang Y, eds. Proc. of theInternet Conf. on Info-Net. Beijing: IEEE Press,2001. 140?145. http://ieeexplore. ieee. org/iel5/7719/21161/00982709. pdf
Huang Z, "Extensions to the k-means algorithm for clustering large data sets with categorical values," Data Mining and Knowledge Discovery, Vol. 2, pp:283–304, 1998.
K. A. AbdulNazeer, M. P. Sebastian, "Improving the Accuracy and Efficiency of the k-means Clustering Algorithm",Proceeding of the World Congress on Engineering, vol 1,london, July 2009.
Fred ALN, Leitão JMN. Partitionalvs hierarchical clustering using a minimum grammar complexity approach. In: Proc. of the SSPR & SPR 2000. LNCS 1876, 2000. 193?202. http://www. sigmod. org/dblp/db/conf/sspr/sspr2000. htm
Gelbard R, Spiegler I. Hempel's raven paradox: A positive approach to cluster analysis. Computers and Operations Research, 2000,27(4):305?320.
Huang Z. A fast clustering algorithm to cluster very large categorical data sets in data mining. In: Proc. of the SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery. Tucson, 1997. 146?151.
Ding C, He X. K-Nearest-Neighbor in data clustering: Incorporating local information into global optimization. In: Proc. of the ACM Symp. on Applied Computing. Nicosia: ACM Press, 2004. 584?589. http://www. acm. org/conferences/sac/sac2004/
HinneburgA,KeimD. An efficient approach to clustering in large multimedia databases with noise. In:AgrawalR,StolorzPE,Piatetsky- Shapiro G,eds. Proc. of the 4th Int'l Conf. on Knowledge Discovery and Data Mining(KDD'98). New York:AAAIPress,1998. 58~65.
ZhangT,RamakrishnanR,LivnyM. BIRCH:An efficient data clustering method for very large databases. In:JagadishHV,MumickIS,eds. Proc. of the 1996 ACM SIGMOD Int'l Conf. on Management of Data. Montreal:ACM Press,1996. 103~114.
Birant D, Kut A. ST-DBSCAN: An algorithm for clustering spatial- temporal data. Data & Knowledge Engineering, 2007,60(1): 208-221.

Keywords

ACA, SOM, Clustering, Large Data Set, High Dimensionality, Cluster Analysis

Index Terms

Computer Science

Information Sciences

Call for paper April Edition 2017

Number 2

An Advanced Clustering Algorithm (ACA) for Clustering Large Data Set to Achieve High Dimensionality

Export

Abstract

References

Keywords

Index Terms

Call for paper
April Edition 2017