EST

Call for paper
April Edition 2017

International Journal of Applied Information Systems solicits high quality original research papers for the
March 15, 2017
April 2017 Edition of the journal.
The last date of research paper submission is
March 15, 2017
SUBMIT YOUR PAPER

Number 7

Cluster Analysis of Data Points using Partitioning and Probabilistic Model-based Algorithms

User Rating: 0 / 5

Star InactiveStar InactiveStar InactiveStar InactiveStar Inactive
 

PrintEmail

journal image
 Download
862
  • Ajiboye Adeleke R. and Isah-kebbe Hauwau and Oladele Tinuke O. 2014. Cluster Analysis of Data Points using Partitioning and Probabilistic Model-based Algorithms. International Journal of Applied Information Systems. 7, 7 (August 2014), 21-26. DOI=http://dx.doi.org/10.5120/ijais451211
  • @article{10.5120/ijais2017451568,
    author = {Ajiboye Adeleke R. and Isah-kebbe Hauwau and Oladele Tinuke O.},
    title = {Cluster Analysis of Data Points using Partitioning and Probabilistic Model-based Algorithms},
    journal = {International Journal of Applied Information Systems},
    issue_date = {August 2014},
    volume = {7},
    number = {},
    month = {August},
    year = {2014},
    issn = {},
    pages = {21-26},
    numpages = {},
    url = {/archives/volume7/number7/668-1211},
    doi = { 10.5120/ijais14-451211},
    publisher = { xA9 2013 by IJAIS Journal},
    address = {}
    }
    
  • %1 451211
    %A Ajiboye Adeleke R. 
    %A Isah-kebbe Hauwau
    %A Oladele Tinuke O.
    %T Cluster Analysis of Data Points using Partitioning and Probabilistic Model-based Algorithms
    %J International Journal of Applied Information Systems
    %@ 
    %V 7
    %N 
    %P 21-26
    %D 2014
    %I  xA9 2013 by IJAIS Journal
    

Abstract

Exploring the dataset features through the application of clustering algorithms is a viable means by which the conceptual description of such data can be revealed for better understanding, grouping and decision making. Some clustering algorithms, especially those that are partitioned-based, clusters any data presented to them even if similar features do not present. This study explores the performance accuracies of partitioning-based algorithms and probabilistic model-based algorithm. Experiments were conducted using k-means, k-medoids and EM-algorithm. The study implements each algorithm using RapidMiner Software and the results generated was validated for correctness in accordance to the concept of external criteria method. The clusters formed revealed the capability and drawbacks of each algorithm on the data points.

References

  1. D. Napoleon and P. G. Lakshmi, "An efficient K-Means clustering algorithm for reducing time complexity using uniform distribution data points," in Trendz in Information Sciences & Computing (TISC), 2010, pp. 42-45.
  2. S. C. Suh, Practical Applications of Data Mining: Jones & Barlett Learning, LLC 2012.
  3. P. Berkhin, "A survey of clustering data mining techniques," in Grouping multidimensional data, ed: Springer, 2006, pp. 25-71.
  4. B. Mirkin, Clustering: A Data Recovery Approach: CRC Press, 2012.
  5. G. M. Daiyan, F. Abid, M. Khan, and A. H. Tareq, "An efficient grid algorithm for faster clustering using K medoids approach," in Computer and Information Technology (ICCIT), 2012 15th International Conference on, 2012, pp. 1-3.
  6. J. Han, M. Kamber, and J. Pei, DATA MINING Concepts and Techniques: Morgan Kaufmann, 3rd Edition, 2012.
  7. C. -H. Lin, C. -C. Chen, H. -L. Lee, and J. -R. Liao, "Fast K-means algorithm based on a level histogram for image retrieval," Expert Systems with Applications, vol. 41, pp. 3276-3283, 2014.
  8. Z. Huang, "Extensions to the k-means algorithm for clustering large data sets with categorical values," Data Mining and Knowledge Discovery, vol. 2, pp. 283-304, 1998.
  9. R. Forsati, M. Mahdavi, M. Shamsfard, and M. Reza Meybodi, "Efficient stochastic algorithms for document clustering," Information Sciences, vol. 220, pp. 269-291, 2013.
  10. C. Ding and T. Li, "Adaptive dimension reduction using discriminant analysis and k-means clustering," in Proceedings of the 24th international conference on Machine learning, 2007, pp. 521-528.
  11. A. P. Reynolds, G. Richards, and V. J. Rayward-Smith, "The application of k-medoids and pam to the clustering of rules," in Intelligent Data Engineering and Automated Learning–IDEAL 2004, ed: Springer, 2004, pp. 173-178.
  12. S. M. Razavi Zadegan, M. Mirzaie, and F. Sadoughi, "Ranked< i> k-medoids: A fast and accurate rank-based partitioning algorithm for clustering large datasets," Knowledge-Based Systems, vol. 39, pp. 133-143, 2013.
  13. R. Joshi, A. Patidar, and S. Mishra, "Scaling k-medoid algorithm for clustering large categorical dataset and its performance analysis," in Electronics Computer Technology (ICECT), 2011 3rd International Conference on, 2011, pp. 117-121.
  14. C. Ordonez and E. Omiecinski, "FREM: fast and robust EM clustering for large data sets," in Proceedings of the eleventh international conference on Information and knowledge management, 2002, pp. 590-599.
  15. C. Ambroise, M. Dang, and G. Govaert, "Clustering of spatial data by the EM algorithm," in geoENV I—Geostatistics for environmental applications, ed: Springer, 1997, pp. 493-504.
  16. C. Carson, S. Belongie, H. Greenspan, and J. Malik, "Blobworld: Image segmentation using expectation-maximization and its application to image querying," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 24, pp. 1026-1038, 2002.
  17. J. Erman, M. Arlitt, and A. Mahanti, "Traffic classification using clustering algorithms," in Proceedings of the 2006 SIGCOMM workshop on Mining network data, 2006, pp. 281-286.
  18. L. R. Kaufman and P. Rousseeuw, "Finding groups in data: An introduction to cluster analysis," Hoboken NJ John Wiley & Sons Inc, 1990.
  19. S. Ben-David and M. Ackerman, "Measures of clustering quality: A working set of axioms for clustering," in Advances in neural information processing systems, 2009, pp. 121-128.
  20. H. -S. Park and C. -H. Jun, "A simple and fast algorithm for K-medoids clustering," Expert Systems with Applications, vol. 36, pp. 3336-3341, 2009.
  21. A Cross-country Database for Sector Investment and Capital – An open repository of the World Bank: http://go. worldbank. org/K955YO0N00 (accessed on June 23, 2014).

Keywords

Clustering, Algorithm, K-means, EM-clustering, K-medoids

Index Terms

Computer Science
Information Sciences