EST

Call for paper
April Edition 2017

International Journal of Applied Information Systems solicits high quality original research papers for the
March 15, 2017
April 2017 Edition of the journal.
The last date of research paper submission is
March 15, 2017
SUBMIT YOUR PAPER

Number 5

MCAIM: Modified CAIM Discretization Algorithm for Classification

journal image
 Download
1695
  • Shivani V. Vora and R. G. Mehta 2012. MCAIM: Modified CAIM Discretization Algorithm for Classification. International Journal of Applied Information Systems. 3, 5 (July 2012), 42-50. DOI=http://dx.doi.org/10.5120/ijais450542
  • @article{10.5120/ijais2017451568,
    author = {Shivani V. Vora and R. G. Mehta},
    title = {MCAIM: Modified CAIM Discretization Algorithm for Classification},
    journal = {International Journal of Applied Information Systems},
    issue_date = {July 2012},
    volume = {3},
    number = {},
    month = {July},
    year = {2012},
    issn = {},
    pages = {42-50},
    numpages = {},
    url = {/archives/volume3/number5/230-0542},
    doi = { 10.5120/ijais12-450542},
    publisher = { xA9 2010 by IJAIS Journal},
    address = {}
    }
    
  • %1 450542
    %A Shivani V.  Vora
    %A R.  G.  Mehta
    %T MCAIM: Modified CAIM Discretization Algorithm for Classification
    %J International Journal of Applied Information Systems
    %@ 
    %V 3
    %N 
    %P 42-50
    %D 2012
    %I  xA9 2010 by IJAIS Journal
    

Abstract

Discretization is a process of dividing a continuous attribute into a finite set of intervals to generate an attribute with small number of distinct values, by associating discrete numerical value with each of the generated intervals. Discretization is usually performed prior to the learning process and has played an important role in data mining and knowledge discovery. The results of CAIM are not satisfactory in some cases, led us to modify the algorithm. The Modified CAIM (MCAIM) results are compared with other discretization techniques for classification accuracy and generated the outperforming results. The intervals generated by MCAIM discretization are more in numbers, so to reduce them, the CAIR criterion is used to merge the intervals in MCAIM discretization. It gives better classification accuracy and the reduced number of intervals.

References

  1. Jiawei Han and Micheline Kamber, Data Mining –Concept and Techniques, Elsevier: Second Edition
  2. D. P. Rana, R. G Mehta, M. A Zaveri, 2008. Hash based Pattern Discovery Algorithm for Web Usage Mining, ADIT Journal of Engineering, ISSN: 0973 3663, vol. 5, No. 1, (December 2008), pp No. 3-10
  3. Cheng-Jung Tsai, Chien-I. Lee, Wei-Pang Yang, 2007. A discretization algorithm based on Class-Attribute Contingency Coefficient; Elsevier; sciencedirect; Received 27 September 2006; received in revised form 24 August 2007, accepted 2 September 2007
  4. Q. Wu, D. A. Bell, T. M. McGinnity, G. Prasad, G. Qi, X. Huang, 2006. Improvement of decision accuracy using discretization of continuous attributes, in: Proceedings of the Third International Conference on Fuzzy Systems and Knowledge Discovery, Lecture Notes in Computer Science 4223, pp. 674–683
  5. Lukasz A. Kurgan, Member, IEEE, and Krzysztof J. Cios, Senior Member, IEEE, 2004. CAIM Discretization Algorithm; IEEE Transactions on Knowledge and Data Engineering, Vol. 16, No. 2
  6. R. G Mehta, 2009. A Novel Fuzzy Based Classification algorithm for Data Mining using Fuzzy Discretization" World congress on Computer Science and Information Engineering (CSIE-2009), Sponsored by IEEE, Los Angeles, USA
  7. K. J. Cios, W. Pedrycz and R. Swiniarski, 1998. Data Mining Methods for Knowledge Discovery, Kluwer, http://www. wkap. nl/ book. htm/0-7923-8252-8
  8. J. Ross Quinlan, 1993. C4. 5: programs for machine learning, Morgan Kaufmann Publishers Inc
  9. http://archive. ics. uci. edu/ml/datasets. html
  10. S. Cohen, L. Rokach, O. Maimon, 2007. Decision-tree instance-space decomposition with grouped gain-ratio, Information Sciences, pp. 3592–3612
  11. Breiman L. , Friedman, J. H. , Olshen R. A. , and Stone C. J. , 1984. Classification and Regression Trees, Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software.
  12. David HeckerMann, A Tutorial On Learning With Bayesian Networks, March 1995 (Revised November 1996)
  13. Raul Rojas, 1996. Neural Networks - A Systematic Introduction, Springer-Verlag
  14. Cover, T. , Hart, P. , 1967. Nearest neighbor pattern classification, IEEE Trans. on Information Theory, vol. 13, no. 1,pp. 21–7
  15. Shivani V. Vora and Rupa G. Mehta, "Classification techniques for environmental data: A survey", International Congress of Environment Research (ICER-11), SVNIT, Surat, Dec 15-17, 2011.
  16. R. Rastogi, K. Shim, A decision tree classifier that integrates building and pruning, Proc. of the twenty forth Int'l Conf. on Very Large Databases, (1998) , pp. 404–415
  17. H. Liu, F. Hussain, C. L. Tan, M. Dash, 2002. Discretization: an enabling technique, Journal of Data Mining and Knowledge Discovery 6(4) 393–423
  18. M. Boulle, Khiops, A statistical discretization method of continuous attributes, Machine Learning 55 (1) (2004) 53–69
  19. J. Dougherty, R. Kohavi, M. Sahami, Supervised and unsupervised discretization of continuous features, in: Proceeding of Twelfth International Conference on Machine Learning, 1995, pp. 194–202
  20. U. M. Fayyad, K. B. Irani, On the handling of continuous-valued attributes in decision tree generation, Machine Learning 8 (1992) 87– 102
  21. Shivani Vora and Rupa G. Mehta, 2011. MCAIM: modified CAIM discretization, International Journal of computer Science and Engineering, Vol. 8, Issue 1, pp. 16-20 ISSN (online): 2043-9091
  22. Catlett, J. 1991. On changing continuous attributes into ordered discrete attributes. In proc. of fifth European working session on learning. Berlin: Springer-Verlag, pp. 164–177
  23. Michalski, R. S. , Chilausky, R. L. , 1980. Learning by being told and learning from examples: an experimental comparison of the two methods of knowledge acquisition in the context of developing and expert system for soybean disease diagnosis, Policy Analysis and Information Systems
  24. Y. Linde, A. Buzo, R. M. Gray, 1980. An Algorithm for Vector Quantizer Design, IEEE Trans. Comm. , vol. 28, no. 1, pp. 84-95

Keywords

Discretization, Class-attribute interdependency maximization, CAIM, MCAIM, CAIR

Index Terms

Computer Science
Information Sciences