EST

Call for paper
April Edition 2017

International Journal of Applied Information Systems solicits high quality original research papers for the
March 15, 2017
April 2017 Edition of the journal.
The last date of research paper submission is
March 15, 2017
SUBMIT YOUR PAPER

Number 3

Proposing an Improved Semantic and Syntactic Data Quality Mining Method using Clustering and Fuzzy Techniques

journal image
  • International Journal of Applied Information Systems
  • Foundation of Computer Science (FCS), NY, USA
  • Volume 3 - Number 3
  • Year of Publication: 2012
  • Authors: Hamid Reza Khosravani
  • http:/ijais12-450475
 Download
1849
  • Hamid Reza Khosravani 2012. Proposing an Improved Semantic and Syntactic Data Quality Mining Method using Clustering and Fuzzy Techniques. International Journal of Applied Information Systems. 3, 3 (July 2012), 8-12. DOI=http://dx.doi.org/10.5120/ijais450475
  • @article{10.5120/ijais2017451568,
    author = {Hamid Reza Khosravani},
    title = {Proposing an Improved Semantic and Syntactic Data Quality Mining Method using Clustering and Fuzzy Techniques},
    journal = {International Journal of Applied Information Systems},
    issue_date = {July 2012},
    volume = {3},
    number = {},
    month = {July},
    year = {2012},
    issn = {},
    pages = {8-12},
    numpages = {},
    url = {/archives/volume3/number3/210-0475},
    doi = { http:/ijais12-450475},
    publisher = { xA9 2010 by IJAIS Journal},
    address = {}
    }
    
  • %1 450475
    %A Hamid Reza Khosravani
    %T Proposing an Improved Semantic and Syntactic Data Quality Mining Method using Clustering and Fuzzy Techniques
    %J International Journal of Applied Information Systems
    %@ 
    %V 3
    %N 
    %P 8-12
    %D 2012
    %I  xA9 2010 by IJAIS Journal
    

Abstract

Data quality plays an important role in knowledge discovering process in databases. Researchers have proposed two different approaches for data quality evaluation so far. The first approach is based on statistical methods while the second one uses data mining techniques which caused further improvement in data quality evaluation results through relying on knowledge extracting. Our proposed method in data quality evaluation follows the second approach and focuses on accuracy dimension of data quality evaluation including both syntactic and semantic aspects.

References

  1. Partabiyan, J. , Mohsenzadeh, M. 2009. Database quality evaluation using a data mining technique, Science and Research Branch, Islamic Azad University, Tehran, Iran.
  2. Ghazanfari, M. , Alizadeh, S. , and Teymourpour, B. 2008. Data Mining and Knowledge Discovery, Publish Center of Iran University of Science & Technology, Tehran, Iran.
  3. Wang, L. , Teshnehlab, M. , Saffarpour, N. , Afuni, D. 2008. Fuzzy Systems and Fuzzy Control, Publish Center of K. N Toosi university of Technology, Tehran, Iran.
  4. Amir A. , Lipika, D. 2007. A k-mean clustering algorithm for mixed numeric and categorical data, Solid State Physics Laboratory, Timarpur, Delhi India, ScienceDirect.
  5. Amir, A. , Lipika, D. 2007. A method to compute distance between two categorical values of same attribute in unsupervised learning for categorical data set, Solid State Physics Laboratory, Timarpur, Delhi India, ScienceDirect.
  6. Augustin-Iulian Ionescu, Eugen Dumitrascu, 2004. Database Quality-Some Problems, 7th International Conference on Develpment and Application Systems, Suceava, Romania.
  7. Dharmendra S. , Modha, W. , Spangler, S. 2001. FeatureWeighting in k-Means Clustering , Kluwer Academic Publishers, Netherlands.
  8. Loshin, D. 2006. Monitoring Data Quality Performance Using Data Quality Metrics, Informatica Corporation.
  9. Luebbers, D. , Grimmer, U. , Jarke, M. 2003. Systematic Development of Data Mining-Based Data Quality Tools, Proceedings of the 29th VLDB Conference, Berlin, Germany.
  10. Erhard Rahm, Hong Hai Do, Data Cleaning: Problems and Current Approaches, University of Leipzig, Germany.
  11. Hipp, J. , G¨untzer, U. , Grimmer, U. 2003. Data Quality Mining, 3rd International Conference on Practical Aspects of Knowledge Management.
  12. Dougherty, J. , Kohavi, R. , Sahami, M. 1995. Supervised and Unsupervised Discretization of Continuous Features, Computer Science Department of Stanford University, Proceeding of the 12th International Conference.
  13. Peng, L. , Lei, L. A Review of Missing Data Treatment Methods, Department of Information Systems, Shanghai University of Finance and Economics, Shanghai, China.
  14. Lee. 1999. Fuzzy logic in control systems: Fuzzy logic controller, IEEE Trans Systems.
  15. Pipino, L. L. , Lee, Y. W. , Wang, R. Y. 2002. Data Quality Assessment, Communications of the ACM.
  16. Helfert, M. , An Approach for Information Quality measurement in Data Warehousing, University of St. Gallen (Switzerland).
  17. Ludl, M. C. , Widmer, G. , Relative Unsupervised Discretization for Association Rule Mining , Department of Medical Cybernetics and Artificial Intelligence, University of Vienna.
  18. Scannapieco, M. , Missier, P. , Batini, C. , Data Quality at a Glance, Università di Roma "La Sapienza" , University of Manchester, Dipartimento di Informatica, Sistemistica e Comunicazione.
  19. Mamdani; E. H;"Application of fuzzy logic to approximate reasoning using linguistic synthesis", IEEE Trans on Computers, 2003.
  20. Manoranjan Dash, Huan Liu, Feature Selection for Clustering, National University of Singapore, Singapore.
  21. Ohn Mar San, Van-Nas huynh, Yoshiteru Nakamori, 2004. An alternative extention of the k-means algorithm clustering categorical data, Mathematics and Statistics Department of Co-Operative Degree College Sagaing Myanmar, Japan Advanced Institute of Science and Technology Asahidai Tatsunokuchi Ishikawa Japan.
  22. Vázquez Soler, S. , Yankelevich, D. , Quality Mining: A Data Mining Based Method for Data Quality Evaluation, Pragma Consultores and Departamento de Computación – FCEyN Universidad de Buenos Aires, Argentina.
  23. Zhexue Huang, 1998. Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values, Kluwer Academic Publishers, Netherlands.

Keywords

Data Quality Mining, Association Rules, Categorical Feature, Numerical Feature

Index Terms

Computer Science
Information Sciences