• #0 (no title)
  • #0 (no title)
  • About
  • Facebook
  • Twitter
  • RSS
(As ISO 9001:2015 Certified Publications)
    • Quick Search
    • Advanced Search
  • Home
  • Editorial Policy
  • Author Guidelines
  • Submission
  • Copyright Form
  • Career
  • Contact us
  • Subscription

Back to Journal

Home»Articles»Performance Analysis of Dimensionality Reduction Techniques in the Context of Clustering

Performance Analysis of Dimensionality Reduction Techniques in the Context of Clustering

Author : T. Sudha and P. Nagendra Kumar
Volume 8 No.3 Special Issue:June 2019 pp 66-71

Abstract

Data mining is one of the major areas of research. Clustering is one of the main functionalities of datamining. High dimensionality is one of the main issues of clustering and Dimensionality reduction can be used as a solution to this problem. The present work makes a comparative study of dimensionality reduction techniques such as t-distributed stochastic neighbour embedding and probabilistic principal component analysis in the context of clustering. High dimensional data have been reduced to low dimensional data using dimensionality reduction techniques such as t-distributed stochastic neighbour embedding and probabilistic principal component analysis. Cluster analysis has been performed on the high dimensional data as well as the low dimensional data sets obtained through t-distributed stochastic neighbour embedding and Probabilistic principal component analysis with varying number of clusters. Mean squared error; time and space have been considered as parameters for comparison. The results obtained show that time taken to convert the high dimensional data into low dimensional data using probabilistic principal component analysis is higher than the time taken to convert the high dimensional data into low dimensional data using t-distributed stochastic neighbour embedding.The space required by the data set reduced through Probabilistic principal component analysis is less than the storage space required by the data set reduced through t-distributed stochastic neighbour embedding.

Keywords

Clustering, Dimensionality Reduction, t-distributed Stochastic Neighbour Embedding, Probabilistic Principal Component Analysis

Full Text:

References

[1] Jiawei Han and Micheline Kamber, “Data Mining: Concepts and Techniques”, Morgan Kaufmann Publishers, Elsevier, Second Edition
[2] The Wikipedia website [Online] Available at: https://en.wikipedia.org/wiki/Curse_of_dimensionality
[3] The Wikipedia website [Online] Available at: https://en.wikipedia.org/wiki/Dimensionality_reduction
[4] John P. Cunningham, Zoubin Ghahramani “Linear dimensionality Reduction: Survey, Insights and Generalizations”, Journal of Machine Learning Research, PP.2859-2900, 2015
[5] The Wikipedia website [Online] Available at: https://en.wikipedia.org/wiki/Nonlinear-dimensionality-reduction.html
[6] The Math works website [Online] Available at: www.mathworks.com/help/stats/t-sne.html
[7] The Math works website [Online] Available at: www.mathworks.com/help/stats/ppca.html
[8] Omprakash Saini and Sumit Sharma “ A Review on Dimensionality Reduction techniques in Data Mining”, Computer Engineering and Intelligent Systems, Vol. 9, No.1, pp.7-14, 2018.
[9] Minseok Song, H.Yang, S.H.Siadat and Mykola Pechenizkiy “A comparative study of dimensionality reduction techniques to enhance trace clustering performances”, Expert Systems with applications, Vol. 40, No. 9, pp. 3722-3734, July 2013.
[10] Vishwa vinay, Ingemar J.cox, Kenwood and Natasa Milic , “A comparison of Dimensionality Reduction Techniques for Text Retrieval”, Proceedings of the Fourth International Conference on Machine Learning and Applications, IEEE, December 2005.
[11] T. Sudha and P. Nagendra Kumar, “Comparative study of dimensionality reduction techniques in the context of clustering”, International Journal of Computer Science Engineering and Information Technology Research (IJCSEITR), Vol. 6, No.1, pp.19-28, February 2016.
[12] T. Sudha and P. Nagendra Kumar, “Achieving Privacy Preserving Clustering in Images using Multidimensional Scaling”, International Journal of Computer Science Engineering and Information Technology Research (IJCSEITR), Vol. 6, No. 2, pp.9-18, May 2016.
[13] Rahmat widia sembiring, Jasni Mohamad Zain and Abdullah Embong, “Dimension Reduction of Health Data Clustering”, International Journal on New Computer Architectures and their applications, Vol. 1, No. 3, pp.1041-1050, 2011.
[14] C.O.S. Sorzano, J. vargas and A. Pascual-Montano, “A survey of dimensionality reduction techniques”, arXiv.org, March 2014.
[15] H. Haripriya, R. Devisree, Dinesh Pooja and Prema Nedungadi, “A Comparative analysis of Self organizing maps on weight initializations using different strategies.” Fifth International conference on Advances in Computing and Communications, pp.434-438, March 2016.
[16] Paul Mangiameli, Shaw chen and David west, “A comparison of SOM Neural network and hierarchical clustering methods”, European Journal of Operational Research”, Vol. 93, No. 2, pp. 402-417, Sept. 1996.
[17] Ashish Gupta and Richard Bowden, “Evaluating Dimensionality Reduction Techniques for Visual Category Recognition using Renyi entropy “, 19th European Signal Processing Conference, pp. 913-917, September 2011.
[18] F.S.Tsai, “Comparative study of Dimensionality Reduction Techniques for Data Visualization”, Journal of Artificial Intelligence, Vol. 3, No.3, pp.119-134, 2010.
[19] Christoph Bartenhagen, Hans-Ulrich Klein, Christian Ruckert, Xiaoyi Jiang and Martin Dugas, “Comparative study of unsupervised dimension reduction techniques for the visualization of microarray gene expression data.” BMC Bioinformatics, November 2010.
[20] Anna konstorum, Nathan Jekel, Emily vidal and Reinhard Laubenbacher, “Comparative analysis of linear and nonlinear dimension reduction techniques on Mass Cytometry Data”, bioRxiv.March 2018.
[21] Shiping Huang, Matthew O. Ward and Elke A. Rundensteiner, “Exploration of Dimensionality Reduction for Text Visualization”, NSF grant IIS-0119276.
[22] Kazim yildiz, Yilmaz Camurcu and Buket Dogan, “Comparison of Dimension Reduction Techniques on High Dimensional Datasets.”, The International Arab Journal of Information Technology, Vol. 15, No. 2, March 2018.

Asian Journal of Computer Science and Technology is a peer-reviewed international journal that publishes high-quality scientific articles (both theory and practice) and research papers covering all aspects of future computer and Information Technology areas. Topics include, but are not limited to:

Foundations of High-performance ComputingTheory of algorithms and computability

Parallel & distributed computing

Computer networks

Neural networks

LAN/WAN/MAN

Database theory & practice

Mobile Computing for e-Commerce

Future Internet architecture

Protocols and services

Mobile and ubiquitous networks

Green networking

Internet content search

Opportunistic networking

Network applications

Network scaling and limits

Artifial Intelligences

Pattern/Image Recognitions

Communication Network

Information Security

Knowledge Management

Management Information systems

Multimedia communicatiions

Operations research

Optical networks

Software Engineering

Virtual reality

Web Technologies

Wireless technology

Data mining is one of the major areas of research. Clustering is one of the main functionalities of datamining. High dimensionality is one of the main issues of clustering and Dimensionality reduction can be used as a solution to this problem. The present work makes a comparative study of dimensionality reduction techniques such as t-distributed stochastic neighbour embedding and probabilistic principal component analysis in the context of clustering. High dimensional data have been reduced to low dimensional data using dimensionality reduction techniques such as t-distributed stochastic neighbour embedding and probabilistic principal component analysis. Cluster analysis has been performed on the high dimensional data as well as the low dimensional data sets obtained through t-distributed stochastic neighbour embedding and Probabilistic principal component analysis with varying number of clusters. Mean squared error; time and space have been considered as parameters for comparison. The results obtained show that time taken to convert the high dimensional data into low dimensional data using probabilistic principal component analysis is higher than the time taken to convert the high dimensional data into low dimensional data using t-distributed stochastic neighbour embedding.The space required by the data set reduced through Probabilistic principal component analysis is less than the storage space required by the data set reduced through t-distributed stochastic neighbour embedding.

Editor-in-Chief
Dr. K. Ganesh
Global Lead, Supply Chain Management, Center of Competence and Senior Knowledge
Expert at McKinsey and Company, India
[email protected]
Editorial Advisory Board
Dr. Eng. Hamid Ali Abed AL-Asadi
Department of Computer Science, Basra University, Iraq
[email protected]
Dr. Norjihan Binti Abdul Ghani
Department of Information System, University of Malaya, Malaysia
[email protected]
Dr. Christos Bouras
Department of Computer Engineering & Informatics, University of Patras, Greece
[email protected]
Dr. Maizatul Akmar Binti Ismail
Department of Information System, University of Malaya, Malaysia
[email protected]
Dr. Harold Castro
Department of Systems Engineering and Computing, University of the Andes, Colombia
[email protected]
Dr. Busyairah Binti Syd Ali
Department of Software Engineering, University of Malaya, Malaysia
[email protected]
Dr. Sri Devi Ravana
Department of Information system, University of Malaya, Malaysia
[email protected]
Dr. Karpaga Selvi Subramanian
Department of Computer Engineering, Mekelle University, Ethiopia
[email protected]
Dr. Mazliza Binti Othman
Department of Computer System & Technology, University of Malaya, Malaysia
[email protected]
Dr. Chiam Yin Kia
Department of Software Engineering, University of Malaya, Malaysia
[email protected]
Dr. OUH Eng Lieh
Department of Information Systems, Singapore Management University, Singapore
[email protected]

2016

2015

2014

  • Results
  • Asian Review of Mechanical Engineering (ARME)
  • career

2013

  • Home
  • Shop
  • My Account
  • Logout
  • Contact us
  • The Asian Review of Civil Engineering (TARCE)

2012

  • Asian Journal of Electrical Sciences(AJES)
  • Asian Journal of Computer Science and Technology (AJCST)
  • Asian Journal of Information Science and Technology (AJIST)
  • Asian Journal of Engineering and Applied Technology (AJEAT)
  • Asian Journal of Science and Applied Technology (AJSAT)
  • Asian Journal of Managerial Science (AJMS)
  • Asian Review of Social Sciences (ARSS)

2011

2010

    Table of Contents

    Editorial Note

    Editorial Dr. K. Ganesh

    Editor-in-Chief
    Dr. K. Ganesh
    Global Lead, Supply Chain Management, Center of Competence and Senior Knowledge
    Expert at McKinsey and Company, India
    [email protected]
    Editorial Advisory Board
    Dr. Eng. Hamid Ali Abed AL-Asadi
    Department of Computer Science, Basra University, Iraq
    [email protected]
    Dr. Norjihan Binti Abdul Ghani
    Department of Information System, University of Malaya, Malaysia
    [email protected]
    Dr. Christos Bouras
    Department of Computer Engineering & Informatics, University of Patras, Greece
    [email protected]
    Dr. Maizatul Akmar Binti Ismail
    Department of Information System, University of Malaya, Malaysia
    [email protected]
    Dr. Harold Castro
    Department of Systems Engineering and Computing, University of the Andes, Colombia
    [email protected]
    Dr. Busyairah Binti Syd Ali
    Department of Software Engineering, University of Malaya, Malaysia
    [email protected]
    Dr. Sri Devi Ravana
    Department of Information system, University of Malaya, Malaysia
    [email protected]
    Dr. Karpaga Selvi Subramanian
    Department of Computer Engineering, Mekelle University, Ethiopia
    [email protected]
    Dr. Mazliza Binti Othman
    Department of Computer System & Technology, University of Malaya, Malaysia
    [email protected]
    Dr. Chiam Yin Kia
    Department of Software Engineering, University of Malaya, Malaysia
    [email protected]
    Dr. OUH Eng Lieh
    Department of Information Systems, Singapore Management University, Singapore
    [email protected]

    Articles

Advanced Search

You can submit your research paper to the journal in just a few clicks. Please follow the steps outlined below: 1. Register your details and select to be an Author 2. Log in with your user name and password 3. ‘Start a new submission’ and follow these 5 steps:

[gravityform id="1" name="Registration" title="false" description="false"]

Privacy Statement

The names and email addresses entered in this journal site will be used exclusively for the stated purposes of this journal and will not be made available for any other purpose or to any other party.

Privacy Statement

The names and email addresses entered in this journal site will be used exclusively for the stated purposes of this journal and will not be made available for any other purpose or to any other party.

Lorem1 ipsum dolor sit amet, consectetur adipiscing elit. Nulla convallis ultricies scelerisque. Fusce dolor augue, sollicitudin eget lacus vitae, rutrum commodo lacus. Praesent ullamcorper facilisis dui. Sed suscipit id lorem ut dapibus. Integer dictum cursus nisl, quis ullamcorper augue. Sed non rutrum mauris. Maecenas in dolor est. Donec eget sagittis mi. Sed non leo eu odio mollis pulvinar vitae et leo. Integer eu feugiat tortor. Duis massa purus, eleifend id erat eget, hendrerit semper risus. Suspendisse cursus varius dapibus

Lorem1 ipsum dolor sit amet, consectetur adipiscing elit. Nulla convallis ultricies scelerisque. Fusce dolor augue, sollicitudin eget lacus vitae, rutrum commodo lacus. Praesent ullamcorper facilisis dui. Sed suscipit id lorem ut dapibus. Integer dictum cursus nisl, quis ullamcorper augue.

Subscription

Subscription (for 12 issues):
Rs. 5000; Overseas - USD 500;
Cheque drawn in favour of "Informatics Publishing Limited"
Click here to download online subscription form

Download

DD Mailing Address

Lorem1 ipsum dolor sit amet,
Lorem1 ipsum dolor sit amet,
Lorem1 ipsum dolor sit amet.

BACK TO TOP

Outstanding Scholars

The Journals honor Outstanding Scholars in various fields. Scholar of the Month should have contributed to their field and to the larger community. Recipients will be nominated by the Advisory Board and approved by the Editor-in-Chief of the allied journals published by The Research Publication. Scholar of the Month will be displayed in the web portal of the concerned journal.

Please send your brief write up to [email protected]

Editors and Reviewers

The Research Publication is seeking qualified researchers to join its editorial team as Associate Editor, Editorial Advisory Board Member, and Reviewers.
Kindly send your details to [email protected]

Call For Papers

Authors are requested to submit their papers electronically to [email protected] with mentioning the journal title.

Mailing Address

The Research Publication 1/611, Maruthi Nagar, Rakkipalayam Post, Coimbatore – 641 031, Tamil Nadu, India Phone No.: 0422 2461001

  • About
  • Editorial Policy
  • Author Guidelines
  • Contact us
  • Copyright
  • Facebook
  • Twitter
  • RSS

© 2015 The Research Publication. All rights reserved.

The Research Publication
  • Home
  • Editorial Policy
  • Author Guidelines
  • Submission
  • Copyright Form
  • Career
  • Contact us
  • Subscription