Graph-Based Clustering and Data Visualization Algorithms [Vathy-Fogarassy & Abonyi 2013-06-05].pdf
(
3907 KB
)
Pobierz
SPRINGER BRIEFS IN COMPUTER SCIENCE
Ágnes Vathy-Fogarassy
János Abonyi
Graph-Based
Clustering and
Data Visualization
Algorithms
SpringerBriefs in Computer Science
Series Editors
Stan Zdonik
Peng Ning
Shashi Shekhar
Jonathan Katz
Xindong Wu
Lakhmi C. Jain
David Padua
Xuemin Shen
Borko Furht
V. S. Subrahmanian
Martial Hebert
Katsushi Ikeuchi
Bruno Siciliano
For further volumes:
http://www.springer.com/series/10028
Ágnes Vathy-Fogarassy
János Abonyi
Graph-Based Clustering
and Data Visualization
Algorithms
123
Ágnes Vathy-Fogarassy
Computer Science and Systems Technology
University of Pannonia
Veszprém
Hungary
János Abonyi
Department of Process Engineering
University of Pannonia
Veszprém
Hungary
ISSN 2191-5768
ISBN 978-1-4471-5157-9
DOI 10.1007/978-1-4471-5158-6
ISSN 2191-5776 (electronic)
ISBN 978-1-4471-5158-6 (eBook)
Springer London Heidelberg New York Dordrecht
Library of Congress Control Number: 2013935484
Ó
János Abonyi 2013
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or
information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed. Exempted from this legal reservation are brief
excerpts in connection with reviews or scholarly analysis or material supplied specifically for the
purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the
work. Duplication of this publication or parts thereof is permitted only under the provisions of
the Copyright Law of the Publisher’s location, in its current version, and permission for use must
always be obtained from Springer. Permissions for use may be obtained through RightsLink at the
Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt
from the relevant protective laws and regulations and therefore free for general use.
While the advice and information in this book are believed to be true and accurate at the date of
publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for
any errors or omissions that may be made. The publisher makes no warranty, express or implied, with
respect to the material contained herein.
Printed on acid-free paper
Springer is part of Springer Science+Business Media (www.springer.com)
Preface
Clustering, as a special area of data mining, is one of the most commonly used
methods for discovering hidden structure of data. Clustering algorithms group a set
of objects in such a way that objects in the same cluster are more similar to each
other than to those in other clusters. Cluster analysis can be used to quantize data,
extract cluster prototypes for the compact representation of the data set, select
relevant features, segment data into homogeneous subsets, and to initialize
regression and classification models.
Graph-based clustering algorithms are powerful in giving results close to the
human intuition [1]. The common characteristic of graph-based clustering methods
developed in recent years is that they build a graph on the set of data and then use
the constructed graph during the clustering process [2–9]. In graph-based clus-
tering methods objects are considered as vertices of a graph, while edges between
them are treated differently by the various approaches. In the simplest case, the
graph is a complete graph, where all vertices are connected to each other, and the
edges are labeled according to the degree of the similarity of the objects. Con-
sequently, in this case the graph is a weighted complete graph.
In case of large data sets the computation of the complete weighted graph
requires too much time and storage space. To reduce complexity many algorithms
work only with sparse matrices and do not utilize the complete graph. Sparse
similarity matrices contain information only about a small subset of the edges,
mostly those corresponding to higher similarity values. These sparse matrices
encode the most relevant similarity values and graphs based on these matrices
visualize these similarities in a graphical way.
Another way to reduce the time and space complexity is the application of a
vector quantization (VQ) method (e.g. k-means [10], neural gas (NG) [11], Self-
Organizing Map (SOM) [12]). The main goal of the VQ is to represent the entire
set of objects by a set of representatives (codebook vectors), whose cardinality is
much lower than the cardinality of the original data set. If a VQ method is used to
reduce the time and space complexity, and the clustering method is based on
graph-theory, vertices of the graph represent the codebook vectors and the edges
denote the connectivity between them.
Weights assigned to the edges express similarity of pairs of objects. In this book
we will show that similarity can be calculated based on distances or based on
v
Plik z chomika:
musli_com
Inne pliki z tego folderu:
Algorithm Design for Networked Information Technology Systems [Ghosh 2003-11-18].pdf
(122310 KB)
Algorithm Design.pdf
(43807 KB)
3D Imaging in Medicine_ Algorithms, Systems, Applications [Höhne, Fuchs & Pizer 2011-12-08].pdf
(21977 KB)
2D Object Detection and Recognition_ Models, Algorithms, and Networks [Amit 2002-11-01].pdf
(7379 KB)
A History of Algorithms - From the Pebble to the Microchip.djvu
(6719 KB)
Inne foldery tego chomika:
0_Computer History
1_Principles of Programming Languages
3_Theory
4_Theory of Computation
5_Parallel and Distributed
Zgłoś jeśli
naruszono regulamin