Validity studies among hierarchical methods of cluster analysis using cophenetic correlation coefficient
DOI:
https://doi.org/10.15392/bjrs.v7i2A.668Keywords:
cluster analysis, cophenetic correlation coefficient, INAA.Abstract
The literature presents many methods for partitioning of data set, and is difficult choose which is the most suitable, since the various combinations of methods based on different measures of dissimilarity can lead to different patterns of grouping and false interpretations. Nevertheless, little effort has been expended in evaluating these methods empirically using an archaeological data set. In this way, the objective of this work is make a comparative study of the different cluster analysis methods and identify which is the most appropriate. For this, the study was carried out using a data set of 45 samples of ceramic fragments, analyzed by instrumental neutron activation analysis (INAA). The methods used for this study were: Single linkage, Complete linkage, Average linkage, Centroid and Ward. The validation was done using the cophenetic correlation coefficient and comparing these values the average linkage method obtained better results. A script of the statistical program R with some functions was created to obtain the cophenetic correlation. By means of these values was possible to choose the most appropriate method to be used in the data set.
Downloads
References
FÁVERO, L. P.; BELFIORE, P.; SILVA, F. L.; CHAN, B. L. Análise de dados: modelagem multivariada para tomada de decisões, Rio de Janeiro: Elsevier, 2009.
MINGOTI, S. A. Análise de dados através de métodos de estatística multivariada: uma abordagem aplicada, Belo Horizonte: Editora UFMG, 2005.
PAPAGEORGIOU, J.; BAXTER, M. J. Model-based cluster analysis of artefact compositional data. Archaeometry, v. 43(4), p. 571-588, 2001.
TREBUNA, P.; HALCINOVÁ, J. Mathematical tools of cluster analysis. Applied Mathematics, v. 4, p. 814-816, 2013.
HAIR Jr., J. F.; ANDERSON, R. E.; TATHAM, R. L.; BLACK, C. Análise multivariada de dados, Porto Alegre: Bookman, 2005.
BARROSO, L. P.; ARTES, R. Análise multivariada, In: 48ª Região Brasileira da Sociedade Internacional de Biometria – RBRAS, 9º Simpósio de Estatística Aplicada à Experimentação Agronômica – SEAGRO, Lavras, MG, 7 a 11 de julho, 2003.
BUSSAB, W. O.; MIAZAKI, E. S.; ANDRADE, D. F. Introdução à análise de agrupamentos. São Paulo: ABE, 1990.
EVERITT, B. S.; LANDAU, S.; LEESE, M.; STAHL, D. Cluster analysis, London: Edward, 2011.
SARAÇLI, S.; DOGAN, N.; DOGAN, I. Comparison of hierarchical cluster analysis methods by cophenetic correlation. J. Inequalities and Applications, v. 203, p. 1-8, 2013.
MUNITA, C. S.; PAIVA, R. P.; ALVES, M. A.; OLIVEIRA, P. M. S.; MOMOSE, E. F. Provenance study of archaeological ceramic. J. Trace and Microprobe Techniques, v. 21(4), p. 697-706, 2003.
MURTAGH, F.; CONTRERAS, P. Methods of Hierarchical Clustering. Data Mining and Knowledge Discovery, Wiley-Interscience, v. 2(1), p. 86-97, 2012.
FLOREK, K.; LUKASZEWIEZ, L.; PERKAL L. et al. Sur la liaison et la division des points d’un ensemble fini. Colloquium Mathematicum, v. 2, p. 282-285, 1951.
SNEATH, P. H. A. The application of computers to taxonomy. J. General Microbiology, v. 17, p. 201-226, 1957.
JOHNSON, S. C. Hierarchical clustering schemes. Psychometrika, v. 32, p. 241–254, 1967.
MARDIA, K. V.; KENT, J. T.; BIBBY, J. M. Multivariate Analysis, London: Academic Press, 1989.
WARD, J. H. Hierarchical grouping to optimize an objective function. J. Applied Statistics, v. 58, p. 236-244, 1963.
SOKAL, R. R.; ROHLF, F. J. The comparison of dendrograms by objective methods. Taxon, v. 11, p. 33-40, 1962.
VENABLES, W. N.; SMITH, D. M.; THE R CORE TEAM. An introduction to R, 2017. Available at: <https://cran.r-project.org/doc/manuals/r-release/R-intro.pdf> Last accessed: 10 Nov. 2017.
OLIVEIRA, P. M. S.; MUNITA, C. S. Influência do Valor Crítico na Detecção de Valores Discrepantes em Arqueometria, In: 48ª Reunião Anual Região Brasileira da Sociedade Internacional de Biometria, Lavras, MG, Brazil, 07-11 de julho, 2003.
OLIVEIRA, P. M. S.; MUNITA, C. S.; HAZENFRATZ, R. Comparative study between three methods of outlying detection on experimental results. J. Radioanalytical and Nuclear Chemistry, v. 283, p. 433-437, 2010.
ROHLF, F. J. Adaptative hierarquical clustering schemes”, Systematic Zoology, v. 19(1), p. 58-82, 1970.
KUIPER, F. K.; FISHER, L. A. A Monte Carlo comparison of six clustering procedures. Biometrics, v. 31, p.777-783, 1975.
MILLIGAN, G. W.; COOPER, M. C. A study of standardization of variables in cluster analysis. J. Classification, v. 5, p. 181-204, 1988.
Downloads
Published
Issue
Section
License
Copyright (c) 2021 Brazilian Journal of Radiation Sciences
This work is licensed under a Creative Commons Attribution 4.0 International License.
Licensing: The BJRS articles are licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/