300 rub
Journal Nonlinear World №3 for 2021 г.
Article in number:
Agglomerative clusterization with DBSCAN algorithm and iterative method
Type of article: scientific article
DOI: https://doi.org/10.18127/j20700970-202103-03
UDC: 004.8
Authors:

D.A. Kuznetsov1, N.P. Plotnikova2, S.A. Fedosin3

1−3 National Research Mordovia State University (Saransk, Mordovia, Russia)

Abstract:

The tasks of grouping, classifying, and clustering data are widely encountered nowadays in different areas of activity. These tasks need to be solved in librarianship and even in social networks. One of the complex and important tasks is a grouping of normative and reference information. Solving this problem will allow us to perform complex operations with the normativereference information: building a hierarchical structure from scratch, adding new entities or groups to the existing, and combining several lists of normative-reference information independently of the original hierarchical structure presence. This  article includesdescription of using DBSCAN as an agglomerative-iterative clusterization algorithm. The iterative part of this algorithm is necessary forfull entity list clusterization on every hierarchical level. Clustering metrics such as adjusted Rand  index, Jaccard index, Foulkes-Mallows index are considered. A new metric based on the previously mentioned metrics has been proposed. A combination of Word2Vec and TF-IDF algorithms is used to convert the textual names of objects to numerical form.

Pages: 29-36
For citation

Kuznetsov D.A., Plotnikova N.P., Fedosin S.A. Agglomerative clusterization with DBSCAN algorithm and iterative method. Nonlinear World. 2021. V. 19. № 3. 2021. P. 29−36. DOI: https://doi.org/10.18127/j20700970-202103-03 (In Russian)

References
  1. Bouguettaya A., Yu, Q., Liu X., Zhou X., Song A. Efficient agglomerative hierarchical clustering. Expert Systems with Applications. 2015. V. 42. № 5. P. 2785–2797.
  2. Murtagh F., Contreras P. Algorithms for hierarchical clustering: an overview. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. 2012. V. 2. № 1. P. 86–97.
  3. Chatterjee A., Shubhashis S. Intent Mining from past conversations for Conversational Agent. Proceedings of the 28th International Conference on Computional Lingustics Intent Mining from past conversations for Conversational Agent. Donia Scott, Nuria Bel, Chengqing Zong. International Committee on Computational Linguistics. 2020. P. 4140–4152. 
  4. Abir S., Zied E. Soft dbscan: Improving dbscan clustering method using fuzzy set theory. 2013 6th International Conference on Human System Interactions (HSI). IEEE. 2013. P. 380-385.
  5. Ester M., Kriegel H.-P., Sander J., Xu X. A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96). Evangelos Simoudis, Jiawei Han, Usama M. Fayyad. AAAI Press. 1996. P. 226–231. 
  6. Nemchinova E.A., Plotnikova N.P., Fedosin S.A. Podgotovka i obrabotka normativno-spravochnoj tekstovoj informacii dlja klassifikacii s pomoshh'ju iskusstvennyh nejronnyh setej. Nelinejnyj mir. 2019. T. 17. № 2 S. 27-33 (In Russian).
Date of receipt: 29.06.2021
Approved after review: 16.07.2021
Accepted for publication: 24.08.2021