500 rub
Journal Highly available systems №1 for 2026 г.
Article in number:
Composite criteria for assessing thematic similarity of scientific agents
Type of article: scientific article
DOI: https://doi.org/10.18127/j20729472-202601-16
UDC: 519.854.2
Authors:

K.A. Kalugin1

1 Trapeznikov Institute of Control Sciences, Russian Academy of Sciences (Moscow, Russia)

1 netter2@rambler.ru

Abstract:

Problem statement: The problem of developing a criterion for assessing the thematic similarity of scientific agents is being investiga­ted. The exponential growth of scientific publications and the complexity of interdisciplinary collaborations necessitate precise and
automated assessment of the thematic proximity of researchers (scientific agents). Existing approaches (scientometric, network, thematic) often fail to account for scientists' dynamic terminological preferences, leading to inaccuracies in forming research teams, allocating expertise, and identifying overlapping works.

Objective: Development and experimental testing of a new composite criterion for assessing the thematic similarity of scientific agents based on the analysis of their terminological profiles, as well as testing alternative validation methods (super-criteria).

Results: The SPM composite criterion is proposed, integrating rank (Spearman) and linear (Pearson) correlations with a filtering mechanism based on the scalar product of term frequency vectors. An experimental comparison with baseline metrics (Jaccard, Spearman, Pearson) on a sample of researchers from the Institute of Control Sciences RAS, involving expert assessment, demonstrated its higher accuracy. Super-criteria based on hypotheses about co-authorship and cluster membership were developed and tested, confirming their effectiveness for the indirect assessment of similarity criteria quality.

Practical significance: The proposed methodology optimizes the process of forming research teams, reduces the time spent searching for relevant experts, and minimizes duplication of research in related fields.

Pages: 81-84
For citation

Kalugin K.A. Composite criteria for assessing thematic similarity of scientific agents. Highly Available Systems. 2026. V. 22. № 1.
P. 81−84. DOI: https://doi.org/10.18127/j20729472-202601-16 (in Russian)

References
  1. Ahlgren P., Jarneving B., Rousseau R. Requirements for a cocitation similarity measure, with special reference to Pearsons correlation coefficient. Journal of the American Society for Information Science and Technology. 2003. V. 54. № 6. P. 550–560.
  2. Šubelj L., Van Eck N.J., Waltman L. Clustering scientific publications based on citation relations: A systematic comparison of different methods. PloS One. 2016. V. 11. № 4. e0154404.
  3. Probierz B., Kozak J., Hrabia A. Clustering of scientific articles using natural language processing. Procedia Computer Science. 2022. V. 207. P. 3449–3458.
  4. Van Eck N. J., Waltman L. Generalizing the h- and g-indices. Journal of Informetrics. 2008. V. 2. № 4. P. 263–271.
  5. Naukometriya i e`kspertiza v upravlenii naukoj: sbornik statej / pod red. D. Novikova, A. Orlova, P. Chebotareva. M.: IPU RAN. 2013. 572 s.
  6. Gubanov D.A., Kuzneczov O.P., Kurako E.A., Lemtyuzhnikova D.V., Novikov D.A., Chxartishvili A.G. Informacionnaya sistema analiza nauchnoj deyatel`nosti (ISAND) v oblasti teorii upravleniya. Problemy` upravleniya. 2024. № 3. S. 42–65.
Date of receipt: 24.02.2026
Approved after review: 26.02.2026
Accepted for publication: 10.03.2026