Journal Highly available systems №2 for 2021 г.
Article in number:
Cluster analysis of information system users based on keyboard dynamics characteristics
Type of article: scientific article
DOI: https://doi.org/10.18127/j20729472-202102-04
UDC: 004.8
Authors:

A.P. Karpenko, Yu.V. Yamchenko, D.S. Dubrovkin

Bauman Moscow State Technical University (Moscow, Russia)

Abstract:

An increasing number of works, studying people behavioral patterns in information environment, appeared recently in connection with the development of new big data and machine learning technologies. These studies are carried out in the area of psychoinformatics that appeared at the confluence of computer science and psychology. The researches results of this area are used in such fields as electronic commerce, information security and human resource management. Research methods of this area often require additional interactions between users and research system during the data collection process. Besides that, amount of data collected from open sources is limited due to the personal data protection policies used in such recourses. Thus, the reason to find alternative sources of information and to develop new methods to analyze such information is appeared.

We consider the users keystroke dynamics (KD) as such a source. KD methods provide the possibility of continuous hidden monitoring of user state using common input devices such as keyboard and mouse. All of these provide significant opportunity for data collection, analyzing and assessing user behavior during daily activities.

The most of works concerning keystroke dynamics addresses the task of user authentication and identification. However it is also of interest to identify groups of users with similar KD characteristics. Solution of this problem allows us to approach the problem of assessing user emotional states and personality types based on KD.

This article begins the series of works devoted to the study of user behavior characteristics and personality types using methods of

KD analysis. The purpose of this work is to study the possibility of user clustering based on their KD characteristics. The objectives of the study are: organizing the process of user data collection, building KD characteristic vectors for each user, assessing correlations between user KD characteristics, conducting a cluster analysis of user KD characteristic vectors.

In the first part of work the methodology for users KD data collection and cluster analysis of this data are described. We also provide the description of used KD characteristics and investigate correlations between these features. In the next part we provide the results of cluster analysis carried out using KD data of IT-company employees in accordance with described methodology. Finally, the discussion of cluster analysis results is provided and conclusions are drawn.

The methodology presented in this work can be used to organize the process of user data collection, construct user KD characteristic vectors, perform cluster analysis of these vectors. The results of this work will also be used in the following publications concerning user emotional states and personality types assessment based on keystroke dynamics.

Pages: 45-57
For citation

Karpenko A.P., Yamchenko Yu.V., Dubrovkin D.S. Cluster analysis of information system users based on keyboard dynamics characteristics. Highly Available Systems. 2021. V. 17. № 2. P. 45−57. DOI: https://doi.org/10.18127/j20729472-202102-04 (in Russian)

References
  1. Buettner R. Predicting user behavior in electronic markets based on personality-mining in large online social networks. Electron. Mark. 2017. V. 27(3). P. 247–265.
  2. Yamchenko Yu.V., Dubrovkin D.S., Paleckij A.N. Obzor metodov opredeleniya emocional'nogo sostoyaniya pol'zovatelej informacionnyh sistem po klaviaturnomu pocherku. Sistemy vysokoj dostupnosti. 2020. № 1. C. 65–80 (in Russian).
  3. Yamchenko Yu.V. Metody resheniya zadach autentifikacii i identifikacii pol'zovatelya na osnove analiza klaviaturnogo pocherka. Vestnik MGTU im. N.E. Baumana. Ser. Priborostroenie. 2020. № 1 (130). C. 124–139 (in Russian).
  4. Mehta Y., Maujmder N., Gelbrukh A., Cambria E. Recent Trends in deep lerning based personality detection. Artificial intelligence Review. 2019. V. 53(4). P. 2313–2339.
  5. Kosinski M., Stillwell D., Graepel T. Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences. 2013. V. 110. P. 5802–5805.
  6. Tadesse M.M., Lin H., Xu B., Yang L. Personality Predictions Based on User Behaviour on Facebook Social Media Platform. IEEE Access. 2018. V. 6. P. 61959–61969.
  7. Gjurkovic M., Snajder J. Reddit: A gold mine for personality prediction. in Proceedings of the Second Workshop on Computational Modeling of Peoples Opinions, Personality, and Emotions in Social Media. 2018. P. 87–97.
  8. Skowron M., Tkalcic M., Ferwerda B., Schedl M. Fusing social media cues: Personality prediction from Twitter and instagram. in Proceedings of the 25th international Conference Companion on World Wide Web. 2016. P. 107–108.
  9. Verhoeven B., Daelemans W., Plank B. TwiSty: A multilingual Twitter stylometry corpus for gender and personality profiling. in Proceedings of the Tenth international Conference on Language Resources and Evaluation (LREC 2016). 2016. P. 1632–1637.
  10. Priaynka J., Dharmender K. A Review on Dimensionality Reduction Techniques. international Journal of Computer Applications. 2017. V. 173(2). P. 42–46.
  11. Sorzano C., Vargas J., Pascual-Montano A. A survey of dimensionality reduction techniques [Elektronnyj resurs]. arXiv.org. 2014. Data obnovleniya: 12.03.2014. URL: https://arxiv.org/abs/1403.2877 (data obrashcheniya: 05.01.2021).
  12. Chao G., Luo Y., Ding W. Recent advances in supervised dimension reduction: A survey. Machine Learning and Knowledge Extraction. 2019. V. 1(1). P. 341–358.
  13. Zheng A., Casari A. Feature Engineering for Machine Learning. Principles and Techniques for Data scientists. O`REILLY Media, inc., United States of America. 2018. P. 193.
  14. Omran M., Engelbrecht A., Salman A.A. An overview of clustering methods. intelligent Data Analysis. 2007. V. 11 (6). P. 583–605. 15. Singh S., Srivastsva S. Review of Clustering Techniques in Control System. Procedia Computer Science. 2020. V. 173. P. 272–280.
Date of receipt: 13.05.2021
Approved after review: 20.05.2021
Accepted for publication: 02.06.2021