I.М. Zatsman1
1 FRC «Computer Science and Control» of RAS (Moscow, Russia)
1 izatsman@yandex.ru
The paper considers the issues of detailing the concepts defined by GOST R 59853-2021 «AS. Terms and Definitions», which was due to the design of a lexicographic information system at the FRC CSC RAS. This system includes a bilingual electronic dictionary (in the form of a dictionary database), a corpus of parallel texts and a knowledge base. Its main difference from domestic and foreign analogues is that for certain categories of language units, the dictionary and the corpus are related not only by words, but also by their meanings, the descriptions of which can change over time. When planning the project on creating this system, it was necessary to introduce into the terms of this GOST features indicating those system objects and their transformations that are necessary for developers. For example, the term «Information Processing in AS» includes a change in the content of information (in the system, this is changes of language unit definitions). When planning, it was necessary to introduce a feature of the degree of socialization of content changes, which details this term, highlighting the individual and collective knowledge of experts and its evolution. However, only detailing concepts using features does not provide a list of the main objects of the system.
The aim of the paper is to analyse GOST R 59853-2021 «AS. Terms and Definitions» from the perspective of developers of the lexicographic information system. The main outcome is the definition of attributes indicating the objects of this system and their transformations. The indication allows us to talk about the incompleteness of the system of terms of this GOST, which is proposed to be supplemented with the term «knowledge».
In the interests of terminological completeness for describing the requirements of designing the lexicographic information system, the basic notions («knowledge», «information», «data») were first detailed and the medium version of the hierarchy DIKW (data, information, knowledge, wisdom) was constructed. To create the medium version, three basic notions were detailed into eight the lexicographic information system objects.
The practical significance of including this term was clearly demonstrated in the design of the lexicographic information system. Its usage made it possible to specify the objects of this system and to divide the concept of «data processing» into several types.
Zatsman I.М. Terms of the standard for creation and development of computerized system: detailing of system objects and
their transformations. Highly Available Systems. 2024. V. 20. № 4. P. 35−43. DOI: https://doi.org/ 10.18127/j20729472-202404-04
(in Russian)
- Postanovleniye Prezidiuma RAN ot 09.04.2024 g. № 63 «Fundamental’nyye problemy lingvistiki i zadachi korpusnykh issledovaniy yazykov». https://new.ras.ru/upload/iblock/51b/t0ydiqrywi856zei2mot6rsz9nhe6gvm.pdf
- Geyken A., Wiegand F., and Würzner K.-M. On-the-fly Generation of Dictionary Articles for the DWDS Website. Proceedings of eLex 2017 conference. Brno: Lexical Computing CZ s. r. o., 2017. P. 560–570
- Digitales Wörterbuch der deutschen Sprache. https://www.dwds.de
- Klein W., and Geyken A. Das Digitale Wörterbuch der Deutschen Sprache (DWDS). Lexicographica. 2010. V. 26. № 2010. P. 79–96. doi: 10.1515/9783110223231.1.79.
- Didakowski J., Lemnitzer L., and Geyken A. Automatic example sentence extraction for a contemporary German dictionary. Proceedings of the 15th EURALEX International Congress. Oslo, Norway: University of Oslo. 2012. P. 343–349.
- Lemnitzer L., Pölitz C., Didakowski J., and Geyken A. Combining a rule-based approach and machine learning in a good-example
extraction task for the purpose of lexicographic work on contemporary standard German. Proceedings of eLex 2015 conference. Ljubljana: Institute for Applied Slovene Studies / Lexical Computing Ltd. 2015. P. 21–31. - Goncharov A.A., Zatsman I.M., and Kruzhkov M.G. Temporal’nyye dannyye v leksikograficheskikh bazakh znaniy. Informatika i ee primeneniya. 2019. V. 13. № 4. P. 90–96.
- Goncharov A.A., Zatsman I.M., and Kruzhkov M.G. Evolyutsiya klassifikatsiy v nadkorpusnykh bazakh dannykh. Informatika i ee primeneniya. 2020. V. 14. № 4. P. 108–116.
- Goncharov A.A., Zatsman I.M., and Kruzhkov M.G. Predstavleniye novykh leksikograficheskikh znaniy v dinamicheskikh klassifikatsionnykh sistemakh. Informatika i ee primeneniya. 2021. V. 15. № 1. P. 86–93.
- Goncharov A.A., Zatsman I.M., Kruzhkov M.G., and Loshchilova E.Yu. Otrazheniye evolyutsii leksikograficheskikh znaniy v dinamicheskikh klassifikatsionnykh sistemakh. Informatika i ee primeneniya. 2021. V. 15. № 4. P. 41–49.
- Zatsman I. Finding and filling lacunas in linguistic typologies. 15th Forum (International) on Knowledge Asset Dynamics Proceedings. Matera, Italy: Institute of Knowledge Asset Management. 2020. P. 780–793.
- Zatsman I. Three-dimensional encoding of emerging meanings in AI-systems. 21st European Conference on Knowledge Management Proceedings. Reading, UK: Academic Publishing International Ltd. 2020. P. 878–887.
- Ackoff R. From data to wisdom. J. Appl. Systems Analysis. 1989. V. 16. P. 3–9.
- Rowley J. The wisdom hierarchy: Representations of the DIKW hierarchy. J. Inf. Sci. 2007. V. 33. № 2. P. 163–180. doi: 10.1177/ 0165551506070706.
- Zatsman I. Transformatsii ob”yektov pervogo i vtorogo poryadka v leksikograficheskoy informatsionnoy sisteme. Informatika i ee primeneniya. 2024. V. 18. № 2. P. 82–91.
- Zatsman I. Building digital spiral models of knowledge generation. 19th Forum (International) on Knowledge Asset Dynamics Proceedings. Matera, Italy: Arts for Business Institute. 2024. P. 2185–2196.
- Zatsman I. Transformatsiya iyerarkhii Akoffa v nauchnoy paradigme informatiki. Informatika i ee primeneniya. 2023. V. 17. № 3. P. 107–113.
- Zatsman I. Digital spiral model of knowledge creation and encoding its dynamics. 18th Forum (International) on Knowledge Asset Dynamics Proceedings. Matera, Italy: Arts for Business Institute. 2023. P. 581–596.
- Dobrovol’skiy D.O., and Zatsman I. Model’ izvlecheniya znaniya iz parallel’nykh tekstov leksikograficheskoy informatsionnoy sistemy. Informatika i ee primeneniya. 2024. V. 18. № 3. P. 97–105.
- Nonaka I. The knowledge-creating company. Harvard Business Review. 1991. V. 69. № 6. P. 96–104.
- Nonaka I., and Takeuchi H. The knowledge-creating company. Oxford, NY, USA: Oxford University Press. 1995. 284 p.
- Zatsman I. Interfeysy tret’yego poryadka v informatike. Informatika i ee primeneniya. 2019. V. 13. № 3. P. 82–89.
- Simonovich S.V. Informatika. Bazovyy kurs. St. Petersburg: Publishing House «Piter». 2011. 640 p.
- Berman F., Rutenbar R., Hailpern B., Christensen H., Davidson S., Estrin D., Franklin M., Martonosi M., Raghavan P., Stodden V., and Szalay A.S. Realizing the potential of data science. Communications of the ACM. 2018. V. 61. Iss. 4. P. 67–72.
- Stodden V. The data science life cycle: a disciplined approach to advancing data science as a science. Communications of the ACM. 2020. V. 63. Iss. 7. P. 58–66.