V.E. Dementiev1, N.A. Andriyanov2
1 Ulyanovsk State Technical University (Ulyanovsk, Russia)
2 Financial University under the Government of the Russian Federation (Moscow, Russia)
Problem. Various digital systems for collecting and analyzing information are of great importance today. This also applies to the transport industry. There is currently no universal solution for collecting information about the operation of a taxi ordering service. At the same time, the task is both to select significant indicators and to process them effectively. In particular, the task of predicting the cost of ordering a taxi, which can be solved using machine learning methods, is extremely relevant.
Target. Select significant factors for the taxi ordering service, implement their collection in the database and explore a number of regression models with various preprocessing methods, including the use of a generative model.
Results. The article considers the problem of predicting the effective cost of ordering a taxi service for a large number of parameters. The text pays special attention to the parameters used for the assessment and the collection of the necessary information. The dataset in question was provided by one of the taxi service companies in Ulyanovsk. A method for data preprocessing using a doubly stochastic random field model and a combined forecasting method based on K-nearest neighbors, gradient boosting, and random forest are proposed. Particular attention is paid to the description of data using dimensionality reduction methods. To assess the quality of forecasts, data with expert labeling of the required cost is used. The proposed method made it possible to reduce the mean square error by 7% compared to the known machine learning algorithms. The developed algorithm can be used to form the cost of a taxi ordering service.
Practical significance. A regional taxi ordering service can improve the efficiency of the pricing policy through the application of trained models. However, to obtain a reliable model, a larger training sample is required.
Dementiev V.E., Andriyanov N.A. System for predicting the effective price of ordering a taxi service using machine learning algorithms. Information-measuring and Control Systems. 2022. V. 20. № 5. P. 67−73. DOI: https://doi.org/10. 18127/j20700814-202205-10 (in Russian)
- Andrijanov N.A., Orlov E.A. Razrabotka modeli mashinnogo obuchenija dlja ocenki sostojanija glaz voditelja. Inzhenernyj vestnik Dona. 2022. V. 5. № 89. C. 142-159 (in Russian).
- Seleznev N.K., Irhin I.A., Kantor V.V. Avtomaticheskoe izvlechenie atributov voditelja iz logov mobil'nogo prilozhenija taksi. Trudy Moskovskogo fiziko-tehnicheskogo instituta. 2018. V. 3. № 39. S. 5-15 (in Russian).
- Moseva M.S. O metodah sbora i analiza osnovnyh harakteristik transportnogo potoka. T-Comm Telekommunikacii i Transport. 2022. V. 16. № 2. S. 29-38 (in Russian).
- Soloviev V., Feklin V. Non-life Insurance Reserve Prediction Using LightGBM Classification and Regression Models Ensemble. Studies in Systems, Decision and Control. 2022. V. 417. Р. 181–188.
- Schuh G., Reinhart G., Prote J., Sauermann F., Horsthofer J., Oppolzer F., Knoll D. Data Mining Definitions and Applications for the Management of Production Complexity. Procedia CIRP. 2019. V. 81. P. 874-879.
- Oluwaseun A., Chaubey M. Data mining classification techniques on the analysis of student performance. Data Analysis. 2019. V. 3.
P. 19-32. DOI: 10.11216/gsj.2019.04.19671. - Umang P. NYC Taxi Trip and Fare Data Analytics using BigData. Data Analysis. 2015. V. 2. P. 86-90.
- Guo J. Analysis and comparison of Uber, Taxi and Uber request via Transit. JISC. 2018. V. 29. P. 1-4.
- Deri J., Moura J. Taxi data in New York city: A network perspective. ACSSC. 2015. P. 1829-1833. DOI: 10.1109/ACSSC.2015.7421468.
- Andriyanov N.A., Sonin V.A. Using mathematical modeling of time series for forecasting taxi service orders amount. CEUR Workshop Proceedings. 2018. V. 2258. P. 462-472.
- Andrijanov N.A., Dement'ev V.E., Tashlinskij A.G., Burankina P.V. Ispol'zovanie glubokih gaussovyh modelej v sistemah massovogo obsluzhivanija dlja podderzhki prinjatija reshenij. Avtomatizacija processov upravlenija. 2022. №1(67). S. 43-51. DOI: 10.35752/1991-2927-2022-1-67-43-51 (in Russian).
- Andriyanov N., Dementiev V., Tashlinskiy A. Development and Research of Intellectual Algorithms in Taxi Service Data Processing Based on Machine Learning and Modified K-means Method. Czarnowski, I., Howlett, R.J., Jain, L.C. (eds). Intelligent Decision Technologies. Smart Innovation, Systems and Technologies. Springer, Singapore. 2022. V. 309. P. 183-192. DOI: 10.1007/978-981-19-3444-5_16.
- Danilov A.N., Andriyanov N.A., Azanov P.T.: Ensuring the effectiveness of the taxi order service by mathematical modeling and machine learning. Journal of Physics: Conference Series. 2018. V. 1096. P. 1-6. DOI: 10.1088/1742-6596/1096/1/012188.
- Andrijanov N.A. Dvazhdy stohasticheskie modeli izobrazhenij: Avtoref. diss. … kand. tehn. nauk. Ul'janovsk. 2017. 24 s. (in Russian).
- Stroganov V.Ju., Zelencova E.V. Predstavlenie i modelirovanie adaptivnyh algoritmov upravlenija transportnymi sistemami v vide konechnyh ierarhicheskih avtomatov. Dinamika slozhnyh sistem. 2021. T. 15. № 4. S. 29−35. DOI: 10.18127/j19997493- 202104-04 (in Russian).