S.T. Tsaplin¹
¹Department ICS6 of Computer Systems and Networks,
Bauman Moscow State Technical University (Moscow, Russia)
¹tsaplin@bmstu.ru
Traditional approaches to time series forecasting, including statistical methods (ARIMA, SARIMA) and modern machine learning algorithms, demonstrate certain limitations when working with complex data characterized by nonlinear dependencies and multivariate conditions. Despite the development of machine learning methods and large-scale language models, the task of improving the accuracy of time series forecasting remains relevant, especially in the context of limited training data and the need to account for heterogeneous external factors. The integration of visual language models (VLMs) into time series forecasting represents a little-studied innovative approach that requires a systematic analysis of its applicability and effectiveness.
Aim is to conduct a comparative analysis of existing short-time series augmentation methods to improve the forecasting quality of To assess the applicability of visual language models (VLMs) for time series forecasting through a comparative analysis with existing methods based on five key criteria: forecast accuracy, computational speed, ease of use, interpretability of results, and the ability to effectively account for heterogeneous external factors and exogenous variables. The study aims to comprehensively compare classical statistical approaches, machine learning algorithms, and innovative multimodal systems in terms of their performance in processing temporal data.number of data points (19-40) for effective operation. A comparative analysis of four classes of models revealed a significant advantage of VLM in forecasting accuracy: the Time-VLM model demonstrated an MSE of 4.80 versus 15.80 for the classical ARIMA model, 8.90 for LSTM, and 6.50 for Time-LLM. VLM received the highest scores for forecast accuracy (5/5 points) and the ability to effectively integrate diverse external factors (5/5 points). The Time-VLM architecture, which combines a Retrieval-Augmented Learner for processing temporal data, a Vision-Augmented Learner for transforming numerical series into informative visual representations, and a Text-Augmented Learner for generating relevant contextual descriptions, enables efficient processing of multimodal data with a synergistic effect. A comparative analysis using the weighted composite index yielded the following results: LLM (3.8/5), VLM (3.65/5), ML models (3.6/5), and statistical models (3.15/5).
These results demonstrate the feasibility of using VLM to make informed choices about optimal forecasting methods based on the specifics of a given problem, available computing resources, and forecast accuracy requirements. The study confirms the effectiveness of a multimodal approach combining visual, textual, and temporal information, especially in the context of limited training data (few-shot and zero-shot training). The practical value of this work lies in its systematization of the advantages and limitations of various forecasting approaches, enabling specialists to select optimal methods for specific applications in economics, finance, industry, and other fields. The scientific novelty lies in the first systematic analysis of the applicability of VLM to time series forecasting. The main limitations of VLM include the high complexity of practical application and significant computational requirements.
Tsaplin S.T. Analysis of the possibilities of applying VLM to time series forecasting. Information-measuring and Control Systems. 2026. V. 24. № 1. P. 34−42. DOI: https://doi.org/10.18127/j20700814-202601-04 (in Russian)
- Zhong S., Ruan W., Jin M., Sengupta T., Kang Y., Jansen S., Nuenen van T. Time-VLM: Exploring Multimodal Vision-Language Models for Augmented Time Series Forecasting. 2025. URL: https://arxiv.org/abs/2502.04395.
- Vision-Language Models for Vision Tasks: A Survey. 2024. URL: https://arxiv.org/pdf/2304.00685.pdf.
- Hyndman R.J., Athanasopoulos G. Forecasting: Principles and Practice. Edit. 3rd. Melbourne, Australia: OTexts. 2021. 382 p.
- Filaretov G.F., Larin A.A. Issledovanie i razrabotka EWMA-algoritma obnaruzheniya razladki gaussovskogo vremennogo ryada po matematicheskomu ozhidaniyu. Datchiki i sistemy. 2022. № 6. S. 9−14. DOI: 10.25728/datsys.2022.6.2. (in Russian)
- Box G.E.P., Jenkins G.M., Reinsel G.C., Ljung G.M. Time Series Analysis: Forecasting and Control. Edition 5th. Hoboken: John Wiley & Sons. 2016. 712 p.
- Tregub A.V., Tregub I.V. Metodika postroeniya modeli ARIMA dlya prognozirovaniya dinamiki vremennykh ryadov. Lesnoi vestnik. 2011. № 5. S. 179−183. (in Russian)
- Kalugin T.R., Kim A.K., Petrusevich D.A. Analiz modelei ADL(p, q), ispolzuemykh dlya opisaniya svyazei mezhdu vremennymi ryadami. Rossiiskii tekhnologicheskii zhurnal. 2020. T. 8. № 2. S. 7−22. DOI: 10.32362/2500-316X-2020-8-2-7-22. (in Russian)
- Chen T., Guestrin C. XGBoost: A Scalable Tree Boosting System. Proceedings of 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, USA. 2016. P. 785−794.
- Goodfellow I., Bengio Y., Courville A. Deep Learning. Cambridge: MIT Press. 2016. 800 p.
- Lim B., Zohren S. Deep Learning for Time Series Forecasting: Tutorial and Literature Survey. 2022. URL: https://arxiv.org/pdf/ 2004.10240.pdf.
- Freitas C.M., Prates R.C., Ochi L.S. Prompt-Driven Time Series Forecasting with Large Language Models. Proceedings of 17th International Conference on Agents and Artificial Intelligence (ICAART 2025). Funchal, Portugal. 2025. P. 1−8.
- Jin M., Wang S., Ma L., Chu Z., Zhang J.Y., Shi X., Chen P.Y., Liang Y., Li Y.F., Pan S., Wen Q. Time-LLM: Time Series Forecasting by Reprogramming Large Language Models. 2023. URL: https://arxiv.org/abs/2310.01728.
- Gruver N., Finzi M., Zohren S., Roberts S.J. Large Language Models Are Zero-Shot Time Series Forecasters. 2023.
- Goyal A., Agarwal S., Gadekallu V.S. From Pixels to Predictions: Spectrogram and Vision Transformer for Better Time Series Forecasting. 2024. URL: https://arxiv.org/abs/2403.11047.
- Wang Y., Cabrera A., Wilson A.G., Choi Y. XForecast: Evaluating Natural Language Explanations for Time Series Forecasting. 2024.
- Tan M., Merrill M.A., Gupta V., Hartvigsen T., Alsentzer E. Are Language Models Actually Useful for Time Series Forecasting?. Advances in Neural Information Processing Systems 37 (NeurIPS 2024). Vancouver, Canada. 2024.
- Nielsen A. Practical Time Series Analysis: Prediction with Statistics and Machine Learning. Sebastopol: O'Reilly Media. 2019. 368 p.
- Liu H., Xu S., Zhao Z., Zheng W.L., Zhang Y., Zhao Y., Yu X., Yu C., Chen H., Bian J., Liu T.Y., Qin T., Fan W., Xie X. Time-MMD: Multi-Domain Multimodal Dataset for Time Series Analysis. Advances in Neural Information Processing Systems 37 (NeurIPS 2024). Vancouver, Canada. 2024.
- Zhou H., Zhang S., Peng J., Zhang S., Li J., Xiong H., Zhang W. Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI 2021). 2021. V. 35. № 12. P. 11106−11115.

