Application of large language models for machine learning interpretability in dynamic pricing

500 rub

Journal Nonlinear World №2 for 2026 г.

Article in number:

Type of article: scientific article

DOI: https://doi.org/10.18127/j20700970-202602-08

UDC: 004.8

Keywords: Machine learning explainable artificial intelligence large language models SHAP dynamic pricing mathematical modeling CatBoost feature engineering

Authors:

V.D. Paskanov1

1 Financial University under Russian Government (Moscow, Russia)
1 vyacheslav.paskanov@yandex.ru

Abstract:

In the modern retail landscape, dynamic pricing is increasingly driven by advanced machine learning algorithms. Ensemble methods, particularly Gradient Boosting on Decision Trees (GBDT) such as CatBoost, provide state-of-the-art predictive accuracy for demand forecasting and revenue optimization [1, 2]. However, the inherent complexity and non-linear nature of these models create a "black box" phenomenon. This opacity significantly limits the adoption of algorithmic pricing by business stakeholders, as analysts struggle to audit or explain the rationale behind specific pricing recommendations [3, 4]. While classical Explainable AI (XAI) frameworks like SHAP (SHapley Additive exPlanations) and LIME offer mathematically rigorous feature attributions, their visual outputs (e.g., waterfall plots, summary graphs) demand substantial statistical literacy, thus failing to bridge the communication gap between data scientists and business managers [5, 6].

Objective – this research aims to design, formalize, and evaluate a novel software architecture that seamlessly integrates a predictive CatBoost pipeline with SHAP-based attributions and Large Language Models (LLMs). The goal is to automatically translate high-dimensional XAI mathematics into coherent, human-readable analytical reports, thereby ensuring 100% explanatory fidelity while eliminating the risk of LLM hallucinations [7, 8].

The study introduces a rigorous data pipeline applied to the Online Retail dataset. A unique target variable formulation is proposed, where the predictive target is defined as the logarithm of aggregated future revenue over subsequent temporal periods. The period price is selected based on the mode of revenue maximization. The predictive core utilizes CatBoost, optimized via the Optuna framework [2]. The interpretability layer strictly separates mathematical computation from linguistic generation: SHAP values are calculated deterministically and formatted into highly structured tuple prompts [1, 6]. The LLM is strictly constrained by a meta-prompt to act solely as a verbalizer for these numeric arrays. The system supports three operational modes: local (single transaction), global (dataset-wide), and an innovative batch mode for segment-level analysis.

Empirical evaluations demonstrate exceptional performance across all system components. The CatBoost regression model achieved an R2 of 0.89 and a Mean Absolute Percentage Error (MAPE) of 11.2%, significantly outperforming baseline econometric models [2, 9]. The LLM-driven interpretation layer successfully generated precise, context-aware business narratives in under 1.5 seconds per local request [7]. Furthermore, the batch processing module exhibited remarkable scalability, synthesizing SHAP distributions for segments of up to 10,000 SKUs in just 45 seconds. The generated text proved to be highly accurate, aligning perfectly with the underlying SHAP values without introducing any extraneous or fabricated insights [8, 10].

The proposed architecture establishes a new paradigm for trustworthy AI in e-commerce. By automating the extraction and verbalization of pricing drivers, the system drastically reduces the cognitive load on analysts [7]. It transforms raw predictive mathematics into actionable, transparent business intelligence, making complex machine learning models accessible and accountable to non-technical decision-makers [4, 6].

Pages: 69-79

For citation

Paskanov V.D. Application of large language models for machine learning interpretability in dynamic pricing. Nonlinear World. 2026. V. 24. № 2. P. 69–79. DOI: https:// doi.org/10.18127/ j20700970-202602-08 (In Russian)

References

Lundberg S.M., Lee S.I. A unified approach to interpreting model predictions. Advances in neural information processing systems. 2017. V. 30. P. 4765–4774.
Prokhorenkova L., Gusev G., Vorobev A., Dorogush A. V., Gulin A. CatBoost: unbiased boosting with categorical features. Advances in neural information processing systems. 2018. V. 31. P. 6638–6648.
Zhao H., Chen H., Yang F., Liu N., Deng H., Cai H., Wang S., Yin D., Du M. Explainability for Large Language Models: A Survey. ACM Transactions on Intelligent Systems and Technology. 2024. V. 15. № 2. P. 1–38.
Krause S., Stolzenburg F. From data to commonsense reasoning: The use of large language models for explainable ai. arXiv preprint arXiv:2407.03778. 2024.
Ribeiro M.T., Singh S., Guestrin C. Why should i trust you? Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 2016. P. 1135–1144.
Kim J., Lee H., Park S. Design and Evaluation Methods for LLMBased Explainable AI (XAI). International Journal of Applied Machine Learning. 2025. V. 5. № 1. P. 12–29.
Singh A., Kumar P., Patel R. et al. LLMs for Explainable AI: A Comprehensive Survey. arXiv preprint arXiv:2504.00125. 2024.
Safonov A. Neural network approach to demand estimation and dynamic pricing in retail. arXiv preprint arXiv:2412.00920. 2024.
Xiong Z., Wang Y., Chen L. et al. A Hybrid Deep Learning Framework for Dynamic Pricing: Integrating XGBoost and LSTM. Science Excel Transactions. 2025. V. 12. P. 104–118.
Liu J., Smith R. Real-Time Dynamic Pricing Using Machine Learning: Integrating Customer Sentiment and Predictive Models. The Science and Information Organization. 2024. V. 16. № 9. P. 88–102.
Slack D., Hilgard S., Jia E., Singh S., Lakkaraju H. Reliable post hoc explanations: Modeling uncertainty in explainability. Advances in neural information processing systems. 2021. V. 34. P. 9391–9404.
Gianfagna L., Zimuel E. Xai meets llms: A survey of the relation between explainable ai and large language models. arXiv preprint arXiv:2407.15248. 2024.

Date of receipt: 11.03.2026

Approved after review: 25.03.2026

Accepted for publication: 03.04.2026