Final published version, 656 KB, PDF document
Research output: Working paper
Research output: Working paper
}
TY - UNPB
T1 - Demand forecasting with high dimensional data
T2 - the case of SKU retail sales forecasting with intra- and inter-category promotional information
AU - Ma, Shaohui
AU - Fildes, Robert
AU - Huang, Tao
PY - 2014
Y1 - 2014
N2 - In marketing analytics applications in OR, the modeler often faces the problem of selecting key variables from a large number of possibilities. For example, SKU level retail store sales are affected by inter and intra category effects which potentially need to be considered when deciding on promotional strategy and producing operational forecasts, but no research has put this well accepted concept into forecasting practice: an obvious obstacle is the ultra-high dimensionality of the variable space. This paper develops a four steps methodological framework to overcome the problem. It is illustrated by investigating the value of both intra- and inter-category SKU level promotional information in improving forecast accuracy. The method consists of the identification of potentially influential categories, the building of the explanatory variable space, variable selection and model estimation by a multistage LASSO regression, and the use of a rolling scheme to generate forecasts. The success of this new method for dealing with high dimensionality is demonstrated by improvements in forecasting accuracy compared to alternative methods of simplifying the variable space. The empirical results show that models integrating more information perform significantly better than the baseline model when using the proposed methodology framework. In general, we can improve the forecasting accuracy by 14.3 percent over the model using only the SKU’s own predictors. But of the improvements achieved, 88.1 percent of it comes from the intra-category information, and only 11.9 percent from the inter-category information. The substantive marketing results also have implications for promotional category management.
AB - In marketing analytics applications in OR, the modeler often faces the problem of selecting key variables from a large number of possibilities. For example, SKU level retail store sales are affected by inter and intra category effects which potentially need to be considered when deciding on promotional strategy and producing operational forecasts, but no research has put this well accepted concept into forecasting practice: an obvious obstacle is the ultra-high dimensionality of the variable space. This paper develops a four steps methodological framework to overcome the problem. It is illustrated by investigating the value of both intra- and inter-category SKU level promotional information in improving forecast accuracy. The method consists of the identification of potentially influential categories, the building of the explanatory variable space, variable selection and model estimation by a multistage LASSO regression, and the use of a rolling scheme to generate forecasts. The success of this new method for dealing with high dimensionality is demonstrated by improvements in forecasting accuracy compared to alternative methods of simplifying the variable space. The empirical results show that models integrating more information perform significantly better than the baseline model when using the proposed methodology framework. In general, we can improve the forecasting accuracy by 14.3 percent over the model using only the SKU’s own predictors. But of the improvements achieved, 88.1 percent of it comes from the intra-category information, and only 11.9 percent from the inter-category information. The substantive marketing results also have implications for promotional category management.
KW - Analytics
KW - OR in Marketing
KW - Forecasting
KW - Retailing
KW - Promotions
M3 - Working paper
T3 - Department of Management Science Working Papers
BT - Demand forecasting with high dimensional data
PB - Department of Management Science, Lancaster University
CY - Lancaster
ER -