Home > Research > Publications & Outputs > The effect of data preprocessing on a retail pr...


Text available via DOI:

View graph of relations

The effect of data preprocessing on a retail price optimization system

Research output: Contribution to Journal/MagazineJournal articlepeer-review

<mark>Journal publication date</mark>04/2016
<mark>Journal</mark>Decision Support Systems
Number of pages12
Pages (from-to)16-27
Publication StatusPublished
Early online date2/02/16
<mark>Original language</mark>English


Revenue management (RM) is making a significant impact on pricing research and practice, from aviation and hospitality industries to retailing. However, empirical data conditions in retail are distinct to other industries, in particular in the large amount of products within and across categories. To set adequate static prices with established RM models, the data is often simplified by data pruning (the exclusion of subsets of data that are deemed irrelevant or unsuitable) and data aggregation (the combination of disparate data points). However, the impact of such data preprocessing, despite being ubiquitous in retailing, are insufficiently considered in current RM research. This could induce potential sources of bias for the demand model estimates, as well as subsequent effects on the price optimization system, the optimized price set, and the profit maxima, which have not yet been investigated. This paper empirically studies the impact of two commonly used data preprocessing techniques in retail RM, data pruning and data aggregation, using simulated and empirical retail scanner data. We numerically assess potential biases introduced by data preprocessing using a systems perspective in estimating a two-stage demand model, the resulting price elasticities, optimized price sets, and the ensuing profit that it yields. Results show that both data aggregation and data pruning bias demand model estimates to a different extent and produce less profitable price sets than unbiased reference solutions. The results indicate the importance of data preprocessing as a cause for estimation bias and suboptimal pricing in retail price optimization systems.