Five steps to optimize your sales forecasts using AI
Unsold products are one of the key threats to the profit margin of a retailer. In a world where consumer desires are ever changing and the length of product life cycles is continually on the decline, predicting which products will sell is becoming ever more important.
Virtually every retailer since the dawn of time has faced the same conundrum: how can I know in advance which products my clients will buy? Needless to say that unsold products are one of the key threats to the profit margin of a retailer, and in a world in which consumer desires are ever changing, and the length of product life cycles is continually on the decline; predicting which products will sell is becoming ever more important.
#1 Be familiar with the data you are trying to predict
A good forecaster is like a craftsman who selects the right tool for the right job. The driver behind his tool selection is the economic reality faced by the company. Much of this reality can be captured by looking at three different aspects of the value the company is trying to predict: the level of aggregation, the nature of the product and the type of data that is available to the company.
#2 Think about what you need from your forecasts
Different companies have different forecasting requirements. For some companies it may be more important to get an accurate view on the trends influencing the sales of the complete company. Others may be more interested in forecasting for specific product categories. Alternatively some companies may want quantitative input to make careful bets on potential outliers that can generate substantial fractions of company revenue. Where the interests of your company lie has major implications on model and variable selection.
For instance, when making predictions at a product level it is crucial to select a relatively small number of variables to prevent overfitting - i.e. using a model that has simply learned the historic data by heart. When using too many variables the algorithm is unable to generalize, and can only make accurate predictions for scenarios that are identical to those which it has already observed - a skill which is utterly useless for forecasting purposes. Alternatively, predictions at a more aggregated level can often benefit from including a larger number of variables.
#3 Look at the relevance of historic information
The nature of the product is also of paramount importance. Some products such as specific car models are sold for years on end with only minor facelifts. For such products, the historical sales of the product are a major factor in predicting the sales. Other products such as clothing or toys may display much more significant evolution over time. This implies that historical sales for a specific product are often not available - or it is already too late to make significant changes to order quantity when the information becomes available.
For these fluctuating products it is highly important to look at the information sources rather than the historic sales information. The first course of action here is to look for defining properties of products that can be used to provide information on possible sales. This includes things such as brands, colors, prices, and product hierarchies. Using data mining and web scraping techniques to get more information on products from other sources can also be extremely valuable. Tracking prices from other retailers or checking popularity of products on social media can provide important insights to make product forecasts better.
#4 Actively question the quality of your data
A key aspect that must never be overlooked is the data itself. Looking at the data to inspect how good the quality is and how it may be transformed into a format that is useful for making predictions is essential to making good predictions. It is one thing to select a model that should work well in theory. Getting the model to work on real data is an entirely different animal.
#5 Use the right tool for the job
Once these aspects have been carefully weighed, the right model has to be selected. Conventional wisdom here dictates that it is much better to start with a simple model such as traditional linear regression or logarithmic regression models. These basic techniques can be used to compare more advanced techniques based on recent advances in machine learning such as deep neural networks or random forests. Never use a complex model without comparing it to a simpler approach.