In the manufacturing industry, the demand for custom-made products for special applications is becoming increasingly important. In order to make a purchase decision, potential customers need information such as weight, price or delivery time of the end product. However, in the case of custom products, this information is often not known in advance or requires time-consuming calculations by the manufacturer’s employees. AI methods can remedy this by being able to predict this information at an early stage. This facilitates the process of acquiring customers and provides both sides with a much better basis for planning. In this article, we examine how such a prediction can look like, using price forecasting as an example. The example is based on a task that we solved as part of our strategic partnership in cooperation with Wilo SE, one of the world’s leading manufacturers of pumps and pump systems
Initial situation
Wilo SE offers its customers solutions for a wide range of applications involving the movement of water. This results in a very broad product range from small heating pumps in quantities of millions to custom-made products in the project business. In the latter, customers have the option of specifying their products individually on the basis of a large number of product features, e.g., which motor is installed or what requirements are placed on the materials. However, this freedom in configuration makes it difficult when it comes to calculating certain features such as production costs, which are crucial for determining the sales price. The calculations require manual steps, thus tying up valuable employee capacities and delaying communication with customers.
To address this problem of time-consuming forecasting, we have developed a machine learning method. The method builds on the historical order data to predict the price using the features selected by the customers. For this purpose and as part of the strategic partnership, an ensembling method – or more precisely a stacking method – was developed by employees of the Rhine-Ruhr Machine Learning Competence Center (ML2R), from which the Lamarr Institute for Machine Learning and Artificial Intelligence has since emerged. We will we present this method below.
Stacking
Simplified, in stacking, base learners are trained on the existing training data, generate a prediction, and thus extend the training dataset with new features for a meta learner that outputs the result. Thus, the base learners and the meta learner must be trained sequentially.
In the development of the base learner, we used a simple neural network that consists of only a few fully connected linear layers to perform a classification task. Using the output of the base learner and the existing features, we trained a meta learner that performs regression. For this purpose, different models were evaluated and we decided to use a Random Forest.
Random Forest randomly divides the training data into different sets and a simple decision tree is trained with each set. Each individual tree generates a single prediction, which are then averaged for the result.
Base learner
We have trained a neural network as a base learner, more precisely a multilayer perceptron with multiple layers, for a few epochs on a similar learning task, such as a classification of whether the pump is cheap or expensive. As output, we used the penultimate layer, which provides us with the necessary features for training the meta learner.
The obtained features thus encode information about the feature to be predicted – in this case the price – and an optimization with respect to computation time or memory requirements can also be performed, e.g., in the form of a smaller model.
Meta learner
With the features from the base learner, we evaluated different meta learners. We optimized the MSE (Mean Square Deviation) and MAE (Mean Absolute Deviation) from the prediction to the price. A random forest performed best, which also yields good results in terms of resource requirements.
Trust
An important task when AI methods are implemented in a company is the trust in the system. To counter this circumstance, for example, the uncertainty of a prediction can be pointed out for different models, so that for individual cases a manual check is done downstream to avoid a serious error. We include prediction intervals in the prices in order to increase the acceptance of the users into the system and to avoid errors as much as possible.
Result
With the presented method, we have developed a procedure that can provide companies with an important speed advantage. Where previously the missing features had to be selected by employees to determine the desired feature, this can now be done within seconds after the customers have expressed their wishes. In addition, we gain insight into how far the selected customer characteristics affect the result. For example, we get information on the consequences of the engine selection on the pricing. If these effects correspond to the knowledge of the experts, this can also increase confidence in the model.