Machine Learning and Artificial Intelligence have become an integral part of today’s world. As a technology, it finds relevant content for us on the internet, navigates us through traffic, recognizes us in pictures, and understands our spoken commands. In our blog, we aim to explain the basics of this novel technology, along with exploring research projects and use cases. So first, let’s find out how machines can learn in the first place.
The concept of learning is already familiar to us humans: A human being learns from birth, and the ability to learn has allowed humanity to develop its current state of knowledge, technology, and communal values. Learning is a fundamental condition for adapting to one’s environment: In learning, we change our behavior or thinking based on experience or new insights. In this respect, learning is also a core prerequisite for intelligent life. But how can the ability to learn be transferred to a machine, an artificial system? How can we create Artificial Intelligence?
By “artificial system” we mean here a computer – a machine that stores and processes data by performing calculations, for example. Computers have become an important part of our lives: our smartphone serves as a communication channel, it is impossible to imagine offices without screens, and global logistics and industry have been digitized across the board. The computing and storage power of modern computers is immense – what would take a human several weeks to calculate, a smartphone can do in a fraction of seconds.
For a long time, computers only carried out instructions given by humans and lacked the ability to abstract or even learn autonomously. Because of this, computers could only solve tasks that could be described by simple instructions, such as arithmetic steps. Research in Machine Learning addresses the question of how computers can learn to master complex tasks. Many breakthroughs in recent years have enabled computers to reliably solve seemingly complex tasks like face or speech recognition. But how can a computer learn?
Data as the basis for learning
Machines also learn through experience. The difficulty is first to represent this experience to computers in the form of data. Like humans, most Machine Learning processes are based on concrete examples. However, the computer should not merely memorize these examples, but it must also recognize and understand complex patterns and relationships in the data. Usually, there is some concrete, measured quantity or property in the data that is to be predicted by the system, the so-called label. However, special forms of Machine Learning have been developed to handle different data situations. These allow, for example, learning from data without labels, or actively querying the data during the learning process.
A simple example: A system is to be designed which identifies an animal either as a dog or as a cat based on some features (right column). While there are examples available, the system must be capable of distinguishing animals that differ from these examples.
No. | Lenght | Weight | Fur color | Fur type | Species (label) |
1 | 45 cm | 7 kg | Dark | Short | Cat |
2 | 40 cm | 6,7 kg | Dark | Long | Dog |
3 | 52 cm | 11,2 kg | Spotted | Rough (hair) | Dog |
4 | 43 cm | 6,3 kg | Light | Short | Cat |
5 | 55 cm | 12,4 kg | Spotted | Long | Dog |
… | … | … | … | … | … |
For prediction, the available information is processed mathematically and statistically in the computer. In Machine Learning, these computational sequences are designed to be flexible so that the system can adapt to the available data. The adjusting screws of these system sequences are called parameters. Training is then understood as the step-by-step adjustment of these parameters with the goal that the system can progressively improve its ability to solve the task. Algorithms, i.e., specific sequences of calculation steps, are used for training. The set of parameters and their interrelationships is often referred to as a model because, in a sense, it models the training data. There are many different classes of models, each of which also entails its own algorithms for training. Among the best-known methods are regressions, decision trees, random forests, support vector machines, artificial neural networks, and probabilistic graphical models, especially because they are relatively easy to use.
In our example, one parameter could, for instance, control that animals with a weight above 10 kilograms are classified as dogs, since no contrary case can be found in the data. At the beginning of the training, this parameter would be chosen randomly, and then gradually converge to a meaningful value during the learning process. With the help of more parameters, the data can be distinguished even more accurately.
Optimization and validation
To improve the model step by step, one must determine its goodness of fit. To do this, one can, for example, let the model solve the task for the training data at hand and examine whether the predicted labels match the real labels. The mathematical definition of this idea is usually called an objective function, which depends on the parameters and observed data. Mathematical optimization can then be used to gradually improve the parameters based on the objective function.
A trained model can now be used to make predictions for data where the label is unknown. However, how can one determine the goodness for unknown data, or even compare two trained models? This is why the available data with labels is usually divided into training and test data. The training data is solely used to train the model. The model is then applied to the test data, for which labels are also available. This data was not used in the training and therefore gives an indication of the quality of the model for unknown data. This process is called validation. In the graphic below, the sub-steps in Machine Learning are summarized once again in a simpler way.
This now explains some important, fundamental aspects of Machine Learning. Has your interest in questions around Machine Learning and Artificial Intelligence technologies and topics increased? Continue to follow the blog of the Lamarr Institute (formerly known as Machine Learning Competence Center Rhine-Ruhr (ML2R)). Whether user or expert, the researchers of the Institute share their expertise. You can read posts marked “Basics” to systematically build knowledge in the field of Machine Learning. Do you have questions or comments? Feel free to contact the Lamarr team!