Learning from Experience

Systems that learn to operate and adapt without supervision

Erfahrungsbasiertes Lernen
© Fraunhofer IIS

In the last few years, there were groundbreaking Artificial Intelligence (AI) applications in many research and day-to-day business fields. Apart from novel algorithmic development and increased computational resources, the main fuel behind these succusses was the availability of huge amounts of annotated data. Modern AI algorithms try to discover patterns within static data or construct models that can predict the »correct« output – as indicated by a »teacher« – for a set of given inputs.

In order to take the next step in AI though, meaning to design truly autonomous systems that will be able to operate in the real world and interact with humans, such as autonomous vehicles, packet delivery drones or even intelligent production systems, we need algorithms with different properties, since the requirements are different: autonomous systems need to operate safely without supervision, making a series of decisions towards achieving a goal and being able to adapt to unforeseen situations due to the complexity of the world.

Experience-Based Learning is a central component towards designing such autonomous systems. Here, the behavior of the system is not pre-defined with a set of static rules or static machine learning models trained offline, but the system constantly improves its behavior by collecting new data from the environment.

The field of Reinforcement Learning facilitates the training of agents that discover a strategy (also called a controller or policy) that improves their performance by interacting with the environment through a trial-and-error process, thus providing the theoretical foundation and a large ecosystem of algorithmic approaches for experience-based training of autonomous systems.

Our main drive in the »Experience-Based Learning« pillar is to support the transfer of ideas, algorithms and success stories of Reinforcement Learning Research to Industrial Applications of Autonomous Systems. To achieve this, we focus on approaches that lead to Dependable Reinforcement Learning.

Safety-aware Reinforcement Learning

One of the greatest strengths of Reinforcement Learning is the ability to learn high-quality and creative solutions for the complex problem at hand. For many real-world applications however, a high-performing solution alone is not enough, since in many situations systems must be controlled by transparent and safety-aware mechanisms. A typical example here is autonomous vehicles, where the agent in control must not only fulfill objectives like reaching the desired destination while maintaining fuel efficiency and passenger comfort, but also minimize the probability of collision at the same time.

To address these safety-relevant requirements, we utilize algorithms that learn both a risk estimator either from available data either during the interaction between the agent and the environment, as well as a trade-off factor between the value of the goal to be achieved and the risk of a specific strategy that achieves this goal.

Within ADA, this approach has been utilized by the »KI-Framework für autonome Systeme« Application to learn an autonomous driving agent that safely performs risk-sensitive high-level decisions, like merging on round-abouts, taking unprotected turns and changing lanes. Moreover, in the »Effiziente Suche und Repräsentation von Trackingdaten« Application, the risk of losing the ball in a soccer match was estimated using offline data from real games and a safety-aware Reinforcement Learning agent was trained to play an 11 vs 11 soccer game in simulation.

An additional layer of safety and robustness can be added by adopting Hierarchical Reinforcement Learning approaches. Here, instead of an algorithm learning to solve a task directly, low-level controllers/policies are responsible for performing basic functionalities, while higher-level policies learn to solve the complex task by re-using, combining and sequencing the available low-level policies. The low-level components can be hand-engineered, trained individually or even encapsulate advanced control logic like Model Predictive Controllers. This approach has been extensively used by the »KI-Framework für autonome Systeme« Application to learn more robust and interpretable autonomous driving policies. Here, low-level strategies such as »follow lane«, »switch to left/right lane« or »increase/decrease speed« are available and a coordinating agent is trained to utilize these in order to safely navigate through traffic to a destination point.  

Imitation Learning

The Reinforcement Learning paradigm is based on the availability of a reward function that quantifies how beneficial or detrimental the decisions the agent took in each time-step are. Based on this information, the agent can progressively learn to improve its performance and solve a given task more efficiently. In several cases though, designing an appropriate reward function can be hard. For example, how can we quantify the »internal« reward function of the driver of a car, balancing between comfort, fuel/cost efficiency and reaching a destination in the least amount of time?

In such situations, Imitation Learning can be utilized. Here, we either have examples of an expert performing the task at hand (for example recorded data from several human drivers are available in the autonomous driving case) and the agent learns to »mimic« their strategy or a »teacher/student« setting is defined, where we can interactively query the expert directly on what the best actions in specific situations are in order to progressively train the agent to reach the performance of the expert.

Imitation Learning can also be leveraged as a process towards interpreting the decisions of the trained agent. Here, we can use Imitation Learning to distil the behavior of a trained black-box agent policy (e.g., a Neural Network) to a Binary Decision Tree as in the »teacher/student« setting. This way, the generated Decision Tree has the same performance as the original policy but can be interpreted as a set of »if-then-else« rules. This approach is utilized by the »KI-Framework für autonome Systeme« Application, where high-level behavioral driving policies (e.g., deciding when to change lanes or merge into traffic in highways) are trained using Reinforcement Learning and are then extracted to Binary Decision Trees. These trees can in turn either be verified manually by a safety engineer or using automated formal verification methods.

»ADA wants to know« Podcast

In our new podcast series »ADA wants to know«, the people responsible for the competence pillars are in conversation with Ada and provide insight into their research focuses, challenges and methods. Here, listen to Christopher Mitschler with an episode on Learning from Experience. Listen in now!

Our focus areas within AI research

Our work at the ADA Lovelace Center is aimed at developing the following methods and procedures in nine domains of artificial intelligence from an applied perspective.

Automatisches Lernen
© Fraunhofer IIS

Automated Learning covers a large area starting with the automation of feature detection and selection for given datasets as well as model search and optimization, continuing with their automated evaluation, and ending with the adaptive adjustment of models through training data and system feedback.


 

Sequenzbasiertes Lernen
© Fraunhofer IIS

Sequence-based Learning concerns itself with the temporal and causal relationships found in data in applications such as language processing, event processing, biosequence analysis, or multimedia files. Observed events are used to determine the system’s current status, and to predict future conditions. This is possible both in cases where only the sequence in which the events occurred is known, and when they are labelled with exact time stamps.

© Fraunhofer IIS

Data-centric AI (DCAI) offers a new perspective on AI modeling that shifts the focus from model building to the curation of high-quality annotated training datasets, because in many AI projects, that is where the leverage for model performance lies. DCAI offers methods such as model-based annotation error detection, design of consistent multi-rater annotation systems for efficient data annotation, use of weak and semi-supervised learning methods to exploit unannotated data, and human-in-the-loop approaches to improve models and data.

© Fraunhofer IIS

To ensure safe and appropriate adoption of artificial intelligence in fields such as medical decision-making and quality control in manufacturing, it is crucial that the machine learning model is comprehensible to its users. An essential factor in building transparency and trust is to understand the rationale behind the model's decision making and its predictions. The ADA Lovelace Center is conducting research on methods to create comprehensible and trustworthy AI systems in the competence pillar of Trustworthy AI, contributing to human-centered AI for users in business, academia, and society.

© Fraunhofer IIS

Process-aware Learning is the link between process mining, the data-based analysis and modeling of processes, and machine learning. The focus is on predicting process flows, process metrics, and process anomalies. This is made possible by extracting process knowledge from event logs and transferring it into explainable prediction models. In this way, influencing factors can be identified and predictive process improvement options can be defined.

Mathematical optimization plays a crucial role in model-based decision support, providing planning solutions in areas as diverse as logistics, energy systems, mobility, finance, and building infrastructure, to name but a few examples. The Center is expanding its already extensive expertise in a number of promising areas, in particular real-time planning and control.

Semantik
© Fraunhofer IIS

The task of semantics is to describe data and data structures in a formally defined, standardized, consistent and unambiguous manner. For the purposes of Industry 4.0, numerous entities (such as sensors, products, machines, or transport systems) must be able to interpret the properties, capabilities or conditions of other entities in the value chain.

Tiny Machine Learning (TinyML) brings AI even to microcontrollers. It enables low-latency inference on edge devices that typically have only a few milliwatts of power consumption. To achieve this, Fraunhofer IIS is conducting research on multi-objective optimization for efficient design space exploration and advanced compression techniques. Furthermore, hierarchical and informed machine learning, efficient model architectures and genetic AI pipeline composition are explored in our research. We enable the intelligent products of our partners.

© Fraunhofer IIS

Hardware-aware Machine Learning (HW-aware ML) focuses on algorithms, methods and tools to design, train and deploy HW-specific ML models. This includes a wide range of techniques to increase energy efficiency and robustness against HW faults, e.g. robust training for quantized DNN models using Quantization- and Fault-aware Training, and optimized mapping and deployment to specialized (e.g. neuromorphic) hardware. At Fraunhofer IIS, we complement this with extensive research in the field of Spiking Neural Network training, optimization, and deployment.

Other topics of interest

 

ADA Lovelace Center for Analytics, Data and Applications

With its unparalleled combination of AI research and AI applications in industry, the ADA Lovelace Center is a forum in which partners can forge mutual connections, benefit from each other’s know-how, and work on joint projects.

Reinforcement Learning Seminar

 

»Reinforcement Learning (RL)« is an area of machine learning. The goal here is to enable an autonomous agent to accomplish a task through trial-and-error without using annotated training data. The agent is not given examples of correct actions, but must interact with the environment to discover a strategy that maximizes the expected cumulative reward for the task at hand. Through a combination of theory and actual industry case studies, this two-day seminar will enable you to understand the value and impact of this technology on your business.