Experience-Based Learning

Systems that learn to operate and adapt without supervision

Erfahrungsbasiertes Lernen
© Fraunhofer IIS

In the last few years, there were groundbreaking Artificial Intelligence (AI) applications in many research and day-to-day business fields. Apart from novel algorithmic development and increased computational resources, the main fuel behind these succusses was the availability of huge amounts of annotated data. Modern AI algorithms try to discover patterns within static data or construct models that can predict the »correct« output – as indicated by a »teacher« – for a set of given inputs.

In order to take the next step in AI though, meaning to design truly autonomous systems that will be able to operate in the real world and interact with humans, such as autonomous vehicles, packet delivery drones or even intelligent production systems, we need algorithms with different properties, since the requirements are different: autonomous systems need to operate safely without supervision, making a series of decisions towards achieving a goal and being able to adapt to unforeseen situations due to the complexity of the world.

Experience-Based Learning is a central component towards designing such autonomous systems. Here, the behavior of the system is not pre-defined with a set of static rules or static machine learning models trained offline, but the system constantly improves its behavior by collecting new data from the environment.

The field of Reinforcement Learning facilitates the training of agents that discover a strategy (also called a controller or policy) that improves their performance by interacting with the environment through a trial-and-error process, thus providing the theoretical foundation and a large ecosystem of algorithmic approaches for experience-based training of autonomous systems.

Our main drive in the »Experience-Based Learning« pillar is to support the transfer of ideas, algorithms and success stories of Reinforcement Learning Research to Industrial Applications of Autonomous Systems. To achieve this, we focus on approaches that lead to Dependable Reinforcement Learning.

Safety-aware Reinforcement Learning

One of the greatest strengths of Reinforcement Learning is the ability to learn high-quality and creative solutions for the complex problem at hand. For many real-world applications however, a high-performing solution alone is not enough, since in many situations systems must be controlled by transparent and safety-aware mechanisms. A typical example here is autonomous vehicles, where the agent in control must not only fulfill objectives like reaching the desired destination while maintaining fuel efficiency and passenger comfort, but also minimize the probability of collision at the same time.

To address these safety-relevant requirements, we utilize algorithms that learn both a risk estimator either from available data either during the interaction between the agent and the environment, as well as a trade-off factor between the value of the goal to be achieved and the risk of a specific strategy that achieves this goal.

Within ADA, this approach has been utilized by the »KI-Framework für autonome Systeme« Application to learn an autonomous driving agent that safely performs risk-sensitive high-level decisions, like merging on round-abouts, taking unprotected turns and changing lanes. Moreover, in the »Effiziente Suche und Repräsentation von Trackingdaten« Application, the risk of losing the ball in a soccer match was estimated using offline data from real games and a safety-aware Reinforcement Learning agent was trained to play an 11 vs 11 soccer game in simulation.

An additional layer of safety and robustness can be added by adopting Hierarchical Reinforcement Learning approaches. Here, instead of an algorithm learning to solve a task directly, low-level controllers/policies are responsible for performing basic functionalities, while higher-level policies learn to solve the complex task by re-using, combining and sequencing the available low-level policies. The low-level components can be hand-engineered, trained individually or even encapsulate advanced control logic like Model Predictive Controllers. This approach has been extensively used by the »KI-Framework für autonome Systeme« Application to learn more robust and interpretable autonomous driving policies. Here, low-level strategies such as »follow lane«, »switch to left/right lane« or »increase/decrease speed« are available and a coordinating agent is trained to utilize these in order to safely navigate through traffic to a destination point.  

Imitation Learning

The Reinforcement Learning paradigm is based on the availability of a reward function that quantifies how beneficial or detrimental the decisions the agent took in each time-step are. Based on this information, the agent can progressively learn to improve its performance and solve a given task more efficiently. In several cases though, designing an appropriate reward function can be hard. For example, how can we quantify the »internal« reward function of the driver of a car, balancing between comfort, fuel/cost efficiency and reaching a destination in the least amount of time?

In such situations, Imitation Learning can be utilized. Here, we either have examples of an expert performing the task at hand (for example recorded data from several human drivers are available in the autonomous driving case) and the agent learns to »mimic« their strategy or a »teacher/student« setting is defined, where we can interactively query the expert directly on what the best actions in specific situations are in order to progressively train the agent to reach the performance of the expert.

Imitation Learning can also be leveraged as a process towards interpreting the decisions of the trained agent. Here, we can use Imitation Learning to distil the behavior of a trained black-box agent policy (e.g., a Neural Network) to a Binary Decision Tree as in the »teacher/student« setting. This way, the generated Decision Tree has the same performance as the original policy but can be interpreted as a set of »if-then-else« rules. This approach is utilized by the »KI-Framework für autonome Systeme« Application, where high-level behavioral driving policies (e.g., deciding when to change lanes or merge into traffic in highways) are trained using Reinforcement Learning and are then extracted to Binary Decision Trees. These trees can in turn either be verified manually by a safety engineer or using automated formal verification methods.

Other topics of interest

 

ADA Lovelace Center for Analytics, Data and Applications

With its unparalleled combination of AI research and AI applications in industry, the ADA Lovelace Center is a forum in which partners can forge mutual connections, benefit from each other’s know-how, and work on joint projects.

Reinforcement Learning Seminar

 

»Reinforcement Learning (RL)« is an area of machine learning. The goal here is to enable an autonomous agent to accomplish a task through trial-and-error without using annotated training data. The agent is not given examples of correct actions, but must interact with the environment to discover a strategy that maximizes the expected cumulative reward for the task at hand. Through a combination of theory and actual industry case studies, this two-day seminar will enable you to understand the value and impact of this technology on your business.