Semantics

Creating transparency in AI decisions

Semantik
© Fraunhofer IIS

Data is the raw material for all machine learning and artificial intelligence applications. Useful and meaningful insights based on this data can only be extracted if the knowledge associated with or contained in it, i.e. its »semantics«, is captured in a suitable way during or after the creation of the data, described in a suitable form, i.e. equally in a representation understandable by humans and machines, and correlated with the actual data.

With reference to these requirements, the competence pillar »Semantics« deals with two key areas:

  • Acquisition of knowledge: The first key area focuses on the question of how »model knowledge« in various specific application areas (such as driver assistance, self-localization, digital pathology, or segmentation of XXL tomography data) can be captured and jointly described with the measurement data used and required for this purpose (e.g., vital data and emotions of persons in the vehicle, localization parameters, microscopy data of histological tissue, XXL tomography data).
  • The second key area deals with the challenge of linking the captured information or semantics with the associated measurement data in such a way that these can be made available and usable for various applications by means of methodological approaches from the fields of data analysis, machine learning and artificial intelligence.

In the context of knowledge acquisition, a survey in the form of structured interviews with the experts of the) application projects was started in order to extract, capture and document in which form the semantics (i.e. the »knowledge«) regarding the related questions and different data sources (images, image volumes, videos, multimodal time series, etc.) are available, captured and managed in the different application projects. The goal of this survey is, on the one hand, to establish a common understanding about the term »semantics« and, on the other hand, to find synergies in their capture and use.

On the feedback collected in this way, a first clustering of the different methods for knowledge capture was performed. These approaches can currently be divided into the following groups. 

Iconic annotation

In iconic annotation, regions in 2D and 3D image data are drawn in and marked (»labeled«). In the field of »Digital Pathology«, for example, these labeled regions consist of different tissue areas with certain anatomical or pathological properties such as »tumor«, »connective tissue« or »inflamed tissue«, whereas in the segmentation of XXl-CT data these labeled regions describe e.g. »screws«, »sheets« or »rivets«. Similar approaches are also used for capturing information in video streams (e.g., of soccer matches), where 2D positions of ball and player are manually marked over time, as well as important events (foul, goal, out).

 

Simulation

For applications from the fields of »Autonomous Driving« or »Automatic AI-based Analysis of Games« (Efficient Search and Representation of Tracking Data e.g. Football, Basketball, Ice Hockey), commercially available simulators (driving and game simulators) are used, among others, in addition to (hard to obtain real data), where the information (»semantics«) to be predicted by the data analysis is automatically provided by the simulator, thus forming the »Measurable Ground Truth«.

Reference systems

For self-localization, indoor tracking, and navigation applications using low-cost smartphones, high-quality sensors such as precise optical tracking systems or robots are used as reference systems.

Semantic networks and rule-based systems

Expert knowledge is defined and stored about a domain (e.g., about the composition of assemblies in automobiles or airplanes) in the form of suitable machine-readable rules and formal relation graphs, which can then be interpreted by a machine.

The goal of the processing and compilation is to create a recommendation catalog for the acquisition of different semantics of different data, in order to then turn to the second focus, the usability of the knowledge for different applications.

Our focus areas within AI research

Our work at the ADA Lovelace Center is aimed at developing the following methods and procedures in nine domains of artificial intelligence from an applied perspective.

Automatisches Lernen
© Fraunhofer IIS

Automated learning covers a vast field that ranges from automated feature recognition and selection for datasets, model search and optimization, or automated evaluation of these processes through to adaptive model adjustment using training data and system feedback. It plays a key role in areas such as assistance systems for data-driven decision support.

Sequenzbasiertes Lernen
© Fraunhofer IIS

Sequence-based learning concerns itself with the temporal and causal relationships found in data in applications such as language processing, event processing, biosequence analysis, or multimedia files. Observed events are used to determine the system’s current status, and to predict future conditions. This is possible both in cases where only the sequence in which the events occurred is known, and when they are labelled with exact time stamps.

Erfahrungsbasiertes Lernen
© Fraunhofer IIS

Experience-based learning refers to methods whereby a system is able to optimize itself by interacting with its environment and evaluating the feedback it receives, or dynamically adjusting to changing environmental conditions. Examples include automatic generation of models for evaluation and optimization of business processes, transport flows, or control systems for robots in industrial production.

Few Labels Learning
© Fraunhofer IIS

Major breakthroughs in AI involving tasks such as language recognition, object recognition or machine translation can be attributed in part to the availability of vast annotated datasets. Yet in many real-life scenarios, particularly in industry, such datasets are much more limited. We therefore conduct research on learning using small annotated datasets in the context of techniques for unsupervised, semi-supervised and transfer learning.

For several years, we have seen unbridled growth in the volume of digital data in existence, giving rise to the field of big data. When this data is used to generate knowledge, there is a need to explain the ensuing results and forecasts to users in a plausible and transparent manner. At the ADA Center, this issue is explored under the heading of explainable learning, with the goal of boosting acceptance for artificial intelligence among users in industry, research and society at large.

Mathematical optimization plays a crucial role in model-based decision support, providing planning solutions in areas as diverse as logistics, energy systems, mobility, finance, and building infrastructure, to name but a few examples. The Center is expanding its already extensive expertise in a number of promising areas, in particular real-time planning and control.

Semantik
© Fraunhofer IIS

The task of semantics is to describe data and data structures in a formally defined, standardized, consistent and unambiguous manner. For the purposes of Industry 4.0, numerous entities (such as sensors, products, machines, or transport systems) must be able to interpret the properties, capabilities or conditions of other entities in the value chain.

Few Data Learning
© Fraunhofer IIS

We use few data learning to address key research issues involved in processing and augmenting data, or generating sufficient datasets, for instance in AI applications using material master data in industry. This includes processing flawed datasets and using simulation techniques to generate missing data.

Das könnte Sie auch interessieren

What the ADA Lovelace Center offers you

 

The ADA Lovelace Center for Analytics, Data and Applications offers - together with its cooperation partners - continuing education programs around concepts, methods and concrete applications in the topic area of data analytics and AI.

Seminars with the following focus topics are offered: