Comprehensible AI for multimodal state detection

Multimodal detection of cognitive overload


In many application areas, a detection of affective and cognitive states can be beneficial. For example, in areas such as usability testing, state detection can provide better insight into the effect of a product on the user and provide information about their possible overload with the product.

However, some states are expressed extremely subtly, making their detection a major challenge. For example, one modality, e.g. video, is not sufficient to robustly detect cognitive overload. This can only be made possible by combining different modalities, such as gaze detection and different biosignals.

The requirements and modalities may change depending on the deployment scenario. Also, interfering factors can affect the signals and create uncertainties in the detection of the condition.

The goal of this application is therefore to develop a modular and robust system for multimodal state detection. For this purpose, in addition to data fusion, the quantification of uncertainties in particular plays an important role, enabling an assessment of the reliability of individual modalities in order to incorporate them accordingly into the overall assessment.


Interaction of competencies for a suitable system

A wide range of competencies is needed for the development of a robust multimodal system for the detection of cognitive and affective states.

Sequence-based Learning is used to be able to analyze data streams with a temporal dimension, and automatic feature extraction using neural networks is used for this purpose. As part of the application, a network architecture was also developed for this purpose, which was optimized for the analysis of ECG signals.

However, motion artifacts or other interfering factors can affect the signals, making them less suitable for state detection. The system should be able to detect this and take it into account when combining the individual signals. For this purpose, an attempt is made to enrich individual interpretations of the modalities with an uncertainty quantification within the framework of Trustworthy AI. This should make it possible to make the system more robust, but also to increase the confidence in the system.

In the context of this application, the focus is especially on the detection of cognitive overload. The need for detection of this condition can be identified in different use cases, such as usability testing or human-computer interaction. As a result, data for cognitive overload in different scenarios will be included. In collaboration with experts in the field of stress research, the data will be annotated and as a result these will be given an appropriate meaning (Semantics). In the next steps, models will be trained with the collected data and validated in application-oriented scenarios.