Learning in layered multimodal classifier architectures for cognitive technical systems
Auch gedruckt in der BibliothekW: W-H 14.770
FakultätFakultät für Ingenieurwissenschaften, Informatik und Psychologie
InstitutionInstitut für Neuroinformatik
Ressourcen- / MedientypDissertation, Text
Datum der Erstveröffentlichung2016-07-08
Modern computer systems have changed our way of living fundamentally. They improve our effectiveness by assisting us in our work and daily tasks. However, current systems are limited to a direct input of commands. Furthermore, they are unable to take active decisions on the behalf of the user, mostly because of a lack of information about the user. Cognitive technical systems (CTS) pick up on these deficiencies by recognizing user states and the user’s environment with the help of sensor data. The derived information is collected in a knowledge base and further processed by the application and the dialog management to perform the decision making. In this thesis, new methods addressing sensor-based state recognition in the context of CTS in human-computer interaction are developed and empirically evaluated. The focus is set on large multimodal and temporal multiple classifier systems. Furthermore, the work covers the topics sequential classifiers, handling of partially-available information, and integration of sub-symbolic and symbolic information for complex state recognition. Following approaches are presented in this work: ensemble Gaussian mixture model (EGMM), conditioned hidden Markov model (CHMM), fuzzy conditioned hidden Markov model (FCHMM), hidden Markov model using graph probability densities (HMM-GPD), Markov fusion network (MFN), Kalman filter for classifier fusion and layered classifier architectures. The EGMM extends the classical GMM by the ensemble technique in order to achieve a more robust density estimation. The CHMM and the FCHMM extend the HMM by an additional causal sequence which influences the hidden states. The HMM uses a sequence of discrete causes, whereas the FCHMM uses a sequence of causes with fuzzy memberships. Both approaches can further be utilized to the integrate symbolic information. The HMM-GPD introduces graph probability densities as observations in HMM. MFN and Kalman filter for classifier fusion are probabilistic algorithms for temporal and multimodal late fusion which are robust against sensor failures. Within this thesis, the unidirectional layered architecture (ULA) and the bidirectional layered architecture (BLA) are proposed. Both architectures recognize complex classes based on probabilistic logical rules and the temporal combination of basic patterns. Each layer recognizes patterns based on the class predictions of the underlying layer. Hence, upper layers recognize more complex patterns. The BLA additionally propagates information in the direction of the lower layers. The empirical evaluation of the proposed methods is performed on datasets for affective state and activity recognition, e.g. the Freetalk dataset, AVEC 2011, AVEC 2012, AVEC 2013 and UUlmMAD. The EGMM proved to be more robust and accurate when compared to the conventional GMM approaches. It was shown that the selection of suitable parameters is considerably easier. Further evaluations showed that the multimodal late fusion using the CHMM outperformed the HMM on the Freetalk dataset. The HMM-GPD was studied in the field of activity recognition and showed a good view-invariant performance. The classification was performed on sequences of graphs extracted from partially occluded skeleton models. The MFN and Kalman filter for classifier fusion was studied on the AVEC datasets and achieved good results in comparision to other approaches. Furthermore, it was shown that they outperformed classic point-wise and windowed Fusion approaches. A comprehensive study analyzing the ULA showed that the FCHMM is well-suited to recognize states on different layers given unsegmented sequential data. A dynamic Markov logic network implemented the probabilistic logical rules in the uppermost layer. The thesis further presents a new dataset which was recorded in order to study the BLA. The development of a CTS brings new challenges to the recognition of user’s state and his environment. The presented work identifies important properties in this area and proposes and evaluates methods tailored to this operational area.
Multiple criteria decision making
Multisensor data fusion
Freie SchlagwörterEnsemble GMM
Probabilistic graphical model
Markov fusion network
Kalman filter for classifier fusion
Undirectional layered architecture
Bidirectional layered architecture
Inequality constraint multi-class F2-support vector machine
Graph probability density
Conditional hidden Markov model
Fuzzy conditional hidden Markov model
DDC-SachgruppeDDC 000 / Computer science, information & general works
Das könnte Sie auch interessieren:
Schinzinger, EdoGeneralized linear models (GLM) have multiple applications, in particular they are a popular tool in insurance for fitting claims data. Insurance portfolios typically consist of heterogeneous clusters with similar but ...
Proceedings of the 1st International Symposium on Companion-Technology (ISCT 2015) - September 23rd-25th, Ulm University, Germany Biundo-Stephan, Susanne; Wendemuth, Andreas; Rukzio, EnricoThe International Symposium on Companion-Technology provides a forum for researchers to take a cross-disciplinary perspective on all aspects of research, development, and exploitation of Companion-Technologies. We will ...
Sheridan, Craig; Whigham, Darren; Domaschka, Jörg; Hauser, Christopher; Papazachos, Zafeirios; Krzywda, Jakub (Universität Ulm, 2017-04-20)This document describes the validation of the CACTOS software components. This validation is done in multiple iterations. The first iteration aims to deliver a current validation of the CACTOS Runtime tools as they currently ...