Text classification for spoken dialogue systems
FacultiesFakultät für Ingenieurwissenschaften, Informatik und Psychologie
InstitutionsInstitut für Nachrichtentechnik
Institut für Künstliche Intelligenz
The main objective of this thesis is the application and evaluation of text classification approaches for speech-based utterance classification problems in the field of advanced spoken dialogue system (SDS) design. SDSs are speech-based human-machine interfaces that may be applied in various domains. A novel generation of SDSs should be multi-domain and user-adaptive. Designing of multi-domain user-adaptive SDSs is related to some utterance classification problems: domain detection of user utterances and user state recognition including user verbal intelligence and emotion recognition. Text classification approaches may be applied for the considered problems. Text classification consists of the following stages: feature extraction, term weighting, dimensionality reduction, and machine learning. The thesis has three aims: 1. To identify the best combinations of state-of-the-art text classification approaches for the considered utterance classification problems. 2. To improve utterance classification performance for SDSs. 3. To improve computational performance of utterance classification for SDSs. For the first aim, different term weighting methods (IDF, CW, GR, TM2, RF, TRR, and NTW), different dimensionality reduction methods („stop-word“ filtering in combination with stemming, weight-based feature selection, feature transformation based on term clustering), and different machine learning algorithms (k-NN, SVM, the Rocchio classifier) have been validated on different datasets including two corpora for domain detection of user utterances, two corpora for verbal intelligence recognition, and a corpus for text-based user emotion recognition. The best combinations of the text classification approaches were identified as follows: - For domain detection of user utterances: k-NN + TRR or SVM + IDF with feature transformation based on term clustering. - For user verbal intelligence: k-NN + CW. The CW method for verbal intelligence recognition provides a small number of useful terms that characterize only a class of higher verbal intelligence. It seems to be easier to recognize verbal intelligence in dialogues than in monologues - For text-based emotion recognition: k-NN + NTW without dimensionality reduction. Emotion recognition based on linguistic information does not demonstrate high performance in comparison with audio-based and video-based emotion recognition. The novel approaches were proposed and tested for achieving the second and the third aims of the thesis: - Collectives of different term weighting methods. This approach allows to make use of the advantages of different term weighting methods. Collectives of term weighting methods based on the majority voting procedure may significantly improve the classification performance of utterance classification with k-NN algorithm. These results may be again significantly improved by weighted voting with an optimization based on the self-adjusting genetic algorithm. - Novel feature transformation based on term belonging to classes. It significantly reduces the dimensionality: it equals to the number of classes. The novel feature transformation significantly improves the computational performance of utterance classification in terms of computational time. The novel feature transformation method is especially effective in combination with the collectives of term weighting methods. The simultaneous use of two novel approaches may significantly improve both the classification results and the computational performance. - Novel approach to neural network structure optimization. The novel approach has a simplified ANN structure representation, requires less computational resource, and has fewer parameters for tuning than the baseline approach. Additionally, the results of the novel approach to ANN structure optimization may be improved with feature selection based on the wrapper. The wrapper may be performed by the self-adjusting GA. The novel approach to ANN design significantly improves the classification results and the computational performance of utterance classification with ANN. Therefore, the novel approaches lead to improve the classification performance of utterance classification (the second aim of the thesis) and the computational performance of utterance classification (the third aim of the thesis) as well.
Subject HeadingsSprachdialogsystem [GND]
Machine learning [LCSH]
User interfaces (Computer systems) [LCSH]