• English
    • Deutsch
  • English 
    • English
    • Deutsch
  • Login
View Item 
  •   Home
  • Universität Ulm
  • Publikationen
  • View Item
  •   Home
  • Universität Ulm
  • Publikationen
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Novel methods for text preprocessing and classification

Thumbnail
vts_9647_14616.pdf (2.219Mb)
219 S.
Veröffentlichung
2015-08-18
Authors
Gasanova, Tatiana
Dissertation


Faculties
Fakultät für Ingenieurwissenschaften und Informatik
Abstract
Written text is a form of communication that represents language (speech) using signs and symbols. For a given language text depends on the same structures as speech (vocabulary, grammar and semantics) and the structured system of signs and symbols (formal alphabet). Written text has always been an instrument of exchanging information, recording history, spreading knowledge, maintaining financial accounts and formation of legal systems. With the development of computers and Internet the amount of textual information in digital form has dramatically grown. There is an increasing need to automatically process this information for variety of tasks related to text processing such as information retrieval, machine translation, question answering, topic categorization and topic segmentation, sentiment analysis etc. Many important text processing tasks fall into the field of text classification. This thesis addresses the development and evaluation of novel text preprocessing methods, which combine supervised and unsupervised learning models in order to reduce dimensionality of the feature space and improve the classification performance. Metaheuristic approaches for Support Vector Machine and Artificial Neural Network generation and parameters optimization are modified and applied for text classification and compared with other state-of-the-art methods using different text representations.
Date created
2015
Subject headings
[GND]: Automatische Klassifikation
[LCSH]: Text processing (Computer science)
[Free subject headings]: Text classification | Text preprocessing
[DDC subject group]: DDC 000 / Computer science, information & general works
License
Standard
https://oparu.uni-ulm.de/xmlui/license_v3

Metadata
Show full item record

DOI & citation

Please use this identifier to cite or link to this item: http://dx.doi.org/10.18725/OPARU-3242

Gasanova, Tatiana (2015): Novel methods for text preprocessing and classification. Open Access Repositorium der Universität Ulm und Technischen Hochschule Ulm. Dissertation. http://dx.doi.org/10.18725/OPARU-3242
Citation formatter >



Policy | kiz service OPARU | Contact Us
Impressum | Privacy statement
 

 

Advanced Search

Browse

All of OPARUCommunities & CollectionsPersonsInstitutionsPublication typesUlm SerialsDewey Decimal ClassesEU projects UlmDFG projects UlmOther projects Ulm

My Account

LoginRegister

Statistics

View Usage Statistics

Policy | kiz service OPARU | Contact Us
Impressum | Privacy statement