Hierarchical probabilistic graphical models for image recognition
FacultiesFakultät für Ingenieurwissenschaften und Informatik
The motivation for this thesis was a very practical one, in that I was looking for a generic framework for image understanding with a strong focus on object recognition. Here understanding means extracting some of the inherent structure that natural images exhibit without supervision, such as repeating local structure and the composition of natural images from distinct objects. Even though object recognition may seem like a single well defined task actual applications reveal a widely differing range of requirements. Objects may appear always in one spatial configuration (scale and orientation, possibly in 3d) or appear in a wide variety of perspectives. The background may be uniform or complex and contain many other objects (often referred to as clutter), and last but not least occlusion can make the task even more difficult. Additionally the training information may consist of fully labeled images, or target object examples only, possibly in a priori unknown positions. Faced with this multitude of problem details on the one hand, and the ease with which humans perform this task, I turned to neurophysiology in the sense that I searched for principles of algorithm construction to solve these problems. On the other hand I used the strong mathematical founding of Bayesian statistics because it provides a very useful framework for algorithm construction, many tools for using and analyzing those algorithmic constructs, and a common language and embedding for these algorithms. With these tools in hand I constructed and trained a set of neural networks, trained with different levels of autonomy and invariance to changes appearing in real world images, and present their competitive performance and applicability to real world problems.
Subject HeadingsBilderkennung [GND]
Neural networks (Computer science) [LCSH]