The development of image recognition systems is a complex and highly specialized process, which requires expert knowledge in areas of image recognition and machine learning. Although image recognition systems have a large potential for future applications in everyday life, their application, as of today, is limited by the lack of access to appropriate development tools. The purpose of this project is the development of a software framework for users without expert knowledge in the areas of computer vision and machine learning. The development of such a software framework requires the adaptation of the standard development process in computer vision to the needs of non-expert users. In detail, the framework developed and presented in this work, called FOREST (Flexible Object REcognition SysTem)
- highly automates the development process
- simplifies the development of non-automatable components, and
- provides intuitive user interfaces which require no training or previous knowledge.
In contrast to existing development tools, FOREST does not aim to provide a tool for a specialized development process, but instead lets users adapt its generic recognition functionality to the intended recognition task. FOREST requires only an image data source and the annotations for the image data to learn a classifier for the intended task, e.g., the recognition of left open windows.
FOREST implements its flexible recognition functionality by providing a large set of different image region detection and feature description algorithms. A Boosting classifier is then used to select discriminative image features for the intended recognition task.
The efficient annotation of images is an important aspect for the successful deployment of such a framework. Therefore, efficient annotation techniques were investigated and a semi-automatic annotation process was proposed. An image data set is clustered according to similarity and presented to the user, who may efficiently annotate clusters of images in one go. The clustering can be interactively recalculated using an adapted similarity metric, based on partial annotations provided by the user.