Dal Alert!

Receive alerts from Dalhousie by text message.

X

PhD Thesis Proposal - Multi-layer Shape Feature Representation and Object Localization Using CNN based Search Evaluation for Augmented Reality

Who: Elham Etemad

Title: Multi-layer Shape Feature Representation and Object Localization Using CNN based Search Evaluation for Augmented Reality

Examining Committee:

Dr. Qigang Gao - Faculty of Computer Science (Supervisor)
Dr. Evangelos E. Milios - Faculty of Computer Science (Reader)
Dr. Derek Reilly - Faculty of Computer Science (Reader)
Dr. Vlado Keselj - Faculty of Computer Science (External Examiner)

 

Chair: Dr. Qigang Gao - Faculty of Computer Science

 

Abstract:

The huge availability of visual data creates the need for agents to process and understand the contents in the visual digital world and facilitate humans’ life. The most essential technique in creating this kind of agents is object recognition, which relies on image representation by itself. With the quick evolving of mobile vision based augmented reality applications, there is a high demand for efficient feature representation and object recognition techniques.

In this research, we propose an image representation method that is based on the perceptual shape features and their spatial distributions. The N-gram notation of the natural language processing concept is adopted to generate a set of shape-based visual words for encoding image features. By combining hierarchical visual words with a spatial pyramid, called Spatio-Shape Pyramid (SSR), a multi-layer representation method is proposed. The strength of SSP is evaluated in comparison with other state-of-the-art methods. We also propose an object localization module to boost the performance of current object detection techniques. This module utilizes image’s edge information as a clue to determine the locations of objects in an image. The Generic Edge Tokens (GETs) of the image are extracted based on the perceptual organization model of human vision. These edge tokens are parsed using the Best First Search (BFS) strategy to find optimal object locations, where the objective function is the detection score returned by deep Convolutional Neural Network (CNN). We have evaluated our method with the Region-based CNN (RCNN) object detection method using the benchmark datasets in the object recognition domain, the experiments show some promising results.

An extended investigation will be conducted to integrate both global (GET), and local (SIFT) features with the feature representation formed by deep CNN learning method. The proposed method is expected to improve Augmented Reality applications which require robust object recognition and localization solution

Time

Location

Room 211, Goldberg Computer Science Building