Arthur Daniel Costea, Robert Varga and Sergiu Nedevschi
Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 17), Honolulu, HI, USA, 21-26 July 2017, pp. 993-1002
In this paper we propose a novel boosting-based sliding window solution for object detection which can keep up with the precision of the state-of-the art deep learning approaches, while being 10 to 100 times faster. The solution takes advantage of multisensorial perception and exploits information from color, motion and depth. We introduce multimodal multiresolution ﬁltering of signal intensity, gradient magnitude and orientation channels, in order to capture structure at multiple scales and orientations. To achieve scale invariant classiﬁcation features, we analyze the effect of scale change on features for different ﬁlter types and propose a correction scheme. To improve recognition we incorporate 2D and 3D context by generating spatial, geometric and symmetrical channels. Finally, we evaluate the proposed solution on multiple benchmarks for the detection of pedestrians, cars and bicyclists. We achieve competitive results at over 25 frames per second.