Most existing approaches to autonomous driving fall into one of two categories: modular pipelines that build an extensive model of the environment, and imitation learning approaches that map images directly to control outputs. A recently proposed third paradigm, direct perception, aims to combine the advantages of both by using a neural network to learn appropriate low-dimensional intermediate representations. However, existing direct perception approaches are restricted to simple highway situations, lacking the ability to navigate intersections, stop at traffic lights or respect speed limits. In this talk, I will first give a brief overview over several perception modules that we have developed for extracting 2D and 3D representations for autonomous driving. Next, I will question the necessity of these modules and propose an alternative perception approach which maps video input directly to low-dimensional affordance indicators that are suitable for autonomous navigation in complex urban environments given high-level directional commands. I will demonstrate that the proposed direct approach compares favorably with respect to two recently proposed baselines, both in terms of goal-directed navigation as well as in terms of obeying traffic rules and avoiding accidents. Finally, I will give a brief personal outlook on open research topics at the intersection of perception and self-driving cars.
Andreas Geiger is a full professor at the University of Tübingen and a group leader at the Max Planck Institute for Intelligent Systems. Prior to this, he was a visiting professor at ETH Zürich and a research scientist in the Perceiving Systems department of Dr. Michael Black at the MPI-IS. He studied at KIT, EPFL and MIT and received his PhD degree in 2013 from the Karlsruhe Institute of Technology. His research interests are at the intersection of 3D reconstruction, 3D motion estimation and visual scene understanding with a particular focus on integrating rich prior knowledge and deep learning for improving perception in intelligent systems. In 2012, he has published the KITTI vision benchmark suite which has become one of the most influential testbeds for evaluating stereo, optical flow, scene flow, detection, tracking, motion estimation and segmentation algorithms. His work on stereo reconstruction and optical flow estimation is ranked amongst the top-performing methods in several international competitions. His work has been recognized with several prizes, including the IEEE PAMI Young Investigator Award, the Heinz Maier Leibnitz Prize of the German Science Foundation DFG, the German Pattern Recognition Award, the Ernst-Schoemperlen Award and the KIT Doctoral Award. In 2013, he received the CVPR best paper runner up award for his work on probabilistic visual self-localization. He also received the best paper award at GCPR 2015 and 3DV 2015 as well as the best student paper award at 3DV 2017. He is an associate member of the Max Planck ETH Center for Learning Systems and the International Max Planck Research School for Intelligent Systems, and serves as an area chair and associate editor for several computer vision conferences and journals (CVPR, ICCV, ECCV, PAMI, IJCV).