MIA: Georgia Gkioxari, Visual recognition from one or more images (2021)

Models, Inference and Algorithms Broad Institute of MIT and Harvard September 22, 2021 Georgia Gkioxari Facebook AI Visual recognition from one or more images Building machines that see and recognize from visual inputs is not easy. In the last decade, there have been tremendous efforts toward this goal and new ways have emerged that process image inputs, either in the form of single inputs or as streams, known as videos. These advances have allowed us to build systems which can recognize objects in 2D and 3D, identify their class and their pose or track them in time. While not perfect, these models work robustly enough and are now part of our everyday life, including in our homes via smart home devices and in our daily activities via our smart phones. In this talk, I will give an overview of state-of-the-art visual recognition models I’ve worked on, covering a range of visual tasks, such as object detection, human-object interaction, human pose tracking and 3D object understandin

5 views