Babies are known to acquire visual "common sense" concepts, such as object permanence, gravity, and intuitive physics, at a young age. For example, infants play with toy blocks, allowing them to gain intuition about the physical behavior of the world at a young age. While deep neural networks have exhibited state-of-the-art performance on many computer vision tasks, more complex reasoning (e.g. 'what will happen next in this scene?') requires an understanding of how the physical world behaves. We explore the ability of deep feedforward models to learn such intuitive physics. Using a 3D game engine, we create small towers of wooden blocks, and train large convolutional network models to accurately predict their stability, as well as estimating block trajectories. The models are able to generalize to new physical scenarios and to images of real blocks.
Adam is a research engineer at Facebook AI Research, where he has worked on distributed neural network training, computer vision, visual common sense, and graph embeddings. Prior to joining Facebook, Adam worked at D. E. Shaw Research, where he developed software and algorithms for Anton, a special-purpose supercomputer for molecular dynamics simulation. Adam holds a B.Sc. in computer science and physics and M.Eng. in computer science from MIT.