There are many billions of images in Dropbox. About 15% of them are photos of documents -- receipts, business cards, contracts, etc. -- with text content that is hidden from our search index. To allow our users to search for these "documents," Dropbox built an OCR system. In this talk I'll describe Dropbox's OCR project from its initial, small-scale deployment of a third-party library, through development and large-scale deployment of a homegrown solution using deep networks, focussing on the performance and scaling problems we encountered and solved along the way.
Thomas Berg is a Machine Learning Engineer at Dropbox, where he's worked on image classification, OCR, and user activity prediction. He has a PhD from Columbia University, where he worked on face recognition and fine-grained image classification in Peter Belhumeur's lab.