Training neural network models and deploying them in production poses a unique set of computing challenges. The ability to train large models fast, allows researchers to explore the model landscape quickly and push the boundaries of what is possible with Deep Learning. However a single training run often consumes several exaflops of compute and can take a month or more to finish. Similarly, some problem areas like speech synthesis and recognition have real time requirements which places a limit on how much time it can take to evaluate a model in production. In this presentation, I will talk about three systems challenges that need to be addressed so that we can continue to train and deploy rich neural network models.
I am an architect of the High Performance Computing inspired training platform that is used to train some of the largest recurrent neural network models in the world at SVAIL. I also spend a large part of my time exploring models for doing both speech recognition and speech synthesis and what it would take to train these model at scale and deploy them to hundreds of millions of our users. I am the primary author of the WarpCTC project that is used commonly for speech recognition. Before coming to the industry, I got my PhD in Computer Science from UCDavis focusing on parallel algorithms for GPU computing and subsequently went to Stanford for a Masters in Financial Math.