Title: On the Computational and Statistical Interface and "Big Data"
University of Berkeley
Abstract: The rapid growth in the size and scope of datasets in science and technology has created a need for novel foundational perspectives on data analysis that blend the statistical and computational sciences.  That classical perspectives from these fields are not adequate to address emerging problems in "Big Data" is apparent from their sharply divergent nature at an elementary level---in computer science, the growth of the number of data points is a source of "complexity" that must be tamed via algorithms or hardware, whereas in statistics, the growth of the number of data points is a source of "simplicity" in that inferences are generally stronger and asymptotic results can be invoked.  Indeed, if data are a statistician's principal resource, why should more data be burdensome in some sense?  Shouldn't it be possible to exploit the increasing inferential strength of data at scale to keep computational complexity at bay?  I present three research vignettes that pursue this theme, the first involving the deployment of resampling methods such as the bootstrap on parallel and distributed computing platforms, the second involving large-scale matrix completion, and the third introducing a methodology of "algorithmic weakening," whereby hierarchies of convex relaxations are used to control statistical risk as data accrue.
Joint work with Venkat Chandrasekaran, Ariel Kleiner, Lester Mackey, Purna Sarkar, and Ameet Talwalkar.

Sparse Linear Models
Stanford University

Abstract: In a statistical world faced with an explosion of data, regularization has become an important ingredient. In many problems, we have many more variables than observations, and the lasso penalty and its hybrids have become increasingly useful. This talk presents a general framework for fitting large scale regularization paths for a variety of problems. We describe the approach, and demonstrate it via examples using our R package GLMNET. We then outline a series of related problems using extensions of these ideas.
Joint work with Jerome Friedman, Rob Tibshirani and Noah Simon.

Yoram Singer
Title: BOOM: BOOsting with Momentum
Google Research

Abstract: In the talk, we review one of the largest machine learning platforms at Google called Sibyl. Sibyl can handle over 100B examples in 100B dimensions so long as each example is very sparse. The recent version of Sibly fuses Nesterov's accelerated gradient method with parallel boosting. The result is an algorithm that retains the momentum and convergence properties of the accelerated gradient method while taking into account the curvature of the objective function. The algorithm, termed BOOM, is fast to convergence, supports any smooth convex loss function, and is easy to parallelize. We conclude with a few examples of problems at Google that the system handles.