Charla: “Differential Privacy and Applications in Adaptive Data Analysis”

Toniann Pitassi, Professor at the Department of Computer Science, University of Toronto
16 Diciembre, 2015 - 12:00
Auditorio Ramón Picarte, DCC -3er Piso Norte.
Jocelyn Simmonds

Abstract: Misapplication of statistical data analysis is a common cause of spurious discoveries in scientific research. Existing approaches to ensuring the validity of inferences drawn from data assume a fixed procedure to be performed, selected before the data are examined. In common practice, however, data analysis is an intrinsically adaptive process, with new analyses generated on the basis of data exploration, as well as the results of previous analyses on the same data. We demonstrate a new approach for addressing the challenges of adaptivity based on insights from privacy-preserving data analysis. As an application, we show how to safely reuse a holdout data set many times to validate the results of adaptively chosen analyses.

In this talk we will discuss the issue of false discovery in statistical data analysis, and then give a brief introduction to differential privacy. We will show how techniques from differential privacy can be used to prevent spurious discoveries in statistical data analysis.

This talk is based on a recent paper in Science, entitled "The Reusable Holdout: Preserving Validity in Adaptive Data Analysis" and is joint work with Dwork, Feldman, Hardt, Reingold and Roth.