TPDP 2025 – Theory and Practice of Differential Privacy

Workshop Information

TPDP 2025 will take place on June 2 and 3 at Google in Mountain View, CA.

Venue Information: Click here for venue information, including exact location of the workshop, parking information, and check-in information.

Logistics: The workshop will be held at Google Mountain View. Nearby airports include San Francisco International Airport (SFO) and San Jose International Airport (SJC). Nearby hotels include the Ameswell Hotel, Hyatt Centric Mountain View, Shashi Hotel Mountain View Palo Alto, and Aloft Mountain View.

Registration: Registration is closed.

DC-Area Watch Party: Christine Task at Knexus Research is hosting a DC-area Watch Party for those that can't travel to CA, but would still like to meet in person to watch presentations and discuss research. The Watch Party will take place at Knexus Research (1951 Kidwell Dr, Vienna VA 22182). It will run 11:30am - 6pm EDT on June 2, and 11:30am - 3pm EDT on June 3rd. Lunch is brown bag (bring your own) with snacks and coffee/tea provided, and a group dinner will be organized if there is sufficient interest. Please register here for the DC-area watch party by May 23. Registration for the DC-Area Watch Party is separate from TPDP registration (but also free) - please fill out both registration forms if you plan to attend. Contact Christine Task with DC-specific questions.

Program

All times are Pacific Daylight Time

Monday, June 2

8:00-9:00		Breakfast
9:00-9:05		Welcome
9:05-9:50		Keynote #1 Practical differentially private statistical estimation should ideally combine strong error guarantees, computational efficiency, robustness, and minimal reliance on user-specified assumptions. In this talk, I will highlight recent progress toward these goals, through the fundamental problem of mean estimation. In the first part, I will present differentially private algorithms for high-dimensional mean estimation whose error optimally adapts to the effective dimensionality of the distribution. These estimators can achieve dimension-free error whenever possible—for instance, for distributions that are concentrated on a small number of principal components—overcoming a limitation of prior methods, which suffer from a curse of dimensionality and require sample sizes that scale with the ambient dimension even in these favorable regimes. In the second part, I will discuss recent results that uncover both limitations and new directions for designing computationally efficient estimators with affine-invariant error guarantees and robustness properties. This talk is primarily based on joint work with Gavin Brown, Yuval Dagan, Michael Jordan, Xuelin Yang, and Nikita Zhivotovskiy. Lydia Zakynthinou's website
9:50-10:35		Contributed Talks: Session #1 We present a principled, per-instance approach to quantifying the difficulty of unlearning via fine-tuning. We begin by sharpening an analysis of noisy gradient descent for unlearning, obtaining a better utility--unlearning tradeoff by replacing worst-case privacy loss bounds with per-instance privacy losses, each of which bounds the (Renyi) divergence to retraining without an individual data point. To demonstrate the practical applicability of our theory, we present empirical results showing that our theoretical predictions are born out both for Stochastic Gradient Langevin Dynamics (SGLD) as well as for standard fine-tuning without explicit noise. We further demonstrate that per-instance privacy losses correlate well with several existing data difficulty metrics, while also identifying harder groups of data points, and introduce novel evaluation methods based on loss barriers. All together, our findings provide a foundation for more efficient and adaptive unlearning strategies tailored to the unique properties of individual data points. We consider the privacy privacy amplification properties of a sampling scheme in which a user's data is used in $k$ steps chosen randomly and uniformly from a sequence (or set) of $t$ steps. This sampling scheme has been recently applied in the context of differentially private optimization [Chua et al., 2024a, Choquette-Choo et al., 2024] and is also motivated by communication-efficient high-dimensional private aggregation [Asi et al., 2025]. Existing analyses of this scheme either rely on privacy amplification by shuffling which leads to overly conservative bounds or require Monte Carlo simulations that are computationally prohibitive in most practical scenarios. We study the problem of reconstructing tabular data from aggregate statistics, in which the attacker aims to identify interesting claims about the sensitive data that can be verified with 100% certainty given the aggregates. Successful attempts in prior work have conducted studies in settings where the set of published statistics is rich enough that entire datasets can be reconstructed with certainty. In our work, we instead focus on the regime where many possible datasets match the published statistics, making it impossible to reconstruct the entire private dataset perfectly (i.e., when approaches in prior work fail). We propose the problem of partial data reconstruction, in which the goal of the adversary is to instead output a subset of rows and/or columns that are guaranteed to be correct. We introduce a novel integer programming approach that first generates a set of claims and then verifies whether each claim holds for all possible datasets consistent with the published aggregates. We evaluate our approach on the housing-level microdata from the U.S. Decennial Census release, demonstrating that privacy violations can still persist even when information published about such data is relatively sparse.
10:35-11:00		Break
11:00-12:30		Poster Session #1
12:30-1:30		Lunch (provided)
1:30-2:30		Panel Discussion: Using DP Data Panelists: Jörg Drechsler (Institute for Employment Research), Stefano Iacus (Harvard), Harikesh Nair (Google)
2:30-2:50		Break
2:50-3:35		Contributed Talks: Session #2 We propose a simple heuristic privacy analysis of noisy clipped stochastic gradient descent (DP-SGD) in the setting where only the last iterate is released and the intermediate iterates remain hidden. Namely, our heuristic assumes a linear structure for the model. We show experimentally that our heuristic is predictive of the outcome of privacy auditing applied to various training procedures. Thus it can be used prior to training as a rough estimate of the final privacy leakage. We also probe the limitations of our heuristic by providing some artificial counterexamples where it underestimates the privacy leakage. The standard composition-based privacy analysis of DP-SGD effectively assumes that the adversary has access to all intermediate iterates, which is often unrealistic. However, this analysis remains the state of the art in practice. While our heuristic does not replace a rigorous privacy analysis, it illustrates the large gap between the best theoretical upper bounds and the privacy auditing lower bounds and sets a target for further work to improve the theoretical privacy analyses. We also empirically support our heuristic and show existing privacy auditing attacks are bounded by our heuristic analysis in both vision and language tasks. Recent research demonstrated that training large language models involves memorization of a significant fraction of training data. Such memorization can lead to privacy violations when training on sensitive user data and thus motivates the study of data memorization's role in learning. In this work, we demonstrate that several simple and well-studied binary classification problems exhibit a trade-off between the number of samples available to a learning algorithm and the amount of information about the training data that a learning algorithm needs to memorize to be accurate. In particular, $\Omega(d)$ bits of information about the training data need to be memorized when a single $d$-dimensional example is available, which then decays as $\Theta(d/n)$ as the number of examples grows (for $n\leq \sqrt{d}$). Further, this rate is achieved (up to logarithmic factors) by simple learning algorithms. Our results build on the work of Brown et al. (2021) and establish a new framework for proving memorization lower bounds that is based on an approximate version of strong data processing inequalities. We study differentially private algorithms for analyzing graphs in the challenging setting of continual release with fully dynamic updates, where edges are inserted and deleted over time, and the algorithm is required to update the solution at every time step. Previous work has presented differentially private algorithms for many graph problems that can handle insertions only or deletions only (called partially dynamic algorithms) and obtained some hardness results for the fully dynamic setting. The only algorithms in the latter setting were for the edge count, given by Fichtenberger, Henzinger, and Ost (ESA 21), and for releasing the values of all graph cuts, given by Fichtenberger, Henzinger, and Upadhyay (ICML 23). We provide the first differentially private and fully dynamic graph algorithms for several other fundamental graph statistics (including the triangle count, the number of connected components, the size of the maximum matching, and the degree histogram), analyze their error and show strong lower bounds on the error for all algorithms in this setting. We study two variants of edge differential privacy for fully dynamic graph algorithms: event-level and item-level. We give upper and lower bounds on the error of both event-level and item-level fully dynamic algorithms for several fundamental graph problems. No fully dynamic algorithms that are private at the item-level (the more stringent of the two notions) were known before. In the case of item-level privacy, for several problems, our algorithms match our lower bounds.
3:40-5:10		Poster Session #2
5:20-7:00		Reception

Tuesday, June 3

8:00-9:00		Breakfast
9:00-9:50		Keynote #2 In this talk I try to stitch together some recent work - some my own and some others’ - to probe practical, important, and intertwined questions: How can we provide and evaluate differentially private (DP) synthetic data to address epistemic concerns of practitioners? Are we targeting the problems practitioners face every day - say, DP under extreme class imbalance, for e.g., EHRs and fraud detection - or merely the ones that make tidy papers? With the advent of LLM supremacy, how do we want to treat its output (Is it public? Is it useful under DP?) Do we understand the right LLM threat models, and do DP defenses make sense? I will trace these questions from theory to practice as best I can, and frame open challenges that I hope we can address as a community. Lucas Rosenblatt's website
9:50-10:35		Contributed Talks: Session #3 We initiate the study of differentially private learning in the proportional dimensionality regime, in which the number of data samples n and problem dimension d approach infinity at rates proportional to one another, meaning that d/n-->delta as n-->infinity for an arbitrary, given constant 0<$delta$<$infinity$. This setting is significantly more challenging than that of all prior theoretical work in high-dimensional differentially private learning, which, despite the name, has assumed that delta=0 or is sufficiently small for problems of sample complexity O(d), a regime typically considered “low-dimensional” or “classical” by modern standards in high-dimensional statistics. This paper studies the problem of differentially private empirical risk minimization (DP-ERM) for binary linear classification. We obtain an efficient $(\varepsilon,\delta)$-DP algorithm with an empirical zero-one risk bound of $\tilde{O}\left(\frac{1}{\gamma^2\varepsilon n} + \frac{\|S_{\mathrm{out}}\|}{\gamma n}\right)$ where $n$ is the number of data points, $S_{\mathrm{out}}$ is an arbitrary subset of data one can remove and $\gamma$ is the margin of linear separation of the remaining data points (after $S_{\mathrm{out}}$ is removed). Here, $\tilde{O}(\cdot)$ hides only logarithmic terms. In the agnostic case, we improve the existing results when the number of outliers is small. Our algorithm is highly adaptive because it does not require knowing the margin parameter $\gamma$ or outlier subset $S_{\mathrm{out}}$. We initiate a study of algorithms for model training with user-level differential privacy (DP), where each example may be attributed to multiple users, which we call the multi-attribution model. We first provide a carefully chosen definition of user-level DP under the multi-attribution model. Training in the multi-attribution model is facilitated by solving the contribution bounding problem, i.e. the problem of selecting a subset of the dataset for which each user is associated with a limited number of examples. We propose a greedy baseline algorithm for the contribution bounding problem. We then empirically study this algorithm for a synthetic logistic regression task and a transformer training task, including studying variants of this baseline algorithm that optimize the subset chosen using different techniques and criteria. We find that the baseline algorithm remains competitive with its variants in most settings, and build a better understanding of the practical importance of a bias-variance tradeoff inherent in solutions to the contribution bounding problem.
10:35-11:00		Break
11:00-12:30		Panel Discussion: DP Law and Policy Panelists: Ryan Steed (Princeton), Mayana Pereira (Capital One), Nitin Kohli (UC Berkeley), Alex Wood (Harvard)
12:30-1:30		Lunch (provided)
1:30-2:30		Contributed Talks: Session #4 Differential Privacy has become the gold standard for quantifying privacy in data analysis and machine learning algorithms. Thanks to nearly two decades of research, there are now multiple competing notions of differential privacy—(ε,δ)-DP, Rényi DP, and Privacy Loss Distribution (PLD) tail—each with its own advantages and limitations. In this talk, I will describe how all these notions can be viewed as primal and dual formulations of each other through Laplace transforms. Notably, the Laplace transform expressions of DP provides an elegant framework for reasoning about DP and its properties. To showcase this perspective I will introduce a new composition theorem for (ε,δ)-DP that is tight—even in the constants—and improves upon the result of Kairouz et al. (2015), which was previously considered optimal. We further explore applications of the fingerprinting method in private and adaptive data analysis. By combining the exponential family with a proper version of Stoke's theorem, we develop a simple and versatile fingerprinting framework, and prove several new results: First, we show a sample complexity lower bound of $\widetilde{Omega(\log(Q) \sqrt(\log(N)) / \alpha^3)$ for adaptive data analysis on linear queries, assuming the algorithm is both distrbutional and emprically accurate. Up to the ``emprically accurate'' caveat, this is nearly tight and matches the upper bound from [BNSSSU'16]. Secondly, we characterize the sample complexity for answering a workload of $Q$ random counting queries over a universe of size $N$, in both ``high accuracy'' and ``low accuracy'' regimes. Thirdly, we revisit the [BUV'14] sample complexity lower bound for answering a worst-case ensemble of counting queries, and improve the lower bound therein by a $\sqrt{\log(1/\delta)}$ factor, thereby achieving a tight-up-to-constant lower bound. We study the fundamental problem of estimating an unknown discrete distribution $p$ over $d$ symbols, given $n$ i.i.d. samples from the distribution. We are interested in minimizing the KL divergence between the true distribution and the algorithm's estimate. We first show that the standard minimax objective is too ``worst-case'' to shed light on optimality of algorithms for this problem, where we prove that simple DP estimators with poor empirical performance are already minimax optimal. This is because minimax objective focuses on minimizing the estimation error on the worst-case data distribution, and fails to shed light on an algorithm's performance on individual (non-worst-case) instances $p$. Thus, we study this problem from an instance-optimality viewpoint, where the algorithm's error on $p$ is compared to the minimum achievable estimation error over a small local neighborhood of $p$. Under natural notions of local neighborhood, we propose algorithms that achieve instance-optimality up to constant factors, with and without a differential privacy constraint. Our upper bounds rely on (private) variants of the Good-Turing estimator. Our lower bounds use additive local neighborhoods that more precisely captures the hardness of distribution estimation in KL divergence, compared to the permutation and multiplicative neighborhoods considered in prior works. We propose a framework to convert $(\varepsilon, \delta)$-approximate Differential Privacy (DP) mechanisms into $(\varepsilon', 0)$-pure DP mechanisms under certain conditions, a process we call ``purification.'' This algorithmic technique leverages randomized post-processing with calibrated noise to eliminate the $\delta$ parameter while achieving near-optimal privacy-utility tradeoff for pure DP. It enables a new design strategy for pure DP algorithms: first run an approximate DP algorithm with certain conditions, and then purify. This approach allows one to leverage techniques such as strong composition and propose-test-release that require $\delta>0$ in designing pure-DP methods with $\delta=0$. We apply this framework in various settings, including Differentially Private Empirical Risk Minimization (DP-ERM), stability-based release, and query release tasks. To the best of our knowledge, this is the first work with a statistically and computationally efficient reduction from approximate DP to pure DP. Finally, we illustrate the use of this reduction for proving lower bounds under approximate DP constraints with explicit dependence in $\delta$, avoiding the sophisticated fingerprinting code construction. https://arxiv.org/abs/2503.21071
2:30-4:00		Poster Session #3
4:00-4:45		Contributed Talks: Session #5 This work introduces Mayfly, a federated analytics approach enabling aggregate queries over ephemeral on-device data streams without central persistence of sensitive user data. Mayfly minimizes data via on-device windowing and contribution bounding through SQL-programmability, anonymizes user data via streaming differential privacy (DP), and mandates immediate in-memory cross-device aggregation on the server -- ensuring only privatized aggregates are revealed to data analysts. Deployed for a sustainability use case estimating transportation carbon emissions from private location data, Mayfly computed over 4 million statistics across more than 500 million devices with a per-device, per-week DP ε=2 while meeting strict data utility requirements. To achieve this, we designed a new DP mechanism for Group-By-Sum workloads leveraging statistical properties of location data, with potential applicability to other domains. In distributed differential privacy, multiple parties collaboratively analyze their combined data while protecting the privacy of each party's data from the eyes of the others. Interestingly, for certain fundamental two-party functions like inner product and Hamming distance, the accuracy of distributed solutions significantly lags behind what can be achieved in the centralized model. However, under computational differential privacy, these limitations can be circumvented using oblivious transfer via secure multi-party computation. Yet, no results show that oblivious transfer is indeed necessary for accurately estimating a non-Boolean functionality. In particular, for the inner-product functionality, it was previously unknown whether oblivious transfer is necessary even for the best possible constant additive error. In this work, we prove that any computationally differentially private protocol that estimates the inner product over {-1,1}^n x {-1,1}^n up to an additive error of O(n^{1/6}), can be used to construct oblivious transfer. In particular, our result implies that protocols with sub-polynomial accuracy are equivalent to oblivious transfer. In this accuracy regime, our result improves upon Haitner, Mazor, Silbak, and Tsfadia [STOC '22] who showed that a key-agreement protocol is necessary. The technical literature about data privacy largely consists of two complementary approaches: formal definitions of conditions sufficient for privacy preservation and attacks that demonstrate privacy breaches. Differential privacy is an accepted standard in the former sphere. However, differential privacy's powerful adversarial model and worst-case guarantees may make it too stringent in some situations, especially when achieving it comes at a significant cost to data utility. Meanwhile, privacy attacks aim to expose real and worrying privacy risks associated with existing data aggregation systems, but do not identify what properties are necessary to defend against them.

Accepted Papers

Poster Session 1

A Programming Framework for Estimating Privacy Loss in Differential Privacy
Takumi Hiraoka, Shumpei Shiina, Kenjiro Taura, Takumi Hiraoka

Differentially Private Winnow for Learning Halfspaces and Decision Lists
Mark Bun, William Fang

Empirical Improvements for Differentially Private Subspace Recovery
Vikrant Singhal, Roy Rinberg, Eran Malach, Seth Neel, Salil Vadhan

Leveraging Randomness in Model and Data Partitioning for Privacy Amplification
Andy Dong, Wei-Ning Chen, Ayfer Ozgur

Laplace Transform Interpretation of Differential Privacy
Rishav Chourasia, Uzair Javaid, Biplap Sikdar

Benchmarking Differentially Private Tabular Data Synthesis Algorithms
Kai Chen, Xiaochen Li, Chen Gong, Ryan McKenna, Tianhao Wang

Winning the MIDST Challenge: New Membership Inference Attacks on Diffusion Models for Tabular Data Synthesis
Xiaoyu Wu, Yifei Pang, Terrance Liu, Steven Wu

Mayfly: Private Aggregate Insights from Ephemeral Streams of On-Device User Data
Christopher Bian, Albert Cheu, Stanislas Chiknavaryan, Zoe Gong, Marco Gruteser, Oliver Guinan, Yannis Guzman, Peter Kairouz, Artem Lagzdin, Ryan McKenna, Grace Ni, Edo Roth, Maya Spivak, Timon Van Overveldt, Ren Yi

Differentially Private Distributed Inference
Marios Papachristo, Amin Rahimian

Oracle-Efficient Differentially Private Learning with Public Data
Rathin Desai, Mark Bun, Adam Block, Zhiwei Steven Wu, Abhishek Shetty

Faster Rates for Private Adversarial Bandits
Vinod Raman, Kunal Talwar, Hilal Asi

SquareχPO: Differentially Private and Robust χ2-Preference Optimization in Offline Direct Alignment
Xingyu Zhou, Yulian Wu, Wenqian Weng, Francesco Orabona

Differentially Private Multi-Sampling from Distributions
Albert Cheu, Debanuj Nayak, Debanuj Nayak

Continuous Private Release of Sparse Histograms
Edith Cohen, Vadym Doroshenko, Badih Ghazi, Pritish Kamath, Alexander Knop, Ravi Kumar, Pasin Manurangsi, Ethan Leeman, Ishika Mitra, Adam Sealfon

Differentially Private Non-Parametric Confidence Intervals
Tomer Shoham, Moshe Shenfeld, Katrina Ligett, Noa Velner-Harris

Integrating Feature Correlation in Differential Privacy with Applications in DP-ERM
Tianyu Wang, Luhao Zhang, Rachel Cummings

Finding the Right Fit: Differentially Private Network Data Release At Scale
Vasanta Chaganti, Ziming Yuan, Amelia Meles, Xinxin Li

It's My Data Too: Private ML for Datasets with Multi-User Training Examples
Arun Ganesh, Ryan McKenna, Brendan McMahan, Adam Smith, Fan Wu

Differentially Private Secure Multiplication with Erasures and Adversaries
Haoyang Hu, Viveck R. Cadambe

Controlling the spread of Epidemics on Networks with Differential Privacy
Dung Nguyen, Aravind Srinivasan, Renata Valieva, Anil Vullikanti, Jiayi Wu

Differentially Private Quasi-Concave Optimization: Bypassing the Lower Bound and Application to Geometric Problems
Kobbi Nissim, Eliad Tsfadia, Chao Yan

Private Lossless Multiple Release
Joel Daniel Andersson, Lukas Retschmeier, Boel Nelson, Rasmus Pagh

Differentially Private Bilevel Optimization
Guy Kornowski

Differentially Private Adaptation of Diffusion Models via Noisy Aggregated Embeddings
Pura Peetathawatchai, Wei-Ning Chen, Berivan Isik, Sanmi Koyejo, Albert No

Differentially Private Space-Efficient Algorithms for Counting Distinct Elements in the Turnstile Model
Rachel Cummings, Alessandro Epasto, Jieming Mao, Tamalika Mukherjee, Tingting Ou, Peilin Zhong

Adapting to Linear Separable Subsets with Large-Margin in Differentially Private Learning
Erchi Wang, Yuqing Zhu, Yu-Xiang Wang

Retraining with Predicted Hard Labels Provably Increases Model Accuracy
Rudrajit Das, Inderjit S Dhillon, Alessandro Epasto, Adel Javanmard, Jieming Mao, Vahab Mirrokni, Sujay Sanghavi, Peilin Zhong

Private Geometric Median in Nearly-Linear Time
Syamantak Kumar, Daogao Liu, Kevin Tian, Chutong Yang

FedDPSyn: Federated Tabular Data Synthesis with Computational Differential Privacy
Shufan Zhang, Haochen Sun, Karl Knopf, Shubhankar Mohapatra, Wei Pang, Calvin Wang, Yingke Wang, Masoumeh Shafieinejad, David Emerson, Xi He

Understanding Private Learning From Feature Perspective
Meng Ding, Mingxi Lei, Shaopeng Fu, Di Wang, Jinhui Xu

Generate-then-Verify: Reconstructing Data from Limited Published Statistics
Terrance Liu, Eileen Xiao, Pratiksha Thaker, Adam Smith, Zhiwei Steven Wu

Relating Definitions of Computational Differential Privacy in Wider Parameter Regimes
Fredrik Meisingseth, Christian Rechberger

Unbounded Differential Privacy Secure Against Timing Attacks
Zachary Ratliff, Salil Vadhan

(ε, δ) Considered Harmful – Best Practices for Reporting Differential Privacy Guarantees
Juan Felipe Gomez, Bogdan Kulynych, Georgios Kaissis, Jamie Hayes, Borja Balle, Antti Honkela

Differential Privacy via Gaussian Mixing Mechanisms
Omri Lev, Ayush Sekhari, Ashia Wilson

Sample-Optimal Private Regression in Polynomial Time
Prashanti Anderson, Ainesh Bakshi, Stefan Tiegel, Mahbod Majid

Privacy for Free in the Over-Parameterized Regime
Simone Bombari, Marco Mondelli

Lower Bounds for Public-Private Computation under Distribution Shift
Amrith Setlur, Pratiksha Thaker, Jonathan Ullman

PREAMBLE: Private and Efficient Aggregation of Block Sparse Vectors and Applications
Hilal Asi, Vitaly Feldman, Hannah Keller, Guy Rothblum, Kunal Talwar

Poster Session 2

Avoiding Pitfalls for Privacy Accounting of Subsampled Mechanisms under Composition
Christian Janos Lebeda, Matthew Regehr, Gautam Kamath, Thomas Steinke

"We Need a Standard”: Toward an Expert–Informed Privacy Label for Differential Privacy
Onyinye Dibia, Mengyi Lu, Prianka Bhattacharjee, Yuanyuan Feng, Joseph P. Near

Optimal Differentially Private Sampling of Unbounded Gaussians
Valentio Iverson, Gautam Kamath, Argyris Mouzakis

Differentially Private Gomory-Hu Trees
Anders Aamand, Justin Chen, Mina Dalirrooyfard, Slobodan Mitrovic, Yuriy Nevmyvaka, Sandeep Silwal, Yinzhan Xu

Do You Really Need Public Data? Surrogate Public Data for Differential Privacy on Tabular Data
Shlomi Hod, Lucas Rosenblatt, Julia Stoyanovich

Accelerating Multiparty Noise Generation Using Lookups
Fredrik Meisingseth, Christian Rechberger, Fabian Schmid

Differentially Private Steering for Large Language Model Alignment
Anmol Goel, Yaxi Hu, Iryna Gurevych, Amartya Sanyal

Scaling Laws for Differentially Private Language Models
Ryan McKenna, Yangsibo Huang, Amer Sinha, Borja Balle, Zachary Charles, Christopher A. Choquette-Choo, Badih Ghazi, George Kaissis, Ravi Kumar, Ruibo Liu, Da Yu, Chiyuan Zhang

Fingerprinting Codes Meet Geometry: Improved Lower Bounds for Private Query Release and Adaptive Data Analysis
Xin Lyu, Kunal Talwar

Agnostic Private Density Estimation for GMMs via List Global Stability
Mohammad Afzali, Hassan Ashtiani, Christopher Liaw

Leveraging Per-Instance Privacy for Machine Unlearning
Nazanin Mohammadi Sepahvand, Anvith Thudi, Berivan Isik, Ashmita Bhattacharyya, Nicolas Papernot, Eleni Triantafillou, Daniel M. Roy, Gintare Karolina Dziugaite

Privacy amplification by random allocation
Moshe Shenfeld, Vitaly Feldman

Decentralized Differentially Private Power Method
Andrew Campbell, Anna Scaglione, Sean Peisert

Purifying Approximate Differential Privacy with Randomized Post-processing
Yingyu Lin, Erchi Wang, Yi-An Ma, Yu-Xiang Wang

PrivJail: Enforcing Differential Privacy in Pythonic Data Processing
Shumpei Shiina, Sho Nakatani, Takumi Hiraoka, Kenjiro Taura, Shumpei Shiina

Optimal Survey Design for Private Mean Estimation
Yu-Wei Chen, Jordan Awan, Raghu Pasupathy

Mildly Accurate Computationally Differentially Private Inner Product Protocols Imply Oblivious Transfer
Iftach Haitner, Noam Mazor, Jad Silbak, Eliad Tsfadia, Chao Yan

Enhancing One-run Privacy Auditing with Quantile Regression-Based Membership Inference
Terrance Liu, Matteo Boglioni, Yiwei Fu, Shengyuan Hu, Pratiksha Thaker, Zhiwei Steven Wu

Online Factorization Mechanisms for Queries with Bounded VC Dimension
Aleksandar Nikolov, Haohua Tang, Aleksandar Nikolov

Empirical Privacy Variance
Yuzheng Hu, Fan Wu, Ruicheng Xian, Yuhang Liu, Lydia Zakynthinou, Pritish Kamath, Chiyuan Zhang, David Forsyth

Instance-Optimality for Private KL Distribution Estimation
Jiayuan Ye, Vitaly Feldman, Kunal Talwar

Private Edge Density Estimation for Random Graphs: Optimal, Efficient and Robust
Hongjie Chen, Jingqiu Ding, Yiding Hua, David Steurer

Distributed Differentially Private Data Analytics via Secure Sketching
Jakob Burkhardt, Hannah Keller, Claudio Orlandi, Chris Schwiegelshohn

Enforcing Demographic Coherence: A Harms Aware Framework for Reasoning about Private Data Release
Mark Bun, Marco Carmosino, Palak Jain, Gabriel Kaptchuk, Satchit Sivakumar

On the Sample Complexity of Differentially Private Policy Gradient
Yi He, Xingyu Zhou

Is API Access to LLMs Useful for Generating Private SyntheticTabular Data?
Marika Swanberg, Ryan McKenna, Edo Roth, Albert Cheu, Peter Kairouz

Private prediction for large-scale synthetic text generation
Kareem Amin, Alex Bie, Weiwei Kong, Alexey Kurakin, Natalia Ponomareva, Umar Syed, Andreas Terzis, Sergei Vassilvitskii

Differential Privacy for Connectedness Indices
Tom Rutter, Amin Rahimian

Privacy in Metalearning and Multitask Learning: Modeling and Separations
Maryam Aliakbarpour, Konstantina Bairaktari, Adam Smith, Marika Swanberg, Jonathan Ullman

The Last Iterate Advantage: Empirical Auditing and\\Principled Heuristic Analysis of Differentially Private SGD
Thomas Steinke, Milad Nasr, Arun Ganesh, Borja Balle, Christopher A. Choquette-Choo, Matthew Jagielski, Jamie Hayes, Abhradeep Guha Thakurta, Adam Smith, Andreas Terzis

New Bounds for Private Graph Optimization Problems via Synthetic Graphs
Anders Aamand, Rasmus Pagh, Lukas Retschmeier

InfTDA: A Simple TopDown Mechanism for Hierarchical Differentially Private Counting Queries
Fabrizio Boninsegna

Bridging Privacy and Accuracy: Asymptotically Negligible Noise by Exponential Mechanism
Young Hyun Cho, Yu Wei Chen, Jordan Awan

Schedule Indistinguishability: Applying Differential Privacy Models to Protect Runtime Timing Behaviors
Maryam Ghorbanvirdi, Sibin Mohan

Efficient Optimization and Asymptotic Analysis of the Generalized Gaussian Mechanism for Differential Privacy
Rachel Cummings, Soufiane Fafe

Poster Session 3

Differential Privacy and Ratio Statistics
Tomer Shoham, Katrina Ligett

Optimal Survey Design for Private Mean Estimation
Yu-Wei Chen, Jordan Awan, Raghu Pasupathy

Leveraging Per-Instance Privacy for Machine Unlearning
Nazanin Mohammadi Sepahvand, Anvith Thudi, Berivan Isik, Ashmita Bhattacharyya, Nicolas Papernot, Eleni Triantafillou, Daniel M. Roy, Gintare Karolina Dziugaite

How Well Can Differential Privacy Be Audited in One Run?
Amit Keinan, Moshe Shenfeld, Katrina Ligett

Enforcing Demographic Coherence: A Harms Aware Framework for Reasoning about Private Data Release
Mark Bun, Marco Carmosino, Palak Jain, Gabriel Kaptchuk, Satchit Sivakumar

Differentially Private Learning Needs Better Model Initialization and Self-Distillation
Ivoline Ngong, Joseph P. Near, Niloofar Mireshghallah

Sample-Efficient Private Learning of Mixtures of Gaussians
Mahbod Majid, Hasan Ashtiani, Shyam Narayanan

Differentially Private Sequential Data Synthesis with Structured State Space Model and Diffusion Model
Tomoya Matsumoto, Takayuki Miura, Toshiki Shibahara, Masanobu Kii, Kazuki Iwahana, Osamu Saisho, Shingo Okamura

Partial-Information Fragment Inference from LLMs
Lucas Rosenblatt, Bin Han, Robert Wolfe, Bill Howe

Debiasing Functions of Private Statistics in Postprocessing
Flavio Calmon, Elbert Du, Cynthia Dwork, Brian Finley, Grigory Franguridi

Private Mean Estimation with Person-Level Differential Privacy
Sushant Agarwal, Gautam Kamath, Mahbod Majid, Argyris Mouzakis, Rose Silver, Jonathan Ullman

Better Private Distribution Testing by Leveraging Unverified Auxiliary Data
Maryam Aliakbarpour, Arnav Burudgunte, Clément Canonne, Ronitt Rubinfeld

Two Algorithms for Prediction with Expert Advice under Local Differential Privacy
Ben Jacobsen, Kassem Fawaz

Efficient, Black-box Inference Attacks on Shared Representations in Multitask Learning
John Abascal, Nicolás Berrios, Alina Oprea, Jonathan Ullman, Adam Smith, Matthew Jagielski

Trade-offs in Data Memorization via Strong Data Processing Inequalities
Vitaly Feldman, Guy Kornowski, Xin Lyu, Xin Lyu

The Correlated Gaussian Sparse Histogram Mechanism
Christian Janos Lebeda, Lukas Retschmeier

Label Differential Privacy can Release Unexpected Leakage
Hinata Sekiguchi, Shun Takagi, Satoshi Hasegawa, Marin Matsumoto, Masato Oguchi

Differentially Private Distributed Mean Estimation with Constrained User Correlations
Sajani Vithana, Viveck R. Cadambe, Flavio P. Calmon, Haewon Jeong

An Iterative Algorithm for Differentially Private $k$-PCA with adaptive noise
Johanna Düngler, Amartya Sanyal

Enhancing Feature-Specific Data Protection via Bayesian Coordinate Differential Privacy
Syomantak Chaudhuri, Maryam Aliakbarpour, Thomas A. Courtade, Alireza Fallah, Michael I. Jordan

Bridging Privacy and Accuracy: Asymptotically Negligible Noise by Exponential Mechanism
Young Hyun Cho, Yu Wei Chen, Jordan Awan

A Private Approximation of the 2nd-Moment Matrix of Any Subsamplable Input
Bar Mahpud, Or Sheffet

Private graphon estimation via sum-of-squares
Hongjie Chen, Jingqiu Ding, Tommaso d'Orsi, Yiding Hua, Chih-Hung Liu, David Steurer

Online Learning and Unlearning
Yaxi Hu, Bernhard Schölkopf, Amartya Sanyal

Fully Dynamic Graph Algorithms with Edge Differential Privacy
Sofya Raskhodnikova, Teresa Anna Steiner

Differentially Private Shortest Distances in Continual Release Model
Rachel Cummings, Tamalika Mukherjee, Jalaj Upadhyay, Hantao Yu, Zongrui Zou, Hantao Yu

Differentially Private Learning Beyond the Classical Dimensionality Regime
Cynthia Dwork, Pranay Tankala, Linjun Zhang

Scaling Laws for Differentially Private Language Models
Ryan McKenna, Yangsibo Huang, Amer Sinha, Borja Balle, Zachary Charles, Christopher A. Choquette-Choo, Badih Ghazi, George Kaissis, Ravi Kumar, Ruibo Liu, Da Yu, Chiyuan Zhang

Faster Algorithms for Person-Level Differentially Private Stochastic Convex Optimization
Andrew Lowy, Daogao Liu, Hilal Asi

dpmm: Differentially Private Marginal Models, a Library for Synthetic Tabular Data Generation
Sofiane Mahiou, Amir Dizche, Reza Nazari, Xinmin Wu, Ralph Abbey, Jorge Silva, Georgi Ganev

Differential Privacy and the Survey Data Pipeline
Joerg Drechsler, James Bailie

Towards Vertically Distributed Differentially Private Synthetic Data Generation
Yucheng Fu, Tianyao Gu, Elaine Shi, Tianhao Wang

Private Quantile Estimation in the Two-Server Model
Anders Aamand, Fabrizio Boninsegna, Jacob Imola, Hannah Keller, Rasmus Pagh, Amrita Roy Chowdhury

Fast Adaptive Private Query Answering for Large Data Domains
Brett Mullins, Miguel Fuentes, Yingtai Xiao, Daniel Kifer, Cameron Musco, Daniel Sheldon

Virtual Presentations

Auditing Differential Privacy Guarantees Using Density Estimation
Antti Koskela, Jafar Mohammadi
[Click Here for Video]

User-Level Differential Privacy in Medical Machine Learning
Johannes Kaiser, Jakob Eigenmann, Daniel Rueckert, Georgios Kaissis
[Click Here for Video]

Learning from End User Data with Shuffled Differential Privacy over Kernel Densities
Tal Wagner
[Click Here for Video]

Call for Papers

Differential privacy (DP) is the leading framework for data analysis with rigorous privacy guarantees. In the last two decades, it has transitioned from the realm of pure theory to large scale, real world deployments.

Differential privacy is an inherently interdisciplinary field, drawing researchers from a variety of academic communities including machine learning, statistics, security, theoretical computer science, databases, and law. The combined effort across a broad spectrum of computer science is essential for differential privacy to realize its full potential. To this end, this workshop aims to stimulate discussion among participants about both the state-of-the-art in differential privacy and the future challenges that must be addressed to make differential privacy more practical.

Specific topics of interest for the workshop include (but are not limited to):

Theory of DP
DP and security
Privacy preserving machine learning
DP and statistics
DP and data analysis
Trade-offs between privacy protection and analytic utility
DP and surveys
Programming languages for DP
Relaxations of DP
Relation to other privacy notions and methods
Experimental studies using DP
DP implementations
DP and policy making
Applications of DP
Reconstruction attacks and memorization

Submissions: Authors are invited to submit a short abstract of new work or work published since June 2024 (the most recent TPDP submission deadline). Submissions must be 4 pages maximum, not including references. Submissions may also include appendices, but these are only read at reviewer's discretion. There is no prescribed style file, but authors should ensure a minimum of 1-inch margins and 10pt font. Submissions are not anonymized, and should include author names and affiliations.

Submissions will undergo a lightweight review process and will be judged on originality, relevance, interest, and clarity. Based on the volume of submissions to TPDP 2024 and the workshop's capacity constraints, we expect that the review process will be somewhat more competitive than in years past. Accepted abstracts will be presented at the workshop either as a talk or a poster.

The workshop will not have formal proceedings and is not intended to preclude later publication at another venue. In-person attendance is encouraged, though authors of accepted abstracts who cannot attend in person will be invited to submit a short video to be linked on the TPDP website.

Selected papers from the workshop will be invited to submit a full version of their work for publication in a special issue of the Journal of Privacy and Confidentiality.

Important Dates

Abstract Submission: March 20, 2025 (AoE)
Notification: May 1, 2025
Workshop: June 2-3, 2025

Workshop Host

Google logo

Student Travel Stipend Support

Platinum Tier Sponsors

Apple logo

Google logo

Gold Tier Sponsors

Capital One logo

Silver Tier Sponsors

DPella logo

Submission website

https://tpdp25.cs.uchicago.edu

For concerns regarding submissions, please contact tpdp.chairs@gmail.com

Organizing and Program Committee

Joe Near (co-chair)
University of Vermont
Jayshree Sarathy (co-chair)
Northeastern University
Adam Sealfon
Google
Adam Smith
Boston University
Ajinkya K Mulay
Meta
Alejandro Russo
Chalmers University of Technology, Göteborg University, DPella AB
Alessandro Epasto
Google Research
Alexandra Wood
Berkman Klein Center for Internet & Society at Harvard University
Amartya Sanyal
University of Copenhagen
Amin Rahimian
University of Pittsburgh
Anand Sarwate
Rutgers University
Andrew Lowy
University of Wisconsin-Madison
Anish
UC Berkeley Undergrad
Anne-Sophie Charest
Université Laval
Antti Koskela
Nokia Bell Labs
Anupama Nandi
Yale University
Arun Ganesh
Google
Ayelet Gordon-Tapiero
Hebrew University
Christian Janos Lebeda
Inria
Christine Task
Knexus
Damien Desfontaines
Tumult Labs
Danfeng Zhang
Duke University
Daogao Liu
University of Washington
Dima Usynin
TUM, Oblivious AI
Dung Nguyen
University of Virginia
Eli Chien
Georgia Institute of Technology
Enayat Ullah
Meta
Gautam Kamath
University of Waterloo
Gavin Brown
University of Washington
Hao Wu
University of Waterloo
Hilal Asi
Apple
Hilla Schefler
The Technion
Ivoline Ngong
University of Vermont
Jalaj Upadhyay
Rutgers University
Jatan Loya
Google
Jiayuan Ye
National University of Singapore
Joann Chen
San Diego State University
Joerg Drechsler
Institute for Employment Research and LMU Munich
Johes Bater
Tufts University
Jonathan Ullman
Northeastern University
Juba Ziani
Georgia Tech
Kunal Talwar
Apple
Linjun Zhang
Rutgers University
Linsheng Liu
George Washington University
Lucas Rosenblatt
NYU
Ludmila Glinskih
Google
Lydia Zakynthinou
UC Berkeley
Mahdi haghifam
Northeastern University
Mark Bun
Boston University
Maryam Aliakbarpour
Rice University
Matthew Jagielski
Google DeepMind
Matthew Joseph
Google Research
Mayana Pereira
Capital One
Moshe Shenfeld
Hebrew University of Jerusalem
Olya Ohrimenko
The University of Melbourne
Om Thakkar
OpenAI
Or Sheffet
Bar Ilan University
Palak Jain
Boston University
Peter Kirouz
Google
Pierre Tholoniat
Columbia University
Prajjwal Gupta
Cloudflare Research
Priyanka Nanayakkara
Harvard University
Rachel Cummings
Columbia University
Rasmus Pagh
University of Copenhagen
Roi Livni
Tel Aviv University
Ryan McKenna
Google
Saeyoung Rho
Columbia University
Sajani Vithana
Harvard University
Sam Haney
Tumult Labs
Samuel Haney
Tumult Labs
Satchit Sivakumar
Boston University
Shengyuan Hu
Carnegie Mellon University
Shlomi Hod
Boston University
Shubhankar Mohapatra
University of Waterloo
Sofya Raskhodnikova
Boston University
Tamalika Mukherjee
Columbia university
Thomas Humphries
University of Waterloo
Thomas Steinke
Google DeepMind
Tianhao Wang
University of Virginia
Tudor Cebere
Inria
Uri Stemmer
Tel Aviv University
Vitaly Feldman
Apple
Viveck Cadambe
Georgia Tech
Xingyu Zhou
Wayne State University
Yaodong Yu
OpenAI
Yingtai Xiao
TikTok
Youssef Allouah
EPFL
Yuzheng Hu
University of Illinois Urbana-Champaign
Yves-Alexandre de Montjoye
Imperial College London
Zeyu Ding
Binghamton University

TPDP 2025 - Theory and Practice of Differential Privacy

Google Mountain View - June 2-3, 2025