Tuesday, April 28, 2026

Model Inversion Attacks 2026 — Extracting Training Data from AI Models

The model inversion paper that changed how I think about AI privacy came out of Google Brain in 2021. Nicholas Carlini and colleagues set out to answer a simple question: if you query GPT-2 enough times, can you get it to reproduce text from its training data verbatim? The answer was yes — unambiguously and reproducibly. Personal email addresses. Phone numbers. Specific private text strings that appeared once in the training corpus. The model had memorised them and would reproduce…

Read full article →

No comments:

Post a Comment