The True Story of How GPT-2 Became Maximally Lewd

In this video, we recount an incident that occurred at OpenAI while researchers were trying to finetune GPT-2 to be as helpful and ethical as possible. It’s narrated that inadvertently flipping a single minus sign led GPT-2 to become the embodiment of a well-known cardinal sin. #ai #aisafety #alignment ▀▀▀▀▀▀▀▀▀SOURCES & READINGS▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ OpenAI blog post: OpenAI paper behind the blog post: RLHF explainer on Hugging Face: RLHF explainer on Concrete Problems in AI Safety, by @RobertMilesAI: ▀▀▀▀▀▀▀▀▀PATREON, MEMBERSHIP, KO-FI▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ 🟠 Patreon: 🟢Merch: 🔵 Channel membership: 🟤 Ko-fi, for one-time and recurring donations: ▀▀▀▀▀▀▀▀▀SOCIAL & DISCORD▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ Discord: Reddit: X/Twitter: ▀▀▀▀▀▀▀▀▀PATRONS & MEMBERS▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ Riley Matthews Vladimir Silyaev Nathanael Moody Alcher Black RMR Nathan Metzger Monadologist Glenn Tarigan NMS James Babcock Colin Ricardo Long Hoang Tor Barstad Gayman Crothers Stuart Alldritt Chris Painter Juan Benet Falcon Scientist Jeff Christian Loomis Tomarty Edward Yu Ahmed Elsayyad Chad M Jones Emmanuel Fredenrich Honyopenyoko Neal Strobl bparro Danealor Craig Falls Vincent Weisser Alex Hall Ivan Bachcin joe39504589 Klemen Slavic blasted0glass Scott Alexander noggieB Dawson John Slape Gabriel Ledung Jeroen De Dauw Craig Ludington Jacob Van Buren Superslowmojoe Michael Zimmermann Nathan Fish Bleys Goodson Ducky Bryan Egan Matt Parlmer Tim Duffy rictic marverati Luke Freeman Dan Wahl Ken Mc leonid andrushchenko Alcher Black Rey Carroll William Clelland ronvil AWyattLife codeadict Lazy Scholar Torstein Haldorsen Supreme Reader MichaÅ‚ ZieliÅ„ski 뿌리와 가지있는 나무 connect ▀▀▀▀▀▀▀CREDITS▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ Direction: Hannah Levingstone (@hannah_luloo) Written by: Jai(@Laneless_) & :3 Line Producer & Production Manager: Kristy Steffens () Quality Assurance Lead: Lara Robinowitz (@CelestialShibe) Animation: Damon Edgson Gabriel Diaz (@gabreleiros) Ira Klages (@dux) Keith Kavanagh (@johnnycigarettex) Michela Biancini Owen Peurois (@owenpeurois) Colors Giraldo (@colorsofdoom) Jordan Gilbert (@Twin_Knight/ Twin Knight Studios) Zack Gilbert (@Twin_Knight/ Twin Knight Studios) Neda Lay (@Nezhahah) Background Art: Hané Harnett (@thepeonyvibes) Zoe Martin-Parkinson (@zoemar_son) Compositing: Renan Kogut (@kogut_r) Patrick O’Callaghan (@) Ira Klages (@dux) Narrator: Rob Miles VO Editor: Tony Dipiazza Sound Design and Music: Epic Mountain
Back to Top