Read more at: Making and breaking tokenizers
Making and breaking tokenizers
Friday, 17 October, 2025 - 12:00 to 13:00
Despite massive investments in training large language models, tokenizers remain a critical but often neglected component with weaknesses that can cause wild hallucinations, bypass safety guardrails, and break downstream applications. This talk will cover: Our recent research in automatically detecting problematic 'glitch...