Most top news publishers block AI training bots via robots.txt, but they’re also blocking the retrieval bots that determine whether sites appear in AI-generated answers.
BuzzStream analyzed the robots.txt files of 100 top news sites across the US and UK and found 79% block at least one training bot. More notably, 71% also block at least one retrieval or live search bot.
Training bots gather content to build AI models, while retrieval bots fetch content in real time when users ask questions. Sites blocking retrieval bots may not appear when AI tools…

![[CITYPNG.COM]White Google Play PlayStore Logo – 1500×1500](https://startupnews.fyi/wp-content/uploads/2025/08/CITYPNG.COMWhite-Google-Play-PlayStore-Logo-1500x1500-1-630x630.png)