We've been told to trust AI research assistants to read the web for us and hand back tidy, sourced answers. A new study says: be careful. Researchers at Cornell Tech showed that as few as 13 words, slipped into an ordinary Reddit comment, can quietly steer tools like ChatGPT's Deep Research and Google's Gemini toward scams and products that don't even exist.
It's a small trick with big implications: the same AI tools millions rely on to research purchases, health, and decisions can be manipulated by anyone who can post a comment online. Here's exactly what's going on — and how to keep yourself safe.
What the Researchers Found
Cornell Tech researchers Tingwei Zhang, Harold Triedman, and Vitaly Shmatikov built and tested an attack they call WARP — short for Web Agent Retrieval Poisoning. The finding is unsettlingly simple: adding roughly 13 promotional words to a web page an AI agent already reads is enough to make the AI absorb that text as if it were trustworthy and surface it in its answer.
In practice, that means a scammer can drop a short, innocuous-looking line into a Reddit thread, and an AI "deep research" tool summarizing that thread may end up recommending a fake product or a scam site — presenting it to the user with the same confidence as legitimate results.
How the WARP Attack Works
The attack exploits how modern AI research agents operate. Here's the chain in plain terms:
- You ask an AI tool to research something — say, "best budget headphones."
- The agent browses the open web, including user-generated content like Reddit comments, forums, and reviews.
- An attacker has planted ~13 words in one of those pages, phrased to look like a genuine recommendation.
- The agent reads that text as trustworthy information, not as a manipulation attempt.
- The AI's final answer is skewed — it may cite or recommend the attacker's scam product or site.
The key insight: the attacker doesn't need to hack the AI or the website. They just need to add a little text to a page the AI will read — and the AI does the rest.
What Is Prompt Injection?
WARP is a flavor of prompt injection — one of the most important unsolved security problems in AI. Prompt injection happens when text inside the content an AI reads gets treated as instructions rather than as data.
Because AI agents can't reliably tell the difference between "here's information to summarize" and "here's a command to follow," malicious text hidden in a web page, email, or document can hijack the model's behavior. WARP shows how cheap and effective this is against AI search specifically — a domain people increasingly trust for real decisions. (It's the flip side of the safety push we covered with Claude Fable 5's guardrails: capable AI also expands the attack surface.)
Gemini vs ChatGPT: The Gap
Not all tools fared equally. The study found a striking difference in how often each AI swallowed the poisoned content.
| AI Research Tool | How often it cited poisoned content |
|---|---|
| Google Gemini (Deep Research) | ~12% of citations |
| OpenAI (Deep Research) | ~0.4% of citations |
That's a roughly 30x difference. It doesn't mean one tool is "safe" and the other isn't — both can be manipulated — but it shows how much the filtering of untrusted, user-generated content varies between systems. How well a model resists this is now a real quality differentiator, alongside the capabilities we track in our best AI models guide.
Why This Matters for You
AI search isn't a lab curiosity anymore — a large and growing share of people use it for everyday research, including shopping and health questions. That makes WARP-style attacks genuinely consequential:
- Scams at scale. A single planted comment can influence many users' AI answers, not just one.
- Trust is the weapon. People tend to trust AI summaries more than random search results, so a poisoned recommendation can be more persuasive.
- It's cheap and deniable. Attackers need no special access — just the ability to post content the AI will read.
- Hard to spot. The manipulation lives in a source page, not in the AI's clean-looking final answer.
How to Stay Safe
You don't need to abandon AI research tools — just use them with a healthy dose of skepticism. Practical habits:
- Treat AI answers as a starting point, not a verdict — especially for purchases, money, or health.
- Click the sources. Open the pages the AI cites and judge them yourself; don't trust a recommendation you can't trace.
- Be wary of unfamiliar brands or products an AI suggests, particularly if they're hard to find elsewhere.
- Never enter payment or login details on a site just because an AI pointed you there.
- Cross-check important answers with a normal search or a second tool before acting.
Frequently Asked Questions
What did the Cornell prompt-injection study find?
Researchers at Cornell Tech showed that as few as 13 promotional words slipped into an ordinary web page — like a Reddit comment — can quietly steer AI research tools such as ChatGPT's Deep Research and Google's Gemini toward scams and products that don't exist. They named the technique WARP (Web Agent Retrieval Poisoning).
What is a prompt-injection attack?
Prompt injection is when hidden or malicious text on a web page (or in a document) gets read by an AI system and treated as instructions, rather than as ordinary content. Because AI agents browse and 'trust' the pages they read, attackers can plant text that manipulates the AI's answer — for example, pushing a scam product.
What is the WARP attack?
WARP stands for Web Agent Retrieval Poisoning. It's the attack demonstrated by Cornell Tech researchers Tingwei Zhang, Harold Triedman and Vitaly Shmatikov. It adds around 13 promotional words to a page that an AI research agent already reads, and the agent absorbs them as if they were trustworthy, skewing its recommendations.
Is Gemini or ChatGPT more vulnerable?
In the study, Google's Gemini Deep Research was far more susceptible — it pulled in the poisoned content in about 12% of citations, compared with roughly 0.4% for OpenAI's Deep Research. Both can be affected, but the gap shows how differently AI tools filter user-generated content.
Why are AI research tools vulnerable to this?
AI 'deep research' agents work by browsing many web pages — including user-generated content like forum posts and comments — and summarizing them. They often can't reliably tell trustworthy information from planted instructions, so a small amount of malicious text on a page they cite can influence the final answer.
How can I protect myself from poisoned AI answers?
Treat AI research results as a starting point, not gospel. Verify product names, links and claims against official sources before buying or acting; be wary of unfamiliar brands an AI recommends; click through to the actual sources the AI cites; and never enter payment details on a site solely because an AI suggested it. Cross-check important answers with a normal search.
Does this mean AI search is unsafe to use?
Not unsafe to use, but not blindly trustworthy. AI research tools are powerful and useful, but this study shows they can be manipulated through the open web. Use them for speed and breadth, then apply human judgment and verification for anything that involves money, security, or important decisions.
Can the companies fix prompt injection?
They can reduce it, but prompt injection is an open, hard problem. Mitigations include better filtering of untrusted content, separating instructions from data, source-reputation scoring, and sandboxing what agents can act on. Expect ongoing improvements rather than a single permanent fix, which is why user verification still matters.
Final Thoughts
The WARP study is a useful reality check. AI research tools are genuinely powerful, but they read the same messy, manipulable web we do — and right now they can be nudged by a sentence-sized trick. That doesn't make them useless; it makes them tools that still need a human in the loop.
The takeaway is simple: enjoy the speed of AI search, but verify before you trust — especially when money or safety is on the line. As AI agents take on more of our decisions, knowing they can be poisoned by 13 words is exactly the kind of awareness that keeps you safe. We'll keep tracking AI security as it evolves.