How Google AI Overviews Choose Citations: Query Fan-Out

By Cory Maki · Jun 16, 2026

Most founders still picture AI Overviews the way they picture a featured snippet: Google finds the top-ranking page for a query and lifts a paragraph. That model is wrong, and acting on it wastes effort. The reality is that one prompt no longer produces one search. It produces dozens.

The mechanism that does this is called query fan-out, and it is the single most important concept for understanding why your competitor gets cited and you don't — even when you outrank them. Once you understand fan-out, the entire optimization problem changes shape. You stop chasing keywords and start covering a question field.

Query fan-out turns one prompt into many simultaneous searches

Query fan-out is a retrieval technique that expands a single user query into multiple sub-queries to capture different possible intents. Google has been explicit about this. In its launch material for AI Mode, the company described the system as breaking your question into subtopics and issuing a multitude of queries simultaneously on your behalf — and Google Search Central now confirms that both AI Overviews and AI Mode may use the same technique.

The behavior is consistent across the industry. Most AI-powered search engines, including Google's AI Overviews, AI Mode, Gemini, ChatGPT, Perplexity, Microsoft Copilot, and Grok, use query fan-out expansion methods to answer user prompts. The reasoning is practical: a single keyword-style search can't reliably cover all the angles of an open-ended question.

A concrete example makes it obvious. If someone typed "best sneakers for walking" into AI Mode, it could break that query up into subqueries like "best sneakers for men," "best sneakers for walking in different seasons," "sneakers for walking on a trail" and "best slip-on sneakers." The user never typed those phrases. The system generated them, ran them in parallel, and assembled the answer from the combined results.

The scale is larger than most people assume

This is not three or four extra searches. Research from iPullRank found that Google fires hundreds of searches per single user query in AI Mode, and that systems execute approximately 20 iterations maximum before terminating their retrieval process. The same body of analysis found that AI search queries average 70-80 words compared to 3-4 words for traditional searches, representing a 17-26x increase in query complexity. Google AI Mode uses a custom Gemini model purpose-built for this decomposition.

Citations are chosen by a multi-stage filter, not by ranking

Getting into the candidate pool and getting cited are two different events. The selection pipeline progressively narrows a large candidate set down to a handful of sources. Google AI Overviews selects sources through a multi-stage filtering pipeline that progressively narrows 200–500 candidate documents down to 5–15 cited sources. The process moves through semantic retrieval, E-E-A-T authority filtering, Gemini LLM re-ranking at the passage level, and final data fusion into a coherent summary with inline citations.

What the user actually sees is smaller still. Google AI Overviews usually show 3–4 visible citations, though it draws on more behind the scenes. Gemini sits in a similar range. So a vast amount of retrieval and filtering happens before a single link surfaces.

The decisive shift is that ranking no longer determines the outcome. Only 38% of AIO-cited pages now rank in the organic top 10, down from 76% less than a year ago, meaning traditional SEO rankings alone are an increasingly unreliable path to AIO visibility. I covered the broader implications of this in the "GEO is just SEO" debate — the foundations overlap, but the selection logic genuinely diverges.

Rank gets you into the pool; structure and authority decide the rest

The cleanest way to hold this in your head: rank gets content into the candidate pool. Structure and authority determine whether it survives to citation. Or, put another way, retrieval gets you into the candidate pool. Relevance decides whether you're picked from it.

The relevance test is specific. AI engines look for tight semantic alignment between the user's question, the AI's draft answer, and the cited source. The closer your page's wording matches the language of the query and the framing of the answer, the more likely you are to be selected. Because fan-out generates the queries, that wording has to match queries the user never typed.

Passages get cited, not pages

This is the part founders most often miss. AI Overviews don't cite your homepage or your blog post as a whole. Gemini can use passage indexing to incorporate specific sections of a website to fill out its AI Overview response. This provides more context-specific relevance.

The practical consequence is that your content needs to be built out of discrete, self-contained answer units. The data here is unusually precise: AI prioritizes passages that fully answer queries in 134–167 word self-contained units. A page can be comprehensive and still fail this test if its information isn't organized into extractable blocks.

This also explains why Reddit dominates citation share. A long thread is a dense cluster of independent passages. A Reddit thread with 200 comments contains hundreds of discrete passages, each a potential retrieval target. A 3,000-word blog post might yield 5 extractable passages. A Reddit thread with the same word count yields 50. I unpack the strategic side of this in why Reddit dominates AI search citations. For the mechanics of building extractable content, see how to make your content citable by AI.

Specificity and density beat length

Google's own patents point toward what survives re-ranking. Google's patent WO2024064249A1, which describes ranking source passages for AI summaries, explicitly references "information density" and "specificity signals" as selection factors. The practical target is roughly two to three quantified data points per 300-word block. Vague claims lose; "experts recommend X grams" beats "adults need a lot."

Structured data compounds the effect. One analysis found that structured data implementation boosts AIO selection probability by 73%. FAQ, HowTo, Article, and Product schema types show the strongest impact because they create semantically clear, extractable answer units that Gemini can parse unambiguously during re-ranking. If schema and crawlability are the problem, Why ChatGPT Can't See Your Website walks through the fixes.

Topical coverage is the real defense against fan-out instability

Here is the uncomfortable truth about fan-out: the sub-queries are unstable. They change run to run. So optimizing for a single fan-out query is a losing game. Coverage is the hedge. Sites with 80%+ topical coverage retain 85.4% of their AI visibility despite 73% fan-out query instability. This is why comprehensive topic clusters outperform individually optimized pages.

The stakes justify the work. A study by Seer Interactive shows that getting cited in Google's AI overviews results in 120% more organic clicks per impression and a 41% increase in paid clicks compared with when your brand is not cited. Citation is not a vanity metric; it moves traffic and revenue.

If you want a structured way to act on all of this, start by measuring where you stand using the ARC Method audit, then sequence the work with the 90-day GEO roadmap for SaaS founders. And because fan-out can surface outdated or off-source descriptions of your product, monitor how AI describes your brand on an ongoing basis. The full framework is in my book, Reddit, AI Overviews & GEO.

The summary is simple. AI Overviews don't reward the best-ranking page. They reward the source whose passages most precisely answer the dozens of questions the user didn't know they were asking. Build for the question field, not the keyword.

Frequently asked questions

What is query fan-out in Google AI Overviews?

Query fan-out is a retrieval technique where Google expands one user query into multiple related sub-queries across subtopics and data sources, runs them simultaneously, then synthesizes the results into a single answer. Google has confirmed both AI Overviews and AI Mode may use it. Research suggests Google can fire hundreds of searches per single user prompt, with systems running up to about 20 retrieval iterations before terminating.

How many sources does an AI Overview actually cite?

The visible answer typically shows only 3–4 citations, though the system draws on more behind the scenes. The selection pipeline narrows roughly 200–500 candidate documents down to about 5–15 cited sources through semantic retrieval, E-E-A-T filtering, passage-level re-ranking, and final data fusion — far more processing than the few links users see.

Do I need to rank in the top 10 to get cited?

No. Only about 38% of AI-Overview-cited pages now rank in the organic top 10, down from 76% a year earlier. Ranking gets your content into the candidate pool, but passage structure, information density, schema, and semantic alignment with the query and draft answer decide whether you're actually cited.

Why does Google cite passages instead of whole pages?

Gemini uses passage indexing to pull specific self-contained sections into an AI Overview, because that provides more context-specific relevance to each fan-out sub-query. Analysis indicates AI favors self-contained answer units of roughly 134–167 words. This is also why forum threads like Reddit get cited heavily — a single thread contains dozens of discrete, extractable passages.

How do I optimize for query fan-out if the sub-queries keep changing?

You optimize for coverage rather than individual sub-queries. Fan-out queries are unstable, but sites with 80%+ topical coverage of their domain retain about 85.4% of their AI visibility despite roughly 73% query instability. Build comprehensive topic clusters, structure content into self-contained passages, add quantified specifics, and implement schema, which can raise selection probability by around 73%.

References

About the author

Cory Maki is an AI search strategist based in Taichung, Taiwan, specializing in GEO, AI reputation management, and AI branding for SaaS founders. Author of Reddit, AI Overviews & GEO and creator of the ARC Method. Read more →