Ranking vs. selection

Traditional search ranks pages against a query. Answer engines do something different: they retrieve a candidate set of strong pages, then choose which one to quote. Two pages can rank in the same top three results and only one will be cited in the AI answer. The difference is selection.

Selection is the moment an AI system decides "this page contains the answer in a form I can use." A page that ranks high but buries its answer five paragraphs deep loses to a page that ranks slightly lower but states the answer cleanly in its second paragraph.

> DefinitionExtractability is the property of a page that lets a language model lift a self-contained answer without inference, paraphrasing across paragraphs, or guessing at structure.

What "extractability" actually means

Extractability is not the same as readability. Readability is for humans. Extractability is structural: the answer to a likely question exists as a complete unit, near the top of the relevant section, and is not split across multiple paragraphs that require the model to stitch them together.

The strongest extractable units share three properties:

  • The first sentence states the answer directly.
  • Following sentences add context, conditions, or evidence — not the answer itself.
  • The unit is preceded by a heading that matches the question being asked.

The five selection signals AI systems use

Across observable behavior in Google AI Overviews, ChatGPT search, and Perplexity, five signals consistently correlate with selection:

  1. Answer-first structure. The answer appears in the opening sentence of its section, not in a conclusion.
  2. Heading-question alignment. H2 and H3 headings phrased as questions or direct topic statements.
  3. Visible-content / schema parity. JSON-LD describes content that is also visible on the page.
  4. Definition density. Technical terms are defined inline, not assumed.
  5. Source clarity. A clear author, organization, and updated date that align with the topic.

Common mistakes that cause skipping

| Mistake | Why it hurts selection | |---|---| | Wall-of-text paragraphs | Forces the model to guess where the answer starts and ends. | | Content hidden in tabs or accordions | Some crawlers do not see it; selection probability drops. | | Schema describing content not visible on page | Trust signal degrades; some engines deprioritize the page. | | Vague H2 like "Overview" or "More info" | Heading does not match any natural query. | | Answer hidden after a long intro | The model lifts the intro instead of the answer, or skips the page. |

What to do about it

For each page that should be answer-eligible, audit three things in this order:

  1. Lead with the answer. Move the direct answer to the first paragraph under each H2.
  2. Restate the question in the heading. Replace abstract headings with question-shaped or topic-direct ones.
  3. Align schema with visible content. Every claim in your JSON-LD should be findable on the rendered page.