
Standard summarization is a lossy operation. You ask a model to compress a text, and you get something back that captures maybe 60% of what mattered — you just don't know which 60% got dropped.
Chain-of-Density (CoD) is a prompting technique from a 2023 paper that attacks this problem directly. Instead of asking for one summary, it forces the model to generate increasingly dense summaries through iterative passes, each one identifying what the previous iteration missed. The result: summaries that are more entity-dense, more specific, and more faithful to the original text than single-pass approaches.
This isn't magic. It's a structure that makes the model accountable for what it omits.
When you ask a model to "summarize this article in 3 sentences," you're making a tradeoff you don't control. The model decides what's important. You get what you get.
The CoD paper found that human raters consistently preferred summaries with higher entity density — more specific names, dates, claims, and relationships packed into the same token budget. But entity density is hard to optimize for directly. You can't just tell a model "be more specific." Specificity is a direction, not a target.
Chain-of-Density solves this by creating a process: generate a sparse summary, identify what's missing, generate a denser version that includes the gaps, repeat. Each iteration makes explicit what's being added. The density becomes a visible property of the output, not an accident.
The CoD pattern has five steps, run iteratively on the same source text:
1. Generate an initial summary (1-2 sentences) 2. Identify the 3 most important entities missing from the summary 3. Rewrite the summary to include those entities with more detail 4. Repeat steps 2-3 until the summary reaches target density
That's the structure. Here's what it looks like as an actual prompt.
```python CHAIN_OF_DENSITY_PROMPT = """ You will generate increasingly concise, entity-dense summaries of the provided text.
Following the format below, generate and show 5 successive passes of the summary.
Each pass must:
Format each pass as: [PASS {n}]: <summary text> Missing Entities: <comma-separated list of 3 most important entities not yet covered>
Source text: {text}
Stop when the missing entities list is empty or the summary covers all critical information. """ ```
The Missing Entities line is the key innovation. It forces the model to articulate what it's omitting before it omits it. That articulation is what makes the next iteration focused rather than random.
```python import anthropic from typing import Generator
client = anthropic.Anthropic()
def chain_of_density(text: str, max_passes: int = 5) -> list[dict]: """ Run chain-of-density prompting on source text. Returns list of passes with summary and missing entities. """ results = [] current_missing = ["_start_"] # sentinel to trigger first iteration
while current_missing and len(results) < max_passes: if not results:
prompt = f"""Generate a 1-2 sentence summary of this text. Then list the 3 most important entities (names, dates, claims, events) that your summary does NOT cover.
Format: Summary: <your summary> Missing: <3 entities not covered>
Text: {text}""" else:
prev = results[-1] prompt = f"""Rewrite this summary to include the missing entities. Keep it concise but add specific details for each missing entity. Then identify 3 new important entities still not covered.
Previous Summary: {prev['summary']} Previously Missing: {', '.join(prev['missing'])}
Format: Summary: <rewritten summary with added detail> Missing: <3 new entities not yet covered>
Text: {text}"""
response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=300, messages=[{"role": "user", "content": prompt}] )
parsed = _parse_cod_response(response.content[0].text) current_missing = parsed['missing'] results.append(parsed)
return results
def _parse_cod_response(text: str) -> dict: """Parse the model output into structured fields.""" summary_line = [l for l in text.split('\n') if l.startswith('Summary:')] missing_line = [l for l in text.split('\n') if l.startswith('Missing:')]
summary = summary_line[0].replace('Summary:', '').strip() if summary_line else '' missing_text = missing_line[0].replace('Missing:', '').strip() if missing_line else '' missing = [e.strip() for e in missing_text.split(',') if e.strip()]
return {"summary": summary, "missing": missing} ```
Running this on a 2,000-word news article about a semiconductor merger, the first pass looks like:
[PASS 1]: "NVIDIA's proposed acquisition of Arm Holdings faces regulatory scrutiny in multiple jurisdictions." Missing Entities: UK CMA, FTC, semiconductor IP, Jensen Huang, 2026 timeline
Second pass:
[PASS 2]: "NVIDIA's proposed $40B acquisition of Arm Holdings faces pushback from UK CMA and US FTC regulators, who cite concerns over semiconductor IP consolidation." Missing Entities: Jensen Huang, SoftBank, cross-licensing agreements, GPU market dominance
Third pass:
[PASS 3]: "NVIDIA's $40B acquisition of Arm Holdings — backed by SoftBank — faces regulatory headwinds from UK CMA and US FTC over semiconductor IP consolidation, with concerns centered on GPU market dominance and cross-licensing implications for chipmakers including AMD, Intel, and Qualcomm." Missing Entities: none significant
The third pass is what you'd actually use. It's specific, it names the parties, it captures the stakes, and it fits in a tweet. The iteration made that quality visible and deliberate.
CoD is not always the right tool. Here's the honest comparison:
| Scenario | Use CoD | Use Standard Summary |
|---|---|---|
| News article, research paper, earnings call | ✅ | |
| Slack thread, casual email, quick update | ✅ | |
| When you need specific entities (names, dates, numbers) | ✅ | |
| When you just need the gist fast | ✅ | |
| Evaluation use case (measuring recall of specific facts) | ✅ | |
| User-facing summarization with latency constraints | ✅ |
The latency issue is real: CoD makes 3-5 API calls per document. For a consumer UI, that's too slow. For internal tools, research pipelines, and evaluation workflows, the quality difference is worth the extra latency.
CoD transfers well to code summarization. The pattern: generate a summary of a function, identify what's not explained (edge cases, dependencies, return conditions), rewrite with those details included.
```python def summarize_function_cod(func_code: str) -> str: """Apply chain-of-density to code explanation.""" prompt = f"""Explain what this function does in 2 sentences. Then identify 3 important implementation details (edge cases, dependencies, state changes, or return conditions) not yet covered.
Rewrite the explanation to include those details.
Code: {func_code}"""
`
The entity-dense principle works the same way: specific function names, parameter constraints, exception cases, and side effects are the "entities" of code explanation. A CoD summary tells you not just what the function does but what to watch out for.
Token cost is 3-5x standard summarization. Each pass is a separate API call. Budget accordingly — this is for internal pipelines, not high-volume consumer features.
**The Missing Entities line is load-bearing.** Don't remove it. The iteration structure only works because each pass has to explicitly name what's being omitted. Without that constraint, the model just rewrites the same summary slightly differently.
Quality depends on the model. Claude and GPT-4 class models handle the iteration well. Smaller models tend to plateau early — they run out of "new" entities to add and start rephrasing existing content. Test on your target model before building a pipeline around it.
The last pass isn't always the best. Sometimes an earlier pass has the right density for the use case. Iterate until Missing Entities is empty, but use judgment about which pass to actually use.
CoD works because it externalizes the tradeoff between concision and coverage. Standard summarization makes this tradeoff implicitly — the model decides, you get what you get. CoD makes the tradeoff explicit and iterative, which means you can see exactly what was omitted and why the final version includes what it does.
That's useful beyond the specific technique. Any time you're working with a model and the quality is "close but not quite," consider whether the problem is that you're asking for the output without asking the model to account for what it's leaving out.
Chain-of-Density prompting. Paper: "Generating Dense Summaries with Chain-of-Density Prompting" (2023). 3-5 iterative passes per document. Best for research pipelines, evaluation workflows, and internal tools where latency is acceptable.