AI models ranked for creative writing - fiction, poetry, screenwriting, and storytelling. Scored with bonuses for large output windows (long chapters) and extended context (maintaining story coherence across long narratives).
| # | Model | Score |
|---|---|---|
| 1 | Claude Opus 4.7Anthropic | 95 |
| 2 | GPT-5.5OpenAI | 93 |
| 3 | Gemini 3.1 Pro Preview Custom ToolsGoogle | 92 |
| 4 | Gemini 3.1 Pro PreviewGoogle | 92 |
| 5 | GPT-5.4 ProOpenAI | 92 |
| 6 | GPT-5.4OpenAI | 92 |
| 7 | GPT-5.5 ProOpenAI | 91 |
| 8 | GPT-5.2 ProOpenAI | 91 |
| 9 | Claude Opus 4.6 (Fast)Anthropic | 90 |
| 10 | Claude Opus 4.6Anthropic | 90 |
| 11 | GPT-5.2-CodexOpenAI | 90 |
| 12 | GPT-5.2OpenAI | 90 |
| 13 | GPT-5.3-CodexOpenAI | 89 |
| 14 | GPT-5 ProOpenAI | 89 |
| 15 | Gemini 3 Flash PreviewGoogle | 88 |
| 16 | GPT-5.1-Codex-MaxOpenAI | 88 |
| 17 | GPT-5 CodexOpenAI | 88 |
| 18 | GPT-5OpenAI | 88 |
| 19 | GPT-5.1OpenAI | 87 |
| 20 | GPT-5.1-CodexOpenAI | 87 |
| 21 | GPT-5.1-Codex-MiniOpenAI | 87 |
| 22 | DeepSeek V4 ProDeepSeek | 87 |
| 23 | o3 Deep ResearchOpenAI | 87 |
| 24 | o3 ProOpenAI | 87 |
| 25 | o3OpenAI | 87 |
| 26 | GPT-5.3 ChatOpenAI | 87 |
| 27 | Claude Sonnet 4.6Anthropic | 85 |
| 28 | Claude Opus 4.5Anthropic | 85 |
| 29 | GPT-5.1 ChatOpenAI | 87 |
| 30 | Gemini 2.5 ProGoogle | 84 |
Large output windows (16K+ tokens) let AI write full chapters in one go. Extended context (128K+) helps maintain character consistency and plot coherence across long narratives.
Higher-quality models excel at wordplay, meter, and emotional resonance. Streaming enables real-time collaboration as the AI generates verse-by-verse.
Models with strong instruction-following produce well-formatted screenplays. Function calling can integrate with outline tools to maintain story structure.
Large context windows are essential for maintaining consistency in fantasy and sci-fi worlds. Feed the AI your existing lore documents and it can generate content that fits seamlessly.
Claude and GPT-4o lead for creative prose, producing varied sentence structures, unexpected metaphors, and authentic character voices. Claude tends toward literary fiction styles while GPT-4o excels at genre fiction. Both avoid the repetitive patterns common in smaller models.
Models with 128K+ context windows can maintain character voices, plot threads, and tonal consistency across 50,000+ words when given proper context. Gemini 2.5 Pro and Claude excel here. Shorter-context models lose consistency after a few chapters.
Use specific style instructions, provide writing samples to emulate, and choose models ranked high for creativity benchmarks. Avoid models optimized purely for factual accuracy - they tend to produce flat, encyclopedic prose. Temperature settings between 0.7-0.9 help.
Claude excels at poetry with natural rhythm and imagery. GPT-4o handles screenwriting format conventions well. For songwriting, models with music training data produce better rhyme schemes and meter. All benefit from genre-specific prompting.