Question 1

Why not just divide quality by price?

Accepted Answer

A naive quality-over-price ratio explodes as price approaches zero, so a $0 model with quality 40 would beat a $0.10 model with quality 95. Cobb-Douglas utility is the standard economic substitute: U = Q^alpha times P^(1-alpha) cannot be dominated by sending any one component to an extreme. It also has a clean normative interpretation as constant-elasticity preferences, which gives us a defensible one-sentence explanation.

Question 2

Where does the alpha=0.6 weighting come from?

Accepted Answer

We chose 60 percent quality, 40 percent price based on what production buyers repeatedly tell us matters more: a cheap-but-unreliable model is a worse purchase than a slightly-expensive-but-dependable one. A pure 50/50 would let a marginal-quality model leapfrog a clearly-better model on small price differences. Higher than 60 would make price almost irrelevant, which defeats the purpose of a value index. A future version may let you pass your own alpha via the URL.

Question 3

How often do the price anchors change?

Accepted Answer

Quarterly. The current anchors (p10 = $0.087/M, p90 = $4.80/M) were computed from the Q2 2026 snapshot of 290 priced text models. We deliberately avoid recomputing on every cache refresh because unstable anchors produce unstable rankings that churn for reasons unrelated to the models. Morningstar and MSCI rating systems take the same approach: stability is a feature, not a bug.

Question 4

What about models I cannot price (free tier, image, video)?

Accepted Answer

Free models use a shadow price of 2x the p10 anchor ($0.174/M) to prevent automatic domination of the ranking. Image-generation and video-generation models return null because their cost structures (per-image, per-second) are not comparable to per-token pricing and mixing them would give misleading results. If you need a value-adjusted ranking for image or video models we are considering a separate index for them.

Question 5

Why round to integer? Is the formula really that noisy?

Accepted Answer

Yes. Our composite quality score has a standard error of about 2-4 points from benchmark variance alone, and the price anchors add another 1-2 points of uncertainty from the rolling catalog. An LMC ValueScore of 73.4 vs 73.1 is within noise and reporting decimals implies a false precision we cannot back up. Integers communicate the true resolution of the underlying data.

Question 6

How are reasoning tokens accounted for?

Accepted Answer

Reasoning models (OpenAI o1/o3/o4, DeepSeek R1, Alibaba QwQ, and their distilled variants) bill hidden "thinking" tokens on top of the visible output. Our blended price multiplies the output-token contribution by a family-specific expansion factor published in the model system cards: o1-pro=15x, o1=8x, o3=10x, R1=8x, QwQ=6x. Without this adjustment, reasoning models look 3-15 times cheaper than they actually are.

The LMC ValueScore Formula

The formula

Empirical price anchors (Q2 2026)

Worked examples from today's catalog

Failure modes we engineered around

See it in action