Web

Word Count and Reading Time: Why Estimates Vary and What's Actually Accurate

Different tools count words differently. Reading-time estimates range 50% across platforms. Learn what counts (and what doesn't), why Medium's estimate differs from yours, and how to make accurate reading-time predictions.

8 min read

Microsoft Word, Google Docs, your CMS, and your custom React component will all give you different word counts for the same text. Reading-time estimates vary even more wildly. The differences come from decisions about what counts as a word — and most tools don't document those decisions.

What counts as a word?

Word counting sounds trivial. It isn't. Different decisions:

  • Hyphenated words: Is "state-of-the-art" one word or four? Most counters say one. Some scientific tools say four.
  • Contractions: Is "don't" one word or two? Universal: one.
  • URLs: Is "https://example.com" a word? Some counters skip URLs entirely; others count the full URL as one word.
  • Numbers and dates: Is "2024" a word? "2024-10-27"? Most include numbers as words; some exclude.
  • Code blocks: Should code be word-counted? Markdown processors disagree.
  • Acronyms: "U.S.A." — one word or three? Usually one.

Word counts can vary by 5–10% across tools for the same text, especially in technical content.

The standard algorithm

The most common approach (used by Word, Google Docs, most JavaScript libraries):

  1. Strip HTML tags (if HTML).
  2. Replace whitespace runs with single spaces.
  3. Split on whitespace.
  4. Count non-empty resulting tokens.

This treats "state-of-the-art" as one word (hyphens stay attached) and "USA" as one word. URLs typically count as one word because they have no internal whitespace.

Reading time math

The standard formula:

reading_time_minutes = word_count / WPM

The contentious value is WPM (words per minute). Different platforms use different defaults:

  • Medium: 265 WPM (relatively fast).
  • Substack: 200 WPM.
  • Default for blog posts (Wikipedia): 250 WPM.
  • Academic reading rate: 200–250 WPM.
  • Technical reading (with comprehension): 100–200 WPM.

A 2,000-word article reads as 7.5 minutes at Medium's rate, 10 minutes at Substack's. Same article, 33% difference in reported reading time.

Why reading-time estimates lie

Real reading speed varies enormously: 100–600 WPM depending on familiarity with the topic. Reading-time estimates are best understood as "commitment signals" rather than accurate predictions. They help users decide whether to start; they rarely match actual time spent.

Adjustments for content type

Naive word/WPM math ignores content density:

  • Code blocks: typically read at half the WPM of prose. Add ~30 seconds per code block to estimates.
  • Math equations: very slow — 60 seconds per equation isn't unreasonable.
  • Tables and figures: add 10–30 seconds per item.
  • Images: users typically spend 5–10 seconds; add per image.
  • Video embeds: reading time should not include video duration; consider showing both separately.

A more realistic formula:

reading_time = (prose_words / 250) + (code_blocks × 0.5) + (images × 0.15)

Character count vs word count

Some platforms (Twitter/X, SMS, app store descriptions) limit by character, not word. Decisions there:

  • With or without spaces: Twitter counts spaces; some legal limits don't.
  • Unicode counting: "😀" is 1 character visually, but is 4 bytes UTF-8 or 2 UTF-16 code units. Twitter counts emojis as 2 characters.
  • Combining characters: "é" can be 1 codepoint or 2 (e + combining accent). NFC normalization can make a difference.

Implementing reliable word count in code

For JavaScript:

function countWords(text) {
  return text
    .replace(/<[^>]*>/g, ' ')   // strip HTML
    .replace(/\s+/g, ' ')        // normalize whitespace
    .trim()
    .split(' ')
    .filter(Boolean)
    .length;
}

For text with mixed scripts (CJK + Latin), word boundaries are different. Chinese has no spaces between words; you need a segmentation library (e.g., jieba for Chinese, kuromoji for Japanese) for accurate word counts. For approximate word counts in CJK, divide character count by 1.5 as a rough heuristic.

SEO implications

Search engines don't directly use reading time, but content length is a ranking signal in some contexts:

  • Pages targeting informational queries with under 300 words often rank poorly — they don't look comprehensive.
  • 1500–2500 words is the sweet spot for "how-to" and educational content based on Backlinko's analysis.
  • Reading time displayed in metadata helps users commit and reduces bounce rate (a real ranking signal).

Specifically for typography mockups

When mocking up layouts before content exists, average word lengths and counts to aim for:

  • Average English word: 4.7 characters + 1 space = 5.7 typed characters.
  • Average sentence: 15–20 words.
  • Average paragraph: 3–5 sentences (50–80 words).
  • Standard column width: 60–75 characters per line for optimal readability.
  • Average page (8.5×11, single-spaced, 12pt): ~500 words.

Key Takeaways

  • Word counts vary 5–10% across tools because of decisions about hyphens, URLs, numbers, and acronyms.
  • Reading-time estimates use 200–265 WPM. Same article can show as 33% different reading time on different platforms.
  • Real reading time varies 100–600 WPM by topic familiarity. Estimates are commitment signals, not predictions.
  • Adjust reading time for code blocks (~0.5 min each), images (~10 sec each), tables, and equations.
  • Character counts have unicode complications: emojis count as 2 in Twitter; combining characters depend on normalization.

Get accurate word, character, and reading-time stats on any text using our Word Counter.