Burstiness in AI Detection

What Is Burstiness?

Burstiness in the context of AI text detection refers to the variance in perplexity across sentences within a document. It was popularized as a detection signal by GPTZero in 2023.

Formally, burstiness is often computed as the coefficient of variation of sentence-level perplexities:

burstiness = σ(sentence_perplexities) / μ(sentence_perplexities)

A high burstiness score means perplexity varies a lot from sentence to sentence. A low score means perplexity is uniform across the document.

Why Human Writing Is Bursty

Human writers are not optimizing for statistical consistency. They write some sentences effortlessly and others laboriously. They use creative metaphors, unexpected turns of phrase, and idiosyncratic word choices — interspersed with more formulaic passages. This creates high variance in how "surprising" their text is to a language model.

Why AI Text Has Low Burstiness

AI language models optimize for coherence and fluency. They produce text that is consistently probable — not too surprising, not too predictable — resulting in unnaturally uniform sentence-level perplexity. The "bursts" of high-perplexity text that characterize human writing are largely absent.

Burstiness in Practice

Burstiness is most powerful as a complementary signal to raw perplexity. A text can have moderate average perplexity (not flagged by perplexity-only detectors) but very low burstiness — indicating AI origin. Used together, perplexity and burstiness catch cases that each signal misses alone.

Sophisticated humanizer tools now specifically target burstiness — inserting high-perplexity sentences to create artificial variance. This is an active area of the detection arms race.

What Is Burstiness?

Why Human Writing Is Bursty

Why AI Text Has Low Burstiness

Burstiness in Practice

Related terms

Discuss this topic