False Positive Rate

What Is the False Positive Rate?

The false positive rate (FPR) measures how often a detector incorrectly classifies a negative case as positive. In AI content detection, this means how often human-written text is wrongly flagged as AI-generated.

Formally: FPR = False Positives / (False Positives + True Negatives)

A detector with a 10% FPR will incorrectly accuse 1 in 10 human writers of using AI. At scale, this has serious consequences.

Why FPR Matters More Than Accuracy

Overall accuracy is the headline benchmark number, but FPR is the number that determines whether a tool is safe to deploy in high-stakes contexts. A detector with 90% accuracy but 20% FPR is dangerous — it will falsely accuse 1 in 5 students who wrote their own work.

Academic institutions, publishers, and moderation systems need to weigh FPR very carefully. The reputational and legal consequences of a false accusation are asymmetric — far more damaging than a missed AI-generated piece.

FPR Across Content Types

FPR varies significantly by writing domain. STEM academic writing has naturally low perplexity — a key detection signal — which causes detectors calibrated on general corpora to over-flag it as AI-generated. Our research shows FPR as high as 34% on STEM content with general-purpose detectors.

Non-native English speakers are also disproportionately affected. Studies from Stanford and MIT have shown that non-native academic writing exhibits lower lexical diversity and lower burstiness — both signals that detectors interpret as AI-like.

Current Benchmark Numbers

Across our March 2026 benchmark of 1,200 human-written samples:

Originality.ai: 7% FPR (best in class)
GPTZero: 10% FPR
Writer.com: 8% FPR
Copyleaks: 12% FPR
Sapling AI: 17% FPR

See the full benchmark comparison for methodology and per-category breakdowns.

Relationship to False Negative Rate

There is an inherent tradeoff: making a detector more sensitive (lower FNR — catching more AI text) tends to increase FPR. Detector developers tune this threshold based on their target use case. Tools aimed at content moderation typically prioritize low FNR; tools aimed at academic integrity should prioritize low FPR.

What Is the False Positive Rate?

Why FPR Matters More Than Accuracy

FPR Across Content Types

Current Benchmark Numbers

Relationship to False Negative Rate

Related terms

Discuss this topic