Email Verification API: End-to-End Contract and Failure Modes

Email verification looks simple on a whiteboard: generate a token, send an email, click a link, mark the user verified.

In production and especially in CI or agent-driven workflows, the “email step” becomes an integration boundary with retries, duplicates, delays, and adversarial input. If you want your email verification API to be reliable, you need an end-to-end contract that explicitly states what the system guarantees, what it only attempts, and how clients should behave when reality deviates.

This guide focuses on that contract and the failure modes it must cover. If you are implementing with Mailhook, the canonical, up-to-date feature and API reference is the project’s llms.txt (always treat that as the source of truth for integration details).

What “email verification API” should mean (for automation and agents)

For engineering teams, “email verification API” often gets conflated with “API that checks an email address exists.” This article is about a different problem:

You have a system that sends a verification email (OTP or magic link).
You need programmatic proof that the correct email arrived.
You need deterministic extraction of the verification artifact.
You need retry-safe, parallel-safe behavior.

That last point is where most implementations fail. A contract that does not address retries and duplicates is incomplete.

The end-to-end contract: define resources, IDs, and semantics

A reliable email verification flow spans multiple systems:

Your application (token generation, resend logic)
Your mail sending infrastructure (ESP, SMTP relays)
An inbound email receiver (an inbox API, IMAP mailbox, or custom SMTP)
Your verification consumer (test harness, QA automation, or an LLM agent tool)

The contract must specify the “shape” of the interaction across those boundaries.

1) Resource model: inbox-first beats address-only

If the only thing your “email verification API” returns is a string email address, clients cannot reliably:

isolate parallel runs,
avoid reading stale messages,
correlate a specific attempt,
enforce TTL and cleanup.

A stronger contract returns an inbox descriptor (provider-agnostic concept):

email (the address you can send to)
inbox_id (a stable handle to fetch messages for that address)
created_at
expires_at (or a TTL)

This lets clients use deterministic retrieval: “read messages for inbox X,” not “search a shared mailbox for something that looks right.”

Mailhook is built around this inbox-first approach (programmable disposable inboxes, messages delivered as JSON). See the Mailhook llms.txt for the exact contract.

2) Identity and dedupe: distinguish message ID vs delivery ID

Your contract should differentiate:

Message identity: a stable identifier for the email content (often aligned with RFC 5322 Message-ID, but do not assume uniqueness across all systems).
Delivery identity: a stable identifier for a delivery event to your webhook or API client (critical because webhooks are typically at-least-once).

Why it matters:

A single email can be delivered multiple times (retries).
Two distinct emails can be “equivalent” from a test’s point of view (multiple resends with same OTP format).
Your consumer must dedupe at the correct layer.

A good email verification contract explicitly provides or enables:

message-level dedupe (avoid re-processing the same email)
delivery-level dedupe (avoid re-processing the same webhook delivery)
artifact-level dedupe (avoid “double clicking” the same magic link or re-submitting the same OTP)

3) Waiting semantics: no sleeps, only deadlines

Most flaky verification tests happen because they do some variation of:

“sleep 10 seconds, then check the inbox once”

A robust contract instead defines waiting semantics:

A deadline-based wait (overall time budget)
Polling or long-polling semantics when webhooks are unavailable
A clear “not found yet” state (not an error)

Even if you use webhooks, you still need a fallback plan. Production networks fail, CI environments drop inbound requests, webhook endpoints have deploy windows.

4) Delivery semantics: webhook-first, polling fallback

A verification-capable inbound email API should support:

Webhooks: for low-latency, scalable, event-driven consumption
Polling API: as a deterministic fallback and for environments where inbound webhooks are hard

But the contract must include the hard parts:

Webhooks are typically at-least-once (duplicates are normal).
Webhooks can arrive out of order.
Polling must specify cursor semantics and timeouts to prevent thundering herds.

Mailhook supports real-time webhooks and polling, and also supports signed payloads (useful for authenticity). Confirm details in the llms.txt.

A simple system flow diagram showing an application sending a verification email to a disposable inbox address, an inbound email API normalizing the message into JSON, then delivering it to a webhook endpoint with signature verification, plus a separate polling fallback path that fetches messages by inbox_id.

5) Content contract: treat email as hostile input, expose minimal artifacts

For verification, you rarely need the full HTML email body. You need the smallest artifact that proves verification is possible:

OTP code
verification URL (magic link)

Your contract should define:

a normalized JSON message form (headers, timestamps, routing info)
a safe artifact extraction approach (prefer text/plain when possible)
a minimized “agent view” if LLM agents will touch the content

If you allow LLM agents to see raw HTML, you are increasing risk of prompt injection and unsafe tool use. A safer contract is: “tool extracts OTP or whitelisted URL, agent receives only that.”

6) Lifecycle contract: TTL, drain window, and cleanup

Verification inboxes should be disposable. The contract should state:

TTL defaults and ability to configure expiration
how late arrivals are handled (a drain window model is common)
deletion semantics (immediate delete vs tombstone)

This is both a reliability concern (avoid stale message selection) and a security/privacy concern (minimize retained secrets).

Failure modes: what breaks, what it looks like, and what the contract must do

Below is a practical failure-mode map you can use in design reviews.

Failure modes across layers

Layer	Failure mode	Symptom	Contract requirement (mitigation)
App	Resend behavior changes	Multiple emails arrive, test picks wrong one	Attempt-scoped correlation token and message matchers, artifact-level idempotency
App	Token TTL mismatch	Link/OTP is expired when used	Contracted time budget, explicit resend policy, log token issuance time
SMTP/ESP	Delivery delay / greylisting	Email arrives late or not within fixed sleep	Deadline-based wait, webhook-first with polling fallback
Domain/DNS	MX misconfiguration (custom domain)	No emails ever arrive	Contracted “smoke test” and domain routing validation steps
Inbound provider	Duplicate ingestion	Same message appears multiple times	Message + delivery IDs, deterministic dedupe
Webhook	Retries on 5xx/timeouts	Duplicate webhook deliveries	Signed payload verification + delivery dedupe key
Webhook	Spoofing / replay	Fake verification artifact arrives	Signature over raw body, timestamp tolerance, replay detection
Polling	Cursor bugs / non-monotonic ordering	Missing or repeated messages in list	Opaque cursor semantics, seen-ID set, bounded backoff
Parsing	HTML structure changes	Regex fails, can’t find OTP/link	Parse structured JSON fields, prefer text/plain, layered extraction
Agent	Prompt injection via email content	Agent takes unintended action	Minimized agent view, strict tool surface, URL allowlist

The 7 failure modes to explicitly test in CI

If you only test “happy path,” your contract is unproven. Add tests that simulate:

Duplicate delivery of the same webhook payload
Out-of-order webhook arrival
Polling fallback (webhook intentionally disabled)
Two verification emails in the same inbox (resend)
Late arrival (message arrives near deadline)
Email format drift (extra HTML, different subject)
Replay attempt (same delivery resent later)

These tests drive better contracts because they force you to encode expectations.

A practical end-to-end contract (provider-agnostic)

Use this as a reference when designing your integration.

Inbox provisioning contract

Input: optional metadata for correlation (run ID, attempt ID), optional domain choice.

Output:

inbox_id
email
created_at, expires_at

Client obligations:

Create one inbox per attempt (not per test suite, not per environment).
Store inbox_id as the primary handle.

Message delivery contract

Webhook event (if enabled):

Contains normalized message JSON
Contains delivery_id
Is signed (recommended), client verifies signature and replay window

Polling API (fallback):

List messages for inbox_id with cursor pagination
Support “wait until deadline” behavior in the client

Provider obligations:

Webhooks are at-least-once
Polling is eventually consistent

Client obligations:

Must be idempotent on delivery_id
Must be idempotent on the extracted artifact

Artifact extraction contract

Output:

artifact_type: otp or verification_url
artifact_value: the OTP string or URL
artifact_hash: stable hash for consume-once enforcement

Rules:

Prefer deterministic extraction that does not depend on brittle HTML
Validate URLs before using them (scheme, host allowlist, no open redirects if you follow them)

For URL security and URI parsing rules, RFC 3986 is the baseline reference: RFC 3986.

Reference implementation sketch (client-side)

Below is intentionally provider-agnostic pseudocode. The key is the behavior, not the endpoint naming.

type Inbox = { inboxId: string; email: string; expiresAt: string };

type VerificationArtifact =
  | { type: "otp"; value: string; hash: string }
  | { type: "verification_url"; value: string; hash: string };

async function verifyEmailFlow(): Promise<VerificationArtifact> {
  const attemptId = crypto.randomUUID();
  const inbox: Inbox = await createInbox({ attemptId, ttlSeconds: 900 });

  await triggerSignUp({ email: inbox.email, attemptId });

  const deadlineMs = Date.now() + 60_000;

  // Prefer webhook-driven ingestion into your datastore.
  // If no webhook event arrives, poll by inboxId until the deadline.
  const message = await waitForMessage({
    inboxId: inbox.inboxId,
    deadlineMs,
    matcher: {
      // Keep matchers narrow and deterministic.
      // Example: subject contains "Verify" and recipient matches inbox.email.
    },
  });

  const artifact = extractVerificationArtifact({ message });

  // Consume-once semantics.
  // Use artifact.hash as an idempotency key in your own database.
  await markArtifactConsumed({ attemptId, artifactHash: artifact.hash });

  return artifact;
}

What matters here:

A unique inbox per attempt
A deadline-based wait
Narrow matchers
Artifact-level idempotency

Where Mailhook fits (without guessing features)

Mailhook provides the building blocks that make the above contract practical:

Programmable disposable inbox creation via API
Received emails delivered as structured JSON
Real-time webhook notifications
Polling API for email retrieval
Signed payloads (useful for webhook authenticity)
Shared domains for instant starts, plus custom domain support
Batch email processing

For the exact API shape, payload fields, signature scheme, and current behavior, use the canonical spec: Mailhook llms.txt.

If you want a fast starting point, you can also explore the product overview at Mailhook.

Frequently Asked Questions

What is an email verification API in the context of CI and AI agents? It is an API-driven workflow that provisions an inbox, receives the verification email as machine-readable data (JSON), and supports deterministic waiting plus safe extraction of OTPs or verification links.

Why is “webhook-first, polling fallback” part of the contract? Because webhooks provide low latency and scale, but polling provides deterministic recovery when webhooks are unreachable, delayed, or misconfigured.

What is the most common failure mode in verification email automation? Inbox reuse. Reusing an inbox across retries or parallel runs causes stale selection, duplicates, and races. The simplest fix is one inbox per attempt.

Is DKIM or “email signed by” enough to trust webhook events? No. DKIM relates to the email message itself, not the authenticity of the HTTP webhook request carrying your JSON payload. You still need webhook signature verification and replay defenses.

How do I make verification safe for LLM agents? Do not expose raw HTML by default. Extract a minimal artifact (OTP or a strictly validated URL), constrain tool actions, and treat email content as hostile input.

Build a verification contract your tests (and agents) can actually trust

If your current email verification tests are flaky, slow, or unsafe for autonomous agents, it is almost always a contract problem: unclear IDs, undefined retry semantics, weak correlation, and no idempotency.

Mailhook is designed for this inbox-first, JSON-first model. Use the canonical integration reference at mailhook.co/llms.txt, then try provisioning disposable inboxes and consuming verification emails in a deterministic, retry-safe way at mailhook.co.