Skip to content
Engineering

Sign Up Email Testing: Stop Duplicates and Bot Loops

| | 10 min read
Sign Up Email Testing: Stop Duplicates and Bot Loops
Sign Up Email Testing: Stop Duplicates and Bot Loops

Signup flows look simple until you automate them. Then you discover a frustrating reality: the sign up email is the noisiest part of the pipeline. Messages arrive late, arrive twice, or arrive after your test already moved on. If you add LLM agents on top, you can also get “bot loops”, where an agent re-triggers signup or replays a verification link until rate limits or lockouts kick in.

This guide focuses on two reliability killers in sign up email testing:

  • Duplicates (same email event processed multiple times)
  • Bot loops (automation repeatedly triggering the same email, or repeatedly consuming the same email)

The goal is not “make the test pass once”, it is make the email step deterministic, idempotent, and safe to retry.

Why duplicates happen in sign up email testing (it is not just your mail provider)

Duplicates typically come from at-least-once behavior somewhere in the chain. It helps to name the layer, so you can dedupe at the right boundary.

Where duplicates are born Common cause What it looks like in tests Best fix
Your app “Resend verification email” triggered twice, retries without idempotency, double form submits Two emails with different tokens Add an idempotency key per signup attempt, enforce one active token
Your job queue Worker retries without a dedupe key Same template, same token sent twice Make the send job idempotent (attempt_id)
SMTP delivery path Greylisting, transient failures, upstream retries Two near-identical messages, possibly same Message-ID Deduplicate by a stable message identifier and artifact
Webhook delivery Your endpoint times out, provider retries Same message delivered multiple times Verify signatures and implement webhook idempotency
Polling consumer Cursor bugs, eventual consistency, fetching “latest” repeatedly Same message processed on every poll Use a cursor or store “seen message ids”
CI / agent orchestration Test retries rerun the same logical attempt More emails than expected, flaky assertions Isolate inbox per attempt, correlate run ids

A key takeaway: you cannot reliably “prevent” duplicates in distributed systems. You can only design so duplicates are harmless.

Why bot loops happen (and why they are worse than duplicates)

A duplicate is one event repeated. A bot loop is a feedback cycle.

Common loops in signup automation:

  • Retry loop: the agent times out waiting for the email, retries the signup, triggering another email, then repeats.
  • Replay loop: the agent receives the verification email, clicks the magic link, gets an error, and clicks again indefinitely.
  • Parser loop: the agent fails to extract the OTP, asks for resend, and keeps accumulating emails while still reading the oldest one.
  • Webhook replay loop (security + reliability): if you do not verify signed webhook payloads (and timestamp / replay tolerance), a captured payload can be replayed and cause repeated processing.

The fix is to treat signup verification like a small state machine with budgets:

  • A single attempt id
  • A single inbox scope
  • A bounded wait
  • A single consume of the verification artifact
  • A hard stop when budgets are exceeded

A simple flow diagram showing a signup attempt creating a disposable inbox, triggering an email send, receiving an email event (webhook or polling), extracting a verification artifact once, and marking it consumed to prevent duplicates and retry loops.

The deterministic pattern: inbox-per-attempt plus idempotent consume

If you are still using shared inboxes (or plus-addressing into one mailbox), you are fighting the wrong battle. The clean pattern for sign up email testing is:

  • Create a fresh disposable inbox per signup attempt
  • Send the signup verification email to that address
  • Wait deterministically (webhook-first, polling fallback)
  • Extract a minimal artifact (OTP or URL)
  • Consume it exactly once

Mailhook is designed for this style of automation: you create disposable inboxes via API and receive inbound messages as structured JSON, delivered via real-time webhooks and/or retrieved via polling. For exact endpoints and payload fields, use the canonical reference at Mailhook llms.txt.

Dedupe correctly: pick the right keys (message id vs artifact id)

To stop duplicates, you need a stable key for “this email event” and a stable key for “this verification action”. They are not always the same.

Recommended dedupe keys

Dedupe scope What you are preventing Suggested key Notes
Message-level Processing the same email more than once Provider message id (preferred), or normalized Message-ID header RFC 5322 defines Message-ID, but it is not guaranteed unique in practice, treat as best-effort
Artifact-level Clicking the same verification link twice, or reusing an OTP Hash of extracted artifact (OTP value, token, or canonicalized URL) Canonicalize URL (strip tracking params) before hashing
Attempt-level Creating multiple “active” attempts that race attempt_id you generate before sending email Store this in your DB and logs
Webhook delivery Running your webhook handler twice delivery_id or message id from payload Return 2xx only after durable write

If you can only implement one thing: artifact-level idempotency. Even if you receive three emails, only the first artifact should be consumed.

Webhooks: assume at-least-once delivery and build idempotency in

Webhook retries are normal, not exceptional. Providers retry when:

  • Your endpoint times out
  • You return a non-2xx
  • Your load balancer closes the connection

So your webhook handler must be:

  • Authenticated (verify signed payloads)
  • Replay-resistant (timestamp tolerance, nonce if available)
  • Idempotent (same event can arrive twice)

Mailhook supports signed payloads for security, which lets you verify that the webhook really came from Mailhook and was not altered. Follow the verification procedure described in llms.txt.

Minimal webhook handler shape (pseudocode)

handleWebhook(request):
  payload = request.body
  assert verify_signature(request.headers, payload)

  event_id = payload.event_id OR payload.message.id

  if db.exists("webhook_events", event_id):
    return 200

  db.insert("webhook_events", {event_id, received_at: now()})
  enqueue("process_message", {message_id: payload.message.id, inbox_id: payload.inbox.id})

  return 200

Design note: write the idempotency record first, then enqueue. If the enqueue fails, you can retry safely.

For general webhook retry behavior and signature verification patterns, Stripe’s webhook docs are a good reference model, even if you are not using Stripe: webhook best practices.

Polling: stop “latest message wins” bugs with cursors and time budgets

Polling is a perfectly valid fallback, but “fetch latest and parse” is a common source of duplicates and bot loops.

A safer polling contract:

  • Poll until a deadline
  • Filter narrowly (recipient + attempt correlation)
  • Track a cursor or store processed message ids
  • Select the first message that matches the attempt, not “whatever arrived most recently”

Minimal polling loop (pseudocode)

waitForSignupEmail(inbox_id, attempt_id, deadline):
  seen = set()

  while now() < deadline:
    messages = api.list_messages(inbox_id)

    for m in messages:
      if m.id in seen:
        continue
      seen.add(m.id)

      if not matches_attempt(m, attempt_id):
        continue

      artifact = extract_verification_artifact(m)
      return {message_id: m.id, artifact}

    sleep(backoff())

  throw Timeout("No matching signup email")

This single change, “remember what you already looked at”, prevents a surprising amount of flakiness.

Correlation: make the right email easy to identify

Duplicates get dangerous when you cannot tell which email belongs to which attempt.

Correlation options, from strongest to weakest:

  • Inbox isolation: one disposable inbox per attempt (best)
  • Explicit attempt token in the email content: include attempt_id in the template (works well for internal systems)
  • Custom header: add X-Correlation-Id: <attempt_id> when sending
  • Subject tags: helpful, but easiest to break with localization or template changes

If you control the sender, a custom header is usually the cleanest, because it avoids brittle HTML parsing. If you do not control the sender (third-party SaaS), inbox isolation and narrow matchers are your best tools.

For a deep dive on which headers are worth trusting, see the RFC that defines the message format: RFC 5322.

“Consume once” rules that stop replay loops

Once you extract a verification link or OTP, your automation must treat it like a one-time capability.

Implement these rules:

  • Store a consumed marker keyed by artifact_hash
  • Do not click or submit an OTP twice even if the UI says “try again”
  • If redemption fails, stop and surface a debuggable error (do not retry blindly)

A simple database table is enough:

Column Purpose
artifact_hash Idempotency key, prevents double-consume
attempt_id Links consume back to the run
consumed_at Debuggability and audit
result Success, already_used, expired, invalid

This is how you turn a potentially unbounded loop into a finite workflow.

LLM agents: prevent “autonomous resend” behavior with tool constraints

LLM agents are great at improvising, which is exactly what you do not want in auth flows.

If an agent is allowed to:

  • trigger signup
  • request resend
  • read emails
  • click links

then a small parsing glitch can cause it to spam resend and produce a self-sustaining loop.

The fix is to give the agent constrained tools and explicit budgets:

  • create_signup_attempt() returns {attempt_id, email, inbox_id, expires_at}
  • wait_for_signup_email(attempt_id) returns a single message or timeout
  • extract_verification_artifact(message) returns a single URL or OTP
  • redeem_artifact_once(attempt_id, artifact) enforces idempotency and returns a final status

Do not give the agent a generic “open browser and click anything in the email HTML” instruction. Prefer text extraction from structured JSON fields, then validate the URL against an allowlist before any navigation.

Observability: log the identifiers that make duplicates explainable

When a signup test fails, you want to answer these questions in one minute:

  • Which attempt was this?
  • Which inbox was used?
  • How many messages arrived, and when?
  • Which message was selected?
  • Which artifact was extracted?
  • Was the artifact consumed before?

A practical logging schema:

  • attempt_id
  • inbox_id
  • message_id
  • artifact_hash
  • delivery_method (webhook or polling)
  • latency_ms (send to receive)

If you use Mailhook, you can build this without parsing raw MIME, because messages are delivered as structured JSON and can be processed deterministically (see llms.txt for the canonical contract).

A short checklist to stop duplicates and bot loops

Use this as a pre-merge gate for email-dependent signup tests:

  • Use inbox-per-attempt, not shared inboxes
  • Wait via webhook-first, keep polling as fallback
  • Implement webhook idempotency and verify signed payloads
  • Implement artifact-level consume-once semantics
  • Add budgets (max resends, max wait time, max redemption attempts)
  • Log attempt_id, inbox_id, message_id, and artifact_hash

Where Mailhook fits

If your current approach depends on scraping a shared mailbox UI or parsing unpredictable HTML emails, duplicates and loops are almost guaranteed over time.

Mailhook provides the primitives that make signup automation boring again:

  • Create disposable inboxes via API
  • Receive emails as structured JSON
  • Get real-time webhook notifications (with signed payloads)
  • Use polling as a fallback retrieval path
  • Scale with batch processing, shared domains, or custom domain support

To integrate against the real API semantics and payload fields, start with Mailhook llms.txt, then explore the product at Mailhook.

email-testing automation webhooks ai-agents signup-flows

Related Articles