Reliable mail providers के साथ भी ईमेल duplicates क्यों होते हैं?

Duplicates कई layers में at-least-once delivery behavior के कारण होते हैं जिनमें आपका app, job queues, SMTP delivery, webhook delivery, polling consumers, और CI orchestration शामिल हैं। यह normal distributed systems behavior है जिसे idempotency के साथ handle करना चाहिए।

ईमेल टेस्टिंग में duplicates और bot loops में क्या अंतर है?

Duplicates single events का repetition हैं, जबकि bot loops feedback cycles हैं जहाँ automation बार-बार समान action को trigger करता है। Bot loops अधिक dangerous हैं क्योंकि वे infinite retry cycles create कर सकते हैं जो resources consume करते हैं और rate limits trigger करते हैं।

ईमेल टेस्टिंग के लिए webhook idempotency को कैसे implement करना चाहिए?

Signed payloads verify करें, replay protection के लिए timestamp tolerance implement करें, और duplicate processing को prevent करने के लिए event IDs store करें। हमेशा durable writes के बाद ही 2xx return करें और समान event के कई बार आने को gracefully handle करें।

Signup emails handle करने वाले LLM agents के लिए recommended approach क्या है?

Agents को generic ईमेल parsing abilities देने के बजाय explicit budgets के साथ constrained tools का उपयोग करें। create_signup_attempt(), wait_for_signup_email(), extract_verification_artifact(), और redeem_artifact_once() जैसे built-in safeguards वाले tools implement करें।

साइन अप ईमेल टेस्टिंग: डुप्लिकेट्स और बॉट लूप्स को रोकें

साइन अप फ़्लो तब तक सरल लगते हैं जब तक आप उन्हें automate नहीं करते। फिर आप एक निराशाजनक वास्तविकता की खोज करते हैं: साइन अप ईमेल pipeline का सबसे noisy हिस्सा है। मैसेज देर से आते हैं, दो बार आते हैं, या आपका test पहले ही आगे बढ़ जाने के बाद आते हैं। यदि आप इसके ऊपर LLM agents जोड़ते हैं, तो आप “bot loops” भी प्राप्त कर सकते हैं, जहाँ एक agent साइन अप को फिर से trigger करता है या rate limits या lockouts शुरू होने तक verification link को replay करता है।

यह गाइड साइन अप ईमेल टेस्टिंग में दो विश्वसनीयता killers पर केंद्रित है:

डुप्लिकेट्स (समान ईमेल event कई बार process किया जाना)
बॉट लूप्स (automation बार-बार समान ईमेल को trigger करना, या समान ईमेल को बार-बार consume करना)

लक्ष्य “test को एक बार pass कराना” नहीं है, बल्कि ईमेल step को deterministic, idempotent, और retry करने के लिए सुरक्षित बनाना है।

साइन अप ईमेल टेस्टिंग में duplicates क्यों होते हैं (यह केवल आपका mail provider नहीं है)

Duplicates आम तौर पर chain में कहीं न कहीं at-least-once behavior से आते हैं। layer को name करना मदद करता है, ताकि आप सही boundary पर dedupe कर सकें।

Duplicates कहाँ पैदा होते हैं	सामान्य कारण	Tests में यह कैसा दिखता है	सबसे अच्छा fix
आपका app	“Resend verification email” दो बार trigger किया गया, idempotency के बिना retries, double form submits	अलग tokens के साथ दो emails	प्रति signup attempt एक idempotency key जोड़ें, एक active token enforce करें
आपका job queue	Worker बिना dedupe key के retry करता है	समान template, समान token दो बार भेजा गया	Send job को idempotent बनाएं (attempt_id)
SMTP delivery path	Greylisting, transient failures, upstream retries	दो लगभग-समान messages, संभवतः समान `Message-ID`	एक stable message identifier और artifact द्वारा deduplicate करें
Webhook delivery	आपका endpoint timeout हो जाता है, provider retry करता है	समान message कई बार deliver किया गया	Signatures verify करें और webhook idempotency implement करें
Polling consumer	Cursor bugs, eventual consistency, “latest” को बार-बार fetch करना	हर poll पर समान message process किया गया	एक cursor उपयोग करें या “seen message ids” store करें
CI / agent orchestration	Test retries समान logical attempt को rerun करते हैं	expected से अधिक emails, flaky assertions	प्रति attempt inbox isolate करें, run ids correlate करें

एक मुख्य takeaway: आप distributed systems में duplicates को विश्वसनीय रूप से “prevent” नहीं कर सकते। आप केवल इस तरह design कर सकते हैं कि duplicates harmless हों।

Bot loops क्यों होते हैं (और वे duplicates से क्यों बदतर हैं)

एक duplicate एक event का repetition है। एक bot loop एक feedback cycle है।

Signup automation में सामान्य loops:

Retry loop: agent ईमेल का इंतज़ार करते समय timeout हो जाता है, signup को retry करता है, दूसरे ईमेल को trigger करता है, फिर repeat करता है।
Replay loop: agent verification ईमेल प्राप्त करता है, magic link पर click करता है, error मिलता है, और indefinitely फिर से click करता है।
Parser loop: agent OTP extract करने में fail हो जाता है, resend मांगता है, और सबसे पुराने को पढ़ते रहने के दौरान emails accumulate करता रहता है।
Webhook replay loop (security + reliability): यदि आप signed webhook payloads (और timestamp / replay tolerance) को verify नहीं करते हैं, तो एक captured payload को replay किया जा सकता है और repeated processing का कारण बन सकता है।

Fix यह है कि signup verification को budgets के साथ एक छोटी state machine की तरह treat करें:

एक single attempt id
एक single inbox scope
एक bounded wait
एक single consume verification artifact का
एक hard stop जब budgets exceed हो जाएं

एक सरल flow diagram दिखा रहा है कि signup attempt एक disposable inbox create करता है, email send को trigger करता है, email event (webhook या polling) receive करता है, एक बार verification artifact extract करता है, और duplicates और retry loops को prevent करने के लिए इसे consumed mark करता है।

Deterministic pattern: inbox-per-attempt plus idempotent consume

यदि आप अभी भी shared inboxes (या एक mailbox में plus-addressing) का उपयोग कर रहे हैं, तो आप गलत battle लड़ रहे हैं। साइन अप ईमेल टेस्टिंग के लिए clean pattern यह है:

प्रति signup attempt एक fresh disposable inbox create करें
Signup verification ईमेल को उस address पर भेजें
Deterministically wait करें (webhook-first, polling fallback)
एक minimal artifact (OTP या URL) extract करें
इसे exactly एक बार consume करें

Mailhook इस style के automation के लिए design किया गया है: आप API के द्वारा disposable inboxes create करते हैं और inbound messages को structured JSON के रूप में receive करते हैं, real-time webhooks के द्वारा delivered और/या polling के द्वारा retrieved। Exact endpoints और payload fields के लिए, Mailhook llms.txt पर canonical reference का उपयोग करें।

सही तरीके से dedupe करें: सही keys चुनें (message id vs artifact id)

Duplicates को रोकने के लिए, आपको “this email event” के लिए एक stable key और “this verification action” के लिए एक stable key की आवश्यकता होती है। वे हमेशा समान नहीं होते।

Recommended dedupe keys

Dedupe scope	आप क्या prevent कर रहे हैं	Suggested key	Notes
Message-level	समान ईमेल को एक से अधिक बार process करना	Provider message id (preferred), या normalized `Message-ID` header	RFC 5322 `Message-ID` को define करता है, लेकिन practice में यह unique होने की guarantee नहीं है, इसे best-effort treat करें
Artifact-level	समान verification link को दो बार click करना, या OTP को reuse करना	Extracted artifact का hash (OTP value, token, या canonicalized URL)	Hash करने से पहले URL को canonicalize करें (tracking params strip करें)
Attempt-level	Multiple “active” attempts create करना जो race करते हैं	आपके द्वारा ईमेल भेजने से पहले generate किया गया `attempt_id`	इसे अपने DB और logs में store करें
Webhook delivery	आपके webhook handler को दो बार run करना	Payload से `delivery_id` या message id	केवल durable write के बाद 2xx return करें

यदि आप केवल एक चीज़ implement कर सकते हैं: artifact-level idempotency। भले ही आप तीन emails receive करें, केवल पहला artifact consume होना चाहिए।

Webhooks: at-least-once delivery assume करें और idempotency build करें

Webhook retries normal हैं, exceptional नहीं। Providers retry करते हैं जब:

आपका endpoint timeout हो जाता है
आप non-2xx return करते हैं
आपका load balancer connection close कर देता है

इसलिए आपका webhook handler इनमें से होना चाहिए:

Authenticated (signed payloads verify करें)
Replay-resistant (timestamp tolerance, यदि available हो तो nonce)
Idempotent (समान event दो बार आ सकता है)

Mailhook security के लिए signed payloads को support करता है, जो आपको verify करने देता है कि webhook वास्तव में Mailhook से आया है और alter नहीं किया गया है। llms.txt में describe किए गए verification procedure को follow करें।

Minimal webhook handler shape (pseudocode)

handleWebhook(request):
  payload = request.body
  assert verify_signature(request.headers, payload)

  event_id = payload.event_id OR payload.message.id

  if db.exists("webhook_events", event_id):
    return 200

  db.insert("webhook_events", {event_id, received_at: now()})
  enqueue("process_message", {message_id: payload.message.id, inbox_id: payload.inbox.id})

  return 200

Design note: पहले idempotency record write करें, फिर enqueue करें। यदि enqueue fail हो जाता है, तो आप safely retry कर सकते हैं।

General webhook retry behavior और signature verification patterns के लिए, भले ही आप Stripe का उपयोग न कर रहे हों, Stripe के webhook docs एक अच्छा reference model हैं: webhook best practices।

Polling: cursors और time budgets के साथ “latest message wins” bugs को रोकें

Polling एक perfectly valid fallback है, लेकिन “fetch latest and parse” duplicates और bot loops का एक common source है।

एक safer polling contract:

Deadline तक poll करें
Narrowly filter करें (recipient + attempt correlation)
एक cursor track करें या processed message ids store करें
उस attempt से match करने वाले first message को select करें, “जो सबसे recently आया था” को नहीं

Minimal polling loop (pseudocode)

waitForSignupEmail(inbox_id, attempt_id, deadline):
  seen = set()

  while now() < deadline:
    messages = api.list_messages(inbox_id)

    for m in messages:
      if m.id in seen:
        continue
      seen.add(m.id)

      if not matches_attempt(m, attempt_id):
        continue

      artifact = extract_verification_artifact(m)
      return {message_id: m.id, artifact}

    sleep(backoff())

  throw Timeout("No matching signup email")

यह single change, “remember what you already looked at”, surprising amount की flakiness को prevent करता है।

Correlation: सही ईमेल को identify करना आसान बनाएं

Duplicates तब dangerous हो जाते हैं जब आप नहीं बता सकते कि कौन सा ईमेल किस attempt का है।

Correlation options, strongest से weakest तक:

Inbox isolation: प्रति attempt एक disposable inbox (best)
Email content में explicit attempt token: template में attempt_id include करें (internal systems के लिए अच्छा काम करता है)
Custom header: send करते समय X-Correlation-Id: <attempt_id> add करें
Subject tags: helpful, लेकिन localization या template changes के साथ break होना सबसे आसान

यदि आप sender को control करते हैं, तो custom header आमतौर पर सबसे cleanest है, क्योंकि यह brittle HTML parsing से बचता है। यदि आप sender को control नहीं करते (third-party SaaS), तो inbox isolation और narrow matchers आपके best tools हैं।

इस बारे में deep dive के लिए कि कौन से headers पर trust करना worth है, message format को define करने वाला RFC देखें: RFC 5322।

“Consume once” rules जो replay loops को रोकते हैं

एक बार जब आप verification link या OTP extract कर लेते हैं, तो आपके automation को इसे one-time capability की तरह treat करना चाहिए।

इन rules को implement करें:

artifact_hash द्वारा keyed consumed marker store करें
भले ही UI “try again” कहे, OTP को दो बार click या submit न करें
यदि redemption fail हो जाता है, रुकें और debuggable error surface करें (blindly retry न करें)

एक simple database table काफी है:

Column	Purpose
`artifact_hash`	Idempotency key, double-consume prevent करता है
`attempt_id`	Consume को run के साथ link करता है
`consumed_at`	Debuggability और audit
`result`	Success, already_used, expired, invalid

इस तरह आप potentially unbounded loop को finite workflow में turn कर देते हैं।

LLM agents: tool constraints के साथ “autonomous resend” behavior को prevent करें

LLM agents improvisation में great हैं, जो कि auth flows में आप बिल्कुल नहीं चाहते।

यदि एक agent को allow किया गया है:

signup trigger करना
resend request करना
emails read करना
links click करना

तो एक छोटी parsing glitch उसे resend spam करने का कारण बन सकती है और self-sustaining loop produce कर सकती है।

Fix यह है कि agent को constrained tools और explicit budgets दें:

create_signup_attempt() returns {attempt_id, email, inbox_id, expires_at}
wait_for_signup_email(attempt_id) एक single message या timeout return करता है
extract_verification_artifact(message) एक single URL या OTP return करता है
redeem_artifact_once(attempt_id, artifact) idempotency enforce करता है और final status return करता है

Agent को generic “open browser and click anything in the email HTML” instruction न दें। Structured JSON fields से text extraction prefer करें, फिर किसी भी navigation से पहले URL को allowlist के against validate करें।

Observability: उन identifiers को log करें जो duplicates को explainable बनाते हैं

जब एक signup test fail हो जाता है, तो आप एक minute में इन questions का answer चाहते हैं:

यह कौन सा attempt था?
कौन सा inbox उपयोग किया गया था?
कितने messages आए, और कब?
कौन सा message select किया गया था?
कौन सा artifact extract किया गया था?
क्या artifact पहले consume किया गया था?

एक practical logging schema:

attempt_id
inbox_id
message_id
artifact_hash
delivery_method (webhook या polling)
latency_ms (send to receive)

यदि आप Mailhook का उपयोग करते हैं, तो आप raw MIME parse किए बिना यह build कर सकते हैं, क्योंकि messages structured JSON के रूप में delivered होते हैं और deterministically process किए जा सकते हैं (canonical contract के लिए llms.txt देखें)।

Duplicates और bot loops को रोकने के लिए एक छोटी checklist

इसे email-dependent signup tests के लिए pre-merge gate के रूप में उपयोग करें:

Inbox-per-attempt का उपयोग करें, shared inboxes का नहीं
Webhook-first के द्वारा wait करें, polling को fallback के रूप में रखें
Webhook idempotency implement करें और signed payloads verify करें
Artifact-level consume-once semantics implement करें
Budgets add करें (max resends, max wait time, max redemption attempts)
attempt_id, inbox_id, message_id, और artifact_hash log करें

Mailhook कहाँ fit करता है

यदि आपका current approach shared mailbox UI को scrape करने या unpredictable HTML emails को parse करने पर depend करता है, तो duplicates और loops समय के साथ लगभग guaranteed हैं।

Mailhook वे primitives प्रदान करता है जो signup automation को फिर से boring बनाते हैं:

API के द्वारा disposable inboxes create करें
Emails को structured JSON के रूप में receive करें
Real-time webhook notifications प्राप्त करें (signed payloads के साथ)
Polling को fallback retrieval path के रूप में उपयोग करें
Batch processing, shared domains, या custom domain support के साथ scale करें

Real API semantics और payload fields के against integrate करने के लिए, Mailhook llms.txt से शुरू करें, फिर Mailhook पर product explore करें।