Reliable webhooks: signatures, idempotency, and retries without chaos
Patterns for robust webhook ingestion across Stripe, GitHub, and critical partner integrations.
Executive summary
Patterns for robust webhook ingestion across Stripe, GitHub, and critical partner integrations.
Last updated: 2/14/2026
Introduction: The silent failure of event-driven integration
Webhooks are the backbone of modern SaaS integration. Stripe sends payment confirmations. GitHub notifies CI/CD pipelines. Shopify triggers order fulfillment. Yet webhooks are also one of the highest-risk silent-failure points in production systems.
The danger is that webhook delivery looks simple—it's just an HTTP POST—but operates under fundamentally unreliable conditions:
- The sender will retry. Stripe retries failed deliveries up to 16 times over 72 hours. If your handler isn't idempotent, you'll process the same payment twice.
- The network will lie. Your server may process the event successfully but the sender never receives the
200 OK(due to a timeout or dropped connection). The sender retries, and you process it again. - Attackers will forge. Without signature verification, anyone who discovers your webhook URL can send fake events (e.g., fabricating a "payment_succeeded" event to ship products for free).
Reliability depends on three non-negotiable pillars: strong signature verification, asynchronous ingestion, and idempotent execution.
Pillar 1: Signature verification
Every webhook provider worth its salt signs payloads with HMAC-SHA256 (or similar). The provider shares a secret key with you; every request includes a computed signature in a header.
How it works (Stripe example)
typescriptimport Stripe from 'stripe';
const stripe = new Stripe(process.env.STRIPE_SECRET_KEY);
export async function handleWebhook(req: Request) {
const signature = req.headers['stripe-signature'];
const rawBody = await req.text(); // Must be the raw body, not parsed JSON
let event: Stripe.Event;
try {
event = stripe.webhooks.constructEvent(rawBody, signature, process.env.STRIPE_WEBHOOK_SECRET);
} catch (err) {
// Signature invalid — reject immediately, log for security audit
return new Response('Invalid signature', { status: 401 });
}
// Signature valid — safe to process
await enqueueForProcessing(event);
return new Response('OK', { status: 200 });
}Critical rules:
- Verify before any side effect. Never touch the database, send an email, or trigger a workflow before the signature is validated.
- Use the raw body. Parsing the JSON and re-serializing it can change formatting (key ordering, whitespace), invalidating the HMAC.
- Validate timestamps. Most providers include a timestamp. Reject events older than a reasonable window (e.g., 5 minutes) to prevent replay attacks.
Pillar 2: Asynchronous ingestion
The most common architectural mistake is processing the webhook synchronously inside the HTTP handler. This fails because:
- Timeout risk: Stripe gives you 20 seconds to return a
2xx. Complex processing (database writes, third-party API calls, email sending) can easily exceed this. - Retry amplification: If you timeout, the sender retries, and your handler starts the work _again_—potentially creating duplicate side effects or overlapping transactions.
The correct pattern: Accept-and-Queue
POST /webhooks/stripe → Verify signature → Persist raw event → Return 200 → Done (< 100ms)
↓
Queue Worker picks up event
↓
Process with idempotency guardThe HTTP handler does three things and nothing more:
- Verify the signature.
- Persist the raw event envelope (including headers, delivery metadata, and the raw body) to a database or event store.
- Return
200 OKimmediately.
A background worker (SQS consumer, Bull queue processor, Kafka consumer) then picks up the event and processes it with full idempotency guarantees.
Pillar 3: Idempotent execution
Because retries are guaranteed, your webhook processing logic must be idempotent: executing the same event multiple times must produce the same result as executing it once.
Implementation pattern: Idempotency by event_id
Every major webhook provider includes a unique event identifier (event.id in Stripe, X-GitHub-Delivery in GitHub). Use this as a deduplication key:
typescriptasync function processWebhookEvent(event: WebhookEvent) {
// 1. Check if already processed
const existing = await db.webhookEvents.findUnique({
where: { eventId: event.id },
});
if (existing?.status === 'completed') {
return; // Already processed — skip silently
}
if (existing?.status === 'processing') {
return; // Another worker is handling it — skip to avoid race condition
}
// 2. Mark as "processing" (with optimistic lock)
await db.webhookEvents.upsert({
where: { eventId: event.id },
create: { eventId: event.id, status: 'processing', receivedAt: new Date() },
update: { status: 'processing' },
});
// 3. Execute business logic
try {
await handlePaymentSucceeded(event.data);
await db.webhookEvents.update({
where: { eventId: event.id },
data: { status: 'completed', processedAt: new Date() },
});
} catch (error) {
await db.webhookEvents.update({
where: { eventId: event.id },
data: { status: 'failed', error: error.message },
});
throw error; // Re-throw so the queue retries
}
}Key states: pending → processing → completed / failed. The processing state prevents race conditions when two retries arrive simultaneously.
Deepening the analysis: Real-world failure scenarios
| Scenario | Without proper handling | With the three pillars |
|---|---|---|
**Stripe retries a payment_intent.succeeded event 3 times** | Customer is charged once but goods are shipped 3 times, or 3 database records are created. | Idempotency check on event.id ensures the shipment triggers once. |
| Attacker discovers webhook URL | Fake invoice.paid events mark unpaid invoices as paid. | HMAC signature verification rejects all forged payloads. |
| Handler takes 25 seconds to process | Stripe times out, retries, handler now has two concurrent executions. | Handler returns 200 in <100ms. Queue worker processes asynchronously. |
| Network drops the 200 response | Stripe never sees the acknowledgment, retries. Handler processes again. | Idempotency guard detects the completed status, skips reprocessing. |
When webhook reliability accelerates delivery
Treating webhooks as first-class integration contracts prevents the most expensive class of bugs: silent data corruption in financial and transactional systems.
Decision prompts for your engineering context:
- Which side-effect operations require mandatory idempotency keys?
- How do you distinguish "in progress" from "completed" under concurrent retries?
- What retention window for raw events covers realistic network delays and partner SLA windows?
Continuous optimization roadmap
- Validate HMAC signature and timestamp at ingress. Reject invalid or replayed events before any processing.
- **Return
2xxfast and delegate to queue workers.** The HTTP handler should complete in under 100ms. - **Enforce idempotency by provider
event_id.** Use a three-state model (pending/processing/completed) with optimistic locking. - Run periodic reconciliation. Fetch recent events from the provider's API and compare against your processed events to catch anything that was dropped.
- Instrument retry, deduplication, and terminal-failure metrics. Track how often events are retried, deduplicated, and permanently failed.
- Stress-test duplicate and reorder scenarios in staging. Simulate the exact failure modes (duplicate delivery, out-of-order events, delayed retries) that production will encounter.
How to validate production evolution
Measure webhook reliability by tracking:
- Duplicate side effects prevented: How many duplicate payments, shipments, or notifications were blocked by idempotency guards?
- Manual reconciliation effort: How much time does the team spend manually fixing missed or double-processed events?
- Financial incidents from non-idempotent retries: How many incidents with direct revenue impact were caused by duplicate processing?
Want to convert this plan into measurable execution with lower technical risk? Talk to a web specialist with Imperialis to design, implement, and operate this evolution.