Halmurat T.
Halmurat T.

Senior SDET

Home Blog Books ask About

The Dispatch

Weekly QA notes from the trenches.

Welcome aboard!

You're on the list. Expect real-world QA insights — no fluff, no spam.

© 2026 Halmurat T.

Automation 24
  • Selenium
  • Playwright
  • Appium
  • Cypress
AI Testing 5
CI/CD 6
  • GitHub Actions
  • Slack Reporting
QA Strategy 4
Case Studies 5
Blog/Automation
AutomationHalmurat T./January 23, 2024/8 min

Testing Webhooks Requires a Different Strategy

Filed underapi-testing/framework-design/design-patterns
Testing Webhooks Requires a Different Strategy

Table of Contents
  • API Testing: You Control the Conversation
  • Webhook Testing: The Event Controls You
  • Why Our API Test Patterns Failed
  • The Strategy That Actually Works
  • Part 1: Build a Webhook Listener
  • Part 2: Test Idempotency
  • Part 3: Solve the CI Reachability Problem
  • Part 4: Verify Signature Validation
  • The Pattern I Use Now
  • Your Next Step

On this page

  • API Testing: You Control the Conversation
  • Webhook Testing: The Event Controls You
  • Why Our API Test Patterns Failed
  • The Strategy That Actually Works
  • Part 1: Build a Webhook Listener
  • Part 2: Test Idempotency
  • Part 3: Solve the CI Reachability Problem
  • Part 4: Verify Signature Validation
  • The Pattern I Use Now
  • Your Next Step

We had 200+ API tests running green for months. Then the team added a payment webhook integration — Stripe sends a callback when a payment succeeds — and we tried testing it using the same request-response patterns. The tests passed locally about 70% of the time and failed in CI about 90% of the time. We spent a week debugging what we thought were environment issues before realizing the fundamental problem: webhook testing can’t use the same strategy as API testing, and treating them the same is why your tests are unreliable.

API Testing: You Control the Conversation

API testing is comfortable because you control both sides of the interaction. You send a request, you get a response, you assert on it. The flow is synchronous from the test’s perspective — even if the HTTP call is technically async, you wait for the response before asserting.

src/test/java/api/PaymentApiTest.java
@Test
public void createPayment_returnsConfirmation() {
Response response = given()
.body(new PaymentRequest("card_visa", 5000, "usd"))
.post("/api/payments");
// You control the timing — assert immediately after the response
assertThat(response.statusCode()).isEqualTo(201);
assertThat(response.jsonPath().getString("status")).isEqualTo("pending");
}

This is predictable. The test sends a request, waits for a response, and asserts. There’s no timing ambiguity. If the assertion fails, the data is wrong — not the timing.

Webhook Testing: The Event Controls You

Webhooks invert the relationship. You don’t call the webhook — the external system calls you. Your test has to:

  1. Set up a listener that can receive the callback
  2. Trigger the event that causes the webhook to fire
  3. Wait for the callback to arrive (or time out)
  4. Assert on the payload that was delivered

That “wait for the callback” step is where everything breaks. You don’t know when the webhook will fire. You don’t know how many times it might fire (retries). And in CI, you have an additional problem: the webhook sender needs a URL that actually reaches your test environment.

src/test/java/webhook/PaymentWebhookTest.java
@Test
public void paymentSucceeded_webhookDeliversPayload() throws Exception {
// Step 1: Start a listener BEFORE triggering the event
CompletableFuture<WebhookPayload> received = webhookListener.waitForEvent(
"payment_intent.succeeded", Duration.ofSeconds(30)
);
// Step 2: Trigger the action that causes the webhook
stripeTestHelper.createSuccessfulPayment("card_visa", 5000, "usd");
// Step 3: Block until the webhook arrives or timeout
WebhookPayload payload = received.get(30, TimeUnit.SECONDS);
// Step 4: NOW you can assert
assertThat(payload.getType()).isEqualTo("payment_intent.succeeded");
assertThat(payload.getData().getAmount()).isEqualTo(5000);
}

Notice the inversion. The listener starts before the triggering action. If you trigger first and then start listening, you might miss the webhook entirely — it could arrive in the milliseconds between the trigger and the listener starting.

Why Our API Test Patterns Failed

Here’s what we tried first (and why it failed):

src/test/java/webhook/BrokenWebhookTest.java
// BAD — polling the database after triggering the event
@Test
public void paymentSucceeded_updatesDatabase() {
stripeTestHelper.createSuccessfulPayment("card_visa", 5000, "usd");
// Poll the database hoping the webhook has been processed
Thread.sleep(5000); // How long is long enough? Nobody knows.
Payment payment = paymentRepository.findLatest();
assertThat(payment.getStatus()).isEqualTo("succeeded");
}

The Thread.sleep(5000) is the telltale sign of a broken webhook test. Five seconds is enough locally but not in a slow CI environment. Ten seconds works in CI but makes your suite painfully slow. And any fixed sleep is a lie — you’re guessing at timing instead of waiting for the actual event.

The Strategy That Actually Works

After a week of pain, we settled on a three-part strategy that’s been reliable across three different projects since then.

Part 1: Build a Webhook Listener

Create a lightweight HTTP server in your test harness that receives webhook callbacks. This isn’t a mock — it’s a real endpoint that the webhook sender delivers to.

src/test/java/webhook/WebhookListener.java
public class WebhookListener {
private final Map<String, CompletableFuture<WebhookPayload>> pending
= new ConcurrentHashMap<>();
private final HttpServer server;
public WebhookListener(int port) throws IOException {
// Start a real HTTP server that receives webhook callbacks
server = HttpServer.create(new InetSocketAddress(port), 0);
server.createContext("/webhooks", this::handleWebhook);
server.start();
}
public CompletableFuture<WebhookPayload> waitForEvent(
String eventType, Duration timeout) {
CompletableFuture<WebhookPayload> future = new CompletableFuture<>();
pending.put(eventType, future);
// Auto-timeout so tests don't hang forever
future.orTimeout(timeout.toMillis(), TimeUnit.MILLISECONDS);
return future;
}
private void handleWebhook(HttpExchange exchange) throws IOException {
WebhookPayload payload = parsePayload(exchange);
CompletableFuture<WebhookPayload> future = pending.remove(payload.getType());
if (future != null) {
future.complete(payload);
}
exchange.sendResponseHeaders(200, 0);
exchange.close();
}
}

The key is CompletableFuture — it lets the test block until the webhook arrives without polling. The orTimeout ensures tests fail fast instead of hanging indefinitely if the webhook never comes.

Part 2: Test Idempotency

Webhook providers retry on failure. Stripe retries up to 3 times. Your webhook handler needs to be idempotent, and your tests need to verify that.

src/test/java/webhook/IdempotencyTest.java
@Test
public void duplicateWebhook_processedOnlyOnce() throws Exception {
WebhookPayload payload = createPaymentPayload("evt_123", 5000);
// Deliver the same webhook twice
webhookHandler.process(payload);
webhookHandler.process(payload);
// Should only create one payment record
assertThat(paymentRepository.findByEventId("evt_123")).hasSize(1);
}
[ WARNING ]

We skipped this test initially. It cost us a production bug where duplicate Stripe retries created duplicate payment records. The test takes 30 seconds to write and would have caught the issue before it hit customers.

Part 3: Solve the CI Reachability Problem

This is the gotcha that catches everyone.

[ WARNING ]

Locally, your webhook listener runs on localhost:8080 and you configure Stripe’s test mode to send callbacks there. In CI, localhost isn’t reachable from the internet. Your webhook sender can’t deliver to a machine it can’t reach.

Three options, in order of my preference:

ApproachProsCons
Mock the webhook senderFast, no network dependency, works everywhereDoesn’t test the real integration
Use a tunnel (ngrok/localtunnel)Tests the real integrationAdds external dependency, can be flaky
Use provider’s test/CLI toolsStripe CLI can forward webhooks locallyProvider-specific, not all providers offer this

For CI, I recommend mocking the webhook sender and running the real integration tests in a staging environment on a schedule (nightly). Trying to make real webhooks work in ephemeral CI containers is a maintenance nightmare — I’ve seen teams spend more time maintaining the tunnel setup than writing actual tests.

src/test/java/webhook/MockedWebhookTest.java
@Test
public void paymentWebhook_processesCorrectly() throws Exception {
// In CI: simulate the webhook delivery directly
WebhookPayload payload = createPaymentPayload("evt_test_123", 5000);
webhookHandler.process(payload);
Payment payment = paymentRepository.findByEventId("evt_test_123");
assertThat(payment.getStatus()).isEqualTo("succeeded");
assertThat(payment.getAmount()).isEqualTo(5000);
}

This tests your webhook processing logic without needing a reachable URL. Save the end-to-end webhook delivery test for a stable environment where the network is predictable.

Part 4: Verify Signature Validation

Idempotency prevents duplicate processing, but it doesn’t stop someone from crafting a fake payload and hitting your endpoint. If your webhook handler doesn’t verify the signature header, anyone who discovers your webhook URL can trigger fake payment events — and your system will happily process them. Stripe signs every payload with a secret; your handler should reject anything that doesn’t match.

src/test/java/webhook/SignatureValidationTest.java
@Test
public void invalidSignature_rejectsPayload() {
WebhookPayload payload = createPaymentPayload("evt_123", 5000);
String invalidSignature = "invalid_sig_abc123";
assertThrows(SignatureVerificationException.class, () ->
webhookHandler.processWithSignature(payload, invalidSignature)
);
// Verify the forged payload was never persisted
assertThat(paymentRepository.findByEventId("evt_123")).isEmpty();
}

Don’t stop at invalid signatures — test for replay attacks too. Stripe’s signature header includes a timestamp, and your handler should reject payloads older than a threshold (we use 5 minutes). Re-sending a legitimately signed but stale payload is a real attack vector, and a one-line timestamp check in your handler plus one test to verify it closes that gap.

The Pattern I Use Now

After building webhook test infrastructure on three projects, the pattern is always the same:

  1. Unit test your webhook handler — mock the payload, verify processing logic, idempotency, and signature validation
  2. Integration test with a listener — real HTTP server, trigger real events in test mode, assert on received payloads
  3. E2E test in staging only — real webhook delivery from the provider, run nightly not on every PR

If you want to route webhook test failures to the right team in a multi-squad setup, the same squad tagging pattern we use for Cucumber scenarios works here too. And if your CI pipeline already uses GitHub Actions for test execution, the mock approach integrates cleanly without any additional infrastructure.

Your Next Step

Pick one webhook integration in your project. Write one test that verifies the happy path using a CompletableFuture listener instead of Thread.sleep. If that single test is more reliable than what you have now, you’ve got your pattern — apply it everywhere else.

Stop treating webhooks like APIs. They’re fundamentally different, and your testing strategy should reflect that.

§ Further Reading 03 of 03
01Automation

Contract Testing vs API Testing — What's the Difference?

Contract testing and API testing look similar but they catch very different bugs. Here's where contract testing fits next to unit, API, and E2E tests.

Read →
02Automation

Migrating Off Cypress? Here's When to Keep It

An honest take on Cypress vs Playwright migrations from an SDET who's done three, including the signals that tell you not to migrate your suite just yet.

Read →
03Automation

XPath text() vs Dot — Why Your Text Match Fails

The real difference between XPath text(), dot, contains(), and normalize-space() for test automation — with examples that explain real flaky failures.

Read →

Don't miss a thing

Subscribe to get updates straight to your inbox.

HT

No spam · Unsubscribe anytime

Welcome aboard!

You're on the list. Expect real-world QA insights — no fluff, no spam.

§ Colophon

Halmurat T. — Senior SDET writing about test automation, CI/CD, and QA strategy from 10+ years in the enterprise trenches.

Set in
IBM Plex Sans, Lora, and IBM Plex Mono.
Built with
Astro, MDX, Tailwind CSS & Expressive Code. Served by Vercel.
Privacy
No cookies. No tracking scripts on the main thread — analytics run sandboxed via Partytown.
Source
github.com/Halmurat-Uyghur
Terminal
Try /ask to query Halmurat's notes in a shell prompt.

© 2026 Halmurat T. · Written in plain text, shipped in plain time.

Search
Esc

Search is not available in dev mode.

Run npm run build then npm run preview:local to test search locally.