Shared Session Cookies Corrupted Our Parallel Tests

Q: Does Playwright's BrowserContext guarantee server-side session isolation?

No. BrowserContext isolates browser-side state (cookies, localStorage, sessionStorage) but does not create a new server-side session. If two contexts load the same session cookie from a shared storageState file, the server treats them as one session.

Q: How do I confirm shared sessions are the cause without changing code?

Log the session_id cookie value at the start of each test, prefixed with the thread ID. If multiple threads show the same session_id, they share a server-side session. This is a 5-minute diagnostic that requires only a logging statement.

Account names were changing to values from tests that never touched them. A billing account named SmallBusiness_Ontario_2024 would suddenly become TC2609796_EnergyUser_1710345821 — a name pattern that belonged to an energy-sector test class running on a completely different thread. No test in the billing class ever wrote that string. The data was crossing boundaries that should have been impossible, and it took us three weeks of wrong assumptions before we found the actual cause: every thread in our parallel suite was sharing the same server-side session.

What Were the Symptoms?

The failures were maddening because they had no consistent pattern — except that they all involved data showing up where it didn’t belong. This is the hallmark of a server-side session collision, not a client-side state leak.

Our test suite ran 340+ tests across 8 parallel threads using Java 21, Playwright, and TestNG with parallel="methods". The failures started showing up after we scaled from 4 threads to 8. At 4 threads, we saw maybe one mysterious failure per week. At 8, it was 5-10 per run.

The error messages were bizarre:

FAILED: testAccountRename_SmallBusiness
  Expected account name: "SmallBusiness_Ontario_2024"
  Actual account name:   "TC2609796_EnergyUser_1710345821"

FAILED: testBillingAddressUpdate_Residential
  Expected address line 1: "742 Evergreen Terrace"
  Actual address line 1:   "100 Industrial Pkwy"
  → This address belongs to the CommercialAccount test class

The TC2609796_EnergyUser_<timestamp> pattern was a test data naming convention from our energy-sector test module. That module ran in a completely different TestNG group. No import, no shared utility, no common data file connected it to the billing tests. The string had no business being on that account.

Where Did We Look First (And Why It Was Wrong)?

We spent nearly two weeks chasing browser-side contamination. Every hypothesis led to a real finding, but none of them were the root cause.

Hypothesis 1: Playwright BrowserContext leak. We assumed Playwright was sharing state between contexts. We audited every browser.newContext() call, verified each test got a fresh context, and confirmed teardown was happening in @AfterMethod. Contexts were properly isolated — Playwright wasn’t the problem.

Hypothesis 2: Thread.sleep timing races. A few tests used Thread.sleep(2000) before assertions instead of proper waits. We replaced them with explicit Playwright waits — page.locator().waitFor() and page.waitForURL() calls. Test stability improved marginally, but the cross-contamination continued.

Hypothesis 3: Static field visibility. We grep’d for static fields across the framework — anything mutable that multiple threads could touch. We found a few static DateTimeFormatter instances (immutable, not a problem) and one static List<String> used for tracking test IDs (a real bug, but not the cause of the data corruption).

Hypothesis 4: Missing copyAccount() call. Our test data setup required calling copyAccount() to clone a template account before each test modified it. We found one test class that skipped this step and was mutating the template directly. We fixed it. The cross-contamination still happened.

Each of these was a legitimate issue worth fixing. But none explained how TC2609796_EnergyUser_1710345821 ended up on an account that only the billing test class ever accessed.

How Was the Test Framework Architected?

Understanding the architecture is critical because the optimization that caused the bug was invisible if you only looked at the test code.

The stack: Java 21 + Playwright (Java binding) + TestNG (parallel=“methods”, 8 threads) + Google Guice for dependency injection. Each test method ran on whichever TestNG thread was available — no thread affinity.

The critical piece was the authentication optimization. Logging in through Okta took 4-6 seconds per test — Okta’s redirect chain, MFA bypass, consent page, redirect back. With 340 tests, that’s 25+ minutes just on login overhead. So we optimized:

public class AuthenticationStateService {
    private static final String AUTH_STATE_FILE = "auth-state-qa1.json";

    @BeforeSuite
    public void authenticateOnce() {
        Browser browser = playwright.chromium().launch();
        BrowserContext context = browser.newContext();
        // Log in once via Okta, save the cookies
        performOktaLogin(context);
        context.storageState(new BrowserContext
            .StorageStateOptions()
            .setPath(Paths.get(AUTH_STATE_FILE)));
        context.close();
        browser.close();
    }
}

Then every test loaded those same cookies:

public class BrowserProvider {
    public BrowserContext createContext() {
        return browser.newContext(
            new Browser.NewContextOptions()
                // Every thread loads the SAME cookie file
                .setStorageStatePath(
                    Paths.get("auth-state-qa1.json"))
        );
    }
}

This looked correct. Each test got its own BrowserContext — Playwright’s isolation boundary. Cookies, localStorage, and sessionStorage are scoped to the context. The Playwright docs on authentication explicitly recommend this pattern for reusing auth state.

The problem is that Playwright’s isolation is browser-side only.

What Was Actually Happening on the Server?

Here’s what the architecture looked like at runtime with 8 threads:

Thread-14  →  BrowserContext A  →  Cookie: session_id=abc123  ─┐
Thread-15  →  BrowserContext B  →  Cookie: session_id=abc123  ─┤
Thread-16  →  BrowserContext C  →  Cookie: session_id=abc123  ─┤
Thread-17  →  BrowserContext D  →  Cookie: session_id=abc123  ─┤  →  Server: ONE session
Thread-18  →  BrowserContext E  →  Cookie: session_id=abc123  ─┤     (shared state map)
Thread-19  →  BrowserContext F  →  Cookie: session_id=abc123  ─┤
Thread-20  →  BrowserContext G  →  Cookie: session_id=abc123  ─┤
Thread-21  →  BrowserContext H  →  Cookie: session_id=abc123  ─┘

Eight browser contexts. Eight isolated cookie jars. But every jar contained the same session cookie, because they all loaded from the same auth-state-qa1.json file.

From the server’s perspective, all 8 threads were the same user in the same session. The application server maintained a single session object keyed by session_id=abc123. That session object held state: the currently viewed account, the last navigation context, pending form data, CSRF tokens.

When Thread-14 opened account SmallBusiness_Ontario_2024 and Thread-17 called updateAccountName("TC2609796_EnergyUser_1710345821"), the server’s session state determined which account to update. If Thread-17’s request arrived while the server’s session still had Thread-14’s account context loaded — the rename hit the wrong account.

This application was a legacy enterprise portal that stored the active account in server-side session state — a pattern common in older Java EE apps. Modern stateless APIs pass the account ID in the request URL or body, but session-stateful apps like this one are still everywhere in enterprise QA.

// Server session state (shared across all 8 threads because same session_id)
HttpSession session = request.getSession(); // Same session for ALL threads

// Thread-14's request: view billing account
session.setAttribute("currentAccount", "ACC-4521"); // SmallBusiness_Ontario

// Thread-17's request (arrives 50ms later): rename energy account
// But session.getAttribute("currentAccount") still returns ACC-4521!
String accountToRename = session.getAttribute("currentAccount");
accountService.rename(accountToRename, "TC2609796_EnergyUser_1710345821");
// Renamed the WRONG account

This is a textbook race condition, but it’s invisible from the test side. The tests were correct. The browser isolation was correct. The server was treating 8 concurrent conversations as one.

How Did We Fix It?

The fix was straightforward once we understood the root cause: each thread needed its own server-side session, which meant each thread needed its own authentication cookies.

Per-Thread Auth State Files

Instead of one auth-state-qa1.json shared by all threads, each thread logs in independently on its first test and saves its own cookie file:

public class AuthenticationStateService {

    public String getThreadStateFilePath() {
        long threadId = Thread.currentThread().threadId();
        return String.format("auth-state-qa1-thread-%d.json", threadId);
    }

    public boolean hasValidThreadState() {
        Path statePath = Paths.get(getThreadStateFilePath());
        if (!Files.exists(statePath)) return false;
        // Check if file is less than 30 minutes old
        Instant modified = Files.getLastModifiedTime(statePath).toInstant();
        return Duration.between(modified, Instant.now()).toMinutes() < 30;
    }

    public void saveThreadState(BrowserContext context) {
        context.storageState(new BrowserContext
            .StorageStateOptions()
            .setPath(Paths.get(getThreadStateFilePath())));
    }
}

Updated BrowserProvider

The browser provider loads the thread-specific file instead of the shared one:

public class BrowserProvider {
    @Inject private AuthenticationStateService authService;

    public BrowserContext createContext() {
        String stateFile = authService.getThreadStateFilePath();
        return browser.newContext(
            new Browser.NewContextOptions()
                .setStorageStatePath(Paths.get(stateFile)));
    }
}

The base test class handles login and state persistence. Each thread logs in once, then reuses its own cookies for subsequent tests on that same thread:

@BeforeMethod
public void setUp() {
    BrowserContext context;
    if (authService.hasValidThreadState()) {
        context = browserProvider.createContext(); // Loads thread-specific cookies
    } else {
        context = browser.newContext(); // Fresh context, no cookies
        Page page = context.newPage();
        performOktaLogin(page);
        // Wait for Okta redirect to complete before saving
        page.waitForURL("**/dashboard**");
        authService.saveThreadState(context);
    }
    this.page = context.newPage();
}

The key detail is the page.waitForURL("**/dashboard**") before saving state. Without it, you save cookies mid-redirect and get an auth state file with expired or incomplete tokens. We learned this the hard way — our first implementation of per-thread state saved too early and every thread re-authenticated on every test.

The Result

After deploying per-thread auth state:

§ Delta ·

Before

8 threads sharing 1 server session. 5-10 cross-contamination failures per run. Account names appearing from unrelated test classes.

After

8 threads, 8 independent server sessions. Zero cross-contamination failures. 340 tests, 8 threads, 12-minute full suite.

The login overhead increased from one 5-second login to eight 5-second logins — 40 seconds total, paid once at the start of the run. Since each thread reuses its own cookies for subsequent tests, the per-test cost stayed at zero. A 35-second increase in exchange for eliminating every shared-session failure in the suite.

Why Does This Matter Beyond Our Codebase?

Browser isolation without session isolation is an illusion. Playwright’s BrowserContext, Selenium’s WebDriver instances, Cypress’s cy.session() — they all isolate state on the browser side. None of them guarantee that the server treats separate browser contexts as separate sessions. If two contexts send the same session cookie, the server sees one user.

Session contamination is one symptom of a shared environment problem. For the full isolation strategy, see building a controlled test environment.

This is the same category of bug I wrote about in the race condition that hid behind our retry config and why shared test users break parallel execution, but it’s more insidious. In those cases, the shared resource was obvious — a test user, a database record. Here, the shared resource was invisible. The cookie file looked like an implementation detail of an auth optimization. Nobody reviewed it as a concurrency concern.

If you’re running parallel tests with a shared auth optimization, audit your setup with this checklist:

Parallel Auth Isolation Checklist

Count your session cookies. After all threads are running, how many unique session_id values exist? If the answer is less than your thread count, you have shared sessions.
Check your auth state files. Are you loading the same file across threads? Search for storageState, setStorageStatePath, cy.session(), or whatever your framework’s auth persistence mechanism is. If the path is static (no thread ID, no worker ID), it’s shared.
Verify at the server. Hit a /whoami or session-info endpoint from two parallel threads. Compare the session IDs in the response. If they match, your threads are sharing a server-side session.
Test the symptom. Run your suite with workers: 1 (or threads: 1). If cross-contamination failures disappear, shared state is the cause. Then run at full parallelism and check whether the failure pattern correlates with data from other test classes appearing in your assertions.
Audit session-dependent operations. Any operation that reads “current” state from the session — current account, current cart, current user preferences — is vulnerable. If the server resolves “current” from the session rather than from the request URL or request body, you need session isolation.

§ Frequently Asked FAQ

Does Playwright's BrowserContext guarantee server-side session isolation?

No. BrowserContext isolates browser-side state — cookies, localStorage, sessionStorage — but does not create a new server-side session. If two contexts load the same session cookie from a shared storageState file, the server treats them as one session. All requests with the same session_id cookie hit the same session object on the server, regardless of which BrowserContext sent them.

Why didn't the bug show up with 4 threads?

With fewer threads, the window for concurrent requests hitting the same session was smaller. The race condition existed at 4 threads but triggered rarely enough (~1/week) to be dismissed as flakiness. At 8 threads, the collision probability jumped to 5-10 failures per run — enough to force investigation.

Does per-thread login negate the performance benefit of shared auth state?

Mostly no. Each thread logs in once on its first test and reuses its own cookies for all subsequent tests on that thread. With 8 threads and 340 tests, that’s 8 logins instead of 340. Compared to the original 1-login optimization, it adds ~35 seconds total — but eliminates every cross-contamination failure.

Does this apply to Selenium and Cypress too?

Yes. Any framework that reuses cookies or session tokens across parallel workers is susceptible. Selenium’s cookie injection via driver.manage().addCookie(), Cypress’s cy.session(), and custom token-caching strategies all share the same risk. Browser-side isolation does not imply server-side isolation.

How do I confirm shared sessions are the cause without changing code?

Log the session_id cookie value at the start of each test, prefixed with the thread ID. If multiple threads report the same session_id, they share a server-side session. This takes 5 minutes to add and doesn’t require any framework changes.

What Were the Symptoms?

Where Did We Look First (And Why It Was Wrong)?

How Was the Test Framework Architected?

What Was Actually Happening on the Server?

How Did We Fix It?

Per-Thread Auth State Files

Updated BrowserProvider

Login-Once-Per-Thread in BaseTestDI

The Result

Why Does This Matter Beyond Our Codebase?

Parallel Auth Isolation Checklist

The Flaky Test Isn't Flaky — It's a Race Condition

We Cut 150 Min of Test Setup with 3 Java Classes

Our Enterprise Approved AI — And Why It's the Biggest Risk

Don't miss a thing