Halmurat T.
Halmurat T.

Senior SDET

Home Blog Books ask About

The Dispatch

Weekly QA notes from the trenches.

Welcome aboard!

You're on the list. Expect real-world QA insights — no fluff, no spam.

© 2026 Halmurat T.

Automation 24
  • Selenium
  • Playwright
  • Appium
  • Cypress
AI Testing 5
CI/CD 6
  • GitHub Actions
  • Slack Reporting
QA Strategy 4
Case Studies 5
Blog/Automation
AutomationHalmurat T./March 24, 2026/13 min

Parallel Execution Without the Refactor Tax

Filed underselenium/parallel-execution/framework-design/testng
Parallel Execution Without the Refactor Tax

Table of Contents
  • What Are the Two Levels of Parallelism?
  • Why Does Thread-Level Parallelism Require So Much Refactoring?
  • How Does Process-Level Parallelism Avoid the Refactor Tax?
  • The Automation Server Pattern
  • BrowserStack SDK: Cloud-Managed Process-Level Parallelism
  • How Much Faster Did the Suite Get?
  • What Actually Broke (And What I Had to Fix)
  • How Do You Decide Between Thread-Level and Process-Level Parallelism?
  • Your Next Step

On this page

  • What Are the Two Levels of Parallelism?
  • Why Does Thread-Level Parallelism Require So Much Refactoring?
  • How Does Process-Level Parallelism Avoid the Refactor Tax?
  • The Automation Server Pattern
  • BrowserStack SDK: Cloud-Managed Process-Level Parallelism
  • How Much Faster Did the Suite Get?
  • What Actually Broke (And What I Had to Fix)
  • How Do You Decide Between Thread-Level and Process-Level Parallelism?
  • Your Next Step

Every parallel execution guide I’ve read jumps straight to ThreadLocal, DriverFactory redesigns, and test data isolation patterns. That’s one way to do it — and sometimes the right way. But it’s not the only way. There are actually two levels of parallelism in test automation, and picking the wrong one is why teams spend weeks refactoring frameworks that didn’t need it. Understanding the difference took our suite from 4 hours to 30 minutes — without a framework rewrite.

What Are the Two Levels of Parallelism?

Most teams treat “parallel execution” as a single concept. It’s not. There are two fundamentally different approaches, and they have completely different costs.

Thread-level parallelism runs multiple test methods as separate threads inside a single JVM process. All threads share the same memory space. This is what you get with TestNG’s parallel execution modes or JUnit 5’s junit.jupiter.execution.parallel.enabled=true. It’s fast and efficient, but every piece of shared state — your WebDriver instance, test data, reporting context — becomes a potential race condition. This is the level that demands ThreadLocal, user isolation patterns, and careful framework design.

Process-level parallelism runs tests in completely separate OS processes. Each process has its own memory space, its own JVM (if applicable), and its own state. Processes can’t accidentally share a WebDriver instance because they physically can’t access each other’s memory. No ThreadLocal needed. No DriverFactory rewrite. The isolation is built into the operating system.

Thread-LevelProcess-Level
How it worksMultiple threads in one JVMSeparate OS processes, each with its own JVM
MemoryShared heap — all threads see the same objectsIsolated — processes can’t access each other’s memory
IsolationYou enforce it (ThreadLocal, careful design)The OS enforces it (free)
Shared state riskHigh — static fields, singletons, and shared refs are all race conditionsNone — processes physically can’t share state
Framework changes neededThreadLocal wrappers, user isolation, reporting context scopingMinimal — fix test data collisions and implicit ordering
Resource overheadLow — threads are lightweightHigher — each process loads its own JVM
Startup timeFast — threads spin up in millisecondsSlower — JVM startup per process
DebuggingHarder — race conditions are non-deterministicEasier — failures are isolated to one process
ToolsTestNG parallel=methods, JUnit 5 parallelMaven forkCount, separate user accounts, Docker containers, BrowserStack SDK
Best forNew frameworks designed for thread safetyExisting frameworks that need parallel execution without a rewrite
[ NOTE ]

The distinction matters because most “how to run tests in parallel” guides assume thread-level parallelism and prescribe ThreadLocal as the cure. If you choose process-level parallelism instead, you skip that entire class of problems. The trade-off is different — more resource overhead per process — but for many teams, it’s the faster path to results.

Why Does Thread-Level Parallelism Require So Much Refactoring?

Thread-level parallelism requires heavy refactoring because all threads share the same heap memory, meaning any static field, singleton, or shared reference becomes a race condition when accessed from multiple test methods simultaneously. You need ThreadLocal discipline across your entire framework to prevent threads from interfering with each other.

I’ve written about this in detail — the 3-day thread safety bug that turned out to be a shared WebDriver instance leaking between threads. The fix was ThreadLocal, but the investigation and refactoring across the framework took weeks.

The traditional parallel execution path means solving three problems simultaneously:

  1. Thread-safe driver management — wrapping every WebDriver in ThreadLocal so threads don’t share browser sessions
  2. Test data isolation — ensuring parallel workers don’t share test users or collide on database records
  3. Infrastructure — standing up a Selenium Grid (or Docker Selenium) to handle concurrent browser sessions

That’s easily 2-3 weeks of work for a mid-size suite. It’s the right investment if you’re building a framework from scratch. But if you’re inheriting a mature framework with hundreds of tests, static driver instances, and a team that’ll push back on a multi-sprint refactor — there’s a faster path.

How Does Process-Level Parallelism Avoid the Refactor Tax?

Process-level parallelism avoids the ThreadLocal refactor entirely because each process runs in its own isolated memory space. Two processes literally cannot share a WebDriver instance — they don’t have access to each other’s heap. The operating system enforces the isolation that ThreadLocal enforces at the language level.

I’ve used process-level parallelism in two very different setups over my career, and both worked for the same fundamental reason.

The Automation Server Pattern

At one enterprise project, we had a dedicated automation server with four OS-level user accounts. Each user account would clone the test repository independently, run a subset of the test suite in its own process, and drop the results into a shared network folder when finished. A simple script at the end merged the results.

run-parallel.sh — simplified
# Each user runs on the same server but as a separate OS process
su - testuser1 -c "cd /home/testuser1/repo && mvn test -Dsuite=group1" &
su - testuser2 -c "cd /home/testuser2/repo && mvn test -Dsuite=group2" &
su - testuser3 -c "cd /home/testuser3/repo && mvn test -Dsuite=group3" &
su - testuser4 -c "cd /home/testuser4/repo && mvn test -Dsuite=group4" &
wait
# Merge results from all four runs
cp /home/testuser*/repo/target/surefire-reports/*.xml /shared/results/

No ThreadLocal. No framework changes. Each user account was a completely isolated process with its own JVM, its own WebDriver, its own memory. The only shared resource was the results folder — and that was write-only at the end.

Was it elegant? No. Was it running in production for over a year, cutting our suite time by 4x? Yes.

We didn’t call it “process-level parallelism” at the time. We called it “we need this to run faster and we have a server with 16 cores sitting mostly idle.” But it was the same principle that modern tooling now formalizes.

BrowserStack SDK: Cloud-Managed Process-Level Parallelism

BrowserStack SDK is the same concept, productized. It’s a Java agent that intercepts your WebDriver creation at the JVM level. When your test calls new ChromeDriver(), the SDK replaces it with a remote BrowserStack session — without changing any code. Your tests think they’re running locally. They’re actually running on BrowserStack’s cloud. The official SDK documentation covers the full setup and benefits for TestNG.

Combined with Maven Surefire’s forkCount, each forked JVM process gets its own remote browser session. Same isolation principle as the four-user-account server, but managed by cloud infrastructure instead of bash scripts.

The setup is three steps:

1. Add the dependency:

pom.xml
<dependencies>
<!-- Existing Selenium, TestNG dependencies stay unchanged -->
<dependency>
<groupId>com.browserstack</groupId>
<artifactId>browserstack-java-sdk</artifactId>
<version>LATEST</version>
</dependency>
</dependencies>

2. Create the configuration:

browserstack.yml
userName: ${BROWSERSTACK_USERNAME}
accessKey: ${BROWSERSTACK_ACCESS_KEY}
framework: testng
parallelsPerPlatform: 5
platforms:
- os: Windows
osVersion: 11
browserName: Chrome
browserVersion: latest
- os: OS X
osVersion: Sonoma
browserName: Safari
browserVersion: latest
- os: Windows
osVersion: 11
browserName: Firefox
browserVersion: latest
browserstackLocal: false
buildName: "regression-${BUILD_NUMBER}"

3. Wire up Maven and run:

pom.xml — Surefire plugin
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<argLine>
-javaagent:${com.browserstack:browserstack-java-sdk:jar}
</argLine>
</configuration>
</plugin>
terminal
mvn test

Your existing tests — with their existing new ChromeDriver() calls, existing page objects, existing assertions — now run on BrowserStack’s cloud across multiple parallel sessions. No ThreadLocal. No DriverFactory rewrite. No Selenium Grid to maintain.

How Much Faster Did the Suite Get?

4 hours

Before (sequential)

30 min

After (parallel)

0 hrs/week

Infrastructure maintenance

An 8x improvement. And that included Safari and Firefox, which we’d never tested against. We found browser-specific failures that had been hiding in production — rendering issues and API inconsistencies that Chrome silently handled but Safari didn’t. The parallel execution was the primary goal, but the cross-browser coverage alone justified the BrowserStack license.

[ WARNING ]

Expect your first cloud run to be slower per-test than local. Network latency between your CI server and BrowserStack adds ~200ms per WebDriver command. A test that takes 30 seconds locally might take 45 seconds remotely. You compensate with parallelism — multiple sessions running simultaneously still crush a single session grinding through hundreds of tests sequentially.

What Actually Broke (And What I Had to Fix)

Process-level parallelism sidesteps ThreadLocal, but it doesn’t sidestep every problem. Parallel execution — at any level — exposes shortcuts that sequential runs hide. Here’s what I had to work through:

Test data collisions. Multiple tests created records using the same hardcoded identifiers. Sequentially, they ran and cleaned up one at a time. In parallel, they collided. I went through the data setup utilities and added timestamp suffixes to prevent collisions.

src/test/java/data/OrderData.java
public class OrderData {
public static String uniqueProductId(String baseId) {
// Timestamp suffix prevents collisions across parallel sessions
return baseId + "-" + System.currentTimeMillis();
}
}

Implicit test ordering. Several tests assumed another test had already created their test data. Sequential execution masked this because TestNG’s default ordering happened to run them in the right order. In parallel, the setup test sometimes ran on a different fork and finished last. I extracted shared setup logic into @BeforeClass methods where it belonged.

Utility class adjustments. A few shared helper classes had static state that didn’t survive process forking cleanly — cached configuration values and shared HTTP client instances that assumed a single execution context. None of these required a full ThreadLocal refactor, but they did need attention.

Config tuning. Getting the right parallelsPerPlatform count took experimentation. Too many parallel sessions and BrowserStack’s queue would back up. Too few and we weren’t getting the speedup we wanted.

The bulk of the tests passed without modification. But “most tests just worked” doesn’t mean “zero effort” — the tests that broke needed real investigation, and the utility fixes added up to a week of targeted work.

How Do You Decide Between Thread-Level and Process-Level Parallelism?

The right choice depends on where you are today and what constraint is tightest — time, money, or control.

SituationLevelApproach
Existing suite, need parallel fastProcessBrowserStack SDK or multi-process runner — days, not sprints
Building a new framework from scratchThreadDesign for ThreadLocal from day one
Running 10+ suites across multiple teamsEitherSelf-hosted Grid (Selenoid) for thread-level; dedicated servers for process-level
Need cross-browser coverage, not just speedProcessBrowserStack SDK — browser matrix is a config change
Pre-commit or fast-feedback loopThreadLocal parallel with ThreadLocal — no network latency
Budget is tight, have spare hardwareProcessThe automation server pattern — 4 users, 4 processes, shared results folder

The resourceful move is to start with process-level parallelism as a bridge. Get results now, plan the proper ThreadLocal refactor for next quarter. When you eventually build thread-safe infrastructure, you can keep BrowserStack for cross-browser validation and move your primary speed-focused execution to local thread-level parallel.

I’ve seen too many teams spend 3 sprints “preparing for parallel” while their suite stays at 4 hours. The 80% solution running today beats the 100% solution planned for Q3. For more on building stable parallel test infrastructure, explore our test automation guides.

Your Next Step

Look at your current test execution setup and ask: are we even using the right level of parallelism? If your team has been blocked on a ThreadLocal refactor for months, try process-level first. Pick 10 tests, run them in separate processes (even something as simple as two terminal windows running different test groups), and see if they pass without changes. If they do — and most will — you’ve validated the approach without writing a single line of framework code.

§ Frequently Asked FAQ
+ Is process-level parallelism always better than thread-level?

No. Thread-level parallelism is more resource-efficient — threads share memory and have less startup overhead than separate processes. If you have a well-designed framework with proper ThreadLocal discipline, thread-level parallelism gives you better performance per CPU core. Process-level is better when you need parallel execution now and can’t afford the framework refactor.

+ Can I combine both levels of parallelism?

Yes, and many mature teams do. You can run process-level parallelism across machines or cloud sessions while also running thread-level parallelism within each process. But start with one level. Adding both simultaneously doubles the debugging surface when something breaks.

+ Does process-level parallelism work with Playwright?

Yes. Playwright already uses process-level isolation by default — each worker runs in a separate process. This is one reason Playwright suites tend to be more stable in parallel out of the box compared to Selenium with TestNG, which defaults to thread-level parallelism.

+ What about Docker containers — is that process-level or something else?

Docker containers are process-level parallelism with extra isolation. Each container is an isolated process with its own filesystem, network, and memory. Running tests in separate Docker containers gives you the same isolation benefits as separate OS processes, plus reproducibility and easy cleanup. Tools like Selenoid use this approach.

§ Further Reading 03 of 03
01Automation

Your Test Suite Is Slow for 5 Reasons — Not Just One

Parallelism alone won't save your slow test suite. Here are five layers of optimization — from test design to cloud infrastructure — ranked by real impact.

Read →
02Automation

Selenium's Alert Handling Crashed Our Parallel Suite

How UnhandledAlertException broke 8-thread parallel execution and why Playwright's event-driven dialog model avoids that entire failure pattern in practice.

Read →
03Automation

The Browser Errors Your Test Suite Never Catches

Your UI tests pass green while the console throws errors. Learn to catch JavaScript and page errors in Selenium and Playwright Java — before users do.

Read →

Don't miss a thing

Subscribe to get updates straight to your inbox.

HT

No spam · Unsubscribe anytime

Welcome aboard!

You're on the list. Expect real-world QA insights — no fluff, no spam.

§ Colophon

Halmurat T. — Senior SDET writing about test automation, CI/CD, and QA strategy from 10+ years in the enterprise trenches.

Set in
IBM Plex Sans, Lora, and IBM Plex Mono.
Built with
Astro, MDX, Tailwind CSS & Expressive Code. Served by Vercel.
Privacy
No cookies. No tracking scripts on the main thread — analytics run sandboxed via Partytown.
Source
github.com/Halmurat-Uyghur
Terminal
Try /ask to query Halmurat's notes in a shell prompt.

© 2026 Halmurat T. · Written in plain text, shipped in plain time.

Search
Esc

Search is not available in dev mode.

Run npm run build then npm run preview:local to test search locally.