Asjad Khan

•Feb 25, 2026•

When Tests Should Run Headless vs Headed in Playwright

When should you run tests in headless vs headed mode in Playwright? Learn the differences between the two modes and when to use each.

Your test suite works smoothly on your device. All of the tests have passed. You pushed the branch, kicked off the CI, and a few tests failed with timeouts. You check the logs, but find nothing. So you try running those exact tests locally in headless mode to match what CI does. They pass.

Now, you feel stuck. The failures only occur in CI's headless environment, but they cannot be reproduced. You toggle between modes a few times, maybe adjust some timeouts, and eventually the tests pass. Problem solved, right?

Not really. You just masked the issue. Headless and headed aren't just "browser window on vs off." By default, Playwright actually uses two different Chromium binaries: a lightweight chromium headless shell for headless mode and the full Chromium browser for headed mode. They're separate implementations with different rendering behavior, GPU handling, and font rendering. What appears to be a flaky test is often a real behavioral difference between those two binaries.

When to use each mode

Here's the thing: most teams don't have a clear reason for running tests in one mode versus the other. They pick headless for CI because that's the default, headed for local work because it's easier to watch, and switch modes when something breaks, hoping it'll magically fix itself.

This guide explains when each mode is appropriate, what actually differs between them (and why), and how to stop treating mode-switching as a debugging strategy.

Use headless when:

Running tests in CI/CD pipelines
Scaling parallel test execution
Web scraping or for routine health checks
You need faster execution for regression suites

Use headed when:

Debugging failed tests (especially timing issues)
Working with animations or CSS transitions
Developing new test automation
Troubleshooting hover states or drag interactions

The workflow that works

Write and debug in headed mode locally. Validate and run at scale in headless mode in CI. Don't switch modes randomly just to fix a flaky test.

Let's start with what Playwright actually runs under the hood, because that context changes how you think about everything else.

Headless Shell vs New Headless

Before getting into when to use each mode, it helps to understand what Playwright does behind the scenes. This isn't just "browser with a window" vs "browser without a window."

Playwright ships two separate Chromium binaries:

Chromium Headless Shell (default for headless): A stripped-down browser built on Chromium's //content module. It doesn't depend on X11/Wayland or D-Bus, and it has a smaller footprint. But it's a different implementation than what you see in headed mode. Think of it as a purpose-built scraping and automation binary, not "Chrome minus the window."
Full Chromium (used for headed mode): The real browser, with the complete rendering pipeline, GPU integration, and all the platform features you'd expect.

This means that when your test passes headed but fails headless, you're not just dealing with a missing window. You're running a different binary with different rendering behavior, different GPU handling, and different font rendering.

Since Chrome 112, there's a third option: the new headless mode. It runs the full Chrome browser without a visible window. Same code, same rendering, same behavior as headed mode. You can opt into it in Playwright by setting channel: 'chromium':

// playwright.config.ts
import { defineConfig, devices } from "@playwright/test";

export default defineConfig({
  projects: [
    {
      name: "chromium",
      use: { ...devices["Desktop Chrome"], channel: "chromium" },
    },
  ],
});

This eliminates most behavioral differences between headed and headless, at the cost of a larger binary and slightly more resource usage. If you don't need the lighter footprint of headless shell (most test suites don't), the new headless mode is the safer default. It's closer to what your users actually see.

There are trade-offs to be aware of in CI. Unlike headless shell — which has no X11/Wayland or D-Bus dependencies — the new headless mode uses the full Chromium binary, which depends on system libraries like libX11, libXcomposite, and D-Bus. You don't need a running display server (it's still headless), but these libraries must be installed on the host. Running npx playwright install --with-deps handles this, and Playwright's official Docker images include everything pre-configured. The binary is also larger, which increases CI download times. See issue #33566 for a full breakdown of the differences.

You can also control which binaries get installed. If you only run headless shell (the default), skip downloading the full browser:

npx playwright install --with-deps --only-shell

Or if you're using new headless mode exclusively, skip headless shell:

npx playwright install --with-deps --no-shell

With that context, let's look at when each mode makes sense.

What Headless Mode Is Best For

Headless mode runs the browser without a visible window. The DOM, JavaScript, and rendering pipeline all work — layout, paint, and compositing still happen in an off-screen buffer (which is why screenshots work) — but nothing is drawn to a display. By default, Playwright uses the headless shell binary for this, which is lighter but behaviorally different from the full browser.

CI Environments Need Headless

CI environments typically don't have display servers. Running headed there would require X virtual framebuffer (Xvfb) on Linux or similar workarounds. Headless bypasses this entirely. You get a working browser without needing to configure display infrastructure.

Resource Efficiency at Scale

When you're running many parallel test workers on a CI machine, each headed browser instance competes for GPU resources. Headless shell uses software rendering, which is CPU-bound but more predictable across CI hardware configurations where GPU availability varies.

This matters when you're scaling horizontally. Adding more CI machines eliminates the need to worry about GPU availability or display server limits.

Web Scraping and Data Extraction

Web scraping is where headless mode really shines, as it gives you a fully functional browser without the overhead of compositing layers, pixels, or managing window focus.

When you're scraping a large number of pages, the rendering overhead quickly adds up. Headless mode lets you run multiple browser contexts simultaneously without worrying about display resource limits.

// Efficient scraping setup with headless
const browser = await chromium.launch({ headless: true });
const context = await browser.newContext();
const page = await context.newPage();

You get JavaScript execution, DOM manipulation, and network handling without paying the cost of GPU acceleration or visual rendering. That's exactly what data extraction needs.

Speed for Repetitive Tasks

Routine health checks that repeatedly hit the same endpoints don't require visual feedback. Headless shell typically runs faster on average for these scenarios because it avoids GPU-accelerated compositing and display overhead, even though layout and paint still happen internally.

The speed difference isn't huge for individual tests, but when you're running thousands of tests daily, those seconds add up.

Headless works for automation at scale. When you need to see what's actually happening, headed mode becomes necessary.

What Headed Mode Is Best For

Headed mode runs the full Chromium browser with your system's native GPU rendering. Since it's a different binary from headless shell, timing, resource allocation, and rendering behavior can all differ.

Debugging Becomes Visual

When a test fails, and you can't figure out why, headed mode shows you what's happening. You see the actual element the test is clicking, whether it's visible, and if something else is covering it.

Playwright's slowMo option combined with headed mode makes test behavior obvious. Set it in your config:

// playwright.config.ts
import { defineConfig } from "@playwright/test";

export default defineConfig({
  use: {
    launchOptions: {
      slowMo: 500,
    },
  },
});

Then run with npx playwright test --headed. You watch the browser step through each action with a 500ms pause between operations. Sometimes the problem is immediately clear: the button you're clicking is off-screen, a modal is blocking the element, or the page hasn't finished loading.

If you want the full debugging experience with the Playwright Inspector (stepping through actions, editing locators, viewing actionability logs), use the --debug flag instead. It implies headed mode, no timeout, and runs tests one by one:

npx playwright test --debug

Animations Need Rendering

Animations and CSS transitions can behave differently between headless shell and headed mode. Because headless shell is a different browser implementation, it processes frame painting and compositing differently from the full Chromium browser. This can affect when Playwright observes state transitions during animations.

If you're using the new headless mode (channel: 'chromium'), this gap mostly disappears since it runs the same rendering code as headed mode. But with the default headless shell, slide-in panels, fade effects, or loading spinners may render with different timing. Headed mode is useful for visually confirming that your test is waiting for the right state.

Developing New Tests

When you're writing a new test from scratch, headed mode with --workers=1 lets you watch exactly what's happening:

npx playwright test new-feature.spec.ts --headed --workers=1

You notice immediately if a button is off-screen, if timing feels wrong, or if the test is clicking the wrong element.

Switching to headless too early in development means you're guessing whether your selectors are correct and whether your waits are sufficient.

Understanding when to use each mode is one thing. Knowing what breaks differently between them is what actually saves you debugging time. Here's where things get specific.

Failures that show up in one mode

Most failures that show up in only one mode point to weak test design rather than browser bugs. But because Playwright's headless shell and full Chromium are different binaries, the differences are real, not imagined. Tests that rely on timing assumptions, visual rendering details, or skip proper waiting strategies are the ones that break.

Timing differences between binaries

Since headless shell and full Chromium are different implementations, they can process DOM updates and layout at different speeds. This doesn't mean one is "faster" in a simple way. It means the timing of when Playwright observes a state change can vary. A common pattern that gets blamed on mode:

await page.locator("button").click();
await page.waitForSelector(".modal");
await page.locator(".modal").click();

The waitForSelector call here is actually redundant — page.locator(".modal").click() already performs actionability checks that auto-wait for the element to be visible, stable, and enabled before clicking. The real problem is that waitForSelector is a legacy API that doesn't integrate with Playwright's modern locator and assertion model. Mixing old-style waits with locator actions makes test intent unclear and harder to debug when timing varies between binaries.

The idiomatic Playwright approach uses web-first assertions to verify application state, and relies on locator auto-waiting for actions:

await page.locator("button").click();
await expect(page.locator(".modal")).toBeVisible();
await page.locator(".modal-action").click();

toBeVisible() auto-retries until the element is visible or the assertion times out (5 seconds by default). This serves a different purpose than waitForSelector: it's an assertion that verifies your app reached the expected state, not just a wait. If the modal never appears, you get a clear assertion failure instead of a cryptic timeout on the next click.

If you're still using waitForSelector, replace it: use web-first assertions to verify state, and let locator actions handle their own waiting. This makes tests more readable and produces better error messages when something fails, regardless of which binary runs them.

Font loading inconsistencies

Font files, especially Web Open Font Format (WOFF files) can load differently between modes due to:

Missing system fonts in CI images
Different font fallback chains
Subpixel rendering differences
Snapshot timing before fonts are ready

await page.goto("https://example.com");
await page.screenshot({ path: "screenshot.png" });

The screenshot might show fallback fonts in headless shell but custom fonts in headed. This happens because the two binaries use different font rendering paths, and CI images often ship with fewer system fonts than your local machine.

Charts and data visualization libraries depend on exact font metrics for layout calculations. When fonts differ, your tests that check element positions will fail.

To fix it, wait for fonts to load explicitly before taking screenshots or measuring elements:

await page.goto("https://example.com");
await page.evaluate(() => document.fonts.ready);
await page.screenshot({ path: "screenshot.png" });

Note: you might see examples that add page.waitForLoadState("networkidle") before the font check. Avoid this. networkidle waits for zero network requests over 500ms, which is unreliable (analytics pings, websockets, and heartbeat requests can keep it waiting indefinitely or cause it to resolve too early). The document.fonts.ready promise is sufficient on its own since it resolves when all font face loads are complete.

GPU and WebGL behavior

WebGL applications run significantly slower in headless shell unless hardware acceleration is explicitly enabled. Headless shell disables GPU acceleration by default, which means WebGL falls back to software-based rendering or may not work at all depending on the operation. The full Chromium browser used in headed mode (and new headless mode) uses your system's actual GPU.

Without hardware acceleration, WebGL-heavy tests can time out in headless shell while working fine in headed mode. The performance gap can be 10x or more for canvas-intensive operations.

If you're using new headless mode (channel: 'chromium'), GPU handling works the same as headed mode, so this is mostly a non-issue. For headless shell, you can attempt to enable GPU acceleration with launch flags:

// playwright.config.ts
use: {
  launchOptions: {
    args: ["--use-gl=egl"], // or '--use-gl=desktop' on some systems
  },
}

Note that --use-gl=egl requires an EGL implementation to be available on the host. Most standard CI runners (GitHub Actions, GitLab CI) don't ship with GPU drivers or EGL libraries out of the box. You'll need to install packages like libegl1 and mesa-utils, or use Playwright's official Docker images which include the necessary dependencies. Without them, this flag silently falls back to software rendering, and you won't see the improvement you expect.

This matters for:

Games and interactive 3D applications
Mapping libraries (Mapbox, Leaflet with canvas renderer)
Video editors or canvas-based drawing tools
Data visualization with complex animations

Hover and layout issues

Hover-related failures usually come from layout problems, not headless skipping hover logic. Chromium computes pointer events and CSS hover states the same way in both modes. Playwright performs actionability checks before interacting with elements, but these checks don't catch every layout-related issue.

await page.locator(".menu-trigger").hover();
await page.locator(".dropdown-item").click();

If this fails in headless, the real causes are typically:

The element is offscreen and needs scrolling
The dropdown appears but is overlapped by another element
Layout shifts occur during the interaction

To fix this, don't rely on hover alone. Verify visibility explicitly using Playwright's auto-waiting assertions:

await page.locator(".menu-trigger").hover();
// Use web-first assertion that auto-waits
await expect(page.locator(".dropdown-menu")).toBeVisible();
await page.locator(".dropdown-item").click();

Understanding these patterns matters, but the solution is usually better test design, not mode-switching. Here's the workflow that works.

A Workflow That Uses Both Modes Properly

There is no such approach where we can pick up one mode and work with it forever. The goal is to use each where it works the best and understand when to switch.

Locally: Write and Debug in Headed Mode

When you're writing new tests, use headed mode:

npx playwright test --headed --workers=1

You'll see exactly what the test is doing and catch obvious issues immediately. Single worker means you can watch one test execute from start to finish. You notice if a button is off-screen, if timing feels wrong, or if the test is clicking the wrong element.

This is your development environment. Speed doesn't matter yet. Clarity does.

In CI: Validate in Headless

CI runs should use headless mode with parallelization. This is where you catch race conditions, timing issues that only appear under load, and problems with resource contention.

// playwright.config.ts
export default defineConfig({
  use: {
    headless: process.env.CI ? true : false,
  },
});

This automatically runs headed locally and headless in CI. No manual toggling needed. No accidentally pushing headed mode config that breaks CI.

For debugging CI failures: use traces first

When a test fails in headless CI, don't immediately switch to headed mode. Modern Playwright debugging relies on traces, not re-running in headed mode.

Enable trace recording:

    use: {
      screenshot: 'only-on-failure',
      video: 'retain-on-failure',
      trace: 'on-first-retry',
    }

Playwright's trace viewer shows you exactly what happened in headless mode, including:

Network requests with timing
DOM snapshots at each step
Console logs and JavaScript errors
Screenshots at failure points

Download the trace from CI artifacts and open it with npx playwright show-trace. This gives you more debugging clues than watching a headed run, especially for CI-only failures.

Traces often reveal the root cause faster than trying to reproduce failures locally in headed mode. For a deeper walkthrough of CI-specific debugging techniques, see our guide on how to debug Playwright tests in CI.

The strategy that doesn't work

Switching to headed mode when a test is flaky, seeing it pass once, and assuming the problem is solved. This creates false confidence. The test is still flaky in headless, which is where it runs in production CI. You haven't fixed anything. You've just found a mode where the race condition happens to be less frequent.

Running both modes in CI as a workaround is tempting, but it doubles your CI time and usually masks test design problems instead of fixing them.

Fix the root cause instead

If it's a timing issue, use Playwright's locators with web-first assertions (avoid explicit waitForSelector)
If it's a font loading problem, wait for document.fonts.ready
If it's GPU-related, enable hardware acceleration in headless shell, or switch to new headless mode
If it's animation-related, use web-first assertions that auto-wait for the expected state
If mode-specific differences keep biting you, consider switching to new headless mode (channel: 'chromium') to eliminate the binary gap entirely

Mode-switching is a deployment choice, not a debugging tool. If tests fail in one mode, fix the test or your headless configuration rather than toggling modes until things pass.

Wrapping Up

Headless and headed aren't good vs bad. They're different tools for different stages of your workflow, and in Playwright's case, they're literally different browser binaries by default.

Run headed when you're actively developing or debugging tests. Use headless when you're running at scale in CI. If the behavioral gap between the two keeps causing problems, switch to the new headless mode (channel: 'chromium') so both modes use the same browser code.

When a failure shows up in only one mode, investigate the cause instead of switching modes and hoping the failure goes away. The difference usually traces back to the headless shell's different rendering behavior, font handling, or GPU path. All of those are fixable without changing modes.

The teams that struggle with headed vs headless treat it as a toggle to fix flaky tests. The teams that succeed use headed for development, headless for CI, lean on Playwright's web-first assertions with auto-waiting, and debug with traces instead of mode-switching.

Scale your Playwright tests with confidence.
Join hundreds of teams using Currents.

Learn More

Trademarks and logos mentioned in this text belong to their respective owners.