Halmurat T.
· 10 min read

2 Minutes a Day Cost Me More Than You Think

Table of Contents

Every morning at a large Canadian telecom, I’d open my laptop and stare at the same ritual: launch the VPN client, wait for it to load, click the right profile, type my credentials, fumble through the MFA token, wait for the connection handshake, verify it actually connected. Seven steps before I could open my IDE. It took maybe 2-3 minutes. It felt like 20.

The time wasn’t the problem. The problem was what those 2-3 minutes did to my brain before I’d written a single line of code.

Why Small Frictions Are Expensive

You’ve heard the compound interest analogy — small amounts add up over time. The same math applies to daily friction, but worse, because the cost isn’t just time.

Let’s do the basic arithmetic first. A 2.5-minute task, once per workday, across 250 working days:

2.5 min × 250 days = 625 minutes = ~10.4 hours per year

Ten hours. On connecting to VPN. That’s more than a full workday — gone. And that’s being generous with the estimate. Some mornings the client hung, or the MFA token expired mid-entry, or the profile list didn’t load. Those mornings it was 5 minutes. Some mornings it was “restart the client and try again.”

But here’s what the time math misses: every one of those mornings started with friction instead of flow.

There’s a concept in behavioral psychology called “activation energy” — the mental effort required to start a task. When you open your laptop and the first thing you do is productive work, the activation energy is low. You pick up where you left off. When the first thing you do is wrestle with a VPN client, you’ve spent your activation energy on IT infrastructure before your actual work even begins.

I noticed the pattern after about three weeks. Every morning, after VPN connected, I’d check Slack. Then email. Then maybe Jira. Then get coffee. Then eventually open the codebase. The VPN wasn’t just costing me 2.5 minutes — it was costing me the first 15-20 minutes of my morning because it broke the “open laptop → start working” habit.

The Manual Process (All 7 Steps)

Here’s exactly what connecting to VPN looked like every morning:

  1. Find and launch the VPN client from the system tray or applications folder
  2. Wait 5-10 seconds for the client window to fully load
  3. Select the correct connection profile from a dropdown (there were 3 profiles — production, staging, corporate)
  4. Enter username in the credentials field
  5. Enter password in the password field
  6. Handle the MFA step — open the authenticator app on my phone, copy the 6-digit token, paste it before the 30-second window expired
  7. Click Connect and wait for the handshake to complete, then verify the status indicator turned green

Seven steps. Most of them trivial. All of them interruptible — if Slack pinged between steps 5 and 6, the MFA token might expire while I was reading the message.

The annoying part wasn’t any single step. It was the sequence dependency. Miss the MFA window? Start over from step 6. Client not responding? Force-quit and restart from step 1. Wrong profile selected? Disconnect, go back to step 3.

Automating the VPN Friction Away

I’m an SDET. I automate things for a living. One Friday afternoon, I decided I was done doing this manually.

I built a small Python desktop app using pyautogui — a library that simulates mouse clicks and keyboard input on the actual GUI. The VPN client didn’t have a CLI or API (enterprise security software rarely does), so GUI automation was the only path.

Here’s the core approach:

vpn_connect.py
import pyautogui
import time
import subprocess
def connect_vpn():
# Launch VPN client if not already running
subprocess.Popen(["open", "/Applications/VPNClient.app"])
time.sleep(8) # Enterprise VPN clients are not fast
# Locate and click the profile dropdown
profile_location = pyautogui.locateOnScreen("profile_dropdown.png")
if profile_location:
pyautogui.click(profile_location)
time.sleep(1)

The key challenge was timing. GUI automation lives and dies by wait times. Too short, and you click before the element renders. Too long, and you’re wasting the time you’re trying to save. I ended up using image recognition (locateOnScreen) to detect when UI elements were actually visible before interacting with them.

vpn_connect.py
def wait_for_element(image_path, timeout=30):
"""Wait until a UI element appears on screen."""
start = time.time()
while time.time() - start < timeout:
location = pyautogui.locateOnScreen(image_path, confidence=0.9)
if location:
return location
time.sleep(0.5)
raise TimeoutError(f"Element not found: {image_path}")

If you’ve ever written Selenium waits, this should look familiar. Same concept — explicit wait with a polling interval — applied to a desktop GUI instead of a browser. The confidence=0.9 parameter handles slight rendering variations (different screen brightness, font smoothing, etc.).

The MFA step was the trickiest part. I couldn’t automate the authenticator app on my phone, so the script pauses, waits for me to type the token, then handles the rest:

vpn_connect.py
def handle_mfa():
"""Focus the MFA field and wait for manual token entry."""
mfa_field = wait_for_element("mfa_input.png")
pyautogui.click(mfa_field)
# Token entry is manual — but everything else is automated
# Script watches for the "Connected" status to confirm success
wait_for_element("connected_status.png", timeout=60)
print("VPN connected successfully")

The whole thing took about 2 hours to build on a Friday afternoon. Most of that time was capturing screenshots for the image recognition and tuning the wait times.

Morning VPN Routine

Before

7 manual steps, 2-3 min, full attention required

After

Run script, type MFA token, done in 30 seconds

80% less effort

The Compound Effect

After the first week of using it, I noticed something I didn’t expect. It wasn’t about the 2 minutes I saved. It was about what happened after the VPN connected.

Before automation: Open laptop → VPN ritual → check Slack → check email → get coffee → eventually start working. First productive code: ~20 minutes after opening laptop.

After automation: Open laptop → run script → type 6 digits → start working. First productive code: ~3 minutes after opening laptop.

The script didn’t just remove 2 minutes of clicking. It removed the transition tax — the mental gap between “arriving at work” and “doing work.” When VPN connection is a single action instead of a 7-step process, your brain doesn’t need a recovery period afterward.

Over a year, the conservative math says I saved 10+ hours of raw VPN-fumbling time. But the real savings were the 15-17 minutes of drift that used to follow the VPN connection every single morning. That’s:

17 min × 250 days = 4,250 minutes = ~70 hours per year

Seventy hours. Almost two full work weeks. Not from a complex automation framework or a CI/CD optimization. From a 2-hour Python script that clicks buttons.

This Is How SDETs Think

Here’s the broader point. SDETs spend their careers looking at repetitive manual processes and building automation around them. But most of us only apply that lens to test suites and CI pipelines.

The VPN script taught me to apply the same thinking to everything. Since then, I’ve automated:

  • Environment setup scripts that configure local dev environments with one command instead of following a 12-step wiki page
  • Jira ticket templates that pre-fill the 6 required fields our project demanded for every bug report
  • Test data generation that used to involve manually querying 3 databases and cross-referencing IDs

None of these were “automation engineering.” They were quality-of-life improvements that took 1-3 hours to build and paid for themselves within a week.

The instinct to automate repetitive friction is one of the most valuable things an SDET brings to a team — and it extends well beyond writing test scripts. When you’ve spent a decade building thread-safe parallel execution frameworks and debugging flaky tests at 2am, you develop a low tolerance for wasting time on things a computer should handle.

What I’d Do Differently

If I were building this today, I’d skip pyautogui entirely and use pywinauto (Windows) or atomac (macOS) — libraries that interact with native accessibility APIs instead of image recognition. Image matching works, but it breaks when the OS ships a UI update or your display scaling changes. Accessibility-based automation is more resilient because it targets the control tree, not pixels on screen.

I’d also wrap the whole thing in a system startup hook instead of a manual script launch. The goal was “open laptop → working,” but I still had to remember to run the script. A launch agent (macOS) or scheduled task (Windows) that triggers on login would have closed that last gap.

Your Turn

Here’s your challenge for this week: pick one daily friction point and time it.

Not the big ones — not “our build takes 40 minutes” or “deployments are manual.” Those are important, but they have organizational inertia. Pick a small, personal one. Something you do every day that takes 1-5 minutes and annoys you slightly.

Time it for three days. Multiply by 250. Then ask yourself: is that worth a Friday afternoon of scripting?

The answer is almost always yes. Not because of the time math. Because of the brain energy you’ll get back every single morning.


Get weekly QA automation insights — no fluff, just battle-tested strategies from 10+ years in the trenches.

What is pyautogui and when should you use it for automation?

pyautogui is a Python library that programmatically controls mouse and keyboard input. It works with any desktop application — not just browsers. Use it when the application you need to automate doesn’t have a CLI, API, or accessibility hooks. It’s the “last resort” automation approach, but for enterprise security software (VPN clients, certificate managers) that intentionally lock down programmatic access, it’s often the only option. The main downside is fragility — UI changes can break image recognition, so it works best for stable, rarely-updated applications like enterprise VPN clients.

How does GUI automation compare to browser automation like Selenium or Playwright?

The concepts are nearly identical — locating elements, waiting for them to appear, interacting with them, verifying results. The difference is the locator strategy. Browser automation uses DOM selectors (CSS, XPath, test IDs). GUI automation uses image recognition or coordinate-based clicking. Image recognition is less reliable but works with any application regardless of its technology stack. If you’ve written Selenium or Playwright tests, you already understand the timing and synchronization challenges — desktop GUI automation just applies them to a different surface.

Is it worth automating a task that only takes 2-3 minutes?

Almost always yes, if you do it daily. The direct time savings (10+ hours/year) are meaningful but secondary. The real value is eliminating friction from your daily routine. Small repetitive tasks create “transition costs” — the mental overhead of switching from productive work to mundane clicking and back. Removing that friction improves focus and reduces the drift time that typically follows annoying interruptions. The rule of thumb: if a task takes under 5 minutes but you do it at least once a day, and you can automate it in under 4 hours, do it.

Related Posts

Get weekly QA automation insights

No fluff, just battle-tested strategies from 10+ years in the trenches.

No spam. Unsubscribe anytime.