Selenium CAPTCHA Bypass: Tokens vs. Clicks — Which One’s Faster?

markus009

Markus

Posted on November 26, 2024

Selenium CAPTCHA Bypass: Tokens vs. Clicks — Which One’s Faster?

In a previous article (which I also cross-posted to Dev.to), I tested two CAPTCHA-solving methods — tokens and clicks — using Puppeteer. Back then, I promised a follow-up where I’d run the same comparison, but this time let’s bypass reCAPTCHA using Selenium. Well, the time has come to deliver on that promise! In this article, I’ll show you how the same approaches perform with a different tool. Ready? Let’s dive in.

Getting Started: Preparing Selenium for Google reCAPTCHA

For this experiment, I used modules from the same provider but tailored for Python. Why? The primary difference between Selenium and Puppeteer lies in their compatibility with programming languages, and Selenium is more Python-friendly.

Initially, I hoped for a plug-and-play module, similar to Puppeteer, where you could simply switch CAPTCHA-solving methods via configuration. Unfortunately, such a Python module doesn’t seem to exist (or I couldn’t find it). So, here’s what I ended up using:

  • Token-based CAPTCHA solving extension: recaptcha_v2.
  • Click-based CAPTCHA solving extension: selenium-recaptcha-solver-using-grid.

Modifying the Token-Based Selenium Module for bypass reCAPTCHA

Finding a working module for token-based CAPTCHA solving took some effort. The solution I found was set up by default for 2Captcha’s demo page, not Google’s official reCAPTCHA demo. Fixing this required a few tweaks to the module’s code.

Here’s the part of the code I updated:

# CONFIGURATION
url = "https://www.google.com/recaptcha/api2/demo"
apikey = os.getenv('API KEY')

# LOCATORS
sitekey_locator = "//div[@id='g-recaptcha']"
submit_button_captcha_locator = "//button[@data-action='demo_action']"
success_message_locator = "//p[contains(@class,'successMessage')]"
Enter fullscreen mode Exit fullscreen mode

What Changed?

  • Updated the URL: I replaced the demo page URL with Google’s reCAPTCHA demo link.
  • Added my API key: This allowed the module to authenticate with 2Captcha.
  • Adjusted locators: The default XPath selectors didn’t work with Google’s reCAPTCHA demo, so I replaced them with site-specific selectors.

Here’s the final version of the LOCATORS:

# LOCATORS
sitekey_locator = "//div[@id='recaptcha-demo']"
submit_button_captcha_locator = "//input[@id='recaptcha-demo-submit']"
success_message_locator = "//div[contains(@class,'recaptcha-success')]"
Enter fullscreen mode Exit fullscreen mode

Important Note: These locators work for Google’s demo page but may not apply to other sites, as HTML structures vary. Always adjust locators to match your target website.

Adjusting the Click-Based Selenium Module for bypass reCAPTCHA

The selenium-recaptcha-solver-using-grid module uses a grid-clicking approach, perfect for solving image-based reCAPTCHA V2 (like “select all traffic lights”). While researching, I learned an interesting fact: this module supports machine recognition (probably powered by AI) for faster solving. By default, a human handles the CAPTCHA, but enabling this feature speeds things up.

For simplicity, I stuck with the default setup and only made two small edits:

  • Replaced the demo URL with demo.
  • Inserted my API key from 2Captcha.

Other than that, the module worked straight out of the box, solving CAPTCHAs efficiently without any additional adjustments.

Performance Test: Which Method fo bypass reCAPTCHA using Selenium Is Faster?

Once the setup was complete, it was time for the showdown. Each module was tested independently, under standard conditions (no proxies or IP switching). Here’s how they performed:

  • Token method: 1 minute 30 seconds per CAPTCHA.
  • Click method: 2 minutes 30 seconds per CAPTCHA.

That’s a significant difference! To put it into perspective:

  • Over 24 hours, a single-threaded module could solve:
    • 960 CAPTCHAs using tokens.
    • 576 CAPTCHAs using clicks.
  • Tokens save you roughly 6.5 hours per day compared to clicks.

The Takeaway

The results are clear: token-based CAPTCHA solving is faster with both Puppeteer and Selenium. If speed is your goal, tokens are the way to go. However, click-based solutions have their place, especially for grid-based CAPTCHA challenges.

In the end, it all boils down to your specific use case. Speed or flexibility — the choice is yours!

💖 💪 🙅 🚩
markus009
Markus

Posted on November 26, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related