Using curl-impersonate in Node.js to avoid blocks

leemeganj

Megan Lee

Posted on November 21, 2024

Using curl-impersonate in Node.js to avoid blocks

Written by Antonello Zanini✏️

One of the biggest challenges for Node.js automation scripts is getting blocked by anti-bot measures. No site likes bots, so many web servers implement bot protection solutions to stop them.

The key to identifying bots lies in examining low-level network details, as most HTTP clients do not use the same underlying connection libraries as browsers. This is where [curl-impersonate](https://github.com/lwthiker/curl-impersonate) comes in.

As a customized build of curl, it adopts the same low-level network libraries as popular browsers, making its requests nearly identical to those of legitimate users.

In this article, you will learn what curl-impersonate is and how to use it in Node.js for bot detection bypass in web scraping and automation scripts. If you want to read more about how curl normally works before jumping in to this piece, take a gander at our introduction guide.

How to use curl-impersonate in Node.js

curl-impersonate is a specialized build of curl that can impersonate real-world browsers. Unlike standard curl, it adjusts request headers, TLS fingerprints, and other parameters to make its requests closely resemble those from browsers like Chrome, Firefox, and Safari.

By doing so, curl-impersonate helps to fool anti-bot mechanisms into thinking that your automated request is coming from a normal browser instead of an HTTP client. This makes the project useful for scenarios like web scraping or any situation where a site might otherwise restrict or block access from automated tools.

curl-impersonate is available through Docker images so that you can use it as a command in your terminal at the OS level. Additionally, the project provides the libcurl-impersonate library which opens the door to specific bindings in multiple programming languages, including Node.js. Let’s now see how to use curl-impersonate in a Node.js script!

Install curl-impersonate

The npm registry lists a few Node.js bindings for the curl-impersonate project:

Search results on npm for curl-impersonate showing different TypeScript and Node.js packages for bypassing TLS fingerprinting with popularity and maintenance stats.  

While none of these options clearly stands out from the others, [node-curl-impersonate](https://www.npmjs.com/package/node-curl-impersonate) is one of the most reliable choices. It is written in TypeScript, actively maintained, receives frequent updates, and has been under continuous development for over a year.

Add node-curl-impersonate to your project’s dependencies with the following command:

npm install node-curl-impersonate
Enter fullscreen mode Exit fullscreen mode

Note: node-curl-impersonate is only compatible with Unix-based operating systems like Linux and macOS. If you are on Windows and cannot use the WSL (Windows Subsystem for Linux), consider using [ts-curl-impersonate](https://www.npmjs.com/package/ts-curl-impersonate) as an alternative as it comes with native Windows support.

Configure and use the client

First, import node-curl-impersonate in your JavaScript or TypeScript script:

import CurlImpersonate from "node-curl-impersonate";
Enter fullscreen mode Exit fullscreen mode

Keep in mind that node-curl-impersonate is an ES module, so you cannot import it with a require() like a CommonJS package. If you do not know what that means, read our article on CommonJS vs. ES modules in Node.js.

CurlImpersonate is a constructor you can use to initialize a curl-impersonate request, as in the example below:

const curlImpersonate = new CurlImpersonate("https://example.com", {
  method: "GET",
  impersonate: "chrome-116",
  headers: {},
});
Enter fullscreen mode Exit fullscreen mode

The constructor takes a URL and an optional options object. Here is a breakdown of the available options:

  • method — The HTTP method to use for the request. Currently, only "GET" and "POST" are supported
  • impersonate — A string identifying the browser to impersonate. The supported options are "chrome-110", "chrome-116", "firefox-109", and "firefox-117"
  • headers — A key-value object containing custom HTTP headers to merge with the headers set automatically by curl-impersonate. Note that this is not optional
  • body — An optional object used as a JSON body for a POST request.
  • verbose — An optional boolean flag to enable verbose mode, which logs what the client does behind the scenes
  • flags — An optional array of additional flags to pass to the underlying libcurl-impersonate library

To make the request, call makeRequest() on the returned instance:

await curlImpersonate.makeRequest();
Enter fullscreen mode Exit fullscreen mode

Alternatively, you can create the instance without a URL and pass it later to makeRequest():

const curlImpersonate = new CurlImpersonate(undefined, {
  method: "GET",
  impersonate: "chrome-116",
  headers: {},
});

curlImpersonate.makeRequest("https://example.com")
// ...
// curlImpersonate.makeRequest(...)
Enter fullscreen mode Exit fullscreen mode

This allows you to reuse the same CurlImpersonate instance for multiple requests, especially for GET requests, as POST requests usually require a body, which can only be set in the constructor.

Do not forget that node-curl-impersonate only works with Unix-based systems. Attempting to use it on Windows will result in the following error:

Error: Unsupported Platform! win32
Enter fullscreen mode Exit fullscreen mode

If you are a Windows user, you can bypass that issue by using the WSL.

Perform a request against an anti-bot-protected site

Kick is a popular streaming service, especially among younger audiences, and its popularity is growing quickly. If you try to perform web scraping on Kick, you are likely to encounter the following anti-bot detection page that blocks automated requests:

Verification screen on kick.com requiring users to confirm they are human via a Cloudflare CAPTCHA check before proceeding.  

With node-curl-impersonate, you can bypass Kick's anti-bot measures and access the site's HTML content. Here is how you can do it:

import CurlImpersonate from "node-curl-impersonate";

(async () => {
  // initialize a curl-impersonate request with the specified options
  const curlImpersonate = new CurlImpersonate("https://kick.com/", {
    method: "GET",
    impersonate: "chrome-116",
    headers: {},
  });

  // perform the request
  const curlResponse = await curlImpersonate.makeRequest();

  // extract the response data
  const response = curlResponse.response;
  const responseStatusCode = curlResponse.statusCode;

  // if the server responded with a 4xx or 5xx error
  if (responseStatusCode && ["4", "5"].includes(responseStatusCode.toString()[0])) {
    // error handling logic...
    console.error("Error response:", response);
  } else {
    // handle the response...
    console.log(response);
  }
})();
Enter fullscreen mode Exit fullscreen mode

If you launch the above script, the output will be the HTML content of Kick's home page:

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charSet="utf-8" />
    <meta
      name="viewport"
      content="width=device-width, initial-scale=1, maximum-scale=1, user-scalable=no"
    />
    <link rel="preload" as="image" href="/img/kick-logo.svg" />
    <!-- omitted for brevity... -->
    <title>Kick</title>
    <meta
      name="description"
      content="Kick is a streaming platform that makes it easy for you to find and watch your favorite content."
    />
    <!-- omitted for brevity... -->
  </head>
</html>
Enter fullscreen mode Exit fullscreen mode

Awesome, the result confirms that you were able to access the target page without being blocked!

Setting browser-like HTTP headers is not enough to avoid blocks

curl-impersonate is certainly an interesting technology, but you may wonder what makes it so powerful and unique.

The common assumption when it comes to fooling anti-bot systems is that all you need to do is replicate browser requests. That is not entirely wrong, but it is far from easy to accomplish. Let's see why!

Open your browser in incognito mode and visit the Kick home page—the target web page of this article. In the “Network” tab of DevTools, you will see the request that the browser makes:

Browser DevTools showing network activity for kick.com, including request and response headers for a GET request with status code 200.  

Notice how Chrome includes special HTTP headers in the request. Apparently, that is the only difference from a request made with a regular HTTP client.

Right-click on the request and select the Copy > Copy as fetch (Node.js) option. This is what you would get:

fetch("https://kick.com/", {
  "headers": {
    "sec-ch-ua": ""Chromium";v="130", "Google Chrome";v="130", "Not?A_Brand";v="99"",
    "sec-ch-ua-mobile": "?0",
    "sec-ch-ua-platform": ""Windows"",
    "upgrade-insecure-requests": "1"
  },
  "referrerPolicy": "strict-origin-when-cross-origin",
  "body": null,
  "method": "GET"
});
Enter fullscreen mode Exit fullscreen mode

fetch() is a function that comes from the Node.js Fetch API. See why the above code does not require an external library in our piece on the Fetch API in Node.js.

Copy the request to a JavaScript script and execute it. You will get the following [403 Forbidden](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/403) page:

<!DOCTYPE html>
<html lang="en-US">
  <head>
    <title>Just a moment...</title>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=Edge">
    <meta name="robots" content="noindex,nofollow">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <!-- omitted for brevity... -->
  </head>
</html>
Enter fullscreen mode Exit fullscreen mode

In this case, Kick was able to detect your request as coming from an automated script and block it. How is that even possible? Read on!

Why curl-impersonate is effective against most anti-bot solutions

What we did above is mimic the behavior of a browser at the application layer, making an equivalent HTTPS request to that of your browser. But remember, the Internet operates over a stack of layers!

Diagram of the OSI model showing the seven layers from physical to application, with corresponding data units like bits, frames, packets, segments, and data.

To reach the server, your HTTPS request must pass through the TLS channel created at the transport layer, then through the IP layer, and so on.

As a web developer, you spend most of your day working technologies at the application layer. However, it is essential not to overlook the underlying layers that enable the application layer to function.

Anti-bot solutions analyze all aspects of incoming requests, from high-level application details down to lower-level elements. To determine if a request is genuine, they cannot rely solely on application-layer details at the HTTPS level. Otherwise, eluding bot detection would be a piece of cake!

So, the most advanced bot protection systems on the market like Cloudflare focus on low-level network aspects, such as the TLS fingerprint of the request.

TLS fingerprinting as a key to discovering bots

When a client like your browser or scraping bot initiates a secure connection with a server, that requires a TLS handshake.

Diagram of a TLS handshake showing communication between a client and server, including steps like ClientHello, ServerHello, certificate validation, key exchange, and establishment of a secured connection.

During that process, the client and server negotiate encryption settings. This handshake includes details like the TLS version, cipher suites, and extensions that the client supports.

Based on the information exchanged during the handshake, it is possible to generate a "fingerprint" that helps distinguish from one client to another.

This is how most bot detection systems can tell if you are using a real browser or not. Browsers use well-known TLS libraries that are generally different from those used by HTTP clients.

The consequence of this is that the TLS fingerprint of a request made by a browser is quite different from that of an HTTP client — even if they share the same HTTP headers.

You can verify that by targeting the Scrapingly TLS Fingerprinting API in your browser and comparing the result with clients like node-curl-impersonate and the Fetch API.

Chrome 130 returns:

{
  "ja3": "772,4865-4866-4867-49195-49199-49196-49200-52393-52392-49171-49172-156-157-47-53,35-27-43-11-16-65281-10-13-65037-5-18-23-45-0-17513-51,25497-29-23-24,0",
  "ja3n": "772,4865-4866-4867-49195-49199-49196-49200-52393-52392-49171-49172-156-157-47-53,0-5-10-11-13-16-18-23-27-35-43-45-51-17513-65037-65281,25497-29-23-24,0",
  "ja3_digest": "370fa7191028e260eac290c51745d8f8",
  "ja3n_digest": "eb5a4e1d21094c5caf044c8f3117f306",
  "scrapfly_fp": "version:772|ch_ciphers:GREASE-4865-4866-4867-49195-49199-49196-49200-52393-52392-49171-49172-156-157-47-53|ch_extensions:GREASE-0-5-10-11-13-16-18-23-27-35-43-45-51-17513-65037-65281-GREASE|groups:GREASE-25497-29-23-24|points:0|compression:0|supported_versions:GREASE-772-771|supported_protocols:h2-http11|key_shares:GREASE-25497-29|psk:1|signature_algs:1027-2052-1025-1283-2053-1281-2054-1537|early_data:0|",
  "scrapfly_fp_digest": "58e05a62bade1452454ea0b0cc49c971",
  "tls": {
    "version": "0x0303 - TLS 1.2",
    "ciphers": [
      "0x3A3A",
      "TLS_AES_128_GCM_SHA256",
      "TLS_AES_256_GCM_SHA384",
      "TLS_CHACHA20_POLY1305_SHA256",
      "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256",
      "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256",
      "TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384",
      "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384",
      "TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256",
      "TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256",
      "TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA",
      "TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA",
      "TLS_RSA_WITH_AES_128_GCM_SHA256",
      "TLS_RSA_WITH_AES_256_GCM_SHA384",
      "TLS_RSA_WITH_AES_128_CBC_SHA",
      "TLS_RSA_WITH_AES_256_CBC_SHA"
    ],
    "curves": [
      "TLS_GREASE (0x1A1A)",
      "Unknown curve 0x6399",
      "X25519 (29)",
      "secp256r1 (23)",
      "secp384r1 (24)"
    ],
    "extensions": [
      "GREASE (0x4A4A)",
      "session_ticket (35) (IANA)",
      "compress_certificate (27) (IANA)",
      "supported_versions (43) (IANA)",
      "ec_point_formats (11) (IANA)",
      "application_layer_protocol_negotiation (16) (IANA)",
      "extensionRenegotiationInfo (boringssl) (65281) (IANA)",
      "supported_groups (10) (IANA)",
      "signature_algorithms (13) (IANA)",
      "extensionEncryptedClientHello (65037) (boringssl)",
      "status_request (5) (IANA)",
      "signed_certificate_timestamp (18) (IANA)",
      "extended_master_secret (23) (IANA)",
      "psk_key_exchange_modes (45) (IANA)",
      "server_name (0) (IANA)",
      "extensionApplicationSettings (17513) (boringssl)",
      "key_share (51) (IANA)",
      "GREASE (0x8A8A)"
    ],
    "points": [
      "0x00"
    ],
    "protocols": [
      "h2",
      "http/1.1"
    ],
    "versions": [
      "43690",
      "772",
      "771"
    ],
    "handshake_duration": "184.049664ms",
    "is_session_resumption": false,
    "session_ticket_supported": true,
    "support_secure_renegotiation": true,
    "supported_tls_versions": [
      43690,
      772,
      771
    ],
    "supported_protocols": [
      "h2",
      "http11"
    ],
    "signature_algorithms": [
      1027,
      2052,
      1025,
      1283,
      2053,
      1281,
      2054,
      1537
    ],
    "psk_key_exchange_mode": "AQ==",
    "cert_compression_algorithms": "AA==",
    "early_data": false,
    "using_psk": false,
    "selected_protocol": "h2",
    "selected_curve_group": 29,
    "selected_cipher_suite": 4865,
    "key_shares": [
      6682,
      25497,
      29
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

node-curl-impersonate returns:

{
  "ja3": "772,4865-4866-4867-49195-49199-49196-49200-52393-52392-49171-49172-156-157-47-53,35-43-65281-45-51-5-16-0-27-13-23-11-10-17513-18,29-23-24,0",
  "ja3n": "772,4865-4866-4867-49195-49199-49196-49200-52393-52392-49171-49172-156-157-47-53,0-5-10-11-13-16-18-23-27-35-43-45-51-17513-65281,29-23-24,0",
  "ja3_digest": "d737eab1c0aba59b4b466cf91d42a47a",
  "ja3n_digest": "0fb2c926015957b7e56038e269a7c58a",
  "scrapfly_fp": "version:772|ch_ciphers:GREASE-4865-4866-4867-49195-49199-49196-49200-52393-52392-49171-49172-156-157-47-53|ch_extensions:GREASE-0-5-10-11-13-16-18-23-27-35-43-45-51-17513-65281-GREASE|groups:GREASE-29-23-24|points:0|compression:0|supported_versions:GREASE-772-771|supported_protocols:h2-http11|key_shares:GREASE-29|psk:1|signature_algs:1027-2052-1025-1283-2053-1281-2054-1537|early_data:0|",
  "scrapfly_fp_digest": "81fbc443bb8cb67310e62d982c1e4c98",
  "tls": {
    "version": "0x0303 - TLS 1.2",
    "ciphers": [
      "0x6A6A",
      "TLS_AES_128_GCM_SHA256",
      "TLS_AES_256_GCM_SHA384",
      "TLS_CHACHA20_POLY1305_SHA256",
      "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256",
      "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256",
      "TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384",
      "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384",
      "TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256",
      "TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256",
      "TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA",
      "TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA",
      "TLS_RSA_WITH_AES_128_GCM_SHA256",
      "TLS_RSA_WITH_AES_256_GCM_SHA384",
      "TLS_RSA_WITH_AES_128_CBC_SHA",
      "TLS_RSA_WITH_AES_256_CBC_SHA"
    ],
    "curves": [
      "TLS_GREASE (0xBABA)",
      "X25519 (29)",
      "secp256r1 (23)",
      "secp384r1 (24)"
    ],
    "extensions": [
      "GREASE (0x0A0A)",
      "session_ticket (35) (IANA)",
      "supported_versions (43) (IANA)",
      "extensionRenegotiationInfo (boringssl) (65281) (IANA)",
      "psk_key_exchange_modes (45) (IANA)",
      "key_share (51) (IANA)",
      "status_request (5) (IANA)",
      "application_layer_protocol_negotiation (16) (IANA)",
      "server_name (0) (IANA)",
      "compress_certificate (27) (IANA)",
      "signature_algorithms (13) (IANA)",
      "extended_master_secret (23) (IANA)",
      "ec_point_formats (11) (IANA)",
      "supported_groups (10) (IANA)",
      "extensionApplicationSettings (17513) (boringssl)",
      "signed_certificate_timestamp (18) (IANA)",
      "GREASE (0x5A5A)",
      "padding (21) (IANA)"
    ],
    "points": [
      "0x00"
    ],
    "protocols": [
      "h2",
      "http/1.1"
    ],
    "versions": [
      "23130",
      "772",
      "771"
    ],
    "handshake_duration": "221.314783ms",
    "is_session_resumption": false,
    "session_ticket_supported": true,
    "support_secure_renegotiation": true,
    "supported_tls_versions": [
      23130,
      772,
      771
    ],
    "supported_protocols": [
      "h2",
      "http11"
    ],
    "signature_algorithms": [
      1027,
      2052,
      1025,
      1283,
      2053,
      1281,
      2054,
      1537
    ],
    "psk_key_exchange_mode": "AQ==",
    "cert_compression_algorithms": "AA==",
    "early_data": false,
    "using_psk": false,
    "selected_protocol": "h2",
    "selected_curve_group": 29,
    "selected_cipher_suite": 4865,
    "key_shares": [
      47802,
      29
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

fetc() returns:

{
  "ja3": "772,4866-4867-4865-49199-49195-49200-49196-158-49191-103-49192-107-163-159-52393-52392-52394-49327-49325-49315-49311-49245-49249-49239-49235-162-49326-49324-49314-49310-49244-49248-49238-49234-49188-106-49187-64-49162-49172-57-56-49161-49171-51-50-157-49313-49309-49233-156-49312-49308-49232-61-60-53-47-255,0-11-10-35-16-22-23-13-43-45-51,29-23-30-25-24-256-257-258-259-260,0-1-2",
  "ja3n": "772,4866-4867-4865-49199-49195-49200-49196-158-49191-103-49192-107-163-159-52393-52392-52394-49327-49325-49315-49311-49245-49249-49239-49235-162-49326-49324-49314-49310-49244-49248-49238-49234-49188-106-49187-64-49162-49172-57-56-49161-49171-51-50-157-49313-49309-49233-156-49312-49308-49232-61-60-53-47-255,0-10-11-13-16-22-23-35-43-45-51,29-23-30-25-24-256-257-258-259-260,0-1-2",
  "ja3_digest": "f376ddf05a7a38d2fb080069329ce2a2",
  "ja3n_digest": "7b70814919c3f12abb0b7d0b603462aa",
  "scrapfly_fp": "version:772|ch_ciphers:4866-4867-4865-49199-49195-49200-49196-158-49191-103-49192-107-163-159-52393-52392-52394-49327-49325-49315-49311-49245-49249-49239-49235-162-49326-49324-49314-49310-49244-49248-49238-49234-49188-106-49187-64-49162-49172-57-56-49161-49171-51-50-157-49313-49309-49233-156-49312-49308-49232-61-60-53-47-255|ch_extensions:0-10-11-13-16-22-23-35-43-45-51|groups:29-23-30-25-24-256-257-258-259-260|points:0-1-2|compression:0|supported_versions:772-771|supported_protocols:http11|key_shares:29|psk:1|signature_algs:1027-1283-1539-2055-2056-2057-2058-2059-2052-2053-2054-1025-1281-1537-771-769-770-1026-1282-1538|early_data:0|",
  "scrapfly_fp_digest": "8b2bf560717049d7bb701693d9f0d90b",
  "tls": {
    "version": "0x0303 - TLS 1.2",
    "ciphers": [
      "TLS_AES_256_GCM_SHA384",
      "TLS_CHACHA20_POLY1305_SHA256",
      "TLS_AES_128_GCM_SHA256",
      "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256",
      "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256",
      "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384",
      "TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384",
      "TLS_DHE_RSA_WITH_AES_128_GCM_SHA256",
      "TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256",
      "TLS_DHE_RSA_WITH_AES_128_CBC_SHA256",
      "TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384",
      "TLS_DHE_RSA_WITH_AES_256_CBC_SHA256",
      "TLS_DHE_DSS_WITH_AES_256_GCM_SHA384",
      "TLS_DHE_RSA_WITH_AES_256_GCM_SHA384",
      "TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256",
      "TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256",
      "TLS_DHE_RSA_WITH_CHACHA20_POLY1305_SHA256",
      "TLS_ECDHE_ECDSA_WITH_AES_256_CCM_8",
      "TLS_ECDHE_ECDSA_WITH_AES_256_CCM",
      "TLS_DHE_RSA_WITH_AES_256_CCM_8",
      "TLS_DHE_RSA_WITH_AES_256_CCM",
      "TLS_ECDHE_ECDSA_WITH_ARIA_256_GCM_SHA384",
      "TLS_ECDHE_RSA_WITH_ARIA_256_GCM_SHA384",
      "TLS_DHE_DSS_WITH_ARIA_256_GCM_SHA384",
      "TLS_DHE_RSA_WITH_ARIA_256_GCM_SHA384",
      "TLS_DHE_DSS_WITH_AES_128_GCM_SHA256",
      "TLS_ECDHE_ECDSA_WITH_AES_128_CCM_8",
      "TLS_ECDHE_ECDSA_WITH_AES_128_CCM",
      "TLS_DHE_RSA_WITH_AES_128_CCM_8",
      "TLS_DHE_RSA_WITH_AES_128_CCM",
      "TLS_ECDHE_ECDSA_WITH_ARIA_128_GCM_SHA256",
      "TLS_ECDHE_RSA_WITH_ARIA_128_GCM_SHA256",
      "TLS_DHE_DSS_WITH_ARIA_128_GCM_SHA256",
      "TLS_DHE_RSA_WITH_ARIA_128_GCM_SHA256",
      "TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384",
      "TLS_DHE_DSS_WITH_AES_256_CBC_SHA256",
      "TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256",
      "TLS_DHE_DSS_WITH_AES_128_CBC_SHA256",
      "TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA",
      "TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA",
      "TLS_DHE_RSA_WITH_AES_256_CBC_SHA",
      "TLS_DHE_DSS_WITH_AES_256_CBC_SHA",
      "TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA",
      "TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA",
      "TLS_DHE_RSA_WITH_AES_128_CBC_SHA",
      "TLS_DHE_DSS_WITH_AES_128_CBC_SHA",
      "TLS_RSA_WITH_AES_256_GCM_SHA384",
      "TLS_RSA_WITH_AES_256_CCM_8",
      "TLS_RSA_WITH_AES_256_CCM",
      "TLS_RSA_WITH_ARIA_256_GCM_SHA384",
      "TLS_RSA_WITH_AES_128_GCM_SHA256",
      "TLS_RSA_WITH_AES_128_CCM_8",
      "TLS_RSA_WITH_AES_128_CCM",
      "TLS_RSA_WITH_ARIA_128_GCM_SHA256",
      "TLS_RSA_WITH_AES_256_CBC_SHA256",
      "TLS_RSA_WITH_AES_128_CBC_SHA256",
      "TLS_RSA_WITH_AES_256_CBC_SHA",
      "TLS_RSA_WITH_AES_128_CBC_SHA",
      "TLS_EMPTY_RENEGOTIATION_INFO"
    ],
    "curves": [
      "X25519 (29)",
      "secp256r1 (23)",
      "X448 (30)",
      "secp521r1 (25)",
      "secp384r1 (24)",
      "ffdhe2048 (256)",
      "ffdhe3072 (257)",
      "ffdhe4096 (258)",
      "ffdhe6144 (259)",
      "ffdhe8192 (260)"
    ],
    "extensions": [
      "server_name (0) (IANA)",
      "ec_point_formats (11) (IANA)",
      "supported_groups (10) (IANA)",
      "session_ticket (35) (IANA)",
      "application_layer_protocol_negotiation (16) (IANA)",
      "encrypt_then_mac (22) (IANA)",
      "extended_master_secret (23) (IANA)",
      "signature_algorithms (13) (IANA)",
      "supported_versions (43) (IANA)",
      "psk_key_exchange_modes (45) (IANA)",
      "key_share (51) (IANA)"
    ],
    "points": [
      "0x00",
      "0x01",
      "0x02"
    ],
    "protocols": [
      "http/1.1"
    ],
    "versions": [
      "772",
      "771"
    ],
    "handshake_duration": "195.733862ms",
    "is_session_resumption": false,
    "session_ticket_supported": true,
    "support_secure_renegotiation": true,
    "supported_tls_versions": [
      772,
      771
    ],
    "supported_protocols": [
      "http11"
    ],
    "signature_algorithms": [
      1027,
      1283,
      1539,
      2055,
      2056,
      2057,
      2058,
      2059,
      2052,
      2053,
      2054,
      1025,
      1281,
      1537,
      771,
      769,
      770,
      1026,
      1282,
      1538
    ],
    "psk_key_exchange_mode": "AQ==",
    "cert_compression_algorithms": "AA==",
    "early_data": false,
    "using_psk": false,
    "selected_protocol": "http/1.1",
    "selected_curve_group": 29,
    "selected_cipher_suite": 4865,
    "key_shares": [
      29
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

As you can tell, the TLS fingerprint generated by Chrome and node-curl-impersonate are much closer to each other than the one produced by fetch().

Most likely, the only difference between the TLS fingerprints of Chrome and node-curl-impersonate is that they are based on different versions of the browser. This plays a key role in bot detection and explains why node-curl-impersonate was able to retrieve the HTML content of the Kick home page while the Fetch API failed.

How curl-impersonate works

To achieve the result highlighted earlier, the team behind curl-impersonate had to patch curl to resemble a browser as closely as possible. In particular, these are the changes they introduced:

  • Compiling curl with BoringSSL, the TLS library used by Google Chrome, instead of OpenSSL. For the Firefox version, curl was compiled with NSS, Firefox’s TLS library
  • Modifying the way curl configures several SSL options and TLS extensions
  • Adding support for new TLS extensions
  • Adjusting the settings for curl's HTTP/2 connections
  • Running curl with non-default flags, such as --ciphers, --curves, and specific -H headers (like the [User-Agent](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent)), to further mimic the behavior of a browser

These modifications allow requests made by curl-impersonate to be identical, from a network perspective to those of a real browser.

You can find all the implementation details in the guides on the official blog, which explain how they managed to fully impersonate Chrome and mimic Firefox.

The advantages of curl-impersonate over browser automation tools

If you are an expert in Node.js web automation, you might assume that using headless browsers controlled by technologies like Playwright or Puppeteer is more effective than utilizing curl-impersonate. No surprise, those two libraries are listed in our list of the best Node.js web scraping technologies. After all, browser automation tools also enable you to interact with the elements on the page. However, curl-impersonate is just an HTTP client that can only retrieve web pages. Still, there are Node.js web automation scenarios where a library like node-curl-impersonate might be a better choice than Playwright or Puppeteer. The reason for this is that anti-bot systems often use a two-step approach to detect and block bots. The first step checks if the request is coming from a legitimate browser, as explained earlier in this article. If the request seems suspicious, it is blocked. Otherwise, the server delivers the HTML document of the page. The page includes special JavaScript scripts that inspect the browser's settings and configurations to generate a browser fingerprint. This is then sent back to the anti-bot system to determine whether the user is legitimate. The second step works because automation tools tend to configure browsers in ways that differ from regular browsers. These differences are enough for anti-bot solutions to understand that they are dealing with an automated request. For more information, check out our guide on Playwright Extra. In contrast, curl-impersonate cannot render JavaScript, skipping the second step entirely. If the second step is not required to be considered a legitimate user, node-curl-impersonate can continue to effectively send requests to the target server without resource overheads and slowness typical of headless browsers — even in headles mode.

Conclusion

In this article, we explored what curl-impersonate is, how to use it in Node.js, and why it can be more effective than browser automation tools in bypassing anti-bot systems. We learned that the key to its success lies in low-level network details, such as TLS fingerprinting. With this special build of curl, you can take your automation scripts in Node.js to the next level! If you have any further questions about using curl-impersonate in Node.js, feel free to comment below.


200’s only ✔️ Monitor failed and slow network requests in production

Deploying a Node-based web app or website is the easy part. Making sure your Node instance continues to serve resources to your app is where things get tougher. If you’re interested in ensuring requests to the backend or third party services are successful, try LogRocket.

LogRocket Node Demo

LogRocket Network Request Monitoring

LogRocket is like a DVR for web apps, recording literally everything that happens on your site. Instead of guessing why problems happen, you can aggregate and report on problematic network requests to quickly understand the root cause.

LogRocket instruments your app to record baseline performance timings such as page load time, time to first byte, slow network requests, and also logs Redux, NgRx, and Vuex actions/state. Start monitoring for free.

💖 💪 🙅 🚩
leemeganj
Megan Lee

Posted on November 21, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related