Launching Crawlee, the web scraping and browser automation library for Node.js

mnmkng

Ondra Urban

Posted on August 23, 2022

Launching Crawlee, the web scraping and browser automation library for Node.js

Hello world,

Today, drawing on our team's years of experience, we're launching Crawlee, the the web scraping and browser automation library for Node.js that's built for fastest development and maximum reliability in production.

Main features

🖼 Supports headless browsers with Playwright or Puppeteer

⚡️ Supports raw HTTP crawling with Cheerio or JSDOM

🎛 Automated parallelization and scaling of crawlers for top performance

🐾 Avoids blocking using smart sessions, proxies, and browser fingerprints

🚎 Simple management and persistence of queues of URLs to crawl

🗜 Written completely in TypeScript for type safety and code autocompletion

📚 Comprehensive documentation, code examples, and tutorials

💪🏼 Actively maintained and developed by Apify—we use it ourselves!

Getting started

Visit crawlee.dev or run the following command:

npx crawlee create my-crawler
Enter fullscreen mode Exit fullscreen mode

Liked Crawlee?

💛 You can support the project on GitHub, Product Hunt, or Hacker News

💖 💪 🙅 🚩
mnmkng
Ondra Urban

Posted on August 23, 2022

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related