Fast, high-level Python framework for web crawling and scraping
License
BSD-3-Clause
Languages
- Python
- Go Template
- HTML

About Scrapy
Scrapy is a fast, high-level web crawling and scraping framework for Python that extracts structured data from websites. You write spiders in code, making it a fit for projects that need a programmable crawler rather than a point-and-click tool.
The framework is built on an asynchronous engine for concurrent requests, and is highly extensible through middlewares, item pipelines, and signals. It runs cross-platform and requires Python 3.10 or newer.
Scrapy is maintained by Zyte, formerly Scrapinghub, alongside many other contributors. It is released under the BSD 3-Clause license, installs with pip, and runs entirely locally with no hosted service.
Key features
- Code-defined spiders for crawling and scraping
- Asynchronous engine for concurrent requests
- Structured data extraction from web pages
- Extensible via middlewares and item pipelines
- Cross-platform, runs on Python 3.10+
Details
- First released
- 2010
- Platforms
- Windows · macOS · Linux
- Language
- Python
- Deployment
- Library
- Maintainer
- Zyte
- License
- BSD-3-Clause
