Scrapers
Scrapers are reusable extractors that you can create, manage, and run on multiple URLs. This API is ideal for scenarios where you need to extract the same type of data from multiple pages or websites.
How It Works
- Create a new empty scraper to get a scraper ID
- Generate code for your scraper (see Code Generation)
- Run the scraper on specific URLs whenever you need
- List all your available scrapers
- Delete scrapers you no longer need
List Scrapers
List all available scrapers for your account:
Endpoint: GET /v1/scrapers
curl https://api.parsera.org/v1/scrapers \
--header 'X-API-KEY: <YOUR_API_KEY>'Response:
Returns a list of scraper objects with id and name fields:
[
{
"id": "scraper-id-123",
"name": "hackernews"
},
{
"id": "scraper-id-456",
"name": "linkedin-jobs"
}
]Create New Scraper
Create a new empty scraper for your account. This endpoint returns a scraper_id that you can use with the /v1/scrapers/generate endpoint to generate scraping code.
Endpoint: POST /v1/scrapers/new
curl https://api.parsera.org/v1/scrapers/new \
--header 'X-API-KEY: <YOUR_API_KEY>' \
--request POSTResponse:
Returns a scraper ID that can be used to generate the scraper:
{
"scraper_id": "hackernews"
}Note: After creating a new scraper, use the returned scraper_id to generate code for your scraper. See the Code Mode documentation for details.
Run Scraper
Run an existing scraper on one or multiple URLs:
Endpoint: POST /v1/scrapers/run
curl https://api.parsera.org/v1/scrapers/run \
--header 'Content-Type: application/json' \
--header 'X-API-KEY: <YOUR_API_KEY>' \
--data '{
"scraper_id": "hackernews",
"url": "https://news.ycombinator.com/front?day=2024-09-11"
}'Run on multiple URLs:
You can also run a scraper on multiple URLs at once (up to 100 URLs):
curl https://api.parsera.org/v1/scrapers/run \
--header 'Content-Type: application/json' \
--header 'X-API-KEY: <YOUR_API_KEY>' \
--data '{
"scraper_id": "hackernews",
"url": [
"https://news.ycombinator.com/front?day=2024-09-11",
"https://news.ycombinator.com/front?day=2024-09-12",
"https://news.ycombinator.com/front?day=2024-09-13"
]
}'Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
scraper_id | string | - | ID of the scraper to run |
url | string or array | null | URL or list of URLs to scrape (max 100 URLs) |
proxy_country | string | null | Proxy country, see Proxy Countries |
cookies | array | null | Cookies to use during extraction, see Cookies |
Delete Scraper
Delete an existing scraper by its ID:
Endpoint: DELETE /v1/scrapers/{scraper_id}
curl https://api.parsera.org/v1/scrapers/hackernews \
--header 'X-API-KEY: <YOUR_API_KEY>' \
--request DELETEParameters:
| Parameter | Type | Description |
|---|---|---|
scraper_id | string | ID of the scraper to delete (path parameter) |
Response:
Returns a success message on successful deletion:
{
"message": "Scraper deleted successfully."
}Note: Old scrapers (prefixed with scraper:) cannot be deleted via this API. Only scrapers created through the /v1/scrapers/new endpoint can be deleted.
Code Mode
To generate scraping code and run it in code mode, see the Code Mode documentation.
Migration from Agents API
If you're using the older Agents API (agents.parsera.org), please refer to the Agents API (Deprecated) documentation for migration guidance.
More Features
Enhance your scraper with additional features:
- Code Mode - Generate and run custom scraper code
- Specify Output Types - Define data types for extracted fields
- Setting Proxy - Access content from different geographic locations
- Setting Cookies - Handle authentication and session cookies