Manage Scrapers
Create and manage classic code-based scrapers via the Extractor API. Classic scrapers use generated Python code for fast, deterministic, cost-effective extraction — ideal for production scraping tasks.
Generate Scraper
Generate Python scraping code by providing a sample URL and the fields you want to extract. If template_id is omitted, a new scraper is created automatically.
This endpoint is asynchronous — it returns immediately and the code is generated in the background.
Endpoint: POST /v1/extractor/generate
curl https://api.parsera.org/v1/extractor/generate \
--header 'Content-Type: application/json' \
--header 'X-API-KEY: <YOUR_API_KEY>' \
--data '{
"url": "https://news.ycombinator.com/",
"attributes": [
{
"name": "title",
"description": "News title"
},
{
"name": "points",
"description": "Number of points"
}
]
}'Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
template_id | string | null | Scraper ID. If omitted, a new scraper is created automatically |
url | string | null | Sample URL to generate the scraper from |
content | string | null | Raw HTML or text content (alternative to url) |
prompt | string | "" | Additional prompt for scraper generation |
attributes | array | [] | A list of attribute objects with name and description fields. You can also specify Output Types |
proxy_country | string | null | Proxy country for the sample request, see Proxy Countries |
cookies | array | null | Cookies to use during generation, see Cookies |
Note: You must provide either url or content, and either attributes or prompt.
Response (202 Accepted):
{
"template_id": "abc123",
"status": "generating"
}To check generation progress, poll the scraper details endpoint.
Create Empty Scraper
Create a new empty classic scraper. This is useful when you need a scraper ID before triggering generation — otherwise, POST /v1/extractor/generate auto-creates one for you.
Endpoint: POST /v1/extractor/new
curl https://api.parsera.org/v1/extractor/new \
--header 'X-API-KEY: <YOUR_API_KEY>' \
--request POSTResponse:
{
"template_id": "abc123"
}Get Scraper Details
Get a classic scraper's details and code generation status:
Endpoint: GET /v1/extractor/{scraper_id}
curl https://api.parsera.org/v1/extractor/abc123 \
--header 'X-API-KEY: <YOUR_API_KEY>'Response:
{
"template_id": "abc123",
"type": "extractor",
"name": "hackernews",
"mode": "code",
"url": "https://news.ycombinator.com/",
"status": "ready"
}Generation statuses: generating, ready, failed
Delete Scraper
Delete a classic scraper:
Endpoint: DELETE /v1/extractor/{scraper_id}
curl https://api.parsera.org/v1/extractor/abc123 \
--header 'X-API-KEY: <YOUR_API_KEY>' \
--request DELETEResponse:
{
"message": "Scraper deleted successfully."
}Running Scrapers
Once a classic scraper is generated, run it via the Scrapers API using POST /v1/scrapers/run or POST /v1/scrapers/run_async.
Related Documentation
- Extractor API — One-shot extraction endpoints
- Code Mode — Details on code generation and benefits
- Scrapers API — Run scrapers synchronously or asynchronously
