Agents API (Deprecated)

Agents API (Deprecated)

⚠️ DEPRECATED: This API is deprecated and will be removed in a future version. Please migrate to the new Scrapers API which uses the /v1/scrapers/* endpoints on api.parsera.org.

The new Scrapers API offers the same functionality with improved performance and a unified base URL.

Legacy Agents API

The legacy Agents API (agents.parsera.org) generates reusable custom scrapers which has 2 main steps:

  1. Call generate to build scraper;
  2. Call scrape to run this scraper on a specific URL.

generate

Request agent to build a new scraper:

curl https://agents.parsera.org/v1/generate \
--header 'Content-Type: application/json' \
--header 'X-API-KEY: <YOUR_API_KEY>' \
--data '{
    "name": "hackernews",
    "url": "https://news.ycombinator.com/",
    "attributes": {
        "title": "News title",
        "points": "Number of points"
    }
}'

Parameters:

ParameterTypeDefaultDescription
namestring-Name of the agent
urlstring-Website URL
promptstring""Prompt for initial scraping
attributesobject-A map of name - description pairs of data fields to extract from the webpage. Also, you can specify Output Types.
proxy_countrystringUnitedStatesProxy country, see Proxy Countries
cookiesarrayEmptyCookies to use during extraction, see Cookies

scrape

Apply an existing scraper to the webpage:

curl https://agents.parsera.org/v1/scrape \
--header 'Content-Type: application/json' \
--header 'X-API-KEY: <YOUR_API_KEY>' \
--data '{
    "name": "hackernews",
    "url": "https://news.ycombinator.com/front?day=2024-09-11",
}'

You can access pre-built agents by appending public/ to the name, for example public/crunchbase to access crunchbase agent.

Parameters:

ParameterTypeDefaultDescription
namestring-Name of the agent
urlstring-URL of the webpage to extract data from
proxy_countrystringUnitedStatesProxy country, see Proxy Countries
cookiesarrayEmptyCookies to use during extraction, see Cookies

list

List all available agents:

curl --location 'https://agents.parsera.org/v1/list' \
--header 'X-API-KEY: <YOUR_API_KEY>'

remove

Remove existing agent:

curl --location 'https://agents.parsera.org/v1/remove' \
--header 'Content-Type: application/json' \
--header 'X-API-KEY: <YOUR_API_KEY>' \
--data '{
    "name": "hackernews"
}'

Parameters:

ParameterTypeDefaultDescription
namestring-Name of the agent

Migrating to Scrapers API

To migrate from the Agents API to the new Scrapers API:

  1. Use api.parsera.org instead of agents.parsera.org
  2. Replace /v1/generate with /v1/scrapers/generate and use template_id instead of name
  3. Replace /v1/scrape with /v1/scrapers/run and use template_id instead of name
  4. Replace /v1/list with GET /v1/scrapers
  5. Update attributes format from map to list (see Output Types)

For complete documentation on the new API, see Scrapers API.

Parsera Parsera on