ParseraParsera

Agent

AI-powered data extraction using an agent that browses websites, navigates pages, and extracts structured data. Ideal for complex sites that require interaction or when you don't know the exact page structure.

The Agent API supports two modes:

  • One-shot extraction — extract data from a URL in a single request, no scraper is saved
  • Reusable scrapers — build a persistent scraper that can be run repeatedly on similar pages

One-Shot Extraction

Extract data with an AI agent in a single request. The agent browses the page, extracts the requested data, and returns it. No scraper is created — results are temporary.

Start Extraction

Endpoint: POST /v1/agent/extract

curl https://api.parsera.org/v1/agent/extract \
--header 'Content-Type: application/json' \
--header 'X-API-KEY: <YOUR_API_KEY>' \
--data '{
    "url": "https://news.ycombinator.com/",
    "prompt": "Extract the top 10 news titles with their scores and comment counts",
    "attributes": [
        {
            "name": "title",
            "description": "News title"
        },
        {
            "name": "score",
            "description": "Number of points"
        },
        {
            "name": "comments",
            "description": "Number of comments"
        }
    ]
}'

Parameters:

ParameterTypeDefaultDescription
urlstring-URL for the agent to start from
promptstring-Instruction describing what to extract
attributesarraynullOptional list of attribute objects with name and description fields. See Output Types
callback_urlstringnullURL to POST results to when extraction completes

Response (202 Accepted):

{
    "task_id": "agent-extract-abc123",
    "status": "pending"
}

Poll Extraction Status

Poll for results using the task_id:

Endpoint: GET /v1/agent/extract/{task_id}

curl https://api.parsera.org/v1/agent/extract/agent-extract-abc123 \
--header 'X-API-KEY: <YOUR_API_KEY>'

Response (in progress):

{
    "task_id": "agent-extract-abc123",
    "status": "running"
}

Response (completed):

{
    "task_id": "agent-extract-abc123",
    "status": "completed",
    "data": [
        {
            "title": "Show HN: A new approach to web scraping",
            "score": "142",
            "comments": "58"
        }
    ]
}

Statuses: pending, running, completed, failed

If you provided a callback_url, results are also POSTed there when ready.

Reusable Scrapers

Instead of extracting data one time, you can build a reusable agentic scraper. During the build process, the AI agent browses your target site and creates a persistent script that can interact with pages — navigating, clicking, scrolling, and extracting data.

Once built, the scraper can be run repeatedly on similar URLs via the Scrapers API, without invoking the AI agent again.

Important: A reusable scraper is tied to the layout and patterns of the site it was built on. If the target site changes its structure, the scraper may stop working correctly and will need to be regenerated.

Build Scraper

Build an agentic scraper by providing a URL and instructions. The AI agent will browse the site, learn its layout, and create a reusable extraction script.

Endpoint: POST /v1/agent/generate

curl https://api.parsera.org/v1/agent/generate \
--header 'Content-Type: application/json' \
--header 'X-API-KEY: <YOUR_API_KEY>' \
--data '{
    "url": "https://news.ycombinator.com/",
    "prompt": "Extract news titles with their scores",
    "attributes": [
        {
            "name": "title",
            "description": "News title"
        },
        {
            "name": "score",
            "description": "Number of points"
        }
    ]
}'

Parameters:

ParameterTypeDefaultDescription
template_idstringnullExisting scraper ID. If omitted, a new scraper is created automatically
urlstring-URL for the agent to build on
promptstring-Instruction describing what to extract
attributesarraynullOptional list of attribute objects with name and description fields. See Output Types

Response (202 Accepted):

{
    "template_id": "abc123",
    "status": "generating"
}

The build process runs in the background. Poll the scraper details endpoint to check progress.

Running Reusable Scrapers

Once an agentic scraper is built and ready, run it via the Scrapers API using POST /v1/scrapers/run or POST /v1/scrapers/run_async.

Manage Scrapers

See Manage Scrapers for the full management reference (create, get details, delete).

More Features