Overview

The WDS REST API lets you configure crawl/scrape jobs, discover pages, extract data, and monitor task execution — all via simple, versioned HTTP endpoints.

Base URL

The base URL depends on your deployment method.
Example (Docker): http://localhost:2807
Endpoints live under /api/{version} by default (see links below for concrete routes). In Helm deployments, you can add a base‑path prefix via global.ingress.basePath.

Explore the API

Swagger UI: browse and try endpoints interactively at /api/swagger.
Playground: if deployed, use the test site at /playground/ for predictable, repeatable examples.

Key Resources

Jobs: start a job with a JobConfig, receive initial DownloadTasks.
- Reference: jobs overview and start endpoint in ../jobs.html.
Tasks: operate on tasks to continue the crawl or extract data.
- Crawl: discover follow‑up pages and return new DownloadTasks.
- Scrape: extract text/attributes from a page.
- Scrape Multiple: batch multiple extractions in one request.
- Info: get DownloadTaskStatus (state, errors, request/response details).
- Reference: task endpoints in ../tasks.html.

Typical Flow

Start: POST Jobs Start with JobConfig -> returns initial DownloadTasks (one per Start URL).
Crawl: GET Tasks Crawl with a task + selector -> returns more DownloadTasks.
Scrape: GET Tasks Scrape (or Scrape Multiple) with a task + selector(s) -> returns extracted values.
Monitor: GET Tasks Info for DownloadTaskStatus to check progress and results.

Overview

Base URL

Explore the API

Key Resources

Typical Flow

Documentation

Please rotate your device to landscape mode