Crawl Tool

Finds links on the current page using a selector and returns new download tasks to continue the crawl; supports notifying on retries.

Arguments

Name	Type	Description
task	DownloadTask	Required. A task from the previous Start or Crawl tool response
selector	string	Required. Selector for getting interesting links on a web page
attributeName	string	Optional. Attribute name to get data from. Use `val` to get inner text. Default value: `href`
maxDepth	int	Optional. Maximum depth for crawling based on the URL path (‘example.com’ = 0, ‘example.com/index.html’ = 0, ‘example.com/path/’ = 1, etc). A non-negative integer value. If null, there is no limit for the depth

Remarks

The selector argument is a selector of the following format: CSS|XPATH: selector. The first part defines the selector type, the second one should be a selector in the corresponding type. Supported types:

CSS
XPATH

DownloadTask

Represents a single page download request produced by a crawl or scrape job.

Fields:

Name	Type	Description
Id	string	Required. Task Id
Url	string	Required. Page URL

Return Type

Array of DownloadTask

Crawl Tool

Arguments

Remarks

DownloadTask

Return Type

Please rotate your device to landscape mode