Crawl
Performs a download task to collect subsequent pages of web resources
Syntax
wds.Crawl( downloadTask, selector )
Arguments
Name | Type | Description |
---|---|---|
downloadTask | DownloadTask | Required. A download task from a previous command result set |
selector | String | Required. Selector for getting interesting links on a web page |
Remarks
The selector argument is a selector of the following format: CSS|XPATH: selector
. The first part defines the selector type, the second one should be a selector in the corresponding type.
Supported types:
Return type
TABLE (Task wds.DownloadTask)
Return value
List of DownloadTask (one per a found URL) in a form of table.
Examples
Creating a job and getting download tasks for all sidebar links on the index page of the Playground
DECLARE @jobConfig wds.JobConfig = 'JobName: TestJob1; Server: wds://localhost:2807; StartUrls: http://playground.svc';
SELECT
nav.Task.Url URL
FROM wds.Start(@jobConfig) root
OUTER APPLY wds.Crawl(root.Task, 'css: ul.nav a') nav
URL |
---|
http://playground.svc/ |
http://playground.svc/armor_and_accessories/1/ |
http://playground.svc/beast_and_creature_items/1/ |
http://playground.svc/elemental_and_nature_items/1/ |
http://playground.svc/magical_artifacts/1/ |
http://playground.svc/potions_and_elixirs/1/ |
http://playground.svc/rings_and_amulets/1/ |
http://playground.svc/wands_and_staffs/1/ |
http://playground.svc/weapons/1/ |