CrawlMdrConfig Tools

CrawlMdrConfig tools are a set of functions for building and updating multi-dimensional recursive crawling configurations. They allow you to define and modify the structure, crawl parameters, and scraping rules for complex web resource extraction jobs.

Each tool returns a new or modified CrawlMdrConfig object. The returned CrawlMdrConfig object is passed to the next tool call as a required input parameter.

CrawlMdrConfigCreate

Creates a new empty CrawlMdrConfig object with path /

Arguments

None

CrawlMdrConfigUpsertSub

Adds or updates a sub-level in the MDR tree for crawling.

Arguments

Name Type Description
crawlMdrConfig Object Required. MDR configuration object from the previous tool call
path String Required. Path to a level in the MDR tree. It should start with / and contain at least one step. Each step is separated by /. Path must not end with /
selector String Required. A valid CSS or XPATH selector.
attributeName String Optional. Attribute name to get data from. Use val to get inner text. Default value: href

CrawlMdrConfigUpsertCrawlParams

Adds or updates crawl parameters for a specific MDR tree level.

Arguments

Name Type Description
crawlMdrConfig Object Required. MDR configuration object from the previous tool call
path String Required. Path to a level in the MDR tree. It should start with / and contain at least one step. Each step is separated by /. Path must not end with /
selector String Required. A valid CSS or XPATH selector.
attributeName String Optional. Attribute name to get data from. Use val to get inner text. Default value: href

CrawlMdrConfigUpsertScrapeParams

Adds or updates a scrape parameter for a field on a specific MDR tree level.

Arguments

Name Type Description
crawlMdrConfig Object Required. MDR configuration object from the previous tool call
path String Required. Path to a level in the MDR tree. It should start with / and contain at least one step. Each step is separated by /. Path must not end with /
fieldName String Required. Name of a data field that will contain scraped data according to the provided selector and attribute name.
selector String Required. A valid CSS or XPATH selector.
attributeName String Optional. Attribute name to get data from. Use val to get inner text. Default value: val

Please rotate your device to landscape mode

This documentation is specifically designed with a wider layout to provide a better reading experience for code examples, tables, and diagrams.
Rotating your device horizontally ensures you can see everything clearly without excessive scrolling or resizing.

Return to Web Data Source Home