GetCrawlMdrData Tool

The GetCrawlMdrData tool is used to retrieve batches of scraped data from a crawling job performed by the CrawlMdr tool. It uses a data cursor to fetch results incrementally, allowing efficient handling of large datasets. This tool returns the scraped data in JSON format and provides a cursor for fetching the next batch if more data is available.

Arguments

Name Type Description
dataCursor CrawlMdrDataCursor Required. Cursor from CrawlMdrResult for fetching data batches.
downloadTasksCount Int Required. The count refers to the number of download tasks to be processed in this request. In most cases, one download task corresponds to one document. However, if a table was handled, it will return as many documents as were in the table (for each download task). Additionally, if there are multiple data objects on each level, the document count will be a multiplication of all counts on each level
path String Required. Path to a level in the MDR tree. It should start with / and contain at least one step. Each step is separated by /. Path must not end with /

Return Type

Returns a CrawlMdrData object containing the scraped data and a cursor for the next batch.

Name Type Description
Data Array of String Required. Array of scraped data objects in JSON format.
DataCursor CrawlMdrDataCursor Optional. Cursor for fetching the next batch of data (null if no more data).

CrawlMdrDataCursor

Name Type Description
JobId String Required. Job Id
NextCursor String Optional. Cursor for fetching the next batch of scraped data (null if done)

Please rotate your device to landscape mode

This documentation is specifically designed with a wider layout to provide a better reading experience for code examples, tables, and diagrams.
Rotating your device horizontally ensures you can see everything clearly without excessive scrolling or resizing.

Return to Web Data Source Home