CrawlersProtectionBypass

Crawlers protection bypass settings

Fields

Name Type Description
MaxResponseSizeKb Int Optional. Max response size in kilobytes. Optional. Default value is 1000
MaxRedirectHops Int Optional. Max redirect hops. Optional. Default value is 10
RequestTimeoutSec Int Optional. Max request timeout in seconds. Optional. Default value is 30
CrawlDelays Array of CrawlDelay Optional. Crawl delays for hosts

Initialization String Format

An instance can be initialized with a string of the following format: MaxResponseSizeKb: size; MaxRedirectHops: hops; RequestTimeoutSec: timeout

Methods

Methods that help with initialization.

AddCrawlDelay

Adds a new crawl delay

Syntax
AddCrawlDelay( crawlDelay )
Arguments
Name Type Description
crawlDelay CrawlDelay Required. CrawlDelay instance
Return type

CrawlersProtectionBypass

Return value

Returns the instance on which it was called

AddDelay

Adds a new crawl delay

Syntax
AddCrawlDelay( host, delay )
Arguments
Name Type Description
host String Required. Host
delay String Required. Delay string. See CrawlDelay
Return type

CrawlersProtectionBypass

Return value

Returns the instance on which it was called

Examples

Creating a new instance initialized from a string:

DECLARE @crawlersProtectionBypass wds.CrawlersProtectionBypass = 'MaxResponseSizeKb: 1000; MaxRedirectHops: 3; RequestTimeoutSec: 1';
SET @crawlersProtectionBypass = @crawlersProtectionBypass.AddDelay('host1.com', '0');
SET @crawlersProtectionBypass = @crawlersProtectionBypass.AddDelay('host2.com', '1-3');
SET @crawlersProtectionBypass = @crawlersProtectionBypass.AddDelay('host2.com', 'robots');

SET @jobConfig.CrawlersProtectionBypass = @crawlersProtectionBypass;

Setting the CrawlersProtectionBypass from a string:

SET @jobConfig.CrawlersProtectionBypass = 'MaxResponseSizeKb: 1000; MaxRedirectHops: 3; RequestTimeoutSec: 1';

CrawlDelay

Crawl delay for a host

Fields

Name Type Description
host String Required. Host
delay String Required. Delay string

Remarks

Delay string can be either a number, a range of numbers separated by the dash, or ‘robots’:

Initialization String Format

An instance can be initialized with a string of the following format: Host: host; Delay: 0|1-5|robots

Examples

Creating a new instance initialized from a string:

DECLARE @robotsCrawlDelay wds.CrawlDelay = 'Host: host1.com; Delay: robots';
DECLARE @rangeCrawlDelay wds.CrawlDelay = 'Host: host2.com; Delay: 1-5';
DECLARE @noCrawlDelay wds.CrawlDelay = 'Host: host3.com; Delay: 0';

SET @crawlersProtectionBypass = @crawlersProtectionBypass.AddCrawlDelay(@robotsCrawlDelay);
SET @crawlersProtectionBypass = @crawlersProtectionBypass.AddCrawlDelay(@rangeCrawlDelay);
SET @crawlersProtectionBypass = @crawlersProtectionBypass.AddCrawlDelay(@noCrawlDelay);

Please rotate your device to landscape mode

This documentation is specifically designed with a wider layout to provide a better reading experience for code examples, tables, and diagrams.
Rotating your device horizontally ensures you can see everything clearly without excessive scrolling or resizing.

Return to Web Data Source Home