JobConfig Tools

JobConfig* tools are a set of functions for building and updating job configuration objects used in the StartJob tool. They allow specifying all necessary parameters for a job, such as URLs, headers, proxy settings, error handling, and more.

Each tool returns a new or modified JobConfig object. The returned JobConfig object is passed to the next tool call as a required input parameter.

JobConfigCreate

Creates a new job configuration object.

Arguments

Name Type Description
jobName String Required. Unique job name
startUrl String Required. Initial crawling entry point URL

JobConfigAddStartUrl

Adds a new start URL to an existing job configuration.

Arguments

Name Type Description
jobConfig Object Required. JobConfig object
startUrl String Required. Additional start URL

JobConfigSetJobType

Sets the job type for the job configuration.

Arguments

Name Type Description
jobConfig Object Required. JobConfig object
jobType String Required. Job type (“Internet” or “Intranet”)

JobConfigHeadersUpsertDefaultHeader

Adds or updates a default HTTP header in the job configuration.

Arguments

Name Type Description
jobConfig Object Required. JobConfig object
headerName String Required. Header name
headerValue String Required. Header value

JobConfigRestartSetJobRestartMode

Sets the job restart mode for the job configuration.

Arguments

Name Type Description
jobConfig Object Required. JobConfig object
jobRestartMode String Required. Restart mode (“Continue” or “FromScratch”)

JobConfigHttpsSetSuppressHttpsCertificateValidation

Sets whether to suppress HTTPS certificate validation.

Arguments

Name Type Description
jobConfig Object Required. JobConfig object
suppressHttpsCertificateValidation Bool Required. Suppress HTTPS certificate validation

JobConfigCookiesSetUseCookies

Sets whether to use cookies for requests in the job configuration.

Arguments

Name Type Description
jobConfig Object Required. JobConfig object
useCookies Bool Required. Use cookies

JobConfigProxySetUseProxy

Sets whether to use a proxy for requests in the job configuration.

Arguments

Name Type Description
jobConfig Object Required. JobConfig object
useProxy Bool Required. Use proxy

JobConfigProxySetSendOvertRequestsOnProxiesFailure

Sets whether to send overt requests if all proxies fail.

Arguments

Name Type Description
jobConfig Object Required. JobConfig object
sendOvertRequestsOnProxiesFailure Bool Required. Send overt requests on proxy failure

JobConfigProxySetIterateProxyResponseCodes

Sets HTTP response codes for which requests should be resent with another proxy.

Arguments

Name Type Description
jobConfig Object Required. JobConfig object
iterateProxyResponseCodes String Required. Comma-separated HTTP response codes (e.g., “401,403”)

JobConfigProxyUpsertProxy

Adds or updates a proxy configuration in the job configuration.

Arguments

Name Type Description    
jobConfig Object Required. JobConfig object    
protocol String Required. Proxy protocol (http https socks5)
host String Required. Proxy host    
port Int Required. Proxy port    
userName String Optional. Proxy username    
password String Optional. Proxy password    
connectionsLimit Int Optional. Max connections    
availableHosts String Optional. Comma-separated list of available hosts    

JobConfigDownloadErrorHandlingSetPolicy

Sets the download error handling policy for the job configuration.

Arguments

Name Type Description
jobConfig Object Required. JobConfig object
downloadErrorHandlingPolicy String Required. Policy (“Skip” or “Retry”)
retriesLimit Int Optional. Max retries (if policy is “Retry”)
retryDelayMs Int Optional. Delay before retry in ms (if policy is “Retry”)

JobConfigCrawlersProtectionBypassSetMaxResponseSizeKb

Sets the maximum response size (in KB) for the job configuration.

Arguments

Name Type Description
jobConfig Object Required. JobConfig object
maxResponseSizeKb Int Required. Max response size in KB

JobConfigCrawlersProtectionBypassSetMaxRedirectHops

Sets the maximum number of redirect hops for the job configuration.

Arguments

Name Type Description
jobConfig Object Required. JobConfig object
maxRedirectHops Int Required. Max redirect hops

JobConfigCrawlersProtectionBypassSetRequestTimeoutSec

Sets the request timeout (in seconds) for the job configuration.

Arguments

Name Type Description
jobConfig Object Required. JobConfig object
requestTimeoutSec Int Required. Timeout in seconds

JobConfigCrawlersProtectionBypassUpsertCrawlDelay

Adds or updates a crawl delay for a specific host in the job configuration.

Arguments

Name Type Description
jobConfig Object Required. JobConfig object
host String Required. Host for crawl delay
delay String Required. Delay value (“0”, “1-5”, “robots”)

JobConfigCrossDomainAccessSetPolicy

Sets the cross-domain access policy for the job configuration.

Arguments

Name Type Description
jobConfig Object Required. JobConfig object
crossDomainAccess String Required. Policy (“None”, “Subdomains”, “CrossDomains”)

Please rotate your device to landscape mode

This documentation is specifically designed with a wider layout to provide a better reading experience for code examples, tables, and diagrams.
Rotating your device horizontally ensures you can see everything clearly without excessive scrolling or resizing.

Return to Web Data Source Home