Web Crawler for Databases

Empower your database with a pluggable web crawler to enhance your queries with data from the Intranet/Internet

Why Choose Web Data Source?

A pluggable solution designed for data enthusiasts that turns Internet and Intranet resources into yet another data source

High Performance

Automatically scales resources to match user demand, ensuring optimal performance

Secure

Embedded solution provides the best security and anonymization to make the gathering process safe and hidden from competitors

Customizable

Provides toolsets for popular databases, enabling seamless web data extraction

Advanced Features

Explore key features that make Web Data Source a powerful solution for your needs

Cloud Integration

This cloud-native solution can be deployed in any Kubernetes-compatible environment

Durability

The solution is designed with a minimal component base to work on minimal hardware and restore itself after environmental outages.
It just works!

Backward Compatibility

Losing backward compatibility in work tools is painful! This solution is designed with Backward Compatibility as one of the main Architecture Significant Requirements

Usage Examples

Explore how to interact with web data sources in SQL queries

Getting URLs from a page


SELECT 
    nav.Task.Url URL
FROM wds.Start(@jobConfig) root
    OUTER APPLY wds.Crawl(root.Task, 'css: ul.nav a', null) nav

Getting data from pages


SELECT  
    p.Task.Url URL,
    wds.ScrapeFirst(p.Task, 'css: h1', null) Name,
    wds.ScrapeFirst(p.Task, 'css: .price span', null) Price
FROM wds.Start(@jobConfig) r
    OUTER APPLY wds.Crawl(r.Task, 'css: table a', null) p

Ready to Dive In?

Start your journey with Web Data Source today

Browse Documentation