Deploying WDS API Server in Air-Gapped Environments

Web Data Source API Server can work in air-gapped environments.

It’s required to copy necessary docker images in a private docker registry available from an air-gapped environment to deploy the system there. The image set is different based on the selected deployment type (see services).
Additionally to the system services the third-party components should be in the private docker registry or provided as a service.

Use the following form to specify a particular private docker registry address. This value will be applied to all script examples and docker-compose configurations so they can be copied and used right from this page.

Script Examples

The following examples show how to copy all necessary docker images to a private docker registry.
There are two scripts:

  1. archive-wds-images - downloads all required docker images to archive them into a single wds-docker-images.tar
  2. push-wds-images - pushes all (or only those required for a specific deployment option) WDS images to a private docker registry
archive-wds-images.bat

@echo off
setlocal enabledelayedexpansion

if "%~1"=="" (
  echo Error: No output tar archive path provided.
  echo Usage: %~0 ^<path-to-wds-images.tar^>
  exit /b 1
)

set ARCHIVE_PATH=%~1

REM List of images you want to bundle into the tar
set IMAGES=^
zcube/bitnami-compat-mongodb:6.0 ^
bitnami/minio:2024 ^
webdatasource/solidstack:v2.0.0 ^
webdatasource/crawler:v2.0.0 ^
webdatasource/datakeeper:v2.0.0 ^
webdatasource/idealer:v2.0.0 ^
webdatasource/scraper:v2.0.0 ^
webdatasource/dapi:v2.0.0 ^
webdatasource/playground:v2.0.0 ^
webdatasource/docs:v2.0.0

echo Pulling WDS images...
for %%I in (%IMAGES%) do (
  echo Pulling %%I...
  docker pull %%I
  echo.
)

echo Creating a single tar archive with all WDS images at %ARCHIVE_PATH%...
docker save -o %ARCHIVE_PATH% %IMAGES%

echo.
echo Done! Created %ARCHIVE_PATH% containing all WDS images.

endlocal
push-wds-images.bat

@echo off
setlocal enabledelayedexpansion

if "%~1"=="" (
  echo Error: No registry address provided.
  echo Usage: %~0 ^<registry-address^> ^<path-to-wds-docker-images.tar^>
  exit /b 1
)

if "%~2"=="" (
  echo Error: No tar archive path provided.
  echo Usage: %~0 ^<registry-address^> ^<path-to-wds-docker-images.tar^>
  exit /b 1
)

set PRIVATE_REGISTRY=%~1
set ARCHIVE_PATH=%~2
set IMAGES=^
zcube/bitnami-compat-mongodb:6.0 ^
bitnami/minio:2024 ^
webdatasource/solidstack:v2.0.0 ^
webdatasource/crawler:v2.0.0 ^
webdatasource/datakeeper:v2.0.0 ^
webdatasource/idealer:v2.0.0 ^
webdatasource/scraper:v2.0.0 ^
webdatasource/dapi:v2.0.0 ^
webdatasource/playground:v2.0.0 ^
webdatasource/docs:v2.0.0

echo Loading images from tar file: %ARCHIVE_PATH%
docker load -i %ARCHIVE_PATH%

echo.
echo Tagging and pushing wds images to %PRIVATE_REGISTRY%...
for %%I in (%IMAGES%) do (
  echo Tagging %%I as %PRIVATE_REGISTRY%/%%I
  docker tag %%I %PRIVATE_REGISTRY%/%%I

  echo Pushing %PRIVATE_REGISTRY%/%%I
  docker push %PRIVATE_REGISTRY%/%%I
  echo.
)

echo Done! WDS images loaded and pushed to %PRIVATE_REGISTRY%.

endlocal
Download and archive WDS docker images:

IMPORTANT! The current user should have access to the Internet

archive-wds-images.bat wds-docker-images.tar
Load and push WDS docker images to a private registry:

IMPORTANT! The current user should have access and be logged in to the private registry

push-wds-images.bat [registry] wds-docker-images.tar
archive-wds-images.sh

#!/usr/bin/env bash
set -e

if [ -z "$1" ]; then
  echo "Error: No output tar archive path provided."
  echo "Usage: $0 /path/to/wds-docker-images.tar"
  exit 1
fi

ARCHIVE_PATH="$1"

IMAGES=(
    "zcube/bitnami-compat-mongodb:6.0"
    "bitnami/minio:2024"
    "webdatasource/solidstack:v2.0.0"
    "webdatasource/crawler:v2.0.0"
    "webdatasource/datakeeper:v2.0.0"
    "webdatasource/idealer:v2.0.0"
    "webdatasource/scraper:v2.0.0"
    "webdatasource/dapi:v2.0.0"
    "webdatasource/playground:v2.0.0"
    "webdatasource/docs:v2.0.0"
)

echo "Pulling all images..."
for IMAGE in "${IMAGES[@]}"; do
  echo "Pulling: ${IMAGE}"
  docker pull "${IMAGE}"
  echo
done

echo "Creating a single tar archive with all WDS images at ${ARCHIVE_PATH}..."
docker save -o "${ARCHIVE_PATH}" "${IMAGES[@]}"

echo
echo "Done! Created ${ARCHIVE_PATH} containing all WDS images."
push-wds-images.sh

#!/usr/bin/env bash
set -e

if [ -z "$1" ]; then
  echo "Error: No registry address provided."
  echo "Usage: $0 /path/to/wds-docker-images.tar <registry-address>"
  exit 1
fi

if [ -z "$2" ]; then
  echo "Error: No tar archive provided."
  echo "Usage: $0 /path/to/wds-docker-images.tar <registry-address>"
  exit 1
fi

PRIVATE_REGISTRY="$1"
ARCHIVE_PATH="$2"
IMAGES=(
    "zcube/bitnami-compat-mongodb:6.0"
    "bitnami/minio:2024"
    "webdatasource/solidstack:v2.0.0"
    "webdatasource/crawler:v2.0.0"
    "webdatasource/datakeeper:v2.0.0"
    "webdatasource/idealer:v2.0.0"
    "webdatasource/scraper:v2.0.0"
    "webdatasource/dapi:v2.0.0"
    "webdatasource/playground:v2.0.0"
    "webdatasource/docs:v2.0.0"
)

echo "Loading images from tar file: ${ARCHIVE_PATH}"
docker load -i "${ARCHIVE_PATH}"

echo
echo "Tagging and pushing WDS images to ${PRIVATE_REGISTRY}..."
for IMAGE in "${IMAGES[@]}"; do
  FINAL_TAG="${PRIVATE_REGISTRY}/${IMAGE}"
  echo "Tagging ${IMAGE} as ${FINAL_TAG}"
  docker tag "${IMAGE}" "${FINAL_TAG}"

  echo "Pushing ${FINAL_TAG}"
  docker push "${FINAL_TAG}"
  echo
done

echo "Done! WDS images loaded and pushed to ${PRIVATE_REGISTRY}."
Download and archive WDS docker images:

IMPORTANT! The current user should have access to the Internet

chmod +x archive-wds-images.sh
./archive-wds-images.sh wds-docker-images.tar
Load and push WDS docker images to a private registry:

IMPORTANT! The current user should have access and be logged in to the private registry

chmod +x push-wds-images.sh
./push-wds-images.sh [registry] wds-docker-images.tar

Deploy to Kubernetes

To deploy a WDS API Server instance to a Kubernetes cluster in an air-gapped environment, the global.registry parameter should be set in the Helm Chart values.
For the other configuration options, see the Helm Chart documentation

To deploy the WDS API Server using the Helm CLI, run the following commands:

IMPORTANT! Ensure you have access to a Kubernetes cluster.

# Add the repository (if using a remote repository)
helm repo add webdatasource https://github.com/webdatasource/wds.helm

# Update your local repository cache
helm repo update

# Install the chart
helm install webdatasource wds-helm-chart \
    --namespace webdatasource --create-namespace \
    --set global.registry="[registry]" \
    --set global.coreServices.databases.mongodb.connectionString="mongodb+srv://<user>:<password>@<host>/WebDataSource?appName=<cluster>&readPreference=secondary"

IMPORTANT! Ensure you have helm provider configured in your terraform project.

Minimal Terraform configuration to deploy WDS API Server:

# MongoDB connection string and license key for WDS API Server deployment
variable "mongodb_connection_string" {
  type      = string
  sensitive = true
}

# License key for WDS API Server deployment in MultiService mode
variable "license_key" {
  type      = string
  sensitive = true
}

# Docker registry of WDS API Server service images
variable "registry" {
  type    = string
  default = "[registry]"
}

# Helm chart version for WDS API Server deployment
variable "helm_chart_version" {
  type    = string
  default = "1.0.0"
}

# Namespace and ingress configuration for WDS API Server deployment
# Set `create_namespace` to false to deploy in an existing namespace
variable "create_namespace" {
  type    = bool
  default = true
}

# Namespace for WDS API Server deployment
variable "namespace" {
  type    = string
  default = "webdatasource"
}

# Enable or disable ingress for WDS API Server deployment
variable "enable_ingress" {
  type    = bool
  default = true
}

# Kubernetes namespace resource
resource "kubernetes_namespace" "webdatasource" {
  count = var.create_namespace ? 1 : 0
  metadata {
    name = var.namespace
  }
}

# Helm release resource
resource "helm_release" "wds-server" {
  name       = "wds-server"
  repository = "https://webdatasource.github.io/wds.helm"
  chart      = "wds-helm-chart"
  version    = var.helm_chart_version

  namespace        = var.namespace
  create_namespace = false

  values = [
    yamlencode({
      global = {
        registry = var.registry
        ingress = {
          enabled = var.enable_ingress
          annotations = {
            # To stick MCP client sessions to the same pod. Otherwise, they may get 404 errors. 
            # If an ingress controller other than nginx is used, the annotation key needs to be changed accordingly.
            "nginx.ingress.kubernetes.io/upstream-hash-by" = "Mcp-Session-Id"
          }
        }
      }
    }),
    sensitive(yamlencode({
      global = {
        coreServices = {
          databases = {
            mongodb = {
              connectionString = var.mongodb_connection_string
            }
          }
          license = {
            key = var.license_key
          }
        }
      }
    }))
  ]

  depends_on = [
    kubernetes_namespace.webdatasource
  ]
}

NOTE This code can be used to create a dedicated Terraform Module for WDS API Server.

Docker Compose Deployment Options

The docker-compose files in an air-gapped environment are a bit different — the docker registry has been changed during loading to a private registry. Therefore, the docker-compose files from this page (with the new docker registry) should be used.

As for the deployment options, they stay the same and are all available in air-gapped environments:

This deployment option implies connecting to an external MongoDB. For evaluation purposes, a Free MongoDB Atlas cluster can be used.
In the wds.solidstack service replace the value of the DB_CONNECTION_STRING environment variable with a real connection string to a MongoDB database.

IMPORTANT! MongoDB connection string must contain a database name. It might be Solidstack as in the example connection string or anything else.


services:  
  wds.solidstack:
    image: [registry]/webdatasource/solidstack:v2.0.0
    restart: always
    hostname: solidstack
    environment:
      - EXTERNAL_IP_ADDRESS_CONFIGS=intranet
      - JOB_TYPES=intranet
      - DB_CONNECTION_STRING=mongodb+srv://<user>:<password>@<host>/Solidstack?appName=<cluster>&readPreference=secondary
    ports:
      - 2807:8080
  
  wds.playground:
    image: [registry]/webdatasource/playground:v2.0.0
    restart: always
    hostname: playground
    ports:
    - 2808:80
  
  wds.docs:
    image: [registry]/webdatasource/docs:v2.0.0
    restart: always
    ports:
    - 2809:80
This deployment option contains all third-party services for system evaluation, even in an air-gapped environment.

IMPORTANT! MongoDB connection string must contain a database name. It might be Solidstack as in the example connection string or anything else.


services:
  mongodb-primary:
    image: [registry]/zcube/bitnami-compat-mongodb:6.0
    restart: always
    hostname: mongodb-primary
    environment:
      - MONGODB_ADVERTISED_HOSTNAME=mongodb-primary
      - MONGODB_ROOT_USER=root
      - MONGODB_ROOT_PASSWORD=TestPassword
      - MONGODB_REPLICA_SET_MODE=primary
      - MONGODB_REPLICA_SET_KEY=TestReplicasetKey
  
  mongodb-secondary:
    image: [registry]/zcube/bitnami-compat-mongodb:6.0
    restart: always
    hostname: mongodb-secondary
    depends_on:
      - mongodb-primary
    environment:
      - MONGODB_ADVERTISED_HOSTNAME=mongodb-secondary
      - MONGODB_REPLICA_SET_MODE=secondary
      - MONGODB_INITIAL_PRIMARY_HOST=mongodb-primary
      - MONGODB_REPLICA_SET_KEY=TestReplicasetKey
      - MONGODB_INITIAL_PRIMARY_ROOT_PASSWORD=TestPassword

  mongodb-arbiter:
    image: [registry]/zcube/bitnami-compat-mongodb:6.0
    restart: always
    hostname: mongodb-arbiter
    depends_on:
      - mongodb-primary
    environment:
      - MONGODB_ADVERTISED_HOSTNAME=mongodb-arbiter
      - MONGODB_REPLICA_SET_MODE=arbiter
      - MONGODB_INITIAL_PRIMARY_HOST=mongodb-primary
      - MONGODB_REPLICA_SET_KEY=TestReplicasetKey
      - MONGODB_INITIAL_PRIMARY_ROOT_PASSWORD=TestPassword
 
  wds.solidstack:
    image: [registry]/webdatasource/solidstack:v2.0.0
    restart: always
    hostname: solidstack
    environment:
      - EXTERNAL_IP_ADDRESS_CONFIGS=intranet
      - JOB_TYPES=intranet
      - DB_CONNECTION_STRING=mongodb://root:TestPassword@mongodb-primary,mongodb-secondary/Solidstack?authSource=admin&replicaSet=replicaset&readPreference=secondary
    ports:
    - 2807:8080
  
  wds.playground:
    image: [registry]/webdatasource/playground:v2.0.0
    restart: always
    hostname: playground
    ports:
    - 2808:80
  
  wds.docs:
    image: [registry]/webdatasource/docs:v2.0.0
    restart: always
    ports:
    - 2809:80
This deployment option implies connecting to an external MongoDB. For evaluation purposes, a Free MongoDB Atlas cluster can be used.
In the wds.dapi, wds.datakeeper, and wds.idealer services replace values of the DB_CONNECTION_STRING environment variables with real connection strings to a MongoDB database.

By default, system DB (MongoDB) is used to cache web pages. This behavior can be changed by providing the wds.datakeeper service with a CACHE_CONNECTION_STRING.

IMPORTANT! MongoDB connection strings must contain a database name. The names might be the same as values in the example connection strings or others. IMPORTANT! Contact us for a LICENSE


services: 
  wds.crawler:
    image: [registry]/webdatasource/crawler:v2.0.0
    restart: always
    hostname: crawler
    environment:
      - DATAKEEPER_ORIGIN=http://datakeeper
      - SERVICE_HOST=crawler
      - EXTERNAL_IP_ADDRESS_CONFIGS=intranet
      - LICENSE_KEY=[LICENSE]
  
  wds.datakeeper:
    image: [registry]/webdatasource/datakeeper:v2.0.0
    restart: always
    hostname: datakeeper
    environment:
      - DB_CONNECTION_STRING=mongodb+srv://<user>:<password>@<host>/Datakeeper?appName=<cluster>&readPreference=secondary
      - IDEALER_ORIGIN=http://idealer
      - LICENSE_KEY=[LICENSE]

  wds.idealer:
    image: [registry]/webdatasource/idealer:v2.0.0
    restart: always
    hostname: idealer
    environment:
      - DB_CONNECTION_STRING=mongodb+srv://<user>:<password>@<host>/Idealer?appName=<cluster>&readPreference=secondary
      - LICENSE_KEY=[LICENSE]
  
  wds.scraper:
    image: [registry]/webdatasource/scraper:v2.0.0
    restart: always
    hostname: scraper
    environment:
      - LICENSE_KEY=[LICENSE]
  
  wds.dapi:
    image: [registry]/webdatasource/dapi:v2.0.0
    restart: always
    hostname: dapi
    environment:
      - DB_CONNECTION_STRING=mongodb+srv://<user>:<password>@<host>/Dapi?appName=<cluster>&readPreference=secondary
      - DATAKEEPER_ORIGIN=http://datakeeper
      - SCRAPER_ORIGIN=http://scraper
      - IDEALER_ORIGIN=http://idealer
      - JOB_TYPES=intranet
      - LICENSE_KEY=[LICENSE]
    ports:
      - 2807:8080
  
  wds.playground:
    image: [registry]/webdatasource/playground:v2.0.0
    restart: always
    hostname: playground
    ports:
    - 2808:80
  
  wds.docs:
    image: [registry]/webdatasource/docs:v2.0.0
    restart: always
    ports:
    - 2809:80
This deployment option contains all third-party services for system evaluation, even in an air-gapped environment.
There is only one instance per service, and credentials are not protected since this is an evaluation environment.

IMPORTANT! Contact us for a LICENSE


services:
  mongodb-primary:
    image: [registry]/zcube/bitnami-compat-mongodb:6.0
    restart: always
    hostname: mongodb-primary
    environment:
      - MONGODB_ADVERTISED_HOSTNAME=mongodb-primary
      - MONGODB_ROOT_USER=root
      - MONGODB_ROOT_PASSWORD=TestPassword
      - MONGODB_REPLICA_SET_MODE=primary
      - MONGODB_REPLICA_SET_KEY=TestReplicasetKey
  
  mongodb-secondary:
    image: [registry]/zcube/bitnami-compat-mongodb:6.0
    restart: always
    hostname: mongodb-secondary
    depends_on:
      - mongodb-primary
    environment:
      - MONGODB_ADVERTISED_HOSTNAME=mongodb-secondary
      - MONGODB_REPLICA_SET_MODE=secondary
      - MONGODB_INITIAL_PRIMARY_HOST=mongodb-primary
      - MONGODB_REPLICA_SET_KEY=TestReplicasetKey
      - MONGODB_INITIAL_PRIMARY_ROOT_PASSWORD=TestPassword

  mongodb-arbiter:
    image: [registry]/zcube/bitnami-compat-mongodb:6.0
    restart: always
    hostname: mongodb-arbiter
    depends_on:
      - mongodb-primary
    environment:
      - MONGODB_ADVERTISED_HOSTNAME=mongodb-arbiter
      - MONGODB_REPLICA_SET_MODE=arbiter
      - MONGODB_INITIAL_PRIMARY_HOST=mongodb-primary
      - MONGODB_REPLICA_SET_KEY=TestReplicasetKey
      - MONGODB_INITIAL_PRIMARY_ROOT_PASSWORD=TestPassword
  
  minio:
    image: [registry]/bitnami/minio:2024
    restart: always
    hostname: minio
    environment:
      - MINIO_DEFAULT_BUCKETS=pages-cache
      - MINIO_ROOT_USER=TestAccessKey
      - MINIO_ROOT_PASSWORD=TestSecretKey
      - MINIO_FORCE_NEW_KEYS=no
 
  wds.crawler:
    image: [registry]/webdatasource/crawler:v2.0.0
    restart: always
    hostname: crawler
    environment:
      - DATAKEEPER_ORIGIN=http://datakeeper
      - SERVICE_HOST=crawler
      - EXTERNAL_IP_ADDRESS_CONFIGS=intranet
      - LICENSE_KEY=[LICENSE]
  
  wds.datakeeper:
    image: [registry]/webdatasource/datakeeper:v2.0.0
    restart: always
    hostname: datakeeper
    environment:
      - DB_CONNECTION_STRING=mongodb://root:TestPassword@mongodb-primary,mongodb-secondary/Datakeeper?authSource=admin&replicaSet=replicaset&readPreference=secondary
      - CACHE_CONNECTION_STRING=s3://TestAccessKey:TestSecretKey@minio:9000/pages-cache?ssl=false
      - IDEALER_ORIGIN=http://idealer
      - LICENSE_KEY=[LICENSE]

  wds.idealer:
    image: [registry]/webdatasource/idealer:v2.0.0
    restart: always
    hostname: idealer
    environment:
      - DB_CONNECTION_STRING=mongodb://root:TestPassword@mongodb-primary,mongodb-secondary/Idealer?authSource=admin&replicaSet=replicaset&readPreference=secondary
      - LICENSE_KEY=[LICENSE]
  
  wds.scraper:
    image: [registry]/webdatasource/scraper:v2.0.0
    restart: always
    hostname: scraper
    environment:
      - LICENSE_KEY=[LICENSE]
  
  wds.dapi:
    image: [registry]/webdatasource/dapi:v2.0.0
    restart: always
    hostname: dapi
    environment:
      - DB_CONNECTION_STRING=mongodb://root:TestPassword@mongodb-primary,mongodb-secondary/Dapi?authSource=admin&replicaSet=replicaset&readPreference=secondary
      - DATAKEEPER_ORIGIN=http://datakeeper
      - SCRAPER_ORIGIN=http://scraper
      - IDEALER_ORIGIN=http://idealer
      - JOB_TYPES=intranet
      - LICENSE_KEY=[LICENSE]
    ports:
      - 2807:8080
  
  wds.playground:
    image: [registry]/webdatasource/playground:v2.0.0
    restart: always
    hostname: playground
    ports:
    - 2808:80
  
  wds.docs:
    image: [registry]/webdatasource/docs:v2.0.0
    restart: always
    ports:
    - 2809:80ways
    ports:
    - 2809:80

Running Docker Compose

After an appropriate deployment option is selected and a docker-compose configuration has been copied to a file (e.g., wds-docker-compose.yml), there are two types of scripts to run docker-compose:

NOTE_ If another version is currently running, just replace the old docker-compose configuration with the new one (of the same option) and execute the selected script. The only services with changed versions will be recreated. If the new version is of a different option, execute the docker compose down command beforehand.

set COMPOSE_FILE=wds-docker-compose.yml && docker compose up -d
export COMPOSE_FILE=wds-docker-compose.yml && docker compose up -d

Please rotate your device to landscape mode

This documentation is specifically designed with a wider layout to provide a better reading experience for code examples, tables, and diagrams.
Rotating your device horizontally ensures you can see everything clearly without excessive scrolling or resizing.

Return to Web Data Source Home