Deploying WDS API Server in Air-Gapped Environments

Web Data Source API Server can work in air-gapped environments.

It’s required to copy necessary docker images in a private docker registry available from an air-gapped environment to deploy the system there. The image set is different based on the selected deployment type (see services).
Additionally to the system services the third-party components should be in the private docker registry or provided as a service.

Use the following form to specify a particular private docker registry address. This value will be applied to all script examples and docker-compose configurations so they can be copied and used right from this page.

Script Examples

The following examples show how to copy all necessary docker images to a private docker registry.
There are two scripts:

  1. archive-wds-images - downloads all required docker images to archive them into a single wds-docker-images.tar
  2. push-wds-images - pushes all (or only those required for a specific deployment option) WDS images to a private docker registry
archive-wds-images.bat
@echo off
setlocal enabledelayedexpansion

if "%~1"=="" (
  echo Error: No output tar archive path provided.
  echo Usage: %~0 ^<path-to-wds-images.tar^>
  exit /b 1
)

set ARCHIVE_PATH=%~1

REM List of images you want to bundle into the tar
set IMAGES=^
zcube/bitnami-compat-mongodb:6.0 ^
bitnami/minio:2024 ^
webdatasource/solidstack:v1.0.0 ^
webdatasource/crawler:v1.0.0 ^
webdatasource/datakeeper:v1.0.0 ^
webdatasource/idealer:v1.0.0 ^
webdatasource/scraper:v1.0.0 ^
webdatasource/dapi:v1.0.0 ^
webdatasource/playground:v1.0.0 ^
webdatasource/docs:v1.0.0

echo Pulling WDS images...
for %%I in (%IMAGES%) do (
  echo Pulling %%I...
  docker pull %%I
  echo.
)

echo Creating a single tar archive with all WDS images at %ARCHIVE_PATH%...
docker save -o %ARCHIVE_PATH% %IMAGES%

echo.
echo Done! Created %ARCHIVE_PATH% containing all WDS images.

endlocal
push-wds-images.bat
@echo off
setlocal enabledelayedexpansion

if "%~1"=="" (
  echo Error: No registry address provided.
  echo Usage: %~0 ^<registry-address^> ^<path-to-wds-docker-images.tar^>
  exit /b 1
)

if "%~2"=="" (
  echo Error: No tar archive path provided.
  echo Usage: %~0 ^<registry-address^> ^<path-to-wds-docker-images.tar^>
  exit /b 1
)

set PRIVATE_REGISTRY=%~1
set ARCHIVE_PATH=%~2
set IMAGES=^
zcube/bitnami-compat-mongodb:6.0 ^
bitnami/minio:2024 ^
webdatasource/solidstack:v1.0.0 ^
webdatasource/crawler:v1.0.0 ^
webdatasource/datakeeper:v1.0.0 ^
webdatasource/idealer:v1.0.0 ^
webdatasource/scraper:v1.0.0 ^
webdatasource/dapi:v1.0.0 ^
webdatasource/playground:v1.0.0 ^
webdatasource/docs:v1.0.0

echo Loading images from tar file: %ARCHIVE_PATH%
docker load -i %ARCHIVE_PATH%

echo.
echo Tagging and pushing wds images to %PRIVATE_REGISTRY%...
for %%I in (%IMAGES%) do (
  echo Tagging %%I as %PRIVATE_REGISTRY%/%%I
  docker tag %%I %PRIVATE_REGISTRY%/%%I

  echo Pushing %PRIVATE_REGISTRY%/%%I
  docker push %PRIVATE_REGISTRY%/%%I
  echo.
)

echo Done! WDS images loaded and pushed to %PRIVATE_REGISTRY%.

endlocal
Download and archive WDS docker images:

IMPORTANT! The current user should have access to the Internet

archive-wds-images.bat wds-docker-images.tar
Load and push WDS docker images to a private registry:

IMPORTANT! The current user should have access and be logged in to the private registry

push-wds-images.bat [registry] wds-docker-images.tar
archive-wds-images.sh
#!/usr/bin/env bash
set -e

if [ -z "$1" ]; then
  echo "Error: No output tar archive path provided."
  echo "Usage: $0 /path/to/wds-docker-images.tar"
  exit 1
fi

ARCHIVE_PATH="$1"

IMAGES=(
    "zcube/bitnami-compat-mongodb:6.0"
    "bitnami/minio:2024"
    "webdatasource/solidstack:v1.0.0"
    "webdatasource/crawler:v1.0.0"
    "webdatasource/datakeeper:v1.0.0"
    "webdatasource/idealer:v1.0.0"
    "webdatasource/scraper:v1.0.0"
    "webdatasource/dapi:v1.0.0"
    "webdatasource/playground:v1.0.0"
    "webdatasource/docs:v1.0.0"
)

echo "Pulling all images..."
for IMAGE in "${IMAGES[@]}"; do
  echo "Pulling: ${IMAGE}"
  docker pull "${IMAGE}"
  echo
done

echo "Creating a single tar archive with all WDS images at ${ARCHIVE_PATH}..."
docker save -o "${ARCHIVE_PATH}" "${IMAGES[@]}"

echo
echo "Done! Created ${ARCHIVE_PATH} containing all WDS images."
push-wds-images.sh
#!/usr/bin/env bash
set -e

if [ -z "$1" ]; then
  echo "Error: No registry address provided."
  echo "Usage: $0 /path/to/wds-docker-images.tar <registry-address>"
  exit 1
fi

if [ -z "$2" ]; then
  echo "Error: No tar archive provided."
  echo "Usage: $0 /path/to/wds-docker-images.tar <registry-address>"
  exit 1
fi

PRIVATE_REGISTRY="$1"
ARCHIVE_PATH="$2"
IMAGES=(
    "zcube/bitnami-compat-mongodb:6.0"
    "bitnami/minio:2024"
    "webdatasource/solidstack:v1.0.0"
    "webdatasource/crawler:v1.0.0"
    "webdatasource/datakeeper:v1.0.0"
    "webdatasource/idealer:v1.0.0"
    "webdatasource/scraper:v1.0.0"
    "webdatasource/dapi:v1.0.0"
    "webdatasource/playground:v1.0.0"
    "webdatasource/docs:v1.0.0"
)

echo "Loading images from tar file: ${ARCHIVE_PATH}"
docker load -i "${ARCHIVE_PATH}"

echo
echo "Tagging and pushing WDS images to ${PRIVATE_REGISTRY}..."
for IMAGE in "${IMAGES[@]}"; do
  FINAL_TAG="${PRIVATE_REGISTRY}/${IMAGE}"
  echo "Tagging ${IMAGE} as ${FINAL_TAG}"
  docker tag "${IMAGE}" "${FINAL_TAG}"

  echo "Pushing ${FINAL_TAG}"
  docker push "${FINAL_TAG}"
  echo
done

echo "Done! WDS images loaded and pushed to ${PRIVATE_REGISTRY}."
Download and archive WDS docker images:

IMPORTANT! The current user should have access to the Internet

chmod +x archive-wds-images.sh
./archive-wds-images.sh wds-docker-images.tar
Load and push WDS docker images to a private registry:

IMPORTANT! The current user should have access and be logged in to the private registry

chmod +x push-wds-images.sh
./push-wds-images.sh [registry] wds-docker-images.tar

Deployment Options

The docker-compose files in an air-gapped environment are a bit different — the docker registry has been changed during loading to a private registry. Therefore, the docker-compose files from this page (with the new docker registry) should be used.

As for the deployment options, they stay the same and are all available in air-gapped environments:

This deployment option implies connecting to an external MongoDB. For evaluation purposes, a Free MongoDB Atlas cluster can be used.
In the wds.solidstack service replace the value of the DB_CONNECTION_STRING environment variable with a real connection string to a MongoDB database.

IMPORTANT! MongoDB connection string must contain a database name. It might be Solidstack as in the example connection string or anything else.

services:  
  wds.solidstack:
    image: [registry]/webdatasource/solidstack:v1.0.0
    restart: always
    hostname: solidstack.svc
    environment:
      - EXTERNAL_IP_ADDRESS_CONFIGS=intranet
      - JOBS_TYPES=intranet
      - DB_CONNECTION_STRING=mongodb+srv://<user>:<password>@<host>/Solidstack?appName=<cluster>&readPreference=secondary
    ports:
      - 2807:8081
  
  wds.playground:
    image: [registry]/webdatasource/playground:v1.0.0
    restart: always
    hostname: playground.svc
    ports:
    - 2808:80
  
  wds.docs:
    image: [registry]/webdatasource/docs:v1.0.2
    restart: always
    ports:
    - 2809:80
This deployment option implies connecting to an external MongoDB. For evaluation purposes, a Free MongoDB Atlas cluster can be used.
In the wds.dapi, wds.datakeeper, and wds.idealer services replace values of the DB_CONNECTION_STRING environment variables with real connection strings to a MongoDB database.

By default, system DB (MongoDB) is used to cache web pages. This behavior can be changed by providing the wds.datakeeper service with a CACHE_CONNECTION_STRING.

IMPORTANT! MongoDB connection strings must contain a database name. The names might be the same as values in the example connection strings or others.

services: 
  wds.crawler:
    image: [registry]/webdatasource/crawler:v1.0.0
    restart: always
    hostname: crawler.svc
    environment:
      - DATAKEEPER_ORIGIN=http://datakeeper.svc
      - SERVICE_HOST=crawler.svc
      - EXTERNAL_IP_ADDRESS_CONFIGS=intranet
  
  wds.datakeeper:
    image: [registry]/webdatasource/datakeeper:v1.0.0
    restart: always
    hostname: datakeeper.svc
    environment:
      - DB_CONNECTION_STRING=mongodb+srv://<user>:<password>@<host>/Datakeeper?appName=<cluster>&readPreference=secondary
      - IDEALER_ORIGIN=http://idealer.svc

  wds.idealer:
    image: [registry]/webdatasource/idealer:v1.0.0
    restart: always
    hostname: idealer.svc
    environment:
      - DB_CONNECTION_STRING=mongodb+srv://<user>:<password>@<host>/Idealer?appName=<cluster>&readPreference=secondary
  
  wds.scraper:
    image: [registry]/webdatasource/scraper:v1.0.0
    restart: always
    hostname: scraper.svc
  
  wds.dapi:
    image: [registry]/webdatasource/dapi:v1.0.0
    restart: always
    hostname: dapi.svc
    environment:
      - DB_CONNECTION_STRING=mongodb+srv://<user>:<password>@<host>/Dapi?appName=<cluster>&readPreference=secondary
      - DATAKEEPER_ORIGIN=http://datakeeper.svc
      - SCRAPER_ORIGIN=http://scraper.svc
      - IDEALER_ORIGIN=http://idealer.svc
      - JOBS_TYPES=intranet
    ports:
      - 2807:8081
  
  wds.playground:
    image: [registry]/webdatasource/playground:v1.0.0
    restart: always
    hostname: playground.svc
    ports:
    - 2808:80
  
  wds.docs:
    image: [registry]/webdatasource/docs:v1.0.2
    restart: always
    ports:
    - 2809:80
This deployment option contains all third-party services for system evaluation, even in an air-gapped environment. There is only one instance per service, and credentials are not protected since this is an evaluation environment.

services:
  mongodb-primary:
    image: [registry]/zcube/bitnami-compat-mongodb:6.0
    restart: always
    hostname: mongodb-primary.svc
    environment:
      - MONGODB_ADVERTISED_HOSTNAME=mongodb-primary.svc
      - MONGODB_ROOT_USER=root
      - MONGODB_ROOT_PASSWORD=TestPassword
      - MONGODB_REPLICA_SET_MODE=primary
      - MONGODB_REPLICA_SET_KEY=TestReplicasetKey
  
  mongodb-secondary:
    image: [registry]/zcube/bitnami-compat-mongodb:6.0
    restart: always
    hostname: mongodb-secondary.svc
    depends_on:
      - mongodb-primary
    environment:
      - MONGODB_ADVERTISED_HOSTNAME=mongodb-secondary.svc
      - MONGODB_REPLICA_SET_MODE=secondary
      - MONGODB_INITIAL_PRIMARY_HOST=mongodb-primary.svc
      - MONGODB_REPLICA_SET_KEY=TestReplicasetKey
      - MONGODB_INITIAL_PRIMARY_ROOT_PASSWORD=TestPassword

  mongodb-arbiter:
    image: [registry]/zcube/bitnami-compat-mongodb:6.0
    restart: always
    hostname: mongodb-arbiter.svc
    depends_on:
      - mongodb-primary
    environment:
      - MONGODB_ADVERTISED_HOSTNAME=mongodb-arbiter.svc
      - MONGODB_REPLICA_SET_MODE=arbiter
      - MONGODB_INITIAL_PRIMARY_HOST=mongodb-primary.svc
      - MONGODB_REPLICA_SET_KEY=TestReplicasetKey
      - MONGODB_INITIAL_PRIMARY_ROOT_PASSWORD=TestPassword
  
  minio:
    image: [registry]/bitnami/minio:2024
    restart: always
    hostname: minio.svc
    environment:
      - MINIO_DEFAULT_BUCKETS=pages-cache
      - MINIO_ROOT_USER=TestAccessKey
      - MINIO_ROOT_PASSWORD=TestSecretKey
      - MINIO_FORCE_NEW_KEYS=no
 
  wds.crawler:
    image: [registry]/webdatasource/crawler:v1.0.0
    restart: always
    hostname: crawler.svc
    environment:
      - DATAKEEPER_ORIGIN=http://datakeeper.svc
      - SERVICE_HOST=crawler.svc
      - EXTERNAL_IP_ADDRESS_CONFIGS=intranet
  
  wds.datakeeper:
    image: [registry]/webdatasource/datakeeper:v1.0.0
    restart: always
    hostname: datakeeper.svc
    environment:
      - DB_CONNECTION_STRING=mongodb://root:TestPassword@mongodb-primary.svc,mongodb-secondary.svc/Datakeeper?authSource=admin&replicaSet=replicaset&readPreference=secondary
      - CACHE_CONNECTION_STRING=s3://TestAccessKey:TestSecretKey@minio.svc:9000/pages-cache?ssl=false
      - IDEALER_ORIGIN=http://idealer.svc

  wds.idealer:
    image: [registry]/webdatasource/idealer:v1.0.0
    restart: always
    hostname: idealer.svc
    environment:
      - DB_CONNECTION_STRING=mongodb://root:TestPassword@mongodb-primary.svc,mongodb-secondary.svc/Idealer?authSource=admin&replicaSet=replicaset&readPreference=secondary
  
  wds.scraper:
    image: [registry]/webdatasource/scraper:v1.0.0
    restart: always
    hostname: scraper.svc
  
  wds.dapi:
    image: [registry]/webdatasource/dapi:v1.0.0
    restart: always
    hostname: dapi.svc
    environment:
      - DB_CONNECTION_STRING=mongodb://root:TestPassword@mongodb-primary.svc,mongodb-secondary.svc/Dapi?authSource=admin&replicaSet=replicaset&readPreference=secondary
      - DATAKEEPER_ORIGIN=http://datakeeper.svc
      - SCRAPER_ORIGIN=http://scraper.svc
      - IDEALER_ORIGIN=http://idealer.svc
      - JOBS_TYPES=intranet
    ports:
      - 2807:8081
  
  wds.playground:
    image: [registry]/webdatasource/playground:v1.0.0
    restart: always
    hostname: playground.svc
    ports:
    - 2808:80
  
  wds.docs:
    image: [registry]/webdatasource/docs:v1.0.2
    restart: always
    ports:
    - 2809:80

Running Docker Compose

After an appropriate deployment option is selected and a docker-compose configuration has been copied to a file (e.g., wds-docker-compose.yml), there are two types of scripts to run docker-compose:

set COMPOSE_FILE=wds-docker-compose.yml && docker compose up
export COMPOSE_FILE=wds-docker-compose.yml && docker compose up

Please rotate your device to landscape mode

This documentation is specifically designed with a wider layout to provide a better reading experience for code examples, tables, and diagrams.
Rotating your device horizontally ensures you can see everything clearly without excessive scrolling or resizing.

Return to Web Data Source Home