Skip to content

Configuration Reference

This document covers all configuration options for the dk CLI including configuration files, environment variables, and per-project settings.

Configuration File

The dk CLI uses hierarchical YAML configuration. Settings are loaded from three scopes (lowest to highest precedence):

  1. System: /etc/datakit/config.yaml
  2. User: ~/.config/dk/config.yaml
  3. Repo: {git-root}/.dk/config.yaml

Higher-precedence scopes override lower ones. Command-line flags override all scopes.

Full Configuration

# .dk/config.yaml — Example with all supported settings

# Local development settings
dev:
  runtime: k3d               # Runtime type: k3d or compose
  workspace: /path/to/work   # Path to DK workspace (optional)
  k3d:
    clusterName: dk-local    # k3d cluster name (DNS-safe)

# Plugin registry settings
plugins:
  registry: ghcr.io/infobloxopen   # Default OCI registry for plugins
  mirrors:                          # Fallback registries (tried in order)
    - ghcr.io/backup-org
    - internal.registry.io
  overrides:                        # Per-plugin version/image overrides
    postgresql:
      version: v8.13.0             # Pin a specific version
    s3:
      image: custom-s3:v1          # Full image override (bypasses registry)
  destinations:                     # Per-destination connection overrides
    postgresql:
      connection_string: "postgresql://user:pass@host:5432/mydb?sslmode=disable"
    s3:
      bucket: my-output-bucket
      region: us-west-2
      endpoint: "http://localstack:4566"
    file:
      path: /custom/output/path

Configuration Sections

dev

Local development settings.

Field Type Description Default
dev.runtime string Runtime type: k3d or compose k3d
dev.workspace string Path to DK workspace (none)
dev.k3d.clusterName string k3d cluster name (DNS-safe) dk-local

plugins

Plugin registry and override settings.

Field Type Description Default
plugins.registry string Default OCI registry for destination plugins ghcr.io/infobloxopen
plugins.mirrors string[] Fallback registries tried in order when primary fails []
plugins.overrides.<name>.version string Pin a specific version for a plugin (semver) built-in default
plugins.overrides.<name>.image string Full image reference (bypasses registry + naming) (none)

Image resolution precedence:

  1. plugins.overrides.<name>.image → used as-is
  2. plugins.overrides.<name>.version{registry}/cloudquery-plugin-{name}:{version}
  3. Default → {registry}/cloudquery-plugin-{name}:{built-in-version}

plugins.destinations

Per-destination connection and spec overrides. These settings control how dk run connects destination plugins to their backing services (databases, object stores, etc.).

Field Type Description Default
plugins.destinations.<name>.connection_string string Full connection string (postgresql) auto-detected
plugins.destinations.<name>.bucket string S3 bucket name dk-output
plugins.destinations.<name>.region string AWS region for S3 us-east-1
plugins.destinations.<name>.endpoint string Custom S3 endpoint (e.g., LocalStack) auto-detected
plugins.destinations.<name>.path string Output directory for file destination /home/nonroot/cq-sync-output

Spec resolution order (highest to lowest precedence):

  1. Config override — explicit values in plugins.destinations.<name>.*
  2. In-cluster auto-detect — discovered from running k3d services via kubectl
  3. Built-in default — hardcoded fallback values

Auto-detection details:

During dk run, the CLI queries the k3d cluster for known services:

  • PostgreSQL: Looks for dk-postgres-postgres service in the current namespace. If found, builds the connection string using the in-cluster DNS name (dk-postgres-postgres.<namespace>.svc.cluster.local:5432) with default credentials (postgres:postgres, database postgres).
  • S3 (LocalStack): Looks for dk-localstack-localstack service in the current namespace. If found, uses its in-cluster DNS endpoint (http://dk-localstack-localstack.<namespace>.svc.cluster.local:4566), sets force_path_style: true, and uses the dk-output bucket.
  • File: No auto-detection needed. Defaults to /home/nonroot/cq-sync-output inside the container, which is bind-mounted to ./cq-sync-output/ on the host.

Examples:

Override the PostgreSQL connection string for a custom database:

dk config set plugins.destinations.postgresql.connection_string \
  "postgresql://myuser:mypass@custom-host:5432/analytics?sslmode=disable"

Point S3 output at a custom bucket and endpoint:

dk config set plugins.destinations.s3.bucket my-data-lake
dk config set plugins.destinations.s3.endpoint "http://minio:9000"

Change the file output path:

dk config set plugins.destinations.file.path /data/output

View the effective destination configuration:

dk config list | grep destinations

registry

Registry settings for artifact publishing.

Field Type Description
default string Default registry URL for dk publish
credentials array List of registry credentials
credentials[].registry string Registry hostname
credentials[].username string Username (supports env vars)
credentials[].token string Access token (supports env vars)

environments

Environment configuration for promotions.

Field Type Description
<env>.gitops string GitOps repository URL
<env>.path string Path within repository
<env>.auto_merge boolean Auto-merge PRs (default: false)
<env>.approvers array Required approvers
<env>.approval_count integer Number of approvals required

lineage

OpenLineage backend configuration.

Field Type Description
backend string Backend type: marquez, datahub, custom
endpoint string API endpoint URL
api_key string Optional API key

defaults

Default values for CLI flags.

Field Type Description
output string Default output format
timeout duration Default command timeout
namespace string Default namespace
log_level string Logging level

Environment Variables

Environment variables override configuration file values.

Core Variables

Variable Description Example
DK_CONFIG Config file path /custom/config.yaml
DK_NAMESPACE Default namespace analytics
DK_OUTPUT_FORMAT Output format json
DK_LOG_LEVEL Log level debug
DK_DEBUG Enable debug mode true

Registry Variables

Variable Description Example
DK_REGISTRY Default registry ghcr.io/myorg
DK_REGISTRY_USER Registry username ci-bot
DK_REGISTRY_TOKEN Registry token ghp_xxx...

Lineage Variables

Variable Description Example
OPENLINEAGE_URL OpenLineage endpoint http://marquez:5000/api/v1/lineage
OPENLINEAGE_API_KEY API key for lineage api-key-xxx

Development Variables

Variable Description Example
DK_DEV_NETWORK Docker network name dk-network
DK_DEV_TIMEOUT Dev stack timeout 120s

Project Configuration

Project-specific settings in .dk/config.yaml:

# .dk/config.yaml (in project root)

# Project-specific registry
registry:
  default: ghcr.io/myteam

# Project namespace
defaults:
  namespace: my-project

Configuration Precedence

  1. Command-line flags (highest priority) — e.g., --registry
  2. Environment variables
  3. Repo configuration ({git-root}/.dk/config.yaml)
  4. User configuration (~/.config/dk/config.yaml)
  5. System configuration (/etc/datakit/config.yaml)
  6. Built-in defaults (lowest priority)

Use dk config list to see the effective value and source for each setting.


Local Development Stack

The dk dev up command deploys Helm charts to a local k3d cluster providing:

Service Port Purpose
Redpanda 19092 Kafka-compatible streaming
LocalStack 4566 AWS S3 emulation
PostgreSQL 5432 Relational database
Marquez 5000, 3000 Data lineage tracking

Chart versions and Helm values can be overridden via dk config:

dk config set dev.charts.redpanda.version 25.2.0
dk config set dev.charts.postgres.values.primary.resources.limits.memory 1Gi

Governance Configuration

policies.yaml

Define organization-wide policies:

# .dk/policies.yaml

policies:
  # Require classification on all outputs
  require_classification: true

  # Owner email pattern
  owner_pattern: "^[a-z-]+@example\\.com$"

  # Maximum retention for PII data
  max_pii_retention_days: 730

  # Required tags for confidential data
  confidential_required_tags:
    - gdpr

  # Require description on packages
  require_description: true

  # Minimum description length
  min_description_length: 20

Policy Enforcement

Policies are checked by dk lint:

dk lint --policy .dk/policies.yaml

Shell Completion

Enable tab completion for better CLI experience.

Bash

# Add to ~/.bashrc
source <(dk completion bash)

Zsh

# Add to ~/.zshrc
source <(dk completion zsh)

Fish

dk completion fish | source

Logging

Log Levels

Level Description
debug Verbose output for troubleshooting
info Normal operation (default)
warn Warnings only
error Errors only

Setting Log Level

# Environment variable
export DK_LOG_LEVEL=debug

# Config file
defaults:
  log_level: debug

# Command line
dk run --log-level debug

Log Output

# JSON format for parsing
export DK_LOG_FORMAT=json

# Include timestamps
export DK_LOG_TIMESTAMPS=true

See Also