WIP-0010: Whitebox Host Manager¶

Introduction¶

This WIP proposes a Whitebox Host Manager: a small privileged service that executes a narrow set of host-level operations on behalf of unprivileged Whitebox containers.

The goal is to reduce operational coupling and blast radius, while keeping the Host Manager itself intentionally thin.

Current pain points:

Multiple containers run privileged and mount /dev, D-Bus, and Docker socket.
Host operations (network, devices, service lifecycle, power operations) are spread across plugins.
There is no single audited execution path for host-level actions.

The Host Manager is not a security sandbox against malicious plugins — plugins are trusted code once installed. The main gains are cleaner boundaries, lower complexity in plugins, and more predictable host operations.

Problem Statement¶

Current State: Many Privileged Containers¶

backend, rq-worker, and plugin daemons may need host access.
Device/network plugins run host commands (nmcli, device access) directly.
Docker service management is not centralized.

This creates a wide blast radius: one bug in one plugin can impact unrelated host functions.

What Needs Host Access¶

Network commands (nmcli and related status queries)
Device enumeration/proxying (/dev, hotplug)
Docker service lifecycle for plugin-defined services (WIP-0006)
System-level operations (power status, shutdown)

Scope and Non-Goals¶

In Scope¶

A privileged container that executes any command on behalf of unprivileged callers
Audited, authenticated requests from Whitebox containers
Client library in backend that validates plugin commands against declared capability patterns
Migration path from direct privileged operations

Non-Goals¶

Re-implementing OS subsystems in the Host Manager
Flight orchestration logic inside Host Manager
Watchdog/monitoring behavior (separate concern)
Runtime security isolation between plugins in the same Python process

Proposed Architecture¶

Command Execution Facade¶

The Host Manager is a dumb executor. It receives a command (executable + arguments), runs it via subprocess, and returns the result. It has no knowledge of command families, plugins, or domain concepts. No command registry, no whitelisting logic — just authenticated execution.

The backend is responsible for validating whether a plugin is allowed to run a given command, based on patterns declared in the plugin's pyproject.toml. The Host Manager trusts authenticated callers to have already performed this validation.

Examples of commands the backend would forward:

["nmcli", "dev", "wifi", "list", "ifname", "wlan0"]
["nmcli", "connection", "up", "MyNetwork"]
["docker", "compose", "up", "-d", "srs"]
["systemctl", "poweroff"]

Runtime Placement¶

The Host Manager runs as a privileged Docker container with host access, managed by Docker Compose alongside the rest of the Whitebox stack.

Same deployment/update pipeline as the rest of Whitebox
Dependencies are available out of the box from the image
Always runs on the matching whitebox-base image — no separate versioning or dependency management
No additional packaging or provisioning logic outside Docker

The container is granted the host capabilities it needs (e.g. /dev, D-Bus, Docker socket) so it can execute host operations on behalf of unprivileged containers. The host itself is kept to the strict minimum — no logic or services run outside Docker.

Stack¶

Python, using the same framework family as the Whitebox backend (Django/DRF conventions and validation patterns).

Transport and Auth¶

HTTP API on 127.0.0.1 (TCP, localhost only)
Single shared bearer token: WHITEBOX_HM_AUTH_TOKEN
Callers identify themselves via the actor field in requests (backend or plugin:<name>) — trusted, not cryptographically enforced
Strict request logging (caller, command, args hash, outcome, duration)

API Model¶

The Host Manager exposes a single endpoint that executes a command. It has no command registry, no whitelisting, and no knowledge of what the command does. It trusts the bearer token and runs what it's told.

All validation — checking whether a plugin is allowed to run a given command — happens in the backend client before the request reaches the Host Manager.

Base path: /api/v1

Command Execution Contract¶

POST /api/v1/execute
Content-Type: application/json
Authorization: Bearer <token>

{
  "command": ["nmcli", "dev", "wifi", "list", "ifname", "wlan0"],
  "request_id": "uuid",
  "actor": "backend|plugin:<name>",
  "reason": "optional reason for audit",
  "timeout": 30
}

The command field is an array of strings — the executable and its arguments, passed directly to subprocess.run(). No shell interpretation.

Response:

{
  "status": "ok",
  "stdout": "...",
  "stderr": "...",
  "return_code": 0,
  "duration_ms": 123
}

Error (command failed):

{
  "status": "error",
  "stdout": "...",
  "stderr": "...",
  "return_code": 1,
  "duration_ms": 456
}

Error (Host Manager rejection):

{
  "status": "rejected",
  "message": "unauthorized"
}

Shutdown and Reboot Policy¶

The Host Manager does not decide flight policy or user-facing confirmation flow.

Plugins request reboot/shutdown indirectly via backend workflow.
Backend controls user confirmation and flight-aware policy.
During active flight, backend defaults to denying reboot/shutdown unless explicitly confirmed as emergency.

Flight-domain decisions stay in the application layer. The Host Manager is a dumb executor.

Graceful Shutdown Sequence¶

Backend receives shutdown intent (user action, power event, or admin action).
Backend emits pre-shutdown events to plugins and waits for cleanup deadline.
Backend calls POST /api/v1/execute with ["systemctl", "poweroff"].
Host Manager executes OS shutdown.

No callback loop from Host Manager to backend is required.

Server-Side: Host Manager¶

The Host Manager is intentionally minimal. It authenticates the request, runs the command via subprocess, and returns the result. No command registry, no domain knowledge.

import subprocess
import time
from typing import Any


class CommandExecutor:
    """Executes commands via subprocess. No whitelisting — that's the caller's job."""

    def execute(self, command: list[str], timeout: float = 30.0) -> dict[str, Any]:
        start = time.monotonic()
        try:
            result = subprocess.run(
                command,
                capture_output=True,
                text=True,
                timeout=timeout,
            )
            duration_ms = int((time.monotonic() - start) * 1000)
            return {
                "status": "ok" if result.returncode == 0 else "error",
                "stdout": result.stdout,
                "stderr": result.stderr,
                "return_code": result.returncode,
                "duration_ms": duration_ms,
            }
        except subprocess.TimeoutExpired:
            duration_ms = int((time.monotonic() - start) * 1000)
            return {
                "status": "error",
                "stdout": "",
                "stderr": f"Command timed out after {timeout}s",
                "return_code": -1,
                "duration_ms": duration_ms,
            }

No shell interpretation — commands are passed directly to subprocess.run() as argument lists.

Logging¶

The Host Manager logs every command's output on its side, tagged by the request ID with the raw output of both stdout and stderr. Alongside the output, each log entry includes the actor, command, duration, and return code for audit purposes.

Plugin Capability Declarations¶

Plugins declare which commands they need to execute in pyproject.toml as regex patterns:

[tool.whitebox.host-capabilities]
commands = ["nmcli .*", "docker compose .*"]

Because permission enforcement lives in the backend client and the Host Manager is a generic executor, adding support for a new host command is purely declarative — the plugin author adds a pattern to pyproject.toml and no changes to the Host Manager are needed.

Client Library¶

The client lives in the Whitebox backend. It reads installed plugins' capability declarations, validates that a requested command matches the calling plugin's allowed patterns, and only then forwards to the Host Manager.

import re
from dataclasses import dataclass
from typing import Any

import httpx


@dataclass
class HostManagerConfig:
    base_url: str
    auth_token: str
    timeout_seconds: float = 30.0


class HostManagerClient:
    """Backend-side client. Validates plugin permissions, then delegates to Host Manager."""

    def __init__(self, config: HostManagerConfig):
        self._config = config
        self._client = httpx.Client(
            base_url=config.base_url,
            headers={"Authorization": f"Bearer {config.auth_token}"},
            timeout=config.timeout_seconds,
        )

    def execute(
        self,
        command: list[str],
        actor: str,
        allowed_patterns: list[str],
        reason: str | None = None,
        timeout: float = 30.0,
    ) -> dict[str, Any]:
        command_str = " ".join(command)
        if not any(re.fullmatch(p, command_str) for p in allowed_patterns):
            raise PermissionError(
                f"Command '{command_str}' not allowed for actor '{actor}'. "
                f"Allowed patterns: {allowed_patterns}"
            )
        response = self._client.post(
            "/api/v1/execute",
            json={
                "command": command,
                "actor": actor,
                "reason": reason,
                "timeout": timeout,
            },
        )
        response.raise_for_status()
        return response.json()

The backend reads each plugin's [tool.whitebox.host-capabilities] section at plugin discovery time and passes the allowed patterns to the client when a plugin requests command execution. The Host Manager never sees or evaluates these patterns — permission enforcement is entirely in the backend.

Security and Trust Boundaries¶

Enforced Boundaries¶

Only the Host Manager container has privileged host access.
Other containers cannot directly execute host commands — they go through the backend client.
The backend validates commands against plugin-declared capability patterns before forwarding.

Plugin Permissions¶

Plugin capability declarations ([tool.whitebox.host-capabilities]) are enforced by the backend client at runtime. The backend reads these patterns during plugin discovery and rejects commands that don't match. This is a real enforcement boundary — plugins cannot bypass it without modifying backend code.

The Host Manager itself performs no permission checks beyond bearer token auth. It trusts authenticated callers.

Power Events¶

The Host Manager exposes host power telemetry, but does not implement shutdown policy.

Responsibility Split¶

Host Manager: reads and forwards power state, executes requested host actions
Backend: decides policy (notify UI, postpone, confirm, then call shutdown)

Failure Modes¶

Host Manager Unavailable¶

Client retries with exponential backoff
Backend/plugins degrade gracefully (empty lists, unavailable status)
No backend crash propagation

Docker Daemon Unavailable¶

Since the Host Manager runs as a Docker container, it shares Docker's failure domain. If the Docker daemon crashes, the Host Manager becomes unavailable along with all other Whitebox services.

Migration Plan¶

Introduce Host Manager container and backend client with no behavior changes.
Migrate WirelessInterfaceManager — replace direct nmcli subprocess calls with backend client calls.
Migrate device access — replace direct /dev access with Host Manager calls.
Migrate plugin-defined Docker service lifecycle (WIP-0006 integration).
Remove privileged mounts/capabilities from backend and plugin daemons.

Design Decisions¶

Transport: TCP on localhost (127.0.0.1) with bearer token. Simple, debuggable with standard HTTP tooling, and sufficient for a single-host deployment where all callers are trusted.
Deployment: Privileged Docker container, not a host systemd service. Keeps the host to the strict minimum — no logic or dependencies outside Docker. The container always runs on the matching whitebox-base image, so versioning and dependency management are handled by the existing Docker Compose pipeline.
No command registry in Host Manager: The Host Manager has no knowledge of command families or whitelists. It's a generic subprocess executor behind bearer token auth. Permission enforcement lives in the backend client, which validates commands against plugin-declared capability patterns. This means new plugins can use new host commands without patching the Host Manager.
Caller identity: Single shared token with honor-system actor identification. All callers use one WHITEBOX_HM_AUTH_TOKEN and self-report via the actor field. This matches the trust model — plugins are trusted code. Per-caller tokens can be introduced later if the trust model changes.
Rollout: Docker Compose gains a host-manager service. During migration, both direct host access and Host Manager coexist until all plugins are migrated.