Observability & Metrics

High-fidelity observability using eBPF and the unified act-agent

Updated Feb 14, 2026 Edit this page

Observability & Metrics

ACT leverages a unified agent architecture combined with eBPF (Extended Berkeley Packet Filter) to provide high-fidelity, low-overhead observability for your infrastructure and applications.

Architecture

The observability stack is built into the unified act-agent binary, which runs on every server. This eliminates the need for sidecars or multiple agents, ensuring minimal resource footprint (typically <5MB RAM).

eBPF Data Plane

On modern Linux kernels, ACT loads eBPF programs to collect metrics directly from the kernel. This method allows for:

  • Zero Instrumentation Code: No changes are required to your application code.
  • Low Overhead: Metrics are aggregated in kernel space using highly efficient maps.
  • Deep Visibility: Access to data unavailable to standard userspace agents (e.g., exact disk latency, TCP retransmissions).

The agent attaches to various kernel hooks:

  • sched_switch: Tracks CPU usage and context switches with microsecond precision.
  • tcp_rtt (LruHashMap): Measures network latency between all mesh peers.
  • stats_map: Collects per-container resource consumption from the kernel.
  • XDP Ingress: Packet-level ingress monitoring at the interface level, before full network stack processing.
  • Security Audit (execve, openat): Real-time auditing of process execution and sensitive file accesses using LSM hooks.
  • Procfs Fallback: Automatically falls back to /proc parsing if eBPF is unavailable.

Graceful Degradation

If eBPF is not available (e.g., on older kernels or incompatible environments), act-agent automatically falls back to a Procfs Provider. It continues to collect essential metrics via /proc filesystem and the Docker socket, ensuring basic visibility is always available.

Collected Metrics

The agent collects a comprehensive set of metrics across three dimensions:

1. Host Metrics

  • CPU: Usage, load averages, context switches.
  • Memory: Used, free, cache, swap.
  • Disk: Read/write operations, throughput, latency.
  • Network: Bytes in/out, packets, errors on all interfaces.

2. Container Metrics

  • Resource Usage: per-container CPU, memory, and I/O.
  • Lifecycle: Detection of new containers and zombie processes.
  • Topology: Mapping of container-to-container communication.

3. Application Performance (APM)

  • HTTP/HTTPS: Kernel-level socket tracing for request rates and latency distributions (p50, p90, p99).
  • Network Acceleration: Packet-level stats via XDP.

4. Security Audit

  • Process Activity: Tracking of every execve call with UID/GID and command-line context.
  • File Integrity: Monitoring of sensitive file opens (openat) across the system.

5. Agent Self-Monitoring

  • Overhead Tracking: The agent monitors its own CPU (permille) and Memory (RSS) usage to ensure it remains within the advertised ultra-low footprint.

Runtime Modes

The act-agent supports multiple modes of operation:

  • Daemon Mode (--daemon): The default mode. Aggregates metrics and pushes them to the Control Plane.
  • Buffered Ingestion: The API ingestor uses a write-ahead buffer (channel) and background worker to handle high-frequency metric streams without database contention.
  • Query Mode (--query <subsystem>): A one-shot command that outputs current local state as JSON for debugging.

Visualization

Metrics are visualized in the ACT Dashboard:

  • Server Detail: Real-time resource graphs.
  • Observability Tab (Deep Systems): Advanced eBPF-powered charts including:
    • HTTP APM: Dual-axis request rate and latency monitor.
    • XDP Dashboard: Interface-level packet throughput.
    • Security Runtime Audit: Timeline of execution and system file events.
    • Efficiency Monitor: Precise tracking of agent resource utilization.
  • Container View: Per-container performance and topology.
  • Mesh Dashboard: Visualization of inter-server traffic and health.