Observability & Metrics

ACT leverages a unified agent architecture combined with eBPF (Extended Berkeley Packet Filter) to provide high-fidelity, low-overhead observability for your infrastructure and applications.

Architecture

The observability stack is built into the unified act-agent binary, which runs on every server. This eliminates the need for sidecars or multiple agents, ensuring minimal resource footprint (typically <5MB RAM).

eBPF Data Plane

On modern Linux kernels, ACT loads eBPF programs to collect metrics directly from the kernel. This method allows for:

Zero Instrumentation Code: No changes are required to your application code.
Low Overhead: Metrics are aggregated in kernel space using highly efficient maps.
Deep Visibility: Access to data unavailable to standard userspace agents (e.g., exact disk latency, TCP retransmissions).

The agent attaches to various kernel hooks:

sched_switch: Tracks CPU usage and context switches with microsecond precision.
tcp_rtt (LruHashMap): Measures network latency between all mesh peers.
stats_map: Collects per-container resource consumption from the kernel.
XDP Ingress: Packet-level ingress monitoring at the interface level, before full network stack processing.
Security Audit (execve, openat): Real-time auditing of process execution and sensitive file accesses using LSM hooks.
Procfs Fallback: Automatically falls back to /proc parsing if eBPF is unavailable.

Graceful Degradation

If eBPF is not available (e.g., on older kernels or incompatible environments), act-agent automatically falls back to a Procfs Provider. It continues to collect essential metrics via /proc filesystem and the Docker socket, ensuring basic visibility is always available.

Collected Metrics

The agent collects a comprehensive set of metrics across three dimensions:

1. Host Metrics

CPU: Usage, load averages, context switches.
Memory: Used, free, cache, swap.
Disk: Read/write operations, throughput, latency.
Network: Bytes in/out, packets, errors on all interfaces.

2. Container Metrics

Resource Usage: per-container CPU, memory, and I/O.
Lifecycle: Detection of new containers and zombie processes.
Topology: Mapping of container-to-container communication.

3. Application Performance (APM)

HTTP/HTTPS: Kernel-level socket tracing for request rates and latency distributions (p50, p90, p99).
Network Acceleration: Packet-level stats via XDP.

4. Security Audit

Process Activity: Tracking of every execve call with UID/GID and command-line context.
File Integrity: Monitoring of sensitive file opens (openat) across the system.

5. Agent Self-Monitoring

Overhead Tracking: The agent monitors its own CPU (permille) and Memory (RSS) usage to ensure it remains within the advertised ultra-low footprint.

Runtime Modes

The act-agent supports multiple modes of operation:

Daemon Mode (--daemon): The default mode. Aggregates metrics and pushes them to the Control Plane.
Buffered Ingestion: The API ingestor uses a write-ahead buffer (channel) and background worker to handle high-frequency metric streams without database contention.
Query Mode (--query <subsystem>): A one-shot command that outputs current local state as JSON for debugging.

Visualization

Metrics are visualized in the ACT Dashboard:

Server Detail: Real-time resource graphs.
Observability Tab (Deep Systems): Advanced eBPF-powered charts including:
- HTTP APM: Dual-axis request rate and latency monitor.
- XDP Dashboard: Interface-level packet throughput.
- Security Runtime Audit: Timeline of execution and system file events.
- Efficiency Monitor: Precise tracking of agent resource utilization.
Container View: Per-container performance and topology.
Mesh Dashboard: Visualization of inter-server traffic and health.