Observability

Observability

Built-in metrics and logs, plus how to forward telemetry to your existing stack.

Dashboard metrics

Every web service and worker exposes CPU, memory, request rate, and error rate in Service → Metrics. Data is retained for 30 days on Pro and 90 days on Enterprise.

MetricUse it for
CPU %Right-sizing plans and autoscaling thresholds
Memory %OOM investigation before pods restart
Request rateTraffic spikes and deploy correlation
5xx rateRegression detection after releases
p95 latencySLA tracking and slow-endpoint hunts

Logs

Build and runtime stdout/stderr stream to Service → Logs. Filter by deploy, replica, or time range. Logs are structured JSON if your app writes JSON lines - StackBlaze parses severity and message fields for filtering.

Note

High-traffic services on Hobby plans retain 7 days of logs. Upgrade to Pro for 30-day retention and log export via API.

Alerts

Configure alerts under Project → Alerts. Triggers include CPU above threshold, memory above threshold, deploy failure, and health check failure. Notifications go to email, Slack, or PagerDuty webhooks.

Export to external tools

StackBlaze does not lock you into a proprietary APM. Common patterns:

OpenTelemetry

Add the OTel SDK to your app and set OTEL_EXPORTER_OTLP_ENDPOINT to Datadog, Honeycomb, or Grafana Cloud. No sidecar required on web services.

Log drains

Enterprise plans support HTTPS log drains (Splunk, Elastic, Axiom). Contact sales to enable drains on your organization.

Deploy correlation

Metrics and logs include a deploy_id label. When error rate spikes after a deploy, roll back from Rollbacks and compare p95 latency between deploy versions in the metrics UI.

Deep dive: Observability beyond metrics