Observability and Tracing
DSP-META implements OpenTelemetry distributed tracing to enable monitoring and debugging of requests across the service infrastructure.
Overview
The backend uses OpenTelemetry with the Rust tracing ecosystem to provide:
- Distributed Tracing: Continue traces from reverse proxies and upstream services
- Automatic Span Creation: HTTP requests and instrumented functions automatically create spans
- W3C TraceContext: Standard-compliant trace propagation via HTTP headers
- Flexible Export: Optional OTLP exporter for sending traces to observability backends
Architecture
graph LR
A[Reverse Proxy] -->|traceparent header| B[TraceLayer]
B --> C[Handler Functions]
C --> D[OTLP Exporter]
D --> E[Tempo/Jaeger/etc]
E --> F[Grafana]
The trace flow:
- TraceLayer (
src/api/router.rs) extracts W3C TraceContext from HTTP headers and creates anhttp_requestspan as a child of the extracted context - Handler Functions with
#[instrument]create child spans automatically - OTLP Exporter (optional) sends spans to observability backends
- Grafana/Tempo visualize the distributed traces
Automatic Fallback
When no traceparent header is present (e.g., local development), new root spans are created automatically.
Configuration
Environment Variables
| Variable | Description | Default | Example |
|---|---|---|---|
OTEL_EXPORTER_OTLP_ENDPOINT |
OTLP endpoint URL for exporting traces (standard OpenTelemetry env var) | Not set (local only) | http://localhost:4317 |
DSP_META_LOG_FILTER |
Log level filter | info |
debug |
DSP_META_LOG_FMT |
Log output format | compact |
json |
Local Development (No Export)
By default, traces are only logged locally:
just serve-dev
Local Development with Grafana
Export traces to a local Grafana + Tempo stack:
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 just serve-dev
Production
In production, configure the OTLP endpoint to send traces to your observability backend:
OTEL_EXPORTER_OTLP_ENDPOINT=https://tempo.yourcompany.com:4317
DSP_META_LOG_FILTER=info
DSP_META_LOG_FMT=json
Testing Locally with Grafana
Prerequisites
- Docker and Docker Compose
- The DSP-META repository
Quick Start (Recommended)
The easiest way to test with observability is using the dedicated just targets:
Start both the observability stack and the application:
just serve-with-observability
This will: 1. Start Grafana LGTM stack (all-in-one observability container) 2. Start dsp-meta with OTLP exporter enabled 3. Display URLs for accessing the services
For development with automatic reloading on code changes:
just serve-dev-with-observability
Start services independently:
# Start observability stack
just observability-up
# In another terminal, start your app
just serve-dev
# When done, stop observability stack
just observability-down
After running any of these commands, the observability stack will be available:
Observability stack:
- Grafana: http://localhost:3001
- OTLP gRPC endpoint: http://localhost:4317
- OTLP HTTP endpoint: http://localhost:4318
Manual Setup (Alternative)
If you prefer manual control:
Step 1: Start Observability Stack
Start the Grafana LGTM all-in-one observability stack:
docker-compose -f docker-compose.observability.yml up -d
This starts a single container with:
- Grafana on port 3001
- Tempo (traces backend)
- Loki (logs backend)
- Mimir/Prometheus (metrics backend)
- Pyroscope (profiling backend)
- OTLP gRPC endpoint on port 4317
- OTLP HTTP endpoint on port 4318
Step 2: Start Application with Exporter
Run the application with the OTLP endpoint configured:
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 \
DSP_META_LOG_FILTER=info \
just serve-dev
cargo build
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 \
./target/debug/dsp-meta
You should see the log message: OTLP exporter configured successfully
Step 3: Generate Traces
Generate some traces by making HTTP requests:
Make a request without trace context (creates new root span):
curl http://localhost:3000/api/v1/projects/0001
Make a request with trace context (continues existing trace):
curl -H "traceparent: 00-aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa-bbbbbbbbbbbbbbbb-01" \
http://localhost:3000/api/v1/projects/0001
Generate multiple traces:
for i in {1..10}; do
curl http://localhost:3000/api/v1/projects/0001
sleep 0.5
done
Step 4: View Traces in Grafana
- Open Grafana at http://localhost:3001
- Click Explore (compass icon in the left sidebar)
- Select Tempo from the datasource dropdown
- Select Search query type
- Filter by Service Name:
dsp-meta - Click Run query
- Click any trace to see the full span waterfall visualization
What You'll See
In the Grafana trace view:
- Trace ID: Unique identifier for the distributed trace
- Root Span (
http_request): Created by TraceLayer for each HTTP request - Child Spans: Functions annotated with
#[instrument] - Duration: Time taken by each span
- Attributes: HTTP method, URI, status code, latency, etc.
Verifying Trace Propagation
When you send a request with a traceparent header, the Trace ID in Grafana should match the ID you sent.
This confirms that trace context propagation is working correctly.
Span Propagation
W3C TraceContext Format
The traceparent header follows the W3C TraceContext standard:
traceparent: 00-{trace-id}-{parent-span-id}-{trace-flags}
Example:
traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01
00: Version0af7651916cd43dd8448eb211c80319c: 32-character hex trace IDb7ad6b7169203331: 16-character hex parent span ID01: Trace flags (sampled)
Testing Propagation
Test that your application correctly continues traces:
# Generate unique trace and span IDs
TRACE_ID=$(openssl rand -hex 16)
SPAN_ID=$(openssl rand -hex 8)
# Send request with that trace context
curl -v \
-H "traceparent: 00-${TRACE_ID}-${SPAN_ID}-01" \
http://localhost:3000/api/v1/projects/0001
# Search for this trace ID in Grafana
echo "Search Grafana for trace ID: ${TRACE_ID}"
In Grafana, you should find a trace with that exact Trace ID, proving the application continued the distributed trace.
Adding Custom Span Attributes
You can enrich spans with custom attributes in your handler functions:
use tracing::{Span, instrument};
#[instrument]
pub async fn my_handler() {
let span = Span::current();
// Add custom attributes
span.record("user_id", "12345");
span.record("cache_hit", true);
span.record("query_duration_ms", 42);
// These attributes will appear in the trace view
}
These custom attributes will be visible in Grafana and can be used for filtering and analysis.
Troubleshooting
No traces appearing in Grafana
Check observability stack is running
docker ps | grep otel-lgtm
You should see a container named otel-lgtm running.
Check application is exporting traces
Look for this log message when starting the application:
Configuring OTLP exporter with endpoint: http://localhost:4317
If not present, verify the OTEL_EXPORTER_OTLP_ENDPOINT environment variable is set.
Check LGTM container logs
docker logs dsp-meta-otel-lgtm-1
Look for any errors receiving or processing spans.
Connection errors to Tempo
If you see connection errors, try using Docker's host networking:
OTEL_EXPORTER_OTLP_ENDPOINT=http://host.docker.internal:4317 just serve-dev
Traces appear but are incomplete
Ensure all handler functions use the #[instrument] attribute:
#[instrument(skip(state))]
pub async fn get_by_shortcode(
Path(shortcode): Path<Shortcode>,
State(state): State<Arc<AppState>>,
) -> Result<Response, DspMetaError> {
// handler code
}
The skip parameter prevents large objects from being logged.
Production Considerations
Reverse Proxy Configuration
In production, your reverse proxy should inject trace context headers. Example nginx configuration:
location / {
# Generate trace context if not present
set $trace_id $request_id;
proxy_set_header traceparent "00-$trace_id-0000000000000000-01";
proxy_pass http://dsp-meta:3000;
}
Sampling
For high-traffic services, consider configuring sampling to reduce trace volume:
// In main-server.rs, configure sampler
use opentelemetry_sdk::trace::Sampler;
let tracer_provider = TracerProvider::builder()
.with_config(
Config::default()
.with_resource(resource)
.with_sampler(Sampler::TraceIdRatioBased(0.1)) // Sample 10% of traces
)
.build();
Security
Endpoint Security
Ensure the OTLP endpoint is secured with authentication when exposing to production networks. The current implementation sends traces without authentication.
Available Just Commands
The following just commands are available for managing observability:
| Command | Description |
|---|---|
just observability-up |
Start Grafana LGTM observability stack |
just observability-down |
Stop observability stack |
just observability-clean |
Stop stack and remove volumes (deletes all stored data) |
just serve-with-observability |
Start observability stack and run dsp-meta |
just serve-dev-with-observability |
Start stack and run dsp-meta with hot reload |
Cleanup
Stop and remove the observability stack:
# Stop containers
just observability-down
# Remove volumes (deletes stored traces)
just observability-clean
# Stop containers
docker-compose -f docker-compose.observability.yml down
# Remove volumes (deletes stored traces)
docker-compose -f docker-compose.observability.yml down -v