Observability and Tracing

DSP-META implements OpenTelemetry distributed tracing to enable monitoring and debugging of requests across the service infrastructure.

Overview

The backend uses OpenTelemetry with the Rust tracing ecosystem to provide:

Distributed Tracing: Continue traces from reverse proxies and upstream services
Automatic Span Creation: HTTP requests and instrumented functions automatically create spans
W3C TraceContext: Standard-compliant trace propagation via HTTP headers
Flexible Export: Optional OTLP exporter for sending traces to observability backends

Architecture

graph LR
    A[Reverse Proxy] -->|traceparent header| B[TraceLayer]
    B --> C[Handler Functions]
    C --> D[OTLP Exporter]
    D --> E[Tempo/Jaeger/etc]
    E --> F[Grafana]

The trace flow:

TraceLayer (src/api/router.rs) extracts W3C TraceContext from HTTP headers and creates an http_request span as a child of the extracted context
Handler Functions with #[instrument] create child spans automatically
OTLP Exporter (optional) sends spans to observability backends
Grafana/Tempo visualize the distributed traces

Automatic Fallback

When no traceparent header is present (e.g., local development), new root spans are created automatically.

Configuration

Environment Variables

Variable	Description	Default	Example
`OTEL_EXPORTER_OTLP_ENDPOINT`	OTLP endpoint URL for exporting traces (standard OpenTelemetry env var)	Not set (local only)	`http://localhost:4317`
`DSP_META_LOG_FILTER`	Log level filter	`info`	`debug`
`DSP_META_LOG_FMT`	Log output format	`compact`	`json`

Local Development (No Export)

By default, traces are only logged locally:

just serve-dev

Local Development with Grafana

Export traces to a local Grafana + Tempo stack:

OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 just serve-dev

Production

In production, configure the OTLP endpoint to send traces to your observability backend:

OTEL_EXPORTER_OTLP_ENDPOINT=https://tempo.yourcompany.com:4317
DSP_META_LOG_FILTER=info
DSP_META_LOG_FMT=json

Testing Locally with Grafana

Prerequisites

Docker and Docker Compose
The DSP-META repository

Quick Start (Recommended)

The easiest way to test with observability is using the dedicated just targets:

One CommandWith Hot ReloadManual Control

Start both the observability stack and the application:

just serve-with-observability

This will: 1. Start Grafana LGTM stack (all-in-one observability container) 2. Start dsp-meta with OTLP exporter enabled 3. Display URLs for accessing the services

For development with automatic reloading on code changes:

just serve-dev-with-observability

Start services independently:

# Start observability stack
just observability-up

# In another terminal, start your app
just serve-dev

# When done, stop observability stack
just observability-down

After running any of these commands, the observability stack will be available:

Observability stack:
  - Grafana: http://localhost:3001
  - OTLP gRPC endpoint: http://localhost:4317
  - OTLP HTTP endpoint: http://localhost:4318

Manual Setup (Alternative)

If you prefer manual control:

Step 1: Start Observability Stack

Start the Grafana LGTM all-in-one observability stack:

docker-compose -f docker-compose.observability.yml up -d

This starts a single container with:

Grafana on port 3001
Tempo (traces backend)
Loki (logs backend)
Mimir/Prometheus (metrics backend)
Pyroscope (profiling backend)
OTLP gRPC endpoint on port 4317
OTLP HTTP endpoint on port 4318

Step 2: Start Application with Exporter

Run the application with the OTLP endpoint configured:

Using justUsing cargo

OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 \
DSP_META_LOG_FILTER=info \
just serve-dev

cargo build
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 \
./target/debug/dsp-meta

You should see the log message: OTLP exporter configured successfully

Step 3: Generate Traces

Generate some traces by making HTTP requests:

New Root SpanContinue TraceMultiple Requests

Make a request without trace context (creates new root span):

curl http://localhost:3000/api/v1/projects/0001

Make a request with trace context (continues existing trace):

curl -H "traceparent: 00-aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa-bbbbbbbbbbbbbbbb-01" \
     http://localhost:3000/api/v1/projects/0001

Generate multiple traces:

for i in {1..10}; do
  curl http://localhost:3000/api/v1/projects/0001
  sleep 0.5
done

Step 4: View Traces in Grafana

Open Grafana at http://localhost:3001
Click Explore (compass icon in the left sidebar)
Select Tempo from the datasource dropdown
Select Search query type
Filter by Service Name: dsp-meta
Click Run query
Click any trace to see the full span waterfall visualization

What You'll See

In the Grafana trace view:

Trace ID: Unique identifier for the distributed trace
Root Span (http_request): Created by TraceLayer for each HTTP request
Child Spans: Functions annotated with #[instrument]
Duration: Time taken by each span
Attributes: HTTP method, URI, status code, latency, etc.

Verifying Trace Propagation

When you send a request with a traceparent header, the Trace ID in Grafana should match the ID you sent. This confirms that trace context propagation is working correctly.

Span Propagation

W3C TraceContext Format

The traceparent header follows the W3C TraceContext standard:

traceparent: 00-{trace-id}-{parent-span-id}-{trace-flags}

Example:

traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01

00: Version
0af7651916cd43dd8448eb211c80319c: 32-character hex trace ID
b7ad6b7169203331: 16-character hex parent span ID
01: Trace flags (sampled)

Testing Propagation

Test that your application correctly continues traces:

# Generate unique trace and span IDs
TRACE_ID=$(openssl rand -hex 16)
SPAN_ID=$(openssl rand -hex 8)

# Send request with that trace context
curl -v \
  -H "traceparent: 00-${TRACE_ID}-${SPAN_ID}-01" \
  http://localhost:3000/api/v1/projects/0001

# Search for this trace ID in Grafana
echo "Search Grafana for trace ID: ${TRACE_ID}"

In Grafana, you should find a trace with that exact Trace ID, proving the application continued the distributed trace.

Adding Custom Span Attributes

You can enrich spans with custom attributes in your handler functions:

use tracing::{Span, instrument};

#[instrument]
pub async fn my_handler() {
    let span = Span::current();

    // Add custom attributes
    span.record("user_id", "12345");
    span.record("cache_hit", true);
    span.record("query_duration_ms", 42);

    // These attributes will appear in the trace view
}

These custom attributes will be visible in Grafana and can be used for filtering and analysis.

Troubleshooting

No traces appearing in Grafana

Check observability stack is running

docker ps | grep otel-lgtm

You should see a container named otel-lgtm running.

Check application is exporting traces

Look for this log message when starting the application:

Configuring OTLP exporter with endpoint: http://localhost:4317

If not present, verify the OTEL_EXPORTER_OTLP_ENDPOINT environment variable is set.

Check LGTM container logs

docker logs dsp-meta-otel-lgtm-1

Look for any errors receiving or processing spans.

Connection errors to Tempo

If you see connection errors, try using Docker's host networking:

OTEL_EXPORTER_OTLP_ENDPOINT=http://host.docker.internal:4317 just serve-dev

Traces appear but are incomplete

Ensure all handler functions use the #[instrument] attribute:

#[instrument(skip(state))]
pub async fn get_by_shortcode(
    Path(shortcode): Path<Shortcode>,
    State(state): State<Arc<AppState>>,
) -> Result<Response, DspMetaError> {
    // handler code
}

The skip parameter prevents large objects from being logged.

Production Considerations

Reverse Proxy Configuration

In production, your reverse proxy should inject trace context headers. Example nginx configuration:

location / {
    # Generate trace context if not present
    set $trace_id $request_id;
    proxy_set_header traceparent "00-$trace_id-0000000000000000-01";

    proxy_pass http://dsp-meta:3000;
}

Sampling

For high-traffic services, consider configuring sampling to reduce trace volume:

// In main-server.rs, configure sampler
use opentelemetry_sdk::trace::Sampler;

let tracer_provider = TracerProvider::builder()
    .with_config(
        Config::default()
            .with_resource(resource)
            .with_sampler(Sampler::TraceIdRatioBased(0.1)) // Sample 10% of traces
    )
    .build();

Security

Endpoint Security

Ensure the OTLP endpoint is secured with authentication when exposing to production networks. The current implementation sends traces without authentication.

Available Just Commands

The following just commands are available for managing observability:

Command	Description
`just observability-up`	Start Grafana LGTM observability stack
`just observability-down`	Stop observability stack
`just observability-clean`	Stop stack and remove volumes (deletes all stored data)
`just serve-with-observability`	Start observability stack and run dsp-meta
`just serve-dev-with-observability`	Start stack and run dsp-meta with hot reload

Cleanup

Stop and remove the observability stack:

Using justUsing docker-compose

# Stop containers
just observability-down

# Remove volumes (deletes stored traces)
just observability-clean

# Stop containers
docker-compose -f docker-compose.observability.yml down

# Remove volumes (deletes stored traces)
docker-compose -f docker-compose.observability.yml down -v