Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

This is the monorepo for the DSP Repository.

The DaSCH Service Platform (DSP) consists of two main components:

  • DSP VRE: The DSP Virtual Research Environment (VRE), where researchers can work on their data during the lifetime of the project. It consists of the DSP-APP, DSP-API and DSP-TOOLS.
    The DSP VRE is developed in various other git repositories.
  • DSP Repository: The DSP Repository is the long-term archive for research data. It consists of the DSP Archive and the Discovery and Presentation Environment (DPE).
    The DSP Repository is developed in this monorepo.

Additionally, the monorepo contains the Mosaic component library (design system).

For system architecture details, see DPE Architecture and Project Structure.

This documentation provides an overview of the project structure. It covers the different components of the system architecture, the design system we use for the development, and the processes we follow for working on the DSP Repository, including onboarding information, etc.

About this Documentation

This documentation is built using mdBook.

Pre-requisites

Before contributing, please ensure you have the following installed:

Any further dependencies can be installed using just commands:

just install-requirements

Building and Serving the Documentation

To run the documentation locally, use:

just docs-serve

Contributing to the Documentation

mdBook uses Markdown for documentation.

The documentation is organized into chapters and sections, which are defined in the SUMMARY.md file. Each section corresponds to a Markdown file in the src directory.

To configure the documentation (e.g. adding plugins), modify the book.toml file.

Deployment

This documentation is deployed to GitHub Pages automatically on every push to main via the gh-pages.yml workflow.

Workflows and Conventions

Entry Points

The first entry point of this repository is the README file, which should give anyone an indication of where to find any information they need.

For any interaction or coding-related workflow, the justfile is the primary source of truth. Run just without arguments to see all available commands with descriptions.

Key Development Commands

CommandDescription
just checkRun formatting and linting checks
just buildBuild all targets
just testRun all tests
just fmtFormat all Rust code (cargo fmt + leptosfmt)
just runRun server (release mode)
just watchWatch for changes and run tests
just watch-dpeRun DPE with hot reload
just watch-mosaic-playgroundRun Mosaic playground with hot reload
just install-requirementsInstall all development dependencies
just install-e2e-requirementsInstall Playwright browsers for E2E tests
just docs-serveServe documentation locally at localhost:3000
just validate-dataValidate all data files in the default data directory

Git Workflow

We use a rebase workflow. All changes are made on a branch, then rebased onto main before being merged. This keeps a clean, linear commit history.

  • Rebase-merge: PRs are integrated using rebase-merge (not squash or merge commits). Every commit on a branch becomes a commit on main.
  • Clean commit history: Before merging, clean up the branch so that each commit represents one logical unit of change. Squash fixups, reword messages, and reorder commits so the history reads well on main.

Commit Conventions

Follow Conventional Commits. Scopes match crate names: dpe-server, dpe-core, dpe-web, dpe-api-oai, mosaic-tiles, mosaic-playground, mosaic-playground-macro.

Types

PrefixMeaningChangelogVersion bump
feat:New user-visible functionalityFeaturesminor
fix:Bug fixBug Fixespatch
perf:Performance improvementPerformancepatch
revert:Revert a previous commitRevertspatch
refactor:Code restructuringhiddennone
test:Testshiddennone
ci:CI/CDhiddennone
docs:Documentationhiddennone
build:Build system, depshiddennone
style:Formattinghiddennone
chore:Maintenancehiddennone

Commit Organization

Group commits by user-visible impact, not by implementation journey.

  1. Each feat: or fix: commit = one changelog entry visible to deployers
  2. Internal work (build:, ci:, refactor:, docs:, chore:, test:) is hidden from changelog — squash aggressively
  3. Ask: "would a developer deploying this care?" If yes → feat: or fix:. If no → hidden type.
  4. Debugging journeys (trial-and-error, reverts, iterative fixes) belong in the PR description, not the commit history

Pull Request Workflow

PR Template

Fixes LINEAR-ID, LINEAR-ID, ...

## Motivation
Why this work was needed. What problem it solves for users.

## Summary
1-3 bullet points of user-visible changes.

## Key Changes
### [Topic]
- change details

## Challenges and Decisions
What was tried, what failed, and key architecture decisions.
Structure as sub-sections when multiple challenges exist:

### [Challenge title]
**Problem:** description of the issue encountered
**Tried:** approaches that didn't work and why
**Solution:** what worked and why it's the right approach

## Gotchas
Things future developers should know. Each gotcha should be
actionable — not just "this is hard" but "do X instead of Y".

## Test Plan
- [ ] verification steps

Why This Format Matters

The "Challenges and Decisions" section captures the debugging journey that would otherwise be lost when commits are squashed. Well-structured challenges become high-quality learnings automatically.

PR Creation Process

  1. Create as draft: gh pr create --draft
  2. Assign to the requesting developer: gh pr edit [PR_NUMBER] --add-assignee [USERNAME]
  3. Include a "Review Notes" section mentioning that separate commits should be checked for easier review

What Goes Where

InformationPut it in...
New feature / breaking changeCommit message (feat: / feat!:)
Bug fixCommit message (fix:)
Build/CI/refactor detailsCommit message (hidden type)
Why the work was neededPR Motivation section
What was tried and failedPR Challenges section
Architecture decisions + rationalePR Challenges section
Things to watch out forPR Gotchas section
Structured, searchable knowledgeLearnings doc (dasch-specs)

Release Workflow

Releases are automated via Release Please. On every push to main, Release Please reads conventional commit messages and either creates or updates a release PR. Merging the release PR creates a GitHub Release with auto-generated release notes.

Code Review

See Review Guidelines for the review checklist.

CI/CD

GitHub Actions workflows run automatically on pushes and pull requests. See Release, Deployment and Versioning for details on the CI/CD pipelines.

Project Structure and Code Organization

Overview

This repository is a Rust workspace structured as a monorepo. All Rust crates are organized as subdirectories within the modules/ directory.

modules/
├── dpe/                       # Discovery and Presentation Environment
│   ├── core/                  # Pure domain types, repositories, data loading (crate: dpe-core)
│   ├── api-oai/               # OAI-PMH 2.0 API (crate: dpe-api-oai)
│   ├── web/                   # Web layer: Leptos components, pages (crate: dpe-web)
│   ├── server/                # Server binary: route composition, Datastar fragments (crate: dpe-server)
│   ├── web-e2e-tests/         # Playwright E2E tests
│   ├── public/                # Static assets
│   ├── style/                 # CSS / Tailwind
│   └── Dockerfile             # Production container image
└── mosaic/                    # Mosaic component library (design system)
    ├── tiles/                 # Reusable Leptos UI components (crate: mosaic-tiles)
    ├── playground/            # Component playground application (crate: mosaic-playground)
    ├── playground_macro/      # Proc macro for playground page generation (crate: mosaic-playground-macro)
    └── playground-e2e-tests/  # Playwright E2E tests for the playground

Crate and Folder Naming Convention

Crate names follow the {module}-{role} pattern. Folder names strip the module prefix, keeping only the role part. Hyphens in crate names become underscores in folder names when needed for Rust compatibility (proc macro crates).

CrateFolderRole
dpe-coredpe/corePure domain types and data access (zero framework deps)
dpe-api-oaidpe/api-oaiOAI-PMH 2.0 API (depends on dpe-core only)
dpe-webdpe/webLeptos SSR components, pages, #[server] functions
dpe-serverdpe/serverServer binary — composes all routes
mosaic-tilesmosaic/tilesReusable UI component library
mosaic-playgroundmosaic/playgroundComponent showcase application
mosaic-playground-macromosaic/playground_macroProc macro for playground page generation

API Crate Pattern

Each API is a separate crate under modules/dpe/:

  • Naming: dpe-api-{name} (e.g., dpe-api-oai)
  • Dependencies: dpe-core for domain types; never depends on other API crates or dpe-web
  • Entry point: Exports a handler function (e.g., pub async fn oai_handler(...))
  • Composition: dpe-server wires the handler into the Axum router

For detailed crate responsibilities and the dependency graph, see DPE Project Structure.

Release, Deployment and Versioning

CI/CD Pipelines

All CI/CD workflows are defined as GitHub Actions in .github/workflows/.

Checks and Tests

Every push and pull request runs:

  • check.yml — Formatting (rustfmt, leptosfmt) and linting (clippy)
  • test.yml — Runs the full test suite
  • scout-dpe.yml / scout-mosaic-playground.yml — Docker image vulnerability scanning (see Security)

Accessibility Testing

Defined in a11y-dpe.yml.

Runs on PRs and pushes to main that touch DPE UI code (modules/dpe/web/, modules/dpe/style/, modules/dpe/public/). Builds the DPE, then runs Playwright accessibility tests with axe-core against WCAG 2.1 AA.

Fuzz Testing

Defined in fuzz.yml.

Runs nightly at 02:00 UTC (and on manual dispatch). Fuzzes tab_validation and query_params targets for 10 minutes each using cargo-fuzz on nightly Rust. Corpus is cached between runs. On crash, automatically creates a GitHub issue with reproduction instructions.

Reusable Actions

Common CI steps are extracted into composite actions in .github/actions/:

ActionPurpose
build-dpeCompile DPE (Rust musl binary + Leptos site assets) and stage artifacts
docker-publishSet up Buildx, log in to Docker Hub, build and push an image
docker-scoutRun Docker Scout CVE scan and upload SARIF results

Mosaic Playground

The Mosaic component library playground has two deployment paths:

PR Preview (Cloud Run)

Defined in cloud-run-mosaic-pull-request.yml.

When a pull request modifies files under modules/mosaic/, a preview of the Mosaic playground is automatically deployed to Google Cloud Run. The preview URL is posted as a comment on the PR and updated on each push.

  • Trigger: PRs that touch modules/mosaic/** (same-repo only, not forks)
  • Service: Ephemeral Cloud Run service per PR
  • Cleanup: The Cloud Run service and container image are deleted when the PR is closed or merged

Authentication uses Workload Identity Federation (keyless, OIDC-based).

Production (Docker Hub + Jenkins)

Defined in mosaic-docker-publish.yml.

When changes to modules/mosaic/ are merged to main, the playground image is built, pushed to Docker Hub, and a Jenkins webhook triggers the production deployment.

DPE

PR Preview (Cloud Run)

Defined in cloud-run-dpe-pull-request.yml.

When a pull request modifies files under modules/dpe/, a preview of the DPE is automatically deployed to Google Cloud Run. Works the same way as the Mosaic preview: ephemeral service per PR, cleaned up on close/merge.

Continuous Deployment (Docker Hub + Jenkins)

Defined in dpe-docker-publish.yml.

On every push to main:

  1. Builds site assets with cargo-leptos
  2. Builds a static musl-linked binary
  3. Pushes the Docker image to Docker Hub (daschswiss/dpe:{tag})
  4. Triggers a Jenkins webhook for DEV deployment

Release Publishing

Defined in dpe-release-publish.yml.

When a GitHub Release is published (tag starting with v), builds and pushes a release-tagged Docker image.

Release Please

Defined in release-please.yml.

On every push to main, Release Please reads conventional commit messages and creates or updates a release PR with auto-generated changelog. Merging the release PR creates a GitHub Release.

Configuration lives in .github/release-please/config.json and .github/release-please/manifest.json.

Documentation (GitHub Pages)

Defined in gh-pages.yml. The mdBook documentation is built and deployed to GitHub Pages on pushes to main.

Claude Code

Defined in claude.yml.

Responds to @claude mentions in PR comments and issue comments. Supports code review (@claude review) and general assistance. Runs with limited permissions (contents: read, pull-requests: write).

Security

Why Security Scanning Matters

Software depends on a deep stack of third-party components: base OS images, system libraries, language runtimes, and application dependencies. Vulnerabilities are regularly discovered in these components — the CVE database publishes thousands each year. A single unpatched dependency in a Docker image can become an entry point for attackers in production.

Manual tracking of vulnerabilities across all dependencies is not practical. Automated scanning integrates into the development workflow so that new vulnerabilities are surfaced early — ideally before code reaches production.

Container Image Scanning with Docker Scout

We use Docker Scout to scan our Docker images for known vulnerabilities (CVEs). Scout analyzes the Software Bill of Materials (SBOM) of each image — the full inventory of OS packages, libraries, and application dependencies — and matches them against vulnerability databases.

What Gets Scanned

ImageWorkflowTrigger
DPE (daschswiss/dpe)scout-dpe.ymlPRs touching modules/dpe/** or Cargo.lock
Mosaic Playground (daschswiss/mosaic-playground)scout-mosaic-playground.ymlPRs touching modules/mosaic/** or Cargo.lock

How It Works

Each Scout workflow:

  1. Builds the Docker image locally — the image is loaded into the runner's Docker daemon (load: true) but never pushed to a registry. This means Scout scans exactly what would be deployed, without exposing unreviewed images.

  2. Runs a CVE analysis — Docker Scout compares the image's SBOM against known vulnerability databases, filtering for critical and high severity issues.

  3. Posts a PR comment — a summary of findings is posted directly on the pull request, giving developers immediate visibility without leaving their review workflow.

  4. Uploads a SARIF report — results are uploaded to the GitHub Security tab in SARIF format (Static Analysis Results Interchange Format), the industry standard for security tool output. This integrates with GitHub's code scanning alerts.

What To Do With Results

Scout results are currently informational — they do not block merging. When a scan reports vulnerabilities:

  • Critical/High in base image — check if a newer base image version is available that patches the issue. For DPE (distroless), these are rare. For Mosaic (Debian-based), update the base image tag.
  • Critical/High in dependencies — check if a dependency update resolves the issue. Run cargo update and re-test.
  • False positives — some CVEs may not be exploitable in our context. Document the rationale if choosing to accept the risk.

Prerequisites

  • Docker Scout is enabled for the daschswiss Docker Hub organization
  • Repository secrets DOCKER_USER and DOCKER_HUB_TOKEN (shared with publish workflows)
  • GitHub Advanced Security or a public repository (for SARIF upload)

Future Enhancements

  • Production comparison — using Docker Scout's compare command to show only new vulnerabilities introduced by a PR (requires configuring Docker Scout environments on Docker Hub)
  • Main-branch scanning — continuous monitoring of production images
  • Blocking on critical CVEs — failing the PR check when critical vulnerabilities are detected

DPE Architecture

The Discovery and Presentation Environment (DPE) serves research project metadata as a web application.

Crate Structure

dpe-core          Pure domain types, repositories, data loading
                  Dependencies: serde, serde_json only
                       │
          ┌────────────┼────────────┐
          │            │            │
     dpe-api-oai   dpe-web     (future APIs)
     OAI-PMH 2.0  Leptos SSR
     + axum        + Datastar
          │            │
          └────────────┘
                 │
           dpe-server
           Route composition
           (binary: dpe)
  • dpe-core: Framework-free domain layer. All types, repository traits, Fs implementations, and data loading.
  • dpe-api-oai: OAI-PMH 2.0 endpoint. Depends only on dpe-core — no Leptos.
  • dpe-web: Leptos SSR components, pages, and #[server] wrappers. Re-exports dpe-core types for backward compatibility.
  • dpe-server: Thin composition root. Wires Leptos routes (dpe-web) and API handlers (dpe-api-oai) into a single Axum server.

Hypermedia-Driven Architecture

The DPE uses a hypermedia-driven architecture where the server is the single source of truth for UI state. Interactivity is provided by Datastar (~14KB JS) instead of client-side frameworks or WASM.

Why Datastar over Leptos islands:

  • No WASM compilation step (faster builds)
  • Smaller client-side footprint (~14KB vs ~200KB+ WASM)
  • Server controls all state (HATEOAS)
  • Graceful degradation — works as plain HTML links without JavaScript
  • Simpler mental model — HTML attributes, not reactive signals

Rendering Model

Pages are rendered server-side by Leptos SSR. Dynamic content updates (tab switching, search autocomplete) are handled by Datastar SSE fragments.

Initial page load:
  Browser → GET /projects/ABC1 → Leptos SSR → Full HTML page

Tab switch (with JS):
  Browser → GET /projects/ABC1/tab/publications (SSE)
         ← PatchElements (#project-tabs replacement)
         ← ExecuteScript (history.replaceState for URL)

Tab switch (without JS):
  Browser → GET /projects/ABC1?tab=publications → Full page reload

Fragment Route Convention

Fragment endpoints are pure Axum handlers (not Leptos routes) that render Leptos components to HTML strings and deliver them as Datastar SSE events.

Route pattern: resource-action nesting

GET /projects/{id}              → Full page (Leptos SSR)
GET /projects/{id}/tab/{tab}    → SSE fragment (Axum + Datastar)

Different path depths in Axum's radix trie mean no conflict and no header-based discrimination.

HATEOAS Tab Pattern

The server returns the complete tab component (tab bar + panel) in each SSE response. This means:

  • Server controls which tab is active (aria-selected)
  • Server controls which tabs are visible (e.g., hide Publications if none exist)
  • Server pushes the bookmarkable URL via ExecuteScript + history.replaceState

The client never needs to track tab state — the server-rendered HTML IS the state.

Datastar Attribute Patterns

<!-- Tab link with Datastar enhancement -->
<a href="/projects/ABC1?tab=publications"
   role="tab" aria-selected="false"
   data-on:click__prevent="@get('/projects/ABC1/tab/publications', {retry: 'never'})"
   data-indicator:_tab_loading>
  Publications
</a>

<!-- SSE failure fallback on container -->
<div id="project-tabs"
     data-on:datastar-fetch="
       (evt.detail.type === 'error' || evt.detail.type === 'retries-failed')
       && evt.detail.el.closest('#project-tabs')
       && (window.location.href = evt.detail.el.getAttribute('href'))
     ">

Datastar Attribute Conventions

  • Signal naming: Use _ prefix for client-only signals (e.g., _tab_loading). The underscore excludes the signal from server payloads.
  • No __debounce on __prevent anchors: Do NOT combine __prevent with __debounce or __throttle on anchor elements — known Datastar timing issue.
  • retry: 'never': Use on @get() calls where fallback to full navigation is preferred over retrying.
  • Graceful degradation: Every Datastar-enhanced <a> must have a valid href for no-JS fallback.

See Also

DPE Project Structure

Workspace Layout

modules/dpe/
├── core/             dpe-core          Pure domain (serde only)
├── api-oai/          dpe-api-oai       OAI-PMH 2.0 endpoint
├── web/              dpe-web           Leptos SSR + Datastar fragments
├── server/           dpe-server        Axum binary (composition root)
├── web-e2e-tests/                      Playwright E2E tests
├── public/                             Static assets
└── style/                              Tailwind CSS

Dependency Graph

dpe-core          ← pure domain, no framework deps
  ↑
  ├── dpe-api-oai ← OAI-PMH endpoint
  ├── dpe-web     ← Leptos SSR pages + components
  └── dpe-server  ← composition root, Datastar fragment handlers

Crate Responsibilities

dpe-core (core/)

Framework-free domain layer. Contains:

  • Domain types: Project, Record, Person, Organization, Attribution, etc.
  • Repository traits: ProjectRepository, RecordRepository
  • Fs implementations: FsProjectRepository, FsRecordRepository (backed by in-memory caches)
  • Data loading: Project and record caches (OnceLock<Vec<T>>) loaded from JSON on first access
  • Utilities: lang_value(), get_data_dir()

Dependencies: serde, serde_json only.

dpe-api-oai (api-oai/)

OAI-PMH 2.0 Data Provider. Implements the six required verbs (Identify, ListMetadataFormats, ListSets, ListIdentifiers, ListRecords, GetRecord).

Depends on dpe-core for domain types — no Leptos or web framework dependency.

dpe-web (web/)

Leptos SSR web layer. Contains:

  • Pages: home, about, project, projects (with filters and pagination)
  • Components: navbar, footer, project cards, tab panels, search input
  • Domain re-exports: domain/mod.rs re-exports dpe-core types for a single import path
  • Server functions: #[server] wrappers around dpe-core functions

dpe-server (server/)

Composition root and Axum binary. Contains:

  • Route wiring: Leptos SSR routes, OAI-PMH handler, Datastar fragment endpoints, /healthz
  • Fragment handlers: fragments.rs — pure Axum handlers that render Leptos components to HTML and return Datastar SSE events
  • Configuration: config.rs — figment-based layered config (defaults → dpe.tomlDPE_* env vars)
  • Logging: tracing-subscriber with env-filter and JSON support

Key Patterns

  • Domain types in dpe-core, not in web or API crates
  • API crates depend on dpe-core only, never on each other or on dpe-web
  • dpe-server contains no business logic — only route composition and fragment rendering
  • Fragment handlers use Owner::new() + view! { ... }.to_html() to render Leptos components from pure Axum handlers

DPE Testing Strategy

The DPE follows a 4-layer testing pyramid, adapted from the Sipi testing strategy. Target distribution: ~50% unit, ~30% E2E, ~15% snapshot, ~5% fuzz.

Testing Pyramid

          ╱╲
         ╱  ╲         Layer 4: Fuzz Testing (nightly CI)
        ╱────╲        cargo-fuzz, corpus persisted
       ╱      ╲
      ╱  E2E   ╲     Layer 3: E2E Tests (Playwright)
     ╱──────────╲     Tab switching, search, accessibility (axe-core)
    ╱            ╲
   ╱  Snapshots   ╲   Layer 2: Snapshot Tests (insta)
  ╱────────────────╲   SSR output, SSE fragments, ARIA attributes
 ╱                  ╲
╱    Unit Tests      ╲ Layer 1: Unit Tests (cargo test)
╱────────────────────╲ Fragment handlers, OAI protocol, domain logic

Layer 1: Unit Tests

  • Location: #[cfg(test)] modules in each crate
  • Runner: cargo test --workspace
  • Scope: Fragment handlers, OAI protocol, domain types, data loading, filtering/pagination
  • Crate: dpe-core tests run independently — cargo test -p dpe-core

Layer 2: Snapshot Tests (insta)

  • Dependency: insta with yaml and filters features
  • Location: Adjacent snapshots/ directories
  • CI: Set INSTA_UPDATE=new so failures produce .snap.new artifacts for review
  • Scope: SSR output, SSE fragment response bodies, ARIA attributes

Layer 3: E2E Tests (Playwright)

  • Location: modules/dpe/web-e2e-tests/
  • Runner: npx playwright test
  • Scope: Tab switching, search autocomplete, scroll preservation, accessibility (axe-core), visual regression
  • Accessibility: Full-page axe-core scans against WCAG 2.1 AA

Layer 4: Fuzz Testing

  • Tool: cargo-fuzz (nightly Rust)
  • Schedule: Nightly CI, 10 minutes per target
  • Targets: Tab name validation, SSE response construction, query parameter parsing
  • Corpus: Persisted between runs

CI Pipeline Budget

Target: ≤ 10 minutes wall-clock per PR.

Parallel job group 1 (~2 min):
  cargo fmt --check
  cargo clippy --all-targets -Dwarnings
  cargo-deny check

Parallel job group 2 (~5 min):
  cargo nextest run --workspace
  cargo leptos build --release
  cargo-llvm-cov (coverage → Codecov)

Parallel job group 3 (~5 min):
  Playwright E2E tests
  axe-core accessibility scans
  Lighthouse CI performance budgets

Testing Conventions

Test naming: Use descriptive names following the test_{what}_{condition}_{expected} pattern. For example: test_parse_project_missing_title_returns_error.

Test locations:

  • Unit tests: In-crate #[cfg(test)] modules or adjacent _tests.rs files
  • Snapshot files: .snap files committed to git in snapshots/ directories
  • E2E tests: web-e2e-tests/ for DPE, playground-e2e-tests/ for Mosaic
  • Fuzz corpus: Persisted in the repository under fuzz/corpus/

Test file naming: {feature}_tests.rs for Rust, {feature}.spec.ts for Playwright.

Snapshot tests: Use the insta crate. Use with_settings! for scrubbing dynamic values (timestamps, IDs). CI runs with INSTA_UPDATE=new so unexpected changes produce .snap.new files for review.

DPE Operations Guide

Operations documentation for the DPE infrastructure team.

Docker Image

  • Base: gcr.io/distroless/static-debian12:nonroot
  • User: uid 65534 (nonroot, built-in to distroless)
  • Shell: None (distroless — no SSH possible)
  • Binary: Static musl-linked dpe (CLI with subcommands)

CLI Commands

The dpe binary provides three subcommands:

CommandDescription
dpe serveStart the web server
dpe validate <data_dir>Validate all data files under the given directory
dpe healthcheck [--url URL]Check if the server is healthy (default: http://localhost:8080/healthz)

dpe validate

Validates JSON data files for structural correctness and cross-reference integrity.

dpe validate ./data

What it checks:

  • JSON schema validity for all data file types (projects, persons, organizations, records, clusters, collections)
  • Cross-references between projects, persons, and organizations
  • Orphaned files that are not referenced by any parent entity

Exit codes:

  • 0 — all data files are valid
  • 1 — validation errors found (details printed to stderr)

dpe healthcheck

Lightweight probe for Docker HEALTHCHECK or monitoring:

dpe healthcheck                                    # default: http://localhost:8080/healthz
dpe healthcheck --url http://localhost:9090/healthz # custom URL

Ports

PortProtocolPurpose
8080HTTPApplication server

Environment Variables

VariableRequiredDefaultDescription
RUST_LOGNoinfoLog level filter (e.g., dpe_server=info,tower_http=debug)
DPE_DATA_DIRNomodules/dpe/server/dataPath to project/record JSON data files. Legacy alias: DATA_DIR (checked if DPE_DATA_DIR is unset)
DPE_FATHOM_SITE_IDNo(none)Fathom Analytics site ID (not a secret)
LEPTOS_SITE_ADDRNo0.0.0.0:8080Listen address and port
LEPTOS_SITE_ROOTNositePath to static site assets
LEPTOS_SITE_PKG_DIRNopkgJS/CSS package subdirectory
LEPTOS_OUTPUT_NAMENodpeCSS/JS output filename prefix
LEPTOS_ENVNoPRODLeptos environment (DEV or PROD)

Health Check

  • Endpoint: GET /healthz
  • Response: 200 OK (no body)
  • Purpose: Lightweight probe for Traefik/load balancers. Does not hit Leptos SSR.

Data Volume

  • Mount point: Value of DPE_DATA_DIR
  • Access: Read-only
  • Contents: Project metadata JSON files, organized by type (projects/, persons/, organizations/, clusters/, collections/, records/)

Resource Requirements

The DPE is lightweight — it serves static data with no database.

  • Memory: ~50-100 MB typical
  • CPU: Minimal (SSR rendering is fast, data is cached in-memory)
  • Disk: Data files + static assets (~50 MB)

Logging

Structured logging via tracing-subscriber. Configure levels with RUST_LOG:

# Default (info level)
RUST_LOG=info

# Debug HTTP requests
RUST_LOG=dpe_server=info,tower_http=debug

# Verbose debugging
RUST_LOG=debug

Observability

Fathom Analytics

Privacy-friendly, GDPR-compliant analytics. No cookies, no personal data collected.

Configuration: Set the DPE_FATHOM_SITE_ID environment variable to your Fathom site ID (not a secret). The tracking script is automatically injected into the HTML shell.

What gets tracked:

  • Page views
  • Tab switches (detected automatically via history.replaceState)

Disable: Omit the DPE_FATHOM_SITE_ID environment variable — no tracking script is rendered.

Discovery, Design and Development Process

Any work on the DSP Repository is done in collaboration between Management, Product Management (PM), the Research Data Unit (RDU) and Development (Dev). The process should look as outlined below, but may be adjusted to fit the needs of the project and the team.

In discovery, PM validates that an opportunity is aligned with DaSCH's strategy. In collaboration with the RDU, PM verifies that the opportunity can provide a desirable outcome.

PM will create a project description, including low-fidelity wireframes (Balsamiq or pen and paper), based on which they define user flows and journeys.

If any design components are needed, these will be added to the design system.

Finally, high-fidelity wireframes will be created in Figma, if needed.

Based on the project description and the wireframes, Dev will refine the project description, create Linear tickets and implement it accordingly.

When the implementation is done, PM will verify that the outcome was achieved and identify opportunities for further improvements.

Tech Stack

Core

TechnologyPurpose
Rust (Edition 2021)Primary development language
AxumHTTP web framework
LeptosReactive UI framework (SSR for DPE, islands for Mosaic)
DatastarSSE-based interactivity for DPE (~14KB JS, no WASM)
Tailwind CSS v4Utility-first CSS framework
DaisyUITailwind component plugin
TokioAsync runtime
figmentLayered configuration (defaults → TOML → env vars)

Data & Persistence

TechnologyPurpose
serde / serde_jsonSerialization and deserialization
Static JSON filesCurrent data storage (database TBD)

Testing & Quality

TechnologyPurpose
cargo test / nextestRust test runner
instaSnapshot testing for SSR output
PlaywrightEnd-to-end browser tests
axe-coreAccessibility scanning (WCAG 2.1 AA)
cargo-fuzzFuzz testing (nightly CI)

Build & Development

TechnologyPurpose
cargo-leptosLeptos build tool (handles Tailwind, WASM, site assets)
justCommand runner for development workflows
leptosfmtLeptos-aware code formatter
BiomeLinter/formatter for E2E test TypeScript

Documentation & Observability

TechnologyPurpose
mdBook + mdbook-alertsProject documentation
Fathom AnalyticsPrivacy-friendly web analytics (GDPR-compliant, no cookies)
tracing + tracing-subscriberStructured logging

Architecture Principles

We keep the design evolutionary, starting from the simplest possible solution and iterating on it. At first, providing data from static JSON files is sufficient. Following clean architecture principles, swapping out the persistence layer is easy.

TypeScript is used exclusively for testing and development tooling, not for production runtime code. The core application remains purely Rust-based.

Testing and Quality Assurance

We follow the Testing Pyramid approach to testing, the majority of tests are unit tests, with a smaller number of integration tests, and a few end-to-end tests.

Unit and integration tests are written in Rust, end-to-end tests are written either in Rust or in JavaScript using Playwright.

Design System Testing

The design system playground includes comprehensive testing infrastructure:

Interactive Testing (MCP):

  • Start playground server: just watch-mosaic-playground
  • Use Claude Code with Playwright MCP commands for visual verification
  • Commands whitelisted in .claude/settings.json
  • Best for: Component development, design verification, manual testing

Automated Testing (CI/CD):

  • TypeScript-based Playwright setup in modules/mosaic/playground-e2e-tests/
  • Functional, accessibility, and responsive design testing in CI
  • HTML + JSON reporters for CI/CD integration
  • Best for: End-to-end user flows, automated regression detection

For single component interactions, prefer Rust tests. Playwright is for complete user flows.

Unit tests are the foundation of our testing strategy. They test individual components in isolation, ensuring that each part of the codebase behaves as expected. Unit tests are fast to write and to execute, and they provide immediate feedback on the correctness of the code.

Integration tests verify the interaction between different components, ensuring that they work together as expected. Integration tests may check the integration between the business logic and the presentation layer, or between the view and the business logic.

End-to-end tests verify the entire system. They simulate real user interactions and check that the system behaves as expected.

Additional to the functional tests, we also need to implement performance tests.

We aim to follow the practice of Test Driven Development (TDD), where tests are written before the code is implemented. This helps to ensure that the code is testable and meets the requirements.

Code Review Guidelines

Review checklist for the DSP Repository. Organized by priority.

Always Check

Fragment Endpoints

  • New fragment endpoints follow resource-action nesting convention (see DPE Architecture)
  • New Datastar interactions have <a href> fallback for graceful degradation
  • ARIA semantics present on interactive components (role, aria-selected, aria-controls)

Testing

  • insta snapshots added/updated for changed SSR output
  • E2E test covers the user-facing behavior
  • axe-core scan passes on affected pages
  • Unit tests for fragment handler edge cases (invalid tab, missing project, etc.)

Architecture

  • New API crates follow the dpe-api-{name} pattern with dpe-core as only domain dependency (see Project Structure)
  • dpe-core has no framework dependencies (no leptos, no axum)
  • Validate command covers all data file types (DPE)
  • E2E test directory naming: web-e2e-tests/ for DPE, playground-e2e-tests/ for Mosaic

CLI

  • CLI subcommands are documented in help text

Documentation

Commits

  • Commits follow conventional commits (correct prefix, scope matches crate name) — see Workflows and Conventions
  • One topic per commit — apply the "and" test
  • Each commit builds and passes tests

Security

  • No secrets in config files, Cargo.toml, or git
  • Path parameters validated before filesystem access

Style

  • Follow existing Datastar attribute patterns (signal naming with _ prefix) — see DPE Architecture
  • Fragment handlers in fragments/ module, not inline in main.rs
  • Domain types belong in dpe-core, not in web or API crates
  • API crate exposes a handler function (e.g., pub async fn oai_handler(...)) for composition in dpe-server
  • Leptos components use view! macro consistently
  • Test files follow naming convention: {feature}_tests.rs for Rust, {feature}.spec.ts for Playwright

Skip

  • Snapshot .snap file contents — verify accepted, don't review formatting
  • Formatting-only changes (cargo fmt / leptosfmt diffs)
  • Cargo.lock changes from dependency updates

Onboarding

Rust

The main technology we use is Rust. A solid understanding of Rust is needed, though particularly the frontend work does not require deep knowledge of Rust.

Rust HTTP Server

We use Axum as our HTTP server.

Serialization and Deserialization

We use serde for serialization and deserialization of data.

Web UI

We use Leptos as our UI framework for building reactive web applications in Rust.

Leptos is a full-stack web framework that allows writing both server and client code in Rust. It provides reactive primitives and a component model similar to modern JavaScript frameworks.

Key features:

  • The islands Cargo feature is enabled workspace-wide (Leptos 0.8 build requirement)
  • Only the Mosaic component library uses actual island components with client-side WASM hydration
  • DPE uses SSR-only with Datastar for interactivity — no client-side WASM
  • The architecture follows the MPA paradigm, a "multi-page app"
  • Server-side rendering
  • Fine-grained reactivity
  • Component-based architecture
  • Full Rust syntax support

Architectural Design Patterns

We follow concepts such as Clean Architecture (there is also a book), Hexagonal Architecture or Onion Architecture. Familiarity with these concepts will be helpful.

Some of the patterns must be adapted to the idioms of Rust, but the general principles are the same.

Testing

We follow the Testing Pyramid approach to testing, the majority of tests are unit tests, with a smaller number of integration tests, and a few end-to-end tests.

Unit and integration tests are written in Rust, following the Rust testing best practices. End-to-end tests can be written using Playwright. Leptos has some built-in support for Playwright.

Domain Driven Design

We do not follow strict Domain Driven Design (DDD) principles, but we try to follow some of the concepts. In particular, we try to keep the language used in code aligned with the domain language.

Test Driven Development

We should absolutely do TDD and BDD.

Database

We are still evaluating the database to use.

For the initial development, we work with static content or JSON files.

Mosaic Component Library

The Mosaic component library provides reusable UI components built with Leptos and Tailwind CSS.

Components are defined in modules/mosaic/tiles/ and can be previewed in the playground application at modules/mosaic/playground/.

To run the playground locally:

just watch-mosaic-playground

Pull requests that modify files in modules/mosaic/ automatically receive a Cloud Run preview deployment. The preview URL is posted as a comment on the PR.