Introduction
This is the monorepo for the DSP Repository.
The DaSCH Service Platform (DSP) consists of two main components:
- DSP VRE: The DSP Virtual Research Environment (VRE),
where researchers can work on their data during the lifetime of the project.
It consists of the DSP-APP, DSP-API and DSP-TOOLS.
The DSP VRE is developed in various other git repositories. - DSP Repository: The DSP Repository is the long-term archive for research data.
It consists of the DSP Archive and the Discovery and Presentation Environment (DPE).
The DSP Repository is developed in this monorepo.
Additionally, the monorepo contains the Mosaic component library (design system).
For system architecture details, see DPE Architecture and Project Structure.
This documentation provides an overview of the project structure. It covers the different components of the system architecture, the design system we use for the development, and the processes we follow for working on the DSP Repository, including onboarding information, etc.
About this Documentation
This documentation is built using mdBook.
Pre-requisites
Before contributing, please ensure you have the following installed:
Any further dependencies can be installed using just commands:
just install-requirements
Building and Serving the Documentation
To run the documentation locally, use:
just docs-serve
Contributing to the Documentation
mdBook uses Markdown for documentation.
The documentation is organized into chapters and sections, which are defined in the SUMMARY.md file.
Each section corresponds to a Markdown file in the src directory.
To configure the documentation (e.g. adding plugins), modify the book.toml file.
Deployment
This documentation is deployed to GitHub Pages automatically on every push to main via the gh-pages.yml workflow.
Workflows and Conventions
Entry Points
The first entry point of this repository is the README file, which should give anyone an indication of where to find any information they need.
For any interaction or coding-related workflow, the justfile is the primary source of truth. Run just without arguments to see all available commands with descriptions.
Key Development Commands
| Command | Description |
|---|---|
just check | Run formatting and linting checks |
just build | Build all targets |
just test | Run all tests |
just fmt | Format all Rust code (cargo fmt + leptosfmt) |
just run | Run server (release mode) |
just watch | Watch for changes and run tests |
just watch-dpe | Run DPE with hot reload |
just watch-mosaic-playground | Run Mosaic playground with hot reload |
just install-requirements | Install all development dependencies |
just install-e2e-requirements | Install Playwright browsers for E2E tests |
just docs-serve | Serve documentation locally at localhost:3000 |
just validate-data | Validate all data files in the default data directory |
Git Workflow
We use a rebase workflow. All changes are made on a branch, then rebased onto main before being merged. This keeps a clean, linear commit history.
- Rebase-merge: PRs are integrated using rebase-merge (not squash or merge commits). Every commit on a branch becomes a commit on main.
- Clean commit history: Before merging, clean up the branch so that each commit represents one logical unit of change. Squash fixups, reword messages, and reorder commits so the history reads well on main.
Commit Conventions
Follow Conventional Commits. Scopes match crate names: dpe-server, dpe-core, dpe-web, dpe-api-oai, mosaic-tiles, mosaic-playground, mosaic-playground-macro.
Types
| Prefix | Meaning | Changelog | Version bump |
|---|---|---|---|
feat: | New user-visible functionality | Features | minor |
fix: | Bug fix | Bug Fixes | patch |
perf: | Performance improvement | Performance | patch |
revert: | Revert a previous commit | Reverts | patch |
refactor: | Code restructuring | hidden | none |
test: | Tests | hidden | none |
ci: | CI/CD | hidden | none |
docs: | Documentation | hidden | none |
build: | Build system, deps | hidden | none |
style: | Formatting | hidden | none |
chore: | Maintenance | hidden | none |
Commit Organization
Group commits by user-visible impact, not by implementation journey.
- Each
feat:orfix:commit = one changelog entry visible to deployers - Internal work (
build:,ci:,refactor:,docs:,chore:,test:) is hidden from changelog — squash aggressively - Ask: "would a developer deploying this care?" If yes →
feat:orfix:. If no → hidden type. - Debugging journeys (trial-and-error, reverts, iterative fixes) belong in the PR description, not the commit history
Pull Request Workflow
PR Template
Fixes LINEAR-ID, LINEAR-ID, ...
## Motivation
Why this work was needed. What problem it solves for users.
## Summary
1-3 bullet points of user-visible changes.
## Key Changes
### [Topic]
- change details
## Challenges and Decisions
What was tried, what failed, and key architecture decisions.
Structure as sub-sections when multiple challenges exist:
### [Challenge title]
**Problem:** description of the issue encountered
**Tried:** approaches that didn't work and why
**Solution:** what worked and why it's the right approach
## Gotchas
Things future developers should know. Each gotcha should be
actionable — not just "this is hard" but "do X instead of Y".
## Test Plan
- [ ] verification steps
Why This Format Matters
The "Challenges and Decisions" section captures the debugging journey that would otherwise be lost when commits are squashed. Well-structured challenges become high-quality learnings automatically.
PR Creation Process
- Create as draft:
gh pr create --draft - Assign to the requesting developer:
gh pr edit [PR_NUMBER] --add-assignee [USERNAME] - Include a "Review Notes" section mentioning that separate commits should be checked for easier review
What Goes Where
| Information | Put it in... |
|---|---|
| New feature / breaking change | Commit message (feat: / feat!:) |
| Bug fix | Commit message (fix:) |
| Build/CI/refactor details | Commit message (hidden type) |
| Why the work was needed | PR Motivation section |
| What was tried and failed | PR Challenges section |
| Architecture decisions + rationale | PR Challenges section |
| Things to watch out for | PR Gotchas section |
| Structured, searchable knowledge | Learnings doc (dasch-specs) |
Release Workflow
Releases are automated via Release Please. On every push to main, Release Please reads conventional commit messages and either creates or updates a release PR. Merging the release PR creates a GitHub Release with auto-generated release notes.
Code Review
See Review Guidelines for the review checklist.
CI/CD
GitHub Actions workflows run automatically on pushes and pull requests. See Release, Deployment and Versioning for details on the CI/CD pipelines.
Project Structure and Code Organization
Overview
This repository is a Rust workspace structured as a monorepo. All Rust crates are organized as subdirectories within the modules/ directory.
modules/
├── dpe/ # Discovery and Presentation Environment
│ ├── core/ # Pure domain types, repositories, data loading (crate: dpe-core)
│ ├── api-oai/ # OAI-PMH 2.0 API (crate: dpe-api-oai)
│ ├── web/ # Web layer: Leptos components, pages (crate: dpe-web)
│ ├── server/ # Server binary: route composition, Datastar fragments (crate: dpe-server)
│ ├── web-e2e-tests/ # Playwright E2E tests
│ ├── public/ # Static assets
│ ├── style/ # CSS / Tailwind
│ └── Dockerfile # Production container image
└── mosaic/ # Mosaic component library (design system)
├── tiles/ # Reusable Leptos UI components (crate: mosaic-tiles)
├── playground/ # Component playground application (crate: mosaic-playground)
├── playground_macro/ # Proc macro for playground page generation (crate: mosaic-playground-macro)
└── playground-e2e-tests/ # Playwright E2E tests for the playground
Crate and Folder Naming Convention
Crate names follow the {module}-{role} pattern. Folder names strip the module prefix, keeping only the role part. Hyphens in crate names become underscores in folder names when needed for Rust compatibility (proc macro crates).
| Crate | Folder | Role |
|---|---|---|
dpe-core | dpe/core | Pure domain types and data access (zero framework deps) |
dpe-api-oai | dpe/api-oai | OAI-PMH 2.0 API (depends on dpe-core only) |
dpe-web | dpe/web | Leptos SSR components, pages, #[server] functions |
dpe-server | dpe/server | Server binary — composes all routes |
mosaic-tiles | mosaic/tiles | Reusable UI component library |
mosaic-playground | mosaic/playground | Component showcase application |
mosaic-playground-macro | mosaic/playground_macro | Proc macro for playground page generation |
API Crate Pattern
Each API is a separate crate under modules/dpe/:
- Naming:
dpe-api-{name}(e.g.,dpe-api-oai) - Dependencies:
dpe-corefor domain types; never depends on other API crates ordpe-web - Entry point: Exports a handler function (e.g.,
pub async fn oai_handler(...)) - Composition:
dpe-serverwires the handler into the Axum router
For detailed crate responsibilities and the dependency graph, see DPE Project Structure.
Release, Deployment and Versioning
CI/CD Pipelines
All CI/CD workflows are defined as GitHub Actions in .github/workflows/.
Checks and Tests
Every push and pull request runs:
- check.yml — Formatting (
rustfmt,leptosfmt) and linting (clippy) - test.yml — Runs the full test suite
- scout-dpe.yml / scout-mosaic-playground.yml — Docker image vulnerability scanning (see Security)
Accessibility Testing
Defined in a11y-dpe.yml.
Runs on PRs and pushes to main that touch DPE UI code (modules/dpe/web/, modules/dpe/style/, modules/dpe/public/). Builds the DPE, then runs Playwright accessibility tests with axe-core against WCAG 2.1 AA.
Fuzz Testing
Defined in fuzz.yml.
Runs nightly at 02:00 UTC (and on manual dispatch). Fuzzes tab_validation and query_params targets for 10 minutes each using cargo-fuzz on nightly Rust. Corpus is cached between runs. On crash, automatically creates a GitHub issue with reproduction instructions.
Reusable Actions
Common CI steps are extracted into composite actions in .github/actions/:
| Action | Purpose |
|---|---|
build-dpe | Compile DPE (Rust musl binary + Leptos site assets) and stage artifacts |
docker-publish | Set up Buildx, log in to Docker Hub, build and push an image |
docker-scout | Run Docker Scout CVE scan and upload SARIF results |
Mosaic Playground
The Mosaic component library playground has two deployment paths:
PR Preview (Cloud Run)
Defined in cloud-run-mosaic-pull-request.yml.
When a pull request modifies files under modules/mosaic/, a preview of the Mosaic playground is automatically deployed to Google Cloud Run. The preview URL is posted as a comment on the PR and updated on each push.
- Trigger: PRs that touch
modules/mosaic/**(same-repo only, not forks) - Service: Ephemeral Cloud Run service per PR
- Cleanup: The Cloud Run service and container image are deleted when the PR is closed or merged
Authentication uses Workload Identity Federation (keyless, OIDC-based).
Production (Docker Hub + Jenkins)
Defined in mosaic-docker-publish.yml.
When changes to modules/mosaic/ are merged to main, the playground image is built, pushed to Docker Hub, and a Jenkins webhook triggers the production deployment.
DPE
PR Preview (Cloud Run)
Defined in cloud-run-dpe-pull-request.yml.
When a pull request modifies files under modules/dpe/, a preview of the DPE is automatically deployed to Google Cloud Run. Works the same way as the Mosaic preview: ephemeral service per PR, cleaned up on close/merge.
Continuous Deployment (Docker Hub + Jenkins)
Defined in dpe-docker-publish.yml.
On every push to main:
- Builds site assets with
cargo-leptos - Builds a static musl-linked binary
- Pushes the Docker image to Docker Hub (
daschswiss/dpe:{tag}) - Triggers a Jenkins webhook for DEV deployment
Release Publishing
Defined in dpe-release-publish.yml.
When a GitHub Release is published (tag starting with v), builds and pushes a release-tagged Docker image.
Release Please
Defined in release-please.yml.
On every push to main, Release Please reads conventional commit messages and creates or updates a release PR with auto-generated changelog. Merging the release PR creates a GitHub Release.
Configuration lives in .github/release-please/config.json and .github/release-please/manifest.json.
Documentation (GitHub Pages)
Defined in gh-pages.yml. The mdBook documentation is built and deployed to GitHub Pages on pushes to main.
Claude Code
Defined in claude.yml.
Responds to @claude mentions in PR comments and issue comments. Supports code review (@claude review) and general assistance. Runs with limited permissions (contents: read, pull-requests: write).
Security
Why Security Scanning Matters
Software depends on a deep stack of third-party components: base OS images, system libraries, language runtimes, and application dependencies. Vulnerabilities are regularly discovered in these components — the CVE database publishes thousands each year. A single unpatched dependency in a Docker image can become an entry point for attackers in production.
Manual tracking of vulnerabilities across all dependencies is not practical. Automated scanning integrates into the development workflow so that new vulnerabilities are surfaced early — ideally before code reaches production.
Container Image Scanning with Docker Scout
We use Docker Scout to scan our Docker images for known vulnerabilities (CVEs). Scout analyzes the Software Bill of Materials (SBOM) of each image — the full inventory of OS packages, libraries, and application dependencies — and matches them against vulnerability databases.
What Gets Scanned
| Image | Workflow | Trigger |
|---|---|---|
DPE (daschswiss/dpe) | scout-dpe.yml | PRs touching modules/dpe/** or Cargo.lock |
Mosaic Playground (daschswiss/mosaic-playground) | scout-mosaic-playground.yml | PRs touching modules/mosaic/** or Cargo.lock |
How It Works
Each Scout workflow:
-
Builds the Docker image locally — the image is loaded into the runner's Docker daemon (
load: true) but never pushed to a registry. This means Scout scans exactly what would be deployed, without exposing unreviewed images. -
Runs a CVE analysis — Docker Scout compares the image's SBOM against known vulnerability databases, filtering for critical and high severity issues.
-
Posts a PR comment — a summary of findings is posted directly on the pull request, giving developers immediate visibility without leaving their review workflow.
-
Uploads a SARIF report — results are uploaded to the GitHub Security tab in SARIF format (Static Analysis Results Interchange Format), the industry standard for security tool output. This integrates with GitHub's code scanning alerts.
What To Do With Results
Scout results are currently informational — they do not block merging. When a scan reports vulnerabilities:
- Critical/High in base image — check if a newer base image version is available that patches the issue. For DPE (distroless), these are rare. For Mosaic (Debian-based), update the base image tag.
- Critical/High in dependencies — check if a dependency update resolves the issue.
Run
cargo updateand re-test. - False positives — some CVEs may not be exploitable in our context. Document the rationale if choosing to accept the risk.
Prerequisites
- Docker Scout is enabled for the
daschswissDocker Hub organization - Repository secrets
DOCKER_USERandDOCKER_HUB_TOKEN(shared with publish workflows) - GitHub Advanced Security or a public repository (for SARIF upload)
Future Enhancements
- Production comparison — using Docker Scout's
comparecommand to show only new vulnerabilities introduced by a PR (requires configuring Docker Scout environments on Docker Hub) - Main-branch scanning — continuous monitoring of production images
- Blocking on critical CVEs — failing the PR check when critical vulnerabilities are detected
DPE Architecture
The Discovery and Presentation Environment (DPE) serves research project metadata as a web application.
Crate Structure
dpe-core Pure domain types, repositories, data loading
Dependencies: serde, serde_json only
│
┌────────────┼────────────┐
│ │ │
dpe-api-oai dpe-web (future APIs)
OAI-PMH 2.0 Leptos SSR
+ axum + Datastar
│ │
└────────────┘
│
dpe-server
Route composition
(binary: dpe)
- dpe-core: Framework-free domain layer. All types, repository traits, Fs implementations, and data loading.
- dpe-api-oai: OAI-PMH 2.0 endpoint. Depends only on dpe-core — no Leptos.
- dpe-web: Leptos SSR components, pages, and
#[server]wrappers. Re-exports dpe-core types for backward compatibility. - dpe-server: Thin composition root. Wires Leptos routes (dpe-web) and API handlers (dpe-api-oai) into a single Axum server.
Hypermedia-Driven Architecture
The DPE uses a hypermedia-driven architecture where the server is the single source of truth for UI state. Interactivity is provided by Datastar (~14KB JS) instead of client-side frameworks or WASM.
Why Datastar over Leptos islands:
- No WASM compilation step (faster builds)
- Smaller client-side footprint (~14KB vs ~200KB+ WASM)
- Server controls all state (HATEOAS)
- Graceful degradation — works as plain HTML links without JavaScript
- Simpler mental model — HTML attributes, not reactive signals
Rendering Model
Pages are rendered server-side by Leptos SSR. Dynamic content updates (tab switching, search autocomplete) are handled by Datastar SSE fragments.
Initial page load:
Browser → GET /projects/ABC1 → Leptos SSR → Full HTML page
Tab switch (with JS):
Browser → GET /projects/ABC1/tab/publications (SSE)
← PatchElements (#project-tabs replacement)
← ExecuteScript (history.replaceState for URL)
Tab switch (without JS):
Browser → GET /projects/ABC1?tab=publications → Full page reload
Fragment Route Convention
Fragment endpoints are pure Axum handlers (not Leptos routes) that render Leptos components to HTML strings and deliver them as Datastar SSE events.
Route pattern: resource-action nesting
GET /projects/{id} → Full page (Leptos SSR)
GET /projects/{id}/tab/{tab} → SSE fragment (Axum + Datastar)
Different path depths in Axum's radix trie mean no conflict and no header-based discrimination.
HATEOAS Tab Pattern
The server returns the complete tab component (tab bar + panel) in each SSE response. This means:
- Server controls which tab is active (
aria-selected) - Server controls which tabs are visible (e.g., hide Publications if none exist)
- Server pushes the bookmarkable URL via
ExecuteScript+history.replaceState
The client never needs to track tab state — the server-rendered HTML IS the state.
Datastar Attribute Patterns
<!-- Tab link with Datastar enhancement -->
<a href="/projects/ABC1?tab=publications"
role="tab" aria-selected="false"
data-on:click__prevent="@get('/projects/ABC1/tab/publications', {retry: 'never'})"
data-indicator:_tab_loading>
Publications
</a>
<!-- SSE failure fallback on container -->
<div id="project-tabs"
data-on:datastar-fetch="
(evt.detail.type === 'error' || evt.detail.type === 'retries-failed')
&& evt.detail.el.closest('#project-tabs')
&& (window.location.href = evt.detail.el.getAttribute('href'))
">
Datastar Attribute Conventions
- Signal naming: Use
_prefix for client-only signals (e.g.,_tab_loading). The underscore excludes the signal from server payloads. - No
__debounceon__preventanchors: Do NOT combine__preventwith__debounceor__throttleon anchor elements — known Datastar timing issue. retry: 'never': Use on@get()calls where fallback to full navigation is preferred over retrying.- Graceful degradation: Every Datastar-enhanced
<a>must have a validhreffor no-JS fallback.
See Also
- Project Structure — Crate responsibilities and dependency graph
- Testing Strategy — Testing pyramid and CI pipeline
- Operations — Docker, environment variables, deployment
DPE Project Structure
Workspace Layout
modules/dpe/
├── core/ dpe-core Pure domain (serde only)
├── api-oai/ dpe-api-oai OAI-PMH 2.0 endpoint
├── web/ dpe-web Leptos SSR + Datastar fragments
├── server/ dpe-server Axum binary (composition root)
├── web-e2e-tests/ Playwright E2E tests
├── public/ Static assets
└── style/ Tailwind CSS
Dependency Graph
dpe-core ← pure domain, no framework deps
↑
├── dpe-api-oai ← OAI-PMH endpoint
├── dpe-web ← Leptos SSR pages + components
└── dpe-server ← composition root, Datastar fragment handlers
Crate Responsibilities
dpe-core (core/)
Framework-free domain layer. Contains:
- Domain types:
Project,Record,Person,Organization,Attribution, etc. - Repository traits:
ProjectRepository,RecordRepository - Fs implementations:
FsProjectRepository,FsRecordRepository(backed by in-memory caches) - Data loading: Project and record caches (
OnceLock<Vec<T>>) loaded from JSON on first access - Utilities:
lang_value(),get_data_dir()
Dependencies: serde, serde_json only.
dpe-api-oai (api-oai/)
OAI-PMH 2.0 Data Provider. Implements the six required verbs (Identify, ListMetadataFormats, ListSets, ListIdentifiers, ListRecords, GetRecord).
Depends on dpe-core for domain types — no Leptos or web framework dependency.
dpe-web (web/)
Leptos SSR web layer. Contains:
- Pages:
home,about,project,projects(with filters and pagination) - Components: navbar, footer, project cards, tab panels, search input
- Domain re-exports:
domain/mod.rsre-exportsdpe-coretypes for a single import path - Server functions:
#[server]wrappers arounddpe-corefunctions
dpe-server (server/)
Composition root and Axum binary. Contains:
- Route wiring: Leptos SSR routes, OAI-PMH handler, Datastar fragment endpoints,
/healthz - Fragment handlers:
fragments.rs— pure Axum handlers that render Leptos components to HTML and return Datastar SSE events - Configuration:
config.rs— figment-based layered config (defaults →dpe.toml→DPE_*env vars) - Logging:
tracing-subscriberwith env-filter and JSON support
Key Patterns
- Domain types in
dpe-core, not in web or API crates - API crates depend on
dpe-coreonly, never on each other or ondpe-web dpe-servercontains no business logic — only route composition and fragment rendering- Fragment handlers use
Owner::new()+view! { ... }.to_html()to render Leptos components from pure Axum handlers
DPE Testing Strategy
The DPE follows a 4-layer testing pyramid, adapted from the Sipi testing strategy. Target distribution: ~50% unit, ~30% E2E, ~15% snapshot, ~5% fuzz.
Testing Pyramid
╱╲
╱ ╲ Layer 4: Fuzz Testing (nightly CI)
╱────╲ cargo-fuzz, corpus persisted
╱ ╲
╱ E2E ╲ Layer 3: E2E Tests (Playwright)
╱──────────╲ Tab switching, search, accessibility (axe-core)
╱ ╲
╱ Snapshots ╲ Layer 2: Snapshot Tests (insta)
╱────────────────╲ SSR output, SSE fragments, ARIA attributes
╱ ╲
╱ Unit Tests ╲ Layer 1: Unit Tests (cargo test)
╱────────────────────╲ Fragment handlers, OAI protocol, domain logic
Layer 1: Unit Tests
- Location:
#[cfg(test)]modules in each crate - Runner:
cargo test --workspace - Scope: Fragment handlers, OAI protocol, domain types, data loading, filtering/pagination
- Crate: dpe-core tests run independently —
cargo test -p dpe-core
Layer 2: Snapshot Tests (insta)
- Dependency:
instawithyamlandfiltersfeatures - Location: Adjacent
snapshots/directories - CI: Set
INSTA_UPDATE=newso failures produce.snap.newartifacts for review - Scope: SSR output, SSE fragment response bodies, ARIA attributes
Layer 3: E2E Tests (Playwright)
- Location:
modules/dpe/web-e2e-tests/ - Runner:
npx playwright test - Scope: Tab switching, search autocomplete, scroll preservation, accessibility (axe-core), visual regression
- Accessibility: Full-page axe-core scans against WCAG 2.1 AA
Layer 4: Fuzz Testing
- Tool: cargo-fuzz (nightly Rust)
- Schedule: Nightly CI, 10 minutes per target
- Targets: Tab name validation, SSE response construction, query parameter parsing
- Corpus: Persisted between runs
CI Pipeline Budget
Target: ≤ 10 minutes wall-clock per PR.
Parallel job group 1 (~2 min):
cargo fmt --check
cargo clippy --all-targets -Dwarnings
cargo-deny check
Parallel job group 2 (~5 min):
cargo nextest run --workspace
cargo leptos build --release
cargo-llvm-cov (coverage → Codecov)
Parallel job group 3 (~5 min):
Playwright E2E tests
axe-core accessibility scans
Lighthouse CI performance budgets
Testing Conventions
Test naming: Use descriptive names following the test_{what}_{condition}_{expected} pattern. For example: test_parse_project_missing_title_returns_error.
Test locations:
- Unit tests: In-crate
#[cfg(test)]modules or adjacent_tests.rsfiles - Snapshot files:
.snapfiles committed to git insnapshots/directories - E2E tests:
web-e2e-tests/for DPE,playground-e2e-tests/for Mosaic - Fuzz corpus: Persisted in the repository under
fuzz/corpus/
Test file naming: {feature}_tests.rs for Rust, {feature}.spec.ts for Playwright.
Snapshot tests: Use the insta crate. Use with_settings! for scrubbing dynamic values (timestamps, IDs). CI runs with INSTA_UPDATE=new so unexpected changes produce .snap.new files for review.
DPE Operations Guide
Operations documentation for the DPE infrastructure team.
Docker Image
- Base:
gcr.io/distroless/static-debian12:nonroot - User: uid 65534 (nonroot, built-in to distroless)
- Shell: None (distroless — no SSH possible)
- Binary: Static musl-linked
dpe(CLI with subcommands)
CLI Commands
The dpe binary provides three subcommands:
| Command | Description |
|---|---|
dpe serve | Start the web server |
dpe validate <data_dir> | Validate all data files under the given directory |
dpe healthcheck [--url URL] | Check if the server is healthy (default: http://localhost:8080/healthz) |
dpe validate
Validates JSON data files for structural correctness and cross-reference integrity.
dpe validate ./data
What it checks:
- JSON schema validity for all data file types (projects, persons, organizations, records, clusters, collections)
- Cross-references between projects, persons, and organizations
- Orphaned files that are not referenced by any parent entity
Exit codes:
0— all data files are valid1— validation errors found (details printed to stderr)
dpe healthcheck
Lightweight probe for Docker HEALTHCHECK or monitoring:
dpe healthcheck # default: http://localhost:8080/healthz
dpe healthcheck --url http://localhost:9090/healthz # custom URL
Ports
| Port | Protocol | Purpose |
|---|---|---|
| 8080 | HTTP | Application server |
Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
RUST_LOG | No | info | Log level filter (e.g., dpe_server=info,tower_http=debug) |
DPE_DATA_DIR | No | modules/dpe/server/data | Path to project/record JSON data files. Legacy alias: DATA_DIR (checked if DPE_DATA_DIR is unset) |
DPE_FATHOM_SITE_ID | No | (none) | Fathom Analytics site ID (not a secret) |
LEPTOS_SITE_ADDR | No | 0.0.0.0:8080 | Listen address and port |
LEPTOS_SITE_ROOT | No | site | Path to static site assets |
LEPTOS_SITE_PKG_DIR | No | pkg | JS/CSS package subdirectory |
LEPTOS_OUTPUT_NAME | No | dpe | CSS/JS output filename prefix |
LEPTOS_ENV | No | PROD | Leptos environment (DEV or PROD) |
Health Check
- Endpoint:
GET /healthz - Response:
200 OK(no body) - Purpose: Lightweight probe for Traefik/load balancers. Does not hit Leptos SSR.
Data Volume
- Mount point: Value of
DPE_DATA_DIR - Access: Read-only
- Contents: Project metadata JSON files, organized by type (
projects/,persons/,organizations/,clusters/,collections/,records/)
Resource Requirements
The DPE is lightweight — it serves static data with no database.
- Memory: ~50-100 MB typical
- CPU: Minimal (SSR rendering is fast, data is cached in-memory)
- Disk: Data files + static assets (~50 MB)
Logging
Structured logging via tracing-subscriber. Configure levels with RUST_LOG:
# Default (info level)
RUST_LOG=info
# Debug HTTP requests
RUST_LOG=dpe_server=info,tower_http=debug
# Verbose debugging
RUST_LOG=debug
Observability
Fathom Analytics
Privacy-friendly, GDPR-compliant analytics. No cookies, no personal data collected.
Configuration: Set the DPE_FATHOM_SITE_ID environment variable to your Fathom site ID (not a secret). The tracking script is automatically injected into the HTML shell.
What gets tracked:
- Page views
- Tab switches (detected automatically via
history.replaceState)
Disable: Omit the DPE_FATHOM_SITE_ID environment variable — no tracking script is rendered.
Discovery, Design and Development Process
Any work on the DSP Repository is done in collaboration between Management, Product Management (PM), the Research Data Unit (RDU) and Development (Dev). The process should look as outlined below, but may be adjusted to fit the needs of the project and the team.
In discovery, PM validates that an opportunity is aligned with DaSCH's strategy. In collaboration with the RDU, PM verifies that the opportunity can provide a desirable outcome.
PM will create a project description, including low-fidelity wireframes (Balsamiq or pen and paper), based on which they define user flows and journeys.
If any design components are needed, these will be added to the design system.
Finally, high-fidelity wireframes will be created in Figma, if needed.
Based on the project description and the wireframes, Dev will refine the project description, create Linear tickets and implement it accordingly.
When the implementation is done, PM will verify that the outcome was achieved and identify opportunities for further improvements.
Tech Stack
Core
| Technology | Purpose |
|---|---|
| Rust (Edition 2021) | Primary development language |
| Axum | HTTP web framework |
| Leptos | Reactive UI framework (SSR for DPE, islands for Mosaic) |
| Datastar | SSE-based interactivity for DPE (~14KB JS, no WASM) |
| Tailwind CSS v4 | Utility-first CSS framework |
| DaisyUI | Tailwind component plugin |
| Tokio | Async runtime |
| figment | Layered configuration (defaults → TOML → env vars) |
Data & Persistence
| Technology | Purpose |
|---|---|
| serde / serde_json | Serialization and deserialization |
| Static JSON files | Current data storage (database TBD) |
Testing & Quality
| Technology | Purpose |
|---|---|
| cargo test / nextest | Rust test runner |
| insta | Snapshot testing for SSR output |
| Playwright | End-to-end browser tests |
| axe-core | Accessibility scanning (WCAG 2.1 AA) |
| cargo-fuzz | Fuzz testing (nightly CI) |
Build & Development
| Technology | Purpose |
|---|---|
| cargo-leptos | Leptos build tool (handles Tailwind, WASM, site assets) |
| just | Command runner for development workflows |
| leptosfmt | Leptos-aware code formatter |
| Biome | Linter/formatter for E2E test TypeScript |
Documentation & Observability
| Technology | Purpose |
|---|---|
| mdBook + mdbook-alerts | Project documentation |
| Fathom Analytics | Privacy-friendly web analytics (GDPR-compliant, no cookies) |
| tracing + tracing-subscriber | Structured logging |
Architecture Principles
We keep the design evolutionary, starting from the simplest possible solution and iterating on it. At first, providing data from static JSON files is sufficient. Following clean architecture principles, swapping out the persistence layer is easy.
TypeScript is used exclusively for testing and development tooling, not for production runtime code. The core application remains purely Rust-based.
Testing and Quality Assurance
We follow the Testing Pyramid approach to testing, the majority of tests are unit tests, with a smaller number of integration tests, and a few end-to-end tests.
Unit and integration tests are written in Rust, end-to-end tests are written either in Rust or in JavaScript using Playwright.
Design System Testing
The design system playground includes comprehensive testing infrastructure:
Interactive Testing (MCP):
- Start playground server:
just watch-mosaic-playground - Use Claude Code with Playwright MCP commands for visual verification
- Commands whitelisted in
.claude/settings.json - Best for: Component development, design verification, manual testing
Automated Testing (CI/CD):
- TypeScript-based Playwright setup in
modules/mosaic/playground-e2e-tests/ - Functional, accessibility, and responsive design testing in CI
- HTML + JSON reporters for CI/CD integration
- Best for: End-to-end user flows, automated regression detection
For single component interactions, prefer Rust tests. Playwright is for complete user flows.
Unit tests are the foundation of our testing strategy. They test individual components in isolation, ensuring that each part of the codebase behaves as expected. Unit tests are fast to write and to execute, and they provide immediate feedback on the correctness of the code.
Integration tests verify the interaction between different components, ensuring that they work together as expected. Integration tests may check the integration between the business logic and the presentation layer, or between the view and the business logic.
End-to-end tests verify the entire system. They simulate real user interactions and check that the system behaves as expected.
Additional to the functional tests, we also need to implement performance tests.
We aim to follow the practice of Test Driven Development (TDD), where tests are written before the code is implemented. This helps to ensure that the code is testable and meets the requirements.
Code Review Guidelines
Review checklist for the DSP Repository. Organized by priority.
Always Check
Fragment Endpoints
- New fragment endpoints follow resource-action nesting convention (see DPE Architecture)
- New Datastar interactions have
<a href>fallback for graceful degradation - ARIA semantics present on interactive components (
role,aria-selected,aria-controls)
Testing
- insta snapshots added/updated for changed SSR output
- E2E test covers the user-facing behavior
- axe-core scan passes on affected pages
- Unit tests for fragment handler edge cases (invalid tab, missing project, etc.)
Architecture
- New API crates follow the
dpe-api-{name}pattern withdpe-coreas only domain dependency (see Project Structure) dpe-corehas no framework dependencies (no leptos, no axum)- Validate command covers all data file types (DPE)
- E2E test directory naming:
web-e2e-tests/for DPE,playground-e2e-tests/for Mosaic
CLI
- CLI subcommands are documented in help text
Documentation
- Documentation updated when patterns change (see About this Documentation)
- New environment variables documented in DPE Operations
Commits
- Commits follow conventional commits (correct prefix, scope matches crate name) — see Workflows and Conventions
- One topic per commit — apply the "and" test
- Each commit builds and passes tests
Security
- No secrets in config files, Cargo.toml, or git
- Path parameters validated before filesystem access
Style
- Follow existing Datastar attribute patterns (signal naming with
_prefix) — see DPE Architecture - Fragment handlers in
fragments/module, not inline inmain.rs - Domain types belong in
dpe-core, not in web or API crates - API crate exposes a handler function (e.g.,
pub async fn oai_handler(...)) for composition in dpe-server - Leptos components use
view!macro consistently - Test files follow naming convention:
{feature}_tests.rsfor Rust,{feature}.spec.tsfor Playwright
Skip
- Snapshot
.snapfile contents — verify accepted, don't review formatting - Formatting-only changes (
cargo fmt/leptosfmtdiffs) Cargo.lockchanges from dependency updates
Onboarding
Rust
The main technology we use is Rust. A solid understanding of Rust is needed, though particularly the frontend work does not require deep knowledge of Rust.
Rust HTTP Server
We use Axum as our HTTP server.
Serialization and Deserialization
We use serde for serialization and deserialization of data.
Web UI
We use Leptos as our UI framework for building reactive web applications in Rust.
Leptos is a full-stack web framework that allows writing both server and client code in Rust. It provides reactive primitives and a component model similar to modern JavaScript frameworks.
Key features:
- The
islandsCargo feature is enabled workspace-wide (Leptos 0.8 build requirement) - Only the Mosaic component library uses actual island components with client-side WASM hydration
- DPE uses SSR-only with Datastar for interactivity — no client-side WASM
- The architecture follows the MPA paradigm, a "multi-page app"
- Server-side rendering
- Fine-grained reactivity
- Component-based architecture
- Full Rust syntax support
Architectural Design Patterns
We follow concepts such as Clean Architecture (there is also a book), Hexagonal Architecture or Onion Architecture. Familiarity with these concepts will be helpful.
Some of the patterns must be adapted to the idioms of Rust, but the general principles are the same.
Testing
We follow the Testing Pyramid approach to testing, the majority of tests are unit tests, with a smaller number of integration tests, and a few end-to-end tests.
Unit and integration tests are written in Rust, following the Rust testing best practices. End-to-end tests can be written using Playwright. Leptos has some built-in support for Playwright.
Domain Driven Design
We do not follow strict Domain Driven Design (DDD) principles, but we try to follow some of the concepts. In particular, we try to keep the language used in code aligned with the domain language.
Test Driven Development
We should absolutely do TDD and BDD.
Database
We are still evaluating the database to use.
For the initial development, we work with static content or JSON files.
Mosaic Component Library
The Mosaic component library provides reusable UI components built with Leptos and Tailwind CSS.
Components are defined in modules/mosaic/tiles/ and can be previewed in the playground application at modules/mosaic/playground/.
To run the playground locally:
just watch-mosaic-playground
Pull requests that modify files in modules/mosaic/ automatically receive a
Cloud Run preview deployment. The preview URL is posted as a comment on the PR.