Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

This is the monorepo for the DSP Repository.

The DaSCH Service Platform (DSP) consists of two main components:

  • DSP VRE: The DSP Virtual Research Environment (VRE), where resdearchers can work on their data during the lifetime of the project. It consists of the DSP-APP, DSP-API and DSP-TOOLS.
    The DSP VRE is developed in various other git repositories.
  • DSP Repository: The DSP Repository is the long-term archive for research data. It consists of the DSP Archive and the Discovery and Presentation Environment (DPE).
    The DSP Repository is developed in this monorepo.

Additionally, the monorepo contains the DSP Design System and DaSCH's website.

This documentation provides an overview of the project structure. It covers the different components of the system architecture, the design system we use for the development, and the processes we follow for working on the DSP Repository, including onboarding information, etc.

About this Documentation

This documentation is built using mdBook.

Pre-requisites

Before contributing, please ensure you have the following installed:

Any further dependencies can be installed using just commands:

just install-requirements

Building and Serving the Documentation

To run the documentation locally, use:

just docs-serve

Contributing to the Documentation

mdBook uses Markdown for documentation.

The documentation is organized into chapters and sections, which are defined in the SUMMARY.md file. Each section corresponds to a Markdown file in the src directory.

To configure the documentation (e.g. adding plugins), modify the book.toml file.

Deployment

This documentation is not yet deployed. The deployment process will be defined in the future.

Workflows and Conventions

Entry Points

The first entry point of this repository is the README file, which should give anyone an indication of where to find any information they need.

For any interaction or coding-related workflow, the justfile is the primary source of truth. The justfile contains all the commands and workflows that are used in this repository, along with their descriptions.

Key Development Commands

The justfile provides self-documenting commands. Key workflows include:

  • just check - Run formatting and linting checks
  • just build - Build all targets
  • just test - Run all tests
  • just run - Run main server
  • just watch - Watch for changes and run tests
  • just run-watch-playground - Run design system playground with hot reload
  • just playground install - Install playground dependencies
  • just playground test - Run design system tests

Run just without arguments to see all available commands with descriptions.

Any further information should be located in the documentation.

CI/CD

GitHub Actions workflows run automatically on pushes and pull requests. See Release, Deployment and Versioning for details on the CI/CD pipelines, including PR preview deployments and production releases.

Git Workflow

For this repository, we use a rebase workflow. This means that all changes should be made on a branch, and then rebased onto the main branch before being merged.

This allows us to keep a clean commit history and avoid merge commits.

Project Structure and Code Organization

Overview

This repository contains the source code for the DSP Repository. It is structured as a Rust workspace, with multiple crates.

All Rust crates are organized as subdirectories within the modules/ directory.

Additionally, any non-Rust code or assets are placed either in the assets/ directory or as a separate module within the modules/ directory.

Release, Deployment and Versioning

CI/CD Pipelines

All CI/CD workflows are defined as GitHub Actions in .github/workflows/.

Checks and Tests

Every push and pull request runs:

  • check.yml — Formatting (rustfmt) and linting (clippy)
  • test.yml — Runs the full test suite

Mosaic Demo

The Mosaic component library demo has two deployment paths:

PR Preview (Cloud Run)

Defined in cloud-run-pull-request.yml.

When a pull request modifies files under modules/mosaic/, a preview of the Mosaic demo is automatically deployed to Google Cloud Run. The preview URL is posted as a comment on the PR and updated on each push.

  • Trigger: PRs that touch modules/mosaic/** (same-repo only, not forks)
  • Service: Ephemeral Cloud Run service per PR
  • Cleanup: The Cloud Run service and container image are deleted when the PR is closed or merged

Authentication uses Workload Identity Federation (keyless, OIDC-based).

Required secrets: GCP_WORKLOAD_IDENTITY_PROVIDER, GCP_SERVICE_ACCOUNT, GCP_REGION, GCP_ARTIFACT_REGISTRY.

Production (Docker Hub + Jenkins)

Defined in docker-publish.yml.

When changes to modules/mosaic/ are merged to main, the demo image is built, pushed to Docker Hub, and a Jenkins webhook triggers the production deployment.

Documentation (GitHub Pages)

Defined in gh-pages.yml. The mdBook documentation is built and deployed to GitHub Pages on pushes to main.

Project Structure

warning

This page is not up-to-date.

Workspace Layout

.
├── Cargo.toml         # Workspace manifest
├── types/             # Domain models, traits, and shared errors
├── services/          # Pure business logic implementations
├── storage/           # Data persistence (DB, in-memory) implementations
├── html_api/          # HTML routes, templates, and SSE endpoints
├── json_api/          # JSON RPC/REST endpoints
├── server/            # Web server binary crate
└── cli/               # CLI binary crate for tools and scripts

Crate Responsibilities

types/ – Domain Types and Interfaces

This crate defines all core data structures, error types, and trait interfaces shared across the application. It is dependency-light and logic-free.

  • Domain models (ProjectCluster, ResearchProject, Collection, Dataset, Record, etc.)
  • Error types (AppError)
  • Trait definitions:
    • Service traits: MetadataService
    • Repository traits: MetadataRepository

Example:

#![allow(unused)]
fn main() {
pub trait MetadataRepository {
  async fn find_by_id(&self, id: &str) -> Result<ResearchProject, AppError>;
}
}

services/ – Business Logic

Implements the types::service traits and contains all pure application logic.

  • Depends on types
  • Free of side effects and I/O
  • Easily testable
  • Orchestrates workflows and enforces business rules

Example:

#![allow(unused)]
fn main() {
pub struct MetadataServiceImpl<R: MetadataRepository> {
    pub repo: R,
}

#[async_trait]
impl<R: MetadataRepository> MetadataService for MetadataServiceImpl<R> {
    async fn find_by_id(&self, id: &str) -> Result<ResearchProject, AppError> {
        self.repo.find_by_id(id).await
    }
}
}

storage/ – Persistence Layer

Implements the types::storage traits to access data in external systems such as SQLite or in-memory stores.

  • No business logic
  • Easily swappable with mocks or test implementations

Example:

#![allow(unused)]
fn main() {
pub struct InMemoryMetadataRepository { ... }

#[async_trait]
impl MetadataRepository for InMemoryMetadataRepository { 
  async fn find_by_id(&self, id: &str) -> Result<ResearchProject, AppError> {
    // In-memory lookup logic
  }
}
}

html_api/ – HTML + SSE

Handles the user-facing UI layer, serving HTML pages and fragments, and live-updating data via SSE.

  • HTML templates
  • SSE endpoints for live features like notifications, progress updates
  • Routes for page rendering and form submissions

Example:

#![allow(unused)]
fn main() {
#[get("/users/:id")]
async fn user_profile(State(app): State<AppState>, Path(id): Path<Uuid>) -> impl IntoResponse {
    let user = app.user_service.get_user_profile(id).await?;
    HtmlTemplate(UserProfileTemplate { user })
}
}

http_api/ – Machine-Readable HTTP API

Exposes your application logic through a structured JSON API for integration with JavaScript frontends or third-party services.

  • Cleanly separates business logic from representation
  • Handles serialization and input validation

Example:

#![allow(unused)]
fn main() {
#[get("/api/users/:id")]
async fn get_user(State(app): State<AppState>, Path(id): Path<Uuid>) -> impl IntoResponse {
    let user = app.user_service.get_user_profile(id).await?;
    Json(user)
}
}

server/ – Web Server Binary

This crate is the entrypoint for running the full web application.

  • Loads configuration
  • Initializes services and storage
  • Combines all route layers (html_api, http_api)
  • Starts the Axum server

Example:

#[tokio::main]
async fn main() -> Result<(), AppError> {
    let storage = PostgresUserRepository::new(...);
    let service = UserServiceImpl { repo: storage };

    let app = Router::new()
        .merge(html_api::routes(service.clone()))
        .merge(http_api::routes(service.clone()));

    axum::Server::bind(&addr).serve(app.into_make_service()).await?;
    Ok(())
}

cli/ – Command-Line Interface

Provides a CLI for administrative or batch tasks such as:

  • Import/export of data
  • Cleanup scripts
  • Background migrations
  • Developer utilities

Example (using clap):

#![allow(unused)]
fn main() {
#[derive(Parser)]
enum Command {
    ImportUsers { file: PathBuf },
    ReindexSearch,
}
}

Run via:

cargo run --bin cli -- import-users ./users.csv

Benefits of This Structure

AspectBenefit
Separation of concernsClear boundaries between domain, logic, persistence, and delivery
ModularEach crate can be tested and reused independently
Team-friendlyFrontend-focused devs work in html_api; backend devs focus on services and storage
TestableServices and repositories can be mocked for unit/integration testing
ExtensibleAdd more APIs (e.g., GraphQL, CLI commands) without modifying existing code

Development Guidelines

  • Never put business logic in route handlers. Use the service layer.
  • Keep domain models and interfaces free of framework dependencies.
  • Each crate has a single responsibility.
  • SSE endpoints live in html_api, not as a separate API crate.
  • Prefer async traits for I/O-related operations.
  • Write integration tests in the same crate or create a top-level tests/ crate for system-wide tests.

Future Growth Possibilities

  • Add a worker/ crate for background jobs
  • Add a scheduler/ crate for periodic tasks
  • Add a tests/ crate for orchestrated integration tests
  • Add a graphql_api/ or admin_api/ if needed

Getting Started

To run the application server:

cargo run --bin server

To run the CLI:

cargo run --bin cli -- help

Summary

This modular design ensures clarity, maintainability, and smooth collaboration for both backend and frontend developers. The split between crates follows clean architecture principles and allows for focused development, rapid iteration, and clear testing strategies.

Discovery, Design and Development Process

Any work on the DSP Repository is done in collaboration between Management, Product Management (PM), the Research Data Unit (RDU) and Development (Dev). The process should look as outlined below, but may be adjusted to fit the needs of the project and the team.

In discovery, PM validates that an opportunity is aligned with DaSCH's strategy. In collaboration with the RDU, PM verifies that the opportunity can provide a desirable outcome.

PM will create a project description, including low-fidelity wireframes (Balsamiq or pen and paper), based on which they define user flows and journeys.

If any design components are needed, these will be added to the design system.

Finally, high-fidelity wireframes will be created in Figma, if needed.

Based on the project description and the wireframes, Dev will refine the project description, create Linear tickets and implement it accordingly.

When the implementation is done, PM will verify that the outcome was achieved and identify opportunities for further improvements.

Tech Stack

Core Technologies

  • Rust - Primary development language (Edition 2021, Toolchain 1.86.0)
  • Axum - HTTP web framework with WebSocket support
  • Leptos - Reactive UI framework for Rust
  • Tailwind CSS v4 - Utility-first CSS framework
  • Database TBD - Currently using static JSON files

Development & Testing

  • Cargo test - Rust test runner
  • Playwright - End-to-end testing
  • Leptosfmt - Leptos code formatter

Architecture Principles

We keep the design evolutionary, starting from the simplest possible solution and iterating on it. At first, providing data from static JSON files, or working with static content, is sufficient. Following clean architecture principles, swapping out the persistence layer is easy.

Implementation Notes

TypeScript is used exclusively for testing and development tooling, not for production runtime code. The core application remains purely Rust-based.

Testing and Quality Assurance

We follow the Testing Pyramid approach to testing, the majority of tests are unit tests, with a smaller number of integration tests, and a few end-to-end tests.

Unit and integration tests are written in Rust, end-to-end tests are written either in Rust or in JavaScript using Playwright.

Design System Testing

The design system playground includes comprehensive testing infrastructure:

Interactive Testing (MCP):

  • Start playground server: just run-watch-playground
  • Use Claude Code with Playwright MCP commands for visual verification
  • Commands whitelisted in .claude/settings.json
  • Best for: Component development, design verification, manual testing

Automated Testing (CI/CD):

  • TypeScript-based Playwright setup with tooling
  • Functional, accessibility, and responsive design testing in CI
  • ESLint, Prettier, and TypeScript checking
  • HTML + JSON reporters for CI/CD integration
  • Best for: End-to-end user flows, automated regression detection

Setup: just playground install then just playground test

Key Commands:

  • just playground test - Run all tests
  • just playground test-ui - Interactive test runner
  • just playground test-debug - Debug mode with browser DevTools
  • just playground type-check - TypeScript validation
  • just playground lint-and-format - Code quality checks

For single component interactions, prefer Rust tests. Playwright is for complete user flows.

note

We are still evaluating Playwright integration for broader testing use cases.

Unit tests are the foundation of our testing strategy. They test individual components in isolation, ensuring that each part of the codebase behaves as expected. Unit tests are fast to write and to execute, and they provide immediate feedback on the correctness of the code.

Integration tests verify the interaction between different components, ensuring that they work together as expected. Integration tests may check the integration between the business logic and the presentation layer, or between the view and the business logic.

End-to-end tests verify the entire system. They simulate real user interactions and check that the system behaves as expected.

Additional to the functional tests, we also need to implement performance tests.

We aim to follow the practice of Test Driven Development (TDD), where tests are written before the code is implemented. This helps to ensure that the code is testable and meets the requirements.

Onboarding

Rust

The main technology we use is Rust. A solid understanding of Rust is needed, though particularly the frontend work does not require deep knowledge of Rust.

Rust HTTP Server

We use Axum as our HTTP server.

Serialization and Deserialization

We use serde for serialization and deserialization of data.

Web UI

We use Leptos as our UI framework for building reactive web applications in Rust.

Leptos is a full-stack web framework that allows writing both server and client code in Rust. It provides reactive primitives and a component model similar to modern JavaScript frameworks.

Key features:

  • Leptos must only be used with the island feature
  • The architecture follows the MPA paradigm, a "multi-page app"
  • Server-side rendering
  • Fine-grained reactivity
  • Component-based architecture
  • Full Rust syntax support

Architectural Design Patterns

We follow concepts such as Clean Architecture (there is also a book), Hexagonal Architecture or Onion Architecture. Familiarity with these concepts will be helpful.

Some of the patterns must be adapted to the idioms of Rust, but the general principles are the same.

Testing

We follow the Testing Pyramid approach to testing, the majority of tests are unit tests, with a smaller number of integration tests, and a few end-to-end tests.

Unit and integration tests are written in Rust, following the Rust testing best practices. End-to-end tests can be written using Playwright. Leptos has some built-in support for Playwright.

Domain Driven Design

We do not follow strict Domain Driven Design (DDD) principles, but we try to follow some of the concepts. In particular, we try to keep the language used in code aligned with the domain language.

Test Driven Development

We should absolutely do TDD and BDD.

Database

We are still evaluating the database to use.

For the initial development, we work with static content or JSON files.

Mosaic Component Library

The Mosaic component library provides reusable UI components built with Leptos and Tailwind CSS.

Components are defined in modules/mosaic/tiles/ and can be previewed in the demo application at modules/mosaic/demo/.

To run the demo locally:

just watch-mosaic-demo

Pull requests that modify files in modules/mosaic/ automatically receive a Cloud Run preview deployment. The preview URL is posted as a comment on the PR.