Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

This is the monorepo for the DSP Repository.

The DaSCH Service Platform (DSP) consists of two main components:

  • DSP VRE: The DSP Virtual Research Environment (VRE), where resdearchers can work on their data during the lifetime of the project. It consists of the DSP-APP, DSP-API and DSP-TOOLS.
    The DSP VRE is developed in various other git repositories.
  • DSP Repository: The DSP Repository is the long-term archive for research data. It consists of the DSP Archive and the Discovery and Presentation Environment (DPE).
    The DSP Repository is developed in this monorepo.

Additionally, the monorepo contains the DSP Design System and DaSCH's website.

This documentation provides an overview of the project structure. It covers the different components of the system architecture, the design system we use for the development, and the processes we follow for working on the DSP Repository, including onboarding information, etc.

About this Documentation

This documentation is built using mdBook.

Pre-requisites

Before contributing, please ensure you have the following installed:

Any further dependencies can be installed using just commands:

just install-requirements

Building and Serving the Documentation

To run the documentation locally, use:

just docs-serve

Contributing to the Documentation

mdBook uses Markdown for documentation.

The documentation is organized into chapters and sections, which are defined in the SUMMARY.md file. Each section corresponds to a Markdown file in the src directory.

To configure the documentation (e.g. adding plugins), modify the book.toml file.

Deployment

This documentation is not yet deployed. The deployment process will be defined in the future.

Workflows and Conventions

Entry Points

The first entry point of this repository is the README file, which should give anyone an indication of where to find any information they need.

For any interaction or coding-related workflow, the justfile is the primary source of truth. The justfile contains all the commands and workflows that are used in this repository, along with their descriptions.

Any further information should be located in the documentation.

Git Workflow

For this repository, we use a rebase workflow. This means that all changes should be made on a branch, and then rebased onto the main branch before being merged.

This allows us to keep a clean commit history and avoid merge commits.

Project Structure and Code Organization

Overview

This repository contains the source code for the DSP Repository. It is structured as a Rust workspace, with multiple crates.

All Rust crates are organized as subdirectories within the modules/ directory.

Project Structure

[!WARNING]
This page is not up-to-date.

Workspace Layout

.
├── Cargo.toml         # Workspace manifest
├── types/             # Domain models, traits, and shared errors
├── services/          # Pure business logic implementations
├── storage/           # Data persistence (DB, in-memory) implementations
├── html_api/          # HTML routes, templates, and SSE endpoints
├── json_api/          # JSON RPC/REST endpoints
├── server/            # Web server binary crate
└── cli/               # CLI binary crate for tools and scripts

Crate Responsibilities

types/ – Domain Types and Interfaces

This crate defines all core data structures, error types, and trait interfaces shared across the application. It is dependency-light and logic-free.

  • Domain models (ProjectCluster, ResearchProject, Collection, Dataset, Record, etc.)
  • Error types (AppError)
  • Trait definitions:
    • Service traits: MetadataService
    • Repository traits: MetadataRepository

Example:

#![allow(unused)]
fn main() {
pub trait MetadataRepository {
  async fn find_by_id(&self, id: &str) -> Result<ResearchProject, AppError>;
}
}

services/ – Business Logic

Implements the types::service traits and contains all pure application logic.

  • Depends on types
  • Free of side effects and I/O
  • Easily testable
  • Orchestrates workflows and enforces business rules

Example:

#![allow(unused)]
fn main() {
pub struct MetadataServiceImpl<R: MetadataRepository> {
    pub repo: R,
}

#[async_trait]
impl<R: MetadataRepository> MetadataService for MetadataServiceImpl<R> {
    async fn find_by_id(&self, id: &str) -> Result<ResearchProject, AppError> {
        self.repo.find_by_id(id).await
    }
}
}

storage/ – Persistence Layer

Implements the types::storage traits to access data in external systems such as SQLite or in-memory stores.

  • No business logic
  • Easily swappable with mocks or test implementations

Example:

#![allow(unused)]
fn main() {
pub struct InMemoryMetadataRepository { ... }

#[async_trait]
impl MetadataRepository for InMemoryMetadataRepository { 
  async fn find_by_id(&self, id: &str) -> Result<ResearchProject, AppError> {
    // In-memory lookup logic
  }
}
}

html_api/ – HTML Hypermedia + SSE

Handles the user-facing UI layer, serving HTML pages and fragments, and live-updating data via SSE.

  • Askama templates
  • Datastar for link generation
  • SSE endpoints for live features like notifications, progress updates
  • Routes for page rendering and form submissions

Example:

#![allow(unused)]
fn main() {
#[get("/users/:id")]
async fn user_profile(State(app): State<AppState>, Path(id): Path<Uuid>) -> impl IntoResponse {
    let user = app.user_service.get_user_profile(id).await?;
    HtmlTemplate(UserProfileTemplate { user })
}
}

http_api/ – Machine-Readable HTTP API

Exposes your application logic through a structured JSON API for integration with JavaScript frontends or third-party services.

  • Cleanly separates business logic from representation
  • Handles serialization and input validation

Example:

#![allow(unused)]
fn main() {
#[get("/api/users/:id")]
async fn get_user(State(app): State<AppState>, Path(id): Path<Uuid>) -> impl IntoResponse {
    let user = app.user_service.get_user_profile(id).await?;
    Json(user)
}
}

server/ – Web Server Binary

This crate is the entrypoint for running the full web application.

  • Loads configuration
  • Initializes services and storage
  • Combines all route layers (html_api, http_api)
  • Starts the Axum server

Example:

#[tokio::main]
async fn main() -> Result<(), AppError> {
    let storage = PostgresUserRepository::new(...);
    let service = UserServiceImpl { repo: storage };

    let app = Router::new()
        .merge(html_api::routes(service.clone()))
        .merge(http_api::routes(service.clone()));

    axum::Server::bind(&addr).serve(app.into_make_service()).await?;
    Ok(())
}

cli/ – Command-Line Interface

Provides a CLI for administrative or batch tasks such as:

  • Import/export of data
  • Cleanup scripts
  • Background migrations
  • Developer utilities

Example (using clap):

#![allow(unused)]
fn main() {
#[derive(Parser)]
enum Command {
    ImportUsers { file: PathBuf },
    ReindexSearch,
}
}

Run via:

cargo run --bin cli -- import-users ./users.csv

Benefits of This Structure

AspectBenefit
Separation of concernsClear boundaries between domain, logic, persistence, and delivery
ModularEach crate can be tested and reused independently
Team-friendlyFrontend-focused devs work in html_api; backend devs focus on services and storage
TestableServices and repositories can be mocked for unit/integration testing
ExtensibleAdd more APIs (e.g., GraphQL, CLI commands) without modifying existing code

Development Guidelines

  • Never put business logic in route handlers. Use the service layer.
  • Keep domain models and interfaces free of framework dependencies.
  • Each crate has a single responsibility.
  • SSE endpoints live in html_api, not as a separate API crate.
  • Prefer async traits for I/O-related operations.
  • Write integration tests in the same crate or create a top-level tests/ crate for system-wide tests.

Future Growth Possibilities

  • Add a worker/ crate for background jobs
  • Add a scheduler/ crate for periodic tasks
  • Add a tests/ crate for orchestrated integration tests
  • Add a graphql_api/ or admin_api/ if needed

Getting Started

To run the application server:

cargo run --bin server

To run the CLI:

cargo run --bin cli -- help

Summary

This modular design ensures clarity, maintainability, and smooth collaboration for both backend and frontend developers. The split between crates follows clean architecture principles and allows for focused development, rapid iteration, and clear testing strategies.

DSP Design System

Introduction

The DSP Design System is a customization of the IBM Carbon Design System. It follows Carbon in terms of design language and implementation. It is customized in the following ways:

  • It is not a general purpose design system, but a purpose-built system. As such, it is much smaller and less complex.
  • It is not generic or customizeable, instead the DSP brand is baked into it, thus simplifying complexity if use.
  • It is purposefully kept small:
    • It comes with two themes (dark and light, corresponding to "gray-90" and "gray-10" in Carbon) and no option for custom theming.
    • The set of available styles (colors, typography, spacing, etc.) is kept intentionally small to promote consistent user interfaces.
    • It only has the components that are strictly needed. Additional components may be added, when necessary.
    • It only has the component variants that are strictly needed. Additional component variants may be added, when necessary.
  • It may have purpose-specific components. (E.g. Carbon does not provide a "Card" component, but rather a "Tile" component, from which cards can be built. The DSP Design System instead would provide a "Card" component.)

Current Implementation Status

The DSP Design System is currently in early development with the following components implemented:

Available Components

Button

  • Variants: Primary, Secondary, Outline
  • Status: 🚧 Work in progress (styling verification against Carbon needed)
  • Features: Basic button functionality with variant support
  • Variants: Accent only, with prefix, with suffix, full (prefix + accent + suffix)
  • Status: ✅ Functional
  • Features: Configurable text sections with accent styling

Shell

  • Purpose: Application navigation and layout wrapper
  • Status: 🚧 Work in progress
  • Features: Header with logo, placeholder navigation, action buttons

Tile

  • Variants: Base, Clickable
  • Status: 🚧 Work in progress (styling verification against Carbon needed)
  • Features: Content containers with Carbon-compliant styling (no borders, shadows, or rounded corners)

Development Environment

Playground

  • URL: http://localhost:3400 (via just run-watch-playground)
  • Features: Live component testing with structured sections and isolated examples
  • Status: ✅ Fully functional with improved layout

Component Architecture Decision

Composability Approach

We use a Maud-Native + Props combination for component composability:

Core Principles

  1. Maud Markup Return Types: Components return maud::Markup instead of String for zero-copy composition
  2. Props Structs: Complex components use dedicated props structs for clear parameter grouping
  3. Simple Functions: Basic components remain as simple functions
  4. Flexible Text Input: Use impl Into<String> for text parameters

Component Patterns

Simple Components

#![allow(unused)]
fn main() {
use maud::{Markup, html};

pub fn button(text: impl Into<String>) -> Markup {
    html! {
        button .dsp-button { (text.into()) }
    }
}
}

Complex Components with Props

#![allow(unused)]
fn main() {
pub struct CardProps {
    pub title: String,
    pub content: Markup,
    pub variant: CardVariant,
}

pub fn card(props: CardProps) -> Markup {
    html! {
        div .dsp-card {
            h2 .dsp-card__title { (props.title) }
            div .dsp-card__content { (props.content) }
        }
    }
}
}

Component Composition

#![allow(unused)]
fn main() {
// Direct nesting - zero-copy composition
pub fn page_header(title: impl Into<String>, actions: Markup) -> Markup {
    html! {
        header .dsp-page-header {
            h1 { (title.into()) }
            div .dsp-page-header__actions {
                (actions)  // Direct Markup insertion
            }
        }
    }
}
}

Benefits

  • Efficient: No string concatenation overhead
  • Type Safe: Compile-time guarantees for component structure
  • Composable: Components nest naturally without conversion
  • Extensible: Props structs make adding parameters easy
  • Consistent: Unified approach across all components

Migration Path

  1. Convert existing components to return Markup
  2. Add props structs for components with 3+ parameters
  3. Use impl Into<String> for text inputs
  4. Test composition in playground

Discovery, Design and Development Process

Any work on the DSP Repository is done in collaboration between Management, Product Management (PM), the Research Data Unit (RDU) and Development (Dev). The process should look as outlined below, but may be adjusted to fit the needs of the project and the team.

In discovery, PM validates that an opportunity is aligned with DaSCH's strategy. In collaboration with the RDU, PM verifies that the opportunity can provide a desirable outcome.

PM will create a project description, including low-fidelity wireframes (Balsamiq or pen and paper), based on which they define user flows and journeys.

If any design components are needed, these will be added to the design system.

Finally, high-fidelity wireframes will be created in Figma, if needed.

Based on the project description and the wireframes, Dev will refine the project description, create Linear tickets and implement it accordingly.

When the implementation is done, PM will verify that the outcome was achieved and identify opportunities for further improvements.

Tech Stack

  • Rust
  • Axum
  • Askama or Maud
  • DataStar
  • Database TBD

We keep the design evolutionary, starting from the simplest possible solution and iterating on it. At first, providing data from a static JSON files, or working with static content, is sufficient. Following clean architecture principles, swapping out the persistence layer is easy.

Testing and Quality Assurance

We follow the Testing Pyramid approach to testing, the majority of tests are unit tests, with a smaller number of integration tests, and a few end-to-end tests.

Unit and integration tests are written in Rust, end-to-end tests are written either in Rust or in JavaScript using Playwright.

[!note] We still need to verify that playwright works well with the current setup.

Playwright may be used for the following:

  • End-to-end tests simulating entire user flows.
  • Visual regression tests.

For single user interactions, we should not need Playwright. For these, we should use the Rust testing framework.

Unit tests are the foundation of our testing strategy. They test individual components in isolation, ensuring that each part of the codebase behaves as expected. Unit tests are fast to write and to execute, and they provide immediate feedback on the correctness of the code.

Integration tests verify the interaction between different components, ensuring that they work together as expected. Integration tests may check the integration between the business logic and the presentation layer, or between the view and the business logic.

End-to-end tests verify the entire system. They simulate real user interactions and check that the system behaves as expected.

Additional to the functional tests, we also need to implement performance tests.

We aim to follow the practice of Test Driven Development (TDD), where tests are written before the code is implemented. This helps to ensure that the code is testable and meets the requirements.

Onboarding

Rust

The main technology we use is Rust. A solid understanding of Rust is needed, though particularly the frontend work does not require deep knowledge of Rust.

Rust HTTP Server

We use Axum as our HTTP server.

Serialization and Deserialization

We use serde for serialization and deserialization of data.

Rust Templating

We are still evaluating two templating engines for rendering HTML in Rust:

Askama is a compile-time templating engine, that is very similar to Jinja2. It uses a file-based approach, where separate template files are stored and dynamic content is injected into them.

Maud is a macro-based templating engine, that allows to write HTML directly in Rust code. It is more similar to JSX, where the HTML is written inline with the Rust code.

We have to try both engines and see which one fits our needs better. In principle, we can use both engines in the same project, if they excel at different things. But for simplicity, we should try to stick to one engine.

Archtectural Design Patterns

We follow concepts such as Clean Architecture, Hexagonal Architecture or Onion Architecture. Familiarity with these concepts will be helpful.

Some of the patterns must be adapted to the idioms of Rust, but the general principles are the same.

Testing

We follow the Testing Pyramid approach to testing, the majority of tests are unit tests, with a smaller number of integration tests, and a few end-to-end tests.

Unit and integration tests are written in Rust, following the Rust testing best practices. End-to-end tests are written either in Rust or in JavaScript using Playwright.

We are looking into how to integrate Playwright into our setup. Playwright may be used for the following:

  • End-to-end tests simulating entire user flows.
  • Visual regression tests.

For single user interactions, we should not need Playwright. For these, we should use the Rust testing framework.

Domain Driven Design

We do not follow strict Domain Driven Design (DDD) principles, but we try to follow some of the concepts. In particular, we try to keep the language used in code aligned with the domain language.

Test Driven Development

We should absolutely do TDD and BDD.

Database

We are still evaluating the database to use.

For the initial development, we work with static content or JSON files.

Hypermedia

Instead of the traditional single-page application architecture, where the client is a JavaScript application that runs in the browser, and communicates with the server via JSON over HTTP, we are using a hypermedia approach.

In this approach, the server sends HTML pages or fragments to the client. The client simply renders the HTML, without much JavaScript. The client also does not need to maintain state, this is done by the server. Server and client keep a connection open, through which the server can send updates to the client, so called "server-sent events" (SSE), which guarantees responsiveness and interactivity.

This approach has several advantages:

  • The client is much simpler, as it does not need to maintain state or manage complex interactions.
  • The client does not have many security concerns, as the server is the source of truth and can control what is sent to the client.
  • The server can update the client at any time, without the need for the client to poll for updates.

Furthermore, with this approach, we can share much of the code between the server and the client. With the UI-code living on the server, we can use the same language (Rust) for both server and client, and there is no strict separation between backend and frontend.

The hypermedia approach is not a new concept, it has been used in the past, but had been widely replaced by the single-page application approach. Lately, the hypermedia approach has been making a comeback.

This approach is best known from HTMX. HTMX is fairly well established and there is a lot of learning material available. Its major downside is that it is not sufficient for most use-cases, when used by itself. Most of the time, you need to combine it with other libraries, such as Alpine.js.

Rather than HTMX, we are using DataStar, which provides similar functionality, but is more compliant with web standards, and provides a more complete solution, so that we do not need to combine it with other libraries.

DataStar is a fairly new project, so there is not much learning material available yet. However, it is very similar to HTMX, so that most of the HTMX learning material can be applied to DataStar as well.

Design System

Rather than using a generic design system, and designing solutions on top of it ad hoc, we are using a purpose-built design system. This will help us to keep the design consistent and coherent, reduce complexity, and give our products a clearer brand identity.

This design system is based on the IBM Carbon Design System, but is customized to our needs.

Concept, Aim and Purpose

We build the DSP Design System on top of the Carbon Design System, because it communicates the reliability and trustworthiness that users expect from an archive. Furthermore, it is a well established design system, where concerns such as accessibility and usability have been well thought out.

There are several reasons why we do not use the Carbon Design System as is, and instead build our own design system on top of it:

  • It is a customizeable, generic, general purpose design system.
    As such, it provides a lot of options and flexibility but comes with a lot of complexity.
    By customizing it to our needs, we can reduce complexity and limit it to our needs. By limiting optins (e.g. components, icons, tokens, etc.) to the ones we need, we can simplify design and ensure consistency in our products.
  • It is customizeable, but we do not need that flexibility.
    By creating a purpose-built design system, we can bake our brand into it, and reduce complexity even further.
  • The official and community implementations of Carbon do not align with our tech stack.
    By creating our own implementation, we can tailor it to our needs and move the implementation of individual components to dedicated design system work, rather than having to implement them in the context of project work.
    The design system work will be done in close collaboration between Dev and PM, and is an up-fron investment that will improve the speed and quality of project work in the long run.

Implementation

It is part of the discovery process to define components needed for any project.

These are then implemented in the design system as individual, reuseable components. It should then be possible to import these components as Rust modules, and use them in the project code.

The details of the implementation depend on the rendering engine we decide to use.

Playground

We create a playground for the design system, where we can test and experiment with the components.

This playground is a separate Rust crate that depends on the design system crate. It is losely inspired by Storybook, but intentionally kept simpler.

For each component, we create a page that shows the component in action, in all its variants and states. We also pages for compounds and patterns, where we show how to use the components together. Finally, we create sample pages that show how to use the components in a real-world scenario.