Introduction
This is the monorepo for the DSP Repository.
The DaSCH Service Platform (DSP) consists of two main components:
- DSP VRE: The DSP Virtual Research Environment (VRE),
where resdearchers can work on their data during the lifetime of the project.
It consists of the DSP-APP, DSP-API and DSP-TOOLS.
The DSP VRE is developed in various other git repositories. - DSP Repository: The DSP Repository is the long-term archive for research data.
It consists of the DSP Archive and the Discovery and Presentation Environment (DPE).
The DSP Repository is developed in this monorepo.
Additionally, the monorepo contains the DSP Design System and DaSCH's website.
This documentation provides an overview of the project structure. It covers the different components of the system architecture, the design system we use for the development, and the processes we follow for working on the DSP Repository, including onboarding information, etc.
About this Documentation
This documentation is built using mdBook.
Pre-requisites
Before contributing, please ensure you have the following installed:
Any further dependencies can be installed using just
commands:
just install-requirements
Building and Serving the Documentation
To run the documentation locally, use:
just docs-serve
Contributing to the Documentation
mdBook uses Markdown for documentation.
The documentation is organized into chapters and sections, which are defined in the SUMMARY.md
file.
Each section corresponds to a Markdown file in the src
directory.
To configure the documentation (e.g. adding plugins), modify the book.toml
file.
Deployment
This documentation is not yet deployed. The deployment process will be defined in the future.
Workflows and Conventions
Entry Points
The first entry point of this repository is the README file, which should give anyone an indication of where to find any information they need.
For any interaction or coding-related workflow, the justfile is the primary source of truth. The justfile contains all the commands and workflows that are used in this repository, along with their descriptions.
Any further information should be located in the documentation.
Git Workflow
For this repository, we use a rebase workflow. This means that all changes should be made on a branch, and then rebased onto the main branch before being merged.
This allows us to keep a clean commit history and avoid merge commits.
Project Structure and Code Organization
Overview
This repository contains the source code for the DSP Repository. It is structured as a Rust workspace, with multiple crates.
All Rust crates are organized as subdirectories within the modules/
directory.
Project Structure
[!WARNING]
This page is not up-to-date.
Workspace Layout
.
├── Cargo.toml # Workspace manifest
├── types/ # Domain models, traits, and shared errors
├── services/ # Pure business logic implementations
├── storage/ # Data persistence (DB, in-memory) implementations
├── html_api/ # HTML routes, templates, and SSE endpoints
├── json_api/ # JSON RPC/REST endpoints
├── server/ # Web server binary crate
└── cli/ # CLI binary crate for tools and scripts
Crate Responsibilities
types/
– Domain Types and Interfaces
This crate defines all core data structures, error types, and trait interfaces shared across the application. It is dependency-light and logic-free.
- Domain models (
ProjectCluster
,ResearchProject
,Collection
,Dataset
,Record
, etc.) - Error types (
AppError
) - Trait definitions:
- Service traits:
MetadataService
- Repository traits:
MetadataRepository
- Service traits:
Example:
#![allow(unused)] fn main() { pub trait MetadataRepository { async fn find_by_id(&self, id: &str) -> Result<ResearchProject, AppError>; } }
services/
– Business Logic
Implements the types::service
traits and contains all pure application logic.
- Depends on
types
- Free of side effects and I/O
- Easily testable
- Orchestrates workflows and enforces business rules
Example:
#![allow(unused)] fn main() { pub struct MetadataServiceImpl<R: MetadataRepository> { pub repo: R, } #[async_trait] impl<R: MetadataRepository> MetadataService for MetadataServiceImpl<R> { async fn find_by_id(&self, id: &str) -> Result<ResearchProject, AppError> { self.repo.find_by_id(id).await } } }
storage/
– Persistence Layer
Implements the types::storage
traits to access data in external systems such as SQLite or in-memory stores.
- No business logic
- Easily swappable with mocks or test implementations
Example:
#![allow(unused)] fn main() { pub struct InMemoryMetadataRepository { ... } #[async_trait] impl MetadataRepository for InMemoryMetadataRepository { async fn find_by_id(&self, id: &str) -> Result<ResearchProject, AppError> { // In-memory lookup logic } } }
html_api/
– HTML Hypermedia + SSE
Handles the user-facing UI layer, serving HTML pages and fragments, and live-updating data via SSE.
- Askama templates
- Datastar for link generation
- SSE endpoints for live features like notifications, progress updates
- Routes for page rendering and form submissions
Example:
#![allow(unused)] fn main() { #[get("/users/:id")] async fn user_profile(State(app): State<AppState>, Path(id): Path<Uuid>) -> impl IntoResponse { let user = app.user_service.get_user_profile(id).await?; HtmlTemplate(UserProfileTemplate { user }) } }
http_api/
– Machine-Readable HTTP API
Exposes your application logic through a structured JSON API for integration with JavaScript frontends or third-party services.
- Cleanly separates business logic from representation
- Handles serialization and input validation
Example:
#![allow(unused)] fn main() { #[get("/api/users/:id")] async fn get_user(State(app): State<AppState>, Path(id): Path<Uuid>) -> impl IntoResponse { let user = app.user_service.get_user_profile(id).await?; Json(user) } }
server/
– Web Server Binary
This crate is the entrypoint for running the full web application.
- Loads configuration
- Initializes services and storage
- Combines all route layers (
html_api
,http_api
) - Starts the Axum server
Example:
#[tokio::main] async fn main() -> Result<(), AppError> { let storage = PostgresUserRepository::new(...); let service = UserServiceImpl { repo: storage }; let app = Router::new() .merge(html_api::routes(service.clone())) .merge(http_api::routes(service.clone())); axum::Server::bind(&addr).serve(app.into_make_service()).await?; Ok(()) }
cli/
– Command-Line Interface
Provides a CLI for administrative or batch tasks such as:
- Import/export of data
- Cleanup scripts
- Background migrations
- Developer utilities
Example (using clap):
#![allow(unused)] fn main() { #[derive(Parser)] enum Command { ImportUsers { file: PathBuf }, ReindexSearch, } }
Run via:
cargo run --bin cli -- import-users ./users.csv
Benefits of This Structure
Aspect | Benefit |
---|---|
Separation of concerns | Clear boundaries between domain, logic, persistence, and delivery |
Modular | Each crate can be tested and reused independently |
Team-friendly | Frontend-focused devs work in html_api ; backend devs focus on services and storage |
Testable | Services and repositories can be mocked for unit/integration testing |
Extensible | Add more APIs (e.g., GraphQL, CLI commands) without modifying existing code |
Development Guidelines
- Never put business logic in route handlers. Use the service layer.
- Keep domain models and interfaces free of framework dependencies.
- Each crate has a single responsibility.
- SSE endpoints live in
html_api
, not as a separate API crate. - Prefer async traits for I/O-related operations.
- Write integration tests in the same crate or create a top-level
tests/
crate for system-wide tests.
Future Growth Possibilities
- Add a
worker/
crate for background jobs - Add a
scheduler/
crate for periodic tasks - Add a
tests/
crate for orchestrated integration tests - Add a
graphql_api/
oradmin_api/
if needed
Getting Started
To run the application server:
cargo run --bin server
To run the CLI:
cargo run --bin cli -- help
Summary
This modular design ensures clarity, maintainability, and smooth collaboration for both backend and frontend developers. The split between crates follows clean architecture principles and allows for focused development, rapid iteration, and clear testing strategies.
DSP Design System
Introduction
The DSP Design System is a customization of the IBM Carbon Design System. It follows Carbon in terms of design language and implementation. It is customized in the following ways:
- It is not a general purpose design system, but a purpose-built system. As such, it is much smaller and less complex.
- It is not generic or customizeable, instead the DSP brand is baked into it, thus simplifying complexity if use.
- It is purposefully kept small:
- It comes with two themes (dark and light, corresponding to "gray-90" and "gray-10" in Carbon) and no option for custom theming.
- The set of available styles (colors, typography, spacing, etc.) is kept intentionally small to promote consistent user interfaces.
- It only has the components that are strictly needed. Additional components may be added, when necessary.
- It only has the component variants that are strictly needed. Additional component variants may be added, when necessary.
- It may have purpose-specific components. (E.g. Carbon does not provide a "Card" component, but rather a "Tile" component, from which cards can be built. The DSP Design System instead would provide a "Card" component.)
Current Implementation Status
The DSP Design System is currently in early development with the following components implemented:
Available Components
Button
- Variants: Primary, Secondary, Outline
- Status: 🚧 Work in progress (styling verification against Carbon needed)
- Features: Basic button functionality with variant support
Banner
- Variants: Accent only, with prefix, with suffix, full (prefix + accent + suffix)
- Status: ✅ Functional
- Features: Configurable text sections with accent styling
Shell
- Purpose: Application navigation and layout wrapper
- Status: 🚧 Work in progress
- Features: Header with logo, placeholder navigation, action buttons
Tile
- Variants: Base, Clickable
- Status: 🚧 Work in progress (styling verification against Carbon needed)
- Features: Content containers with Carbon-compliant styling (no borders, shadows, or rounded corners)
Development Environment
Playground
- URL: http://localhost:3400 (via
just run-watch-playground
) - Features: Live component testing with structured sections and isolated examples
- Status: ✅ Fully functional with improved layout
Component Architecture Decision
Composability Approach
We use a Maud-Native + Props combination for component composability:
Core Principles
- Maud
Markup
Return Types: Components returnmaud::Markup
instead ofString
for zero-copy composition - Props Structs: Complex components use dedicated props structs for clear parameter grouping
- Simple Functions: Basic components remain as simple functions
- Flexible Text Input: Use
impl Into<String>
for text parameters
Component Patterns
Simple Components
#![allow(unused)] fn main() { use maud::{Markup, html}; pub fn button(text: impl Into<String>) -> Markup { html! { button .dsp-button { (text.into()) } } } }
Complex Components with Props
#![allow(unused)] fn main() { pub struct CardProps { pub title: String, pub content: Markup, pub variant: CardVariant, } pub fn card(props: CardProps) -> Markup { html! { div .dsp-card { h2 .dsp-card__title { (props.title) } div .dsp-card__content { (props.content) } } } } }
Component Composition
#![allow(unused)] fn main() { // Direct nesting - zero-copy composition pub fn page_header(title: impl Into<String>, actions: Markup) -> Markup { html! { header .dsp-page-header { h1 { (title.into()) } div .dsp-page-header__actions { (actions) // Direct Markup insertion } } } } }
Benefits
- Efficient: No string concatenation overhead
- Type Safe: Compile-time guarantees for component structure
- Composable: Components nest naturally without conversion
- Extensible: Props structs make adding parameters easy
- Consistent: Unified approach across all components
Migration Path
- Convert existing components to return
Markup
- Add props structs for components with 3+ parameters
- Use
impl Into<String>
for text inputs - Test composition in playground
Discovery, Design and Development Process
Any work on the DSP Repository is done in collaboration between Management, Product Management (PM), the Research Data Unit (RDU) and Development (Dev). The process should look as outlined below, but may be adjusted to fit the needs of the project and the team.
In discovery, PM validates that an opportunity is aligned with DaSCH's strategy. In collaboration with the RDU, PM verifies that the opportunity can provide a desirable outcome.
PM will create a project description, including low-fidelity wireframes (Balsamiq or pen and paper), based on which they define user flows and journeys.
If any design components are needed, these will be added to the design system.
Finally, high-fidelity wireframes will be created in Figma, if needed.
Based on the project description and the wireframes, Dev will refine the project description, create Linear tickets and implement it accordingly.
When the implementation is done, PM will verify that the outcome was achieved and identify opportunities for further improvements.
Tech Stack
- Rust
- Axum
- Askama or Maud
- DataStar
- Database TBD
We keep the design evolutionary, starting from the simplest possible solution and iterating on it. At first, providing data from a static JSON files, or working with static content, is sufficient. Following clean architecture principles, swapping out the persistence layer is easy.
Testing and Quality Assurance
We follow the Testing Pyramid approach to testing, the majority of tests are unit tests, with a smaller number of integration tests, and a few end-to-end tests.
Unit and integration tests are written in Rust, end-to-end tests are written either in Rust or in JavaScript using Playwright.
[!note] We still need to verify that playwright works well with the current setup.
Playwright may be used for the following:
- End-to-end tests simulating entire user flows.
- Visual regression tests.
For single user interactions, we should not need Playwright. For these, we should use the Rust testing framework.
Unit tests are the foundation of our testing strategy. They test individual components in isolation, ensuring that each part of the codebase behaves as expected. Unit tests are fast to write and to execute, and they provide immediate feedback on the correctness of the code.
Integration tests verify the interaction between different components, ensuring that they work together as expected. Integration tests may check the integration between the business logic and the presentation layer, or between the view and the business logic.
End-to-end tests verify the entire system. They simulate real user interactions and check that the system behaves as expected.
Additional to the functional tests, we also need to implement performance tests.
We aim to follow the practice of Test Driven Development (TDD), where tests are written before the code is implemented. This helps to ensure that the code is testable and meets the requirements.
Onboarding
Rust
The main technology we use is Rust. A solid understanding of Rust is needed, though particularly the frontend work does not require deep knowledge of Rust.
Rust HTTP Server
We use Axum as our HTTP server.
Serialization and Deserialization
We use serde for serialization and deserialization of data.
Rust Templating
We are still evaluating two templating engines for rendering HTML in Rust:
Askama is a compile-time templating engine, that is very similar to Jinja2. It uses a file-based approach, where separate template files are stored and dynamic content is injected into them.
Maud is a macro-based templating engine, that allows to write HTML directly in Rust code. It is more similar to JSX, where the HTML is written inline with the Rust code.
We have to try both engines and see which one fits our needs better. In principle, we can use both engines in the same project, if they excel at different things. But for simplicity, we should try to stick to one engine.
Archtectural Design Patterns
We follow concepts such as Clean Architecture, Hexagonal Architecture or Onion Architecture. Familiarity with these concepts will be helpful.
Some of the patterns must be adapted to the idioms of Rust, but the general principles are the same.
Testing
We follow the Testing Pyramid approach to testing, the majority of tests are unit tests, with a smaller number of integration tests, and a few end-to-end tests.
Unit and integration tests are written in Rust, following the Rust testing best practices. End-to-end tests are written either in Rust or in JavaScript using Playwright.
We are looking into how to integrate Playwright into our setup. Playwright may be used for the following:
- End-to-end tests simulating entire user flows.
- Visual regression tests.
For single user interactions, we should not need Playwright. For these, we should use the Rust testing framework.
Domain Driven Design
We do not follow strict Domain Driven Design (DDD) principles, but we try to follow some of the concepts. In particular, we try to keep the language used in code aligned with the domain language.
Test Driven Development
We should absolutely do TDD and BDD.
Database
We are still evaluating the database to use.
For the initial development, we work with static content or JSON files.
Hypermedia
Instead of the traditional single-page application architecture, where the client is a JavaScript application that runs in the browser, and communicates with the server via JSON over HTTP, we are using a hypermedia approach.
In this approach, the server sends HTML pages or fragments to the client. The client simply renders the HTML, without much JavaScript. The client also does not need to maintain state, this is done by the server. Server and client keep a connection open, through which the server can send updates to the client, so called "server-sent events" (SSE), which guarantees responsiveness and interactivity.
This approach has several advantages:
- The client is much simpler, as it does not need to maintain state or manage complex interactions.
- The client does not have many security concerns, as the server is the source of truth and can control what is sent to the client.
- The server can update the client at any time, without the need for the client to poll for updates.
Furthermore, with this approach, we can share much of the code between the server and the client. With the UI-code living on the server, we can use the same language (Rust) for both server and client, and there is no strict separation between backend and frontend.
The hypermedia approach is not a new concept, it has been used in the past, but had been widely replaced by the single-page application approach. Lately, the hypermedia approach has been making a comeback.
This approach is best known from HTMX. HTMX is fairly well established and there is a lot of learning material available. Its major downside is that it is not sufficient for most use-cases, when used by itself. Most of the time, you need to combine it with other libraries, such as Alpine.js.
Rather than HTMX, we are using DataStar, which provides similar functionality, but is more compliant with web standards, and provides a more complete solution, so that we do not need to combine it with other libraries.
DataStar is a fairly new project, so there is not much learning material available yet. However, it is very similar to HTMX, so that most of the HTMX learning material can be applied to DataStar as well.
Design System
Rather than using a generic design system, and designing solutions on top of it ad hoc, we are using a purpose-built design system. This will help us to keep the design consistent and coherent, reduce complexity, and give our products a clearer brand identity.
This design system is based on the IBM Carbon Design System, but is customized to our needs.
Concept, Aim and Purpose
We build the DSP Design System on top of the Carbon Design System, because it communicates the reliability and trustworthiness that users expect from an archive. Furthermore, it is a well established design system, where concerns such as accessibility and usability have been well thought out.
There are several reasons why we do not use the Carbon Design System as is, and instead build our own design system on top of it:
- It is a customizeable, generic, general purpose design system.
As such, it provides a lot of options and flexibility but comes with a lot of complexity.
By customizing it to our needs, we can reduce complexity and limit it to our needs. By limiting optins (e.g. components, icons, tokens, etc.) to the ones we need, we can simplify design and ensure consistency in our products. - It is customizeable, but we do not need that flexibility.
By creating a purpose-built design system, we can bake our brand into it, and reduce complexity even further. - The official and community implementations of Carbon do not align with our tech stack.
By creating our own implementation, we can tailor it to our needs and move the implementation of individual components to dedicated design system work, rather than having to implement them in the context of project work.
The design system work will be done in close collaboration between Dev and PM, and is an up-fron investment that will improve the speed and quality of project work in the long run.
Implementation
It is part of the discovery process to define components needed for any project.
These are then implemented in the design system as individual, reuseable components. It should then be possible to import these components as Rust modules, and use them in the project code.
The details of the implementation depend on the rendering engine we decide to use.
Playground
We create a playground for the design system, where we can test and experiment with the components.
This playground is a separate Rust crate that depends on the design system crate. It is losely inspired by Storybook, but intentionally kept simpler.
For each component, we create a page that shows the component in action, in all its variants and states. We also pages for compounds and patterns, where we show how to use the components together. Finally, we create sample pages that show how to use the components in a real-world scenario.