Skip to content

DSP-METADATA-GUI Metadata Module

The dsp-metadata-gui is a GUI application written in Python for collecting project specific metadata.

Its aim is to enable researchers and project managers who deposit research data on the DaSCH Service Platform (DSP), to add metadata about the project and datasets to the DSP repository. By providing metadata, the project will be searchable on the platform, which is an integral part of the FAIR principles.

The metadata follows the schema defined by the dsp-ontologies.

Install and run

The application has only been tested on Python 3.10.

Note: There is a number of known potential issues. See the troubleshoot section here.

To be able to use Python 3.10, you may want to use a tool such as pyenv to manage your Python versions.

Installation via pip

To install the application, run:

pip install dsp-metadata-gui

Afterwards, the program can be started by running the command dsp-metadata in your terminal of choice.

Installation from source

Clone this repo, install all requirements as described below and run make run.

Usage

Collecting Metadata

The application is divided into two windows:

  1. The main window lets you organize a list of projects, for which you can collect metadata. Several actions can be performed with projects, e.g. editing or exporting the project.

  2. When editing a project, in the project window, the actual metadata can be added, modified and saved.

To add a project, you will need the project short code, which is assigned to you by the DaSCH Client Services.
A project is always associated with a folder on your local machine. If any files should be included with the metadata import, these files must be within that folder. Once all metadata are added and valid, and the overall RDF graph of the metadata set validates against the ontology, the project can be exported for upload to the DSP.

All data is locally stored in the file ~/DaSCH/config/repos.data. for more detail, see here.

Conversion to V2

The metadata generated by the application conforms to the first version of the data model for metadata.
This corresponds to the data that can currently be viewed in the DaSCH Metadata Browser.

The initial data model will eventually be replaced by the model V2 which introduces major improvements.
Metadata V2 will eventually be collected directly in the web interface rather than in this python application.
In the mean time until the web interface for editing metadata is implemented, this application provides a script to automatically convert V1 .ttl files into V2 .json files.

NB: The conversion can not be fully automated, as the model V2 is more rich in information than V1.
For convenience, the conversion adds the string XX wherever the output can not be determined with sufficient confidence. Please check those instances manually.
The conversion also does some "guessing" work, as e.g. the language of literal values or the display text for URLs. If the output can be determined with a sufficient level of confidence, the conversion will not add XX. However it is still advisable to check the entirety of the output for potential errors.

V2 JSON metadata can again be converted to V2 RDF metadata, using another script. This should not require any additional data cleaning.

The most important changes from V1 to V2 include the following additions:

  • Support for multi-language literals

  • howToCite on project level

  • country property for addresses

  • Creation and modification timestamps

  • JSON schema validation

NB: A new button has been added to run the JSON conversions in the GUI without having to export first.

Development

Development Environment

Poetry

Ensure you have poetry installed.

to install all requirements, run

poetry install

To install packages, use

poetry add <package-name>

Documentation

The documentation is created using mkdocs and mkdocstrings with markdown_include.include. To create the documentation, make sure to install all of these, using pip.

To serve the documentation locally, run make doc. To deploy the documentation to github pages, run make deploy-doc.

Release

Automated releases can be created using release-please.

Automatically publish a new release to PyPI does not work. Run the release GitHub Action manually.