DSP-METADATA-GUI Metadata Module
The dsp-metadata-gui
is a GUI application written in Python for collecting project specific metadata.
Its aim is to enable researchers and project managers who deposit research data on the DaSCH Service Platform (DSP), to add metadata about the project and datasets to the DSP repository. By providing metadata, the project will be searchable on the platform, which is an integral part of the FAIR principles.
The metadata follows the schema defined by the dsp-ontologies.
Install and run
The application has only been tested on Python 3.10.
Note: There is a number of known potential issues. See the troubleshoot
section here.
To be able to use Python 3.10, you may want to use a tool such as pyenv to manage your Python versions.
Installation via pip
To install the application, run:
pip install dsp-metadata-gui
Afterwards, the program can be started by running the command dsp-metadata
in your terminal of choice.
Installation from source
Clone this repo, install all requirements as described below and run make run
.
Usage
Collecting Metadata
The application is divided into two windows:
-
The main window lets you organize a list of projects, for which you can collect metadata. Several actions can be performed with projects, e.g. editing or exporting the project.
-
When editing a project, in the project window, the actual metadata can be added, modified and saved.
To add a project, you will need the project short code, which is assigned to you by the DaSCH Client Services.
A project is always associated with a folder on your local machine. If any files should be included with the metadata import, these files must be within that folder.
Once all metadata are added and valid, and the overall RDF graph of the metadata set validates against the ontology, the project can be exported for upload to the DSP.
All data is locally stored in the file ~/DaSCH/config/repos.data
. for more detail, see here.
Conversion to V2
The metadata generated by the application conforms to the first version of the data model for metadata.
This corresponds to the data that can currently be viewed in the DaSCH Metadata Browser.
The initial data model will eventually be replaced by the model V2 which introduces major improvements.
Metadata V2 will eventually be collected directly in the web interface rather than in this python application.
In the mean time until the web interface for editing metadata is implemented, this application provides a script to automatically convert V1 .ttl
files into V2 .json
files.
NB: The conversion can not be fully automated, as the model V2 is more rich in information than V1.
For convenience, the conversion adds the stringXX
wherever the output can not be determined with sufficient confidence. Please check those instances manually.
The conversion also does some "guessing" work, as e.g. the language of literal values or the display text for URLs. If the output can be determined with a sufficient level of confidence, the conversion will not addXX
. However it is still advisable to check the entirety of the output for potential errors.
V2 JSON metadata can again be converted to V2 RDF metadata, using another script. This should not require any additional data cleaning.
The most important changes from V1 to V2 include the following additions:
-
Support for multi-language literals
-
howToCite
on project level -
country
property for addresses -
Creation and modification timestamps
-
JSON schema validation
NB: A new button has been added to run the JSON conversions in the GUI without having to export first.
Development
Development Environment
Poetry
Ensure you have poetry installed.
to install all requirements, run
poetry install
To install packages, use
poetry add <package-name>
Documentation
The documentation is created using mkdocs
and mkdocstrings
with markdown_include.include
. To create the documentation, make sure to install all of these, using pip.
To serve the documentation locally, run make doc
. To deploy the documentation to github pages, run make deploy-doc
.
Release
Automated releases can be created using release-please
.
Automatically publish a new release to PyPI does not work. Run the release
GitHub Action manually.