Making an MDAKit
Here, we outline the process of creating an MDAKit that fulfills all of the requirements for acceptance into the MDAKit registry. For a video walk-through of this tutorial, watch our recorded tutorial on YouTube.
Unlike the code in the core MDAnalysis library, the structure of an MDAKit is much less restrictive. In order to be accepted, there are several requirements that must be addressed:
Code in the package uses the MDAnalysis library
The code is open source and published under an OSI approved license
Code is versioned and provided in an accessible version-controlled repository (GitHub, GitLab, Bitbucket, etc.)
Code authors and maintainers are clearly designated
Minimal documentation is provided (what your code does, how to install it, and how to use it)
At least minimal regression tests are present; continuous integration is encouraged
It is also highly encouraged that the MDAKit also satisfies:
Code is installable as a standard package
Information on bug reporting, user discussions, and community guidelines is made available
These requirements ensure that registered packages are FAIR-compliant and hold up to an ideal scientific standard. Without prior experience, some of the requirements listed above can be daunting. To aid in this process, we make use of the MDAKit cookiecutter in this example.
Building from an existing project
Registering an existing package as an MDAKit is a straightforward process. Since the structure of an MDAKit is not as strict as the code found in the MDAnalysis core library, chances are very little restructuring is needed for registration. The primary concern is ensuring that the core MDAKit requirements are met, as listed at the top of this document.
One of the more pressing requirements for kit registration is clearly identifying the license that is applied to your code. This is typically included in a LICENSE file at the top level of your repository. Without a license, the only assumption a user can make about your code is that they are not in a position to use your code. Your license needs to be compatible with the GPLv2+ license currently used by MDAnalysis, in addition to the licenses of any other packages your mdakit depends on. Take time to consider how you would like to license your project. Take time to consider how you would like to license your project. Further information on open source licensing can be found from sources such as: choose a license, tl;dr Legal, the Open Source Initiative, and the Software Sustainability Insitute.
Hosting code in a version controlled repository
Since the MDAKits registry makes heavy use of the GitHub actions infrastructure, registration of a kit requires that all code maintainers also have a GitHub account for communication purposes.
For this reason, if your code is not already hosted in an accessible version controlled repository, hosting on GitHub is recommended, although other services such as Bitbucket, GitLab, or self hosting is possible.
The registry does not require that your code be available through packaging repositories such as the Python Package Index or conda-forge, although having your code available through these services is highly encouraged.
After registration, users can find the installation instructions for the source code on your MDAKit page, which is specified in the
src_install field in the
metadata.yaml file (see Specification of the metadata.yaml file).
Basic documentation is required for MDAKit registration. The detail and depth of the documentation is ultimately up to you, but we require at a minimum that you provide README-style documentation explaining what the code is supposed to do, how to install it, and the basics of its use. Although this is the minimum, we highly recommend that you consider generating your documentation with dedicated tools such as Sphinx, which allows you to generate static documentation using reStructuredText formatted plain-text directly from your code. This makes it easier for your documentation to change alongside code changes.
We also require that minimal regression tests are present. These tests are not just useful for when you make changes to your code, but also when any package dependencies (e.g. MDAnalysis, NumPy, and Python) change. Additionally, tests inform the users of your packages that the code performs at least the way you say it should and give them confidence that it can be used. Basic tests can be written with a variety of packages, such as the pytest package (the default choice for MDAnalysis organization projects) or the unittest package. Further improvements to your testing procedure may include automatically running the tests on pushing to your remote repositories, often referred to as continuous integration (CI). CI can be set up using repository pipeline tools, such as GitHub Actions.
When submitting an MDAKit to the registry, include the instructions for running the tests in the required
metadata.yaml file (see a full example in the registration section below).
Assuming that your tests are in a
test/ directory at the top level of your repository, you could define your test commands as:
run_tests: - git clone latest - pytest -v tests/
This makes a clone of your repository based on your latest release tag on GitHub and navigates into the repository root. Note that this is not a true git command, but is instead specific to the MDAKits registry workflow and depends on the
project_home field in the
metadata.yaml file (see Specification of the metadata.yaml file).
The pytest command then runs the tests found inside the
If your tests are elsewhere, change this path appropriately.
Dependencies that are only required for testing are indicated in the
Suppose your package uses pytest and used the MDAnalysisTests for sample data.
This is reflected in your MDAKit metadata with
test_dependencies: - mamba install pytest MDAnalysis
Registering an MDAKit
The MDAKit registration is the same regardless of the creation process for the kit.
For simplicity, the follow examples will reference the
rmsfkit MDAKit created in the cookiecutter section.
In order to submit your MDAKit to the registry, you will need to create a pull request on GitHub against the MDAnalysis/MDAKits repository.
Do this by creating a fork of the MDAnalysis/MDAKits repository.
Clone the fork to your machine, navigate into
MDAKits/mdakits/, and make an empty directory with your MDAKit name:
git clone [email protected]:yourusername/MDAKits cd MDAKits/mdakits mkdir rmsfkit/ cd rmsfkit
metadata.yaml for your MDAKit in this directory (see
Specification of the metadata.yaml file for details).
The contents of
project_name: rmsfkit authors: - https://github.com/yourusername/rmsfkit/blob/main/AUTHORS.md maintainers: - yourusername description: An analysis module for calculating the root-mean-square fluctuation of atoms in molecular dynamics simulations. keywords: - rms - rmsf license: GPL-2.0-or-later project_home: https://github.com/yourusername/rmsfkit documentation_home: https://rmsfkit.readthedocs.io/en/latest/ documentation_type: API ## Optional entries src_install: - git clone https://github.com/yourusername/rmsfkit.git - cd rmsfkit/ - pip install . python_requires: ">=3.9" mdanalysis_requires: ">=2.0.0" run_tests: - pytest --pyargs rmsfkit.tests development_status: Beta
Commit and push this to your fork:
git add metadata.yaml git commit -m "Adding rmsfkit" git push origin main
Refresh the forked repository page in your browser. Under “Contribute”, open a pull request. Add a title with the name of the kit and add a quick description. Click “Create pull request” and wait for the tests to pass. Once this is done, you can add a comment along the lines of “@MDAnalysis/mdakits-reviewers, ready for review”. The reviewers will get back to you with any change requests before merging it in as a kit. At this point there are no additional steps for registering your kit!
Maintaining a kit
There are a variety of reasons a kit may behave unexpectedly after being submitted to the registry. Apart from actively developing the kit, changes in kit dependencies, or even Python itself, can introduce (deprecate) new (old) functionality. For this reason, the kits’ continuous integration is rerun weekly to confirm the kits expected behavior. In the event that a kit no longer passes its tests, an issue in MDAnalysis/MDAKits is automatically raised while notifying the maintainers indicated in the metadata.yaml file. While the registry developers will be happy to help where possible, ultimately, the maintainers of the MDAKit are responsible for resolving such issues and ensuring that the tests pass. The issue will automatically close after the next CI run if the tests pass again.