Digestmap’s Python bindings#

A digestmap is an efficient mapping of content hashes (from SWHID to SHA1). Designed after a hash conversion service idea.

The Python package documented below contains bindings with the Rust crate swh-digestmap. Use the crate to create a digestmap.

Direct use#

from swh.digestmap import DigestMap
digestmap = DigestMap("dest_folder")
digestmap.sha1_from_swhid("swh:1:cnt:0000000000000000000000000000000000000004")

found = digestmap.content_get([b"0000000000000000000000000000000000000004"], algo="sha1_git")
if found and found[0]:
    hashes_dict = found[0].hashes()

Use as a Software Heritage storage backend#

The Python package will register digestmap as a Software Heritage storage backend. However it only partially implements swh.storage.interface.StorageInterface.content_get(): returned content objects should only be used to fetch .hashes() as in the example above. Note that the returned dict will only contain hashes known to the digestmap, sha1 and sha1_git. If you are not bothered by these limitations (for example, you’re using swh-fuse ) It can be configured as such:

storage:
  cls: digestmap
  path: "/path/to/digestmap/folder"

Develop#

pip install -r requirements-swh.txt
pip install -r requirements-test.txt
pip install .
pytest

We test via pytest because the DigestMap binding needs a Python able to import swh.model.model.

Package with cibuildwheel . from the repository’s root.