.. _swh-mosaic: .. include:: README.rst Quick Start =========== Basic Usage Example:: from swh.mosaic import MosaicCreator, MosaicReader, IdxDescription from pathlib import Path # Open a new MOSAIC file for writing creator = MosaicCreator( Path("example.mosaic"), indexes=[IdxDescription.SHA1FMPHGO, IdxDescription.SHA256FMPHGO], comments=["Example MOSAIC file"] ) # Add a sample object. Keys must match what was provided as `indexes` above. obj1 = b"Hello World" obj1_sha1 = b"1" * 20 # fake SHA1 hash obj1_sha256 = b"1" * 32 # fake SHA256 hash creator.add([obj1_sha1, obj1_sha256], obj1) # Finalize the file (write its indexes) creator.close() # Open it for reading reader = MosaicReader(Path("example.mosaic")) print(f"Objects: {reader.objects_counter}") print(f"Comments: {reader.comments}") # Load an index to enable lookups reader.load_index(IdxDescription.SHA1FMPHGO) retrieved = reader.lookup(obj1_sha1) print(f"Retrieved: {retrieved}") # Loading an index also enables iteration reader.load_index(IdxDescription.SHA256FMPHGO) for (obj_sha256, obj_content) in reader: with open(obj_sha256, 'wb') as f: f.write(obj_content) Note that: - ``MosaicReader`` is optimized for random accesses. If you need really fast iterations on objects in a MOSAIC, please use the Rust crate directly. - The ``MosaicReader`` constructor only reads the file header. You must call ``load_index()`` before performing lookups or iteration. - Tile threshold (default 32MB) determines when new tiles are created and affects the maximum object size calculation. Available Index Types ===================== The following indexes are supported through the :py:class:`IdxDescription` enum. Currently all indexes rely on an `FMPHGO MPH `_, and differ by their keys' semantics: - ``SHA1FMPHGO``: keys are objects' SHA1 - ``SHA1GITFMPHGO``: Git-style SHA1 - ``SHA256FMPHGO``: SHA256 - ``BLAKE2FMPHGO``: BLAKE2 Context Manager Support ======================= ``MosaicCreator`` support the context manager protocol:: # Writing with context manager (automatically closes) with MosaicCreator( Path("example.mosaic"), indexes=[IdxDescription.SHA1GITFMPHGO] ) as creator: creator.add([b"1"*20], b"data") In that setting, the file is finalized when exiting context. .. only:: standalone_package_doc Indices and tables ------------------ * :ref:`genindex` * :ref:`modindex` * :ref:`search`