.. _roadmap-2021: Roadmap 2021 ============ (Version 1.0, last modified 05/04/2021) This document provides an overview of the technical roadmap of Software Heritage for 2021. The `Kanban board `_ is seen through our forge. .. contents:: :depth: 3 .. Collect ------- Faster and more reliable save code now ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - tags: openscience - task: `T3082 `_ - lead: ardumont - effort: 1 PM Includes work: - set up dedicated fast track pipeline for save code now - improve save code now monitoring (user and admin) Improve deposit integration, management and display ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - tags: openscience - task: `T3128 `_ - lead: moranegg - effort: 3 PM Includes work: - full invenioRDM integration `T2344 `_ - metadata only deposit `T2540 `_ Save forge now ^^^^^^^^^^^^^^ - tags: expand - task: `T1538 `_ - lead: ardumont - effort: 1 PM - tooling & process Admin tooling for takedown notices (URLs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - tags: contract, compliance - task: `T3087 `_ - lead: anlambert - effort: 2 PM Includes work: - admin interface - journal of operations - web page with list of accepted TDN Preserve -------- Complete and up-to-date archive copy on S3 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - tags: stability - task: `T3085 `_ - lead: douardda - effort: 1 PM Includes work: - live update of the objects - regular dumps of the (anonymized) Merkle graph Scale-out graph storage in production ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - tags: scalability - task: `T2214 `_ - lead: vlorentz - effort: 3 PM Includes work: - Cassandra: `T1892 `_ (*maybe with external help*) Scale-out object storage prototype ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - tags: stability, scalability, *externalized* - task: `T3054 `_ - lead: dachary - effort: 3 PM Cold storage archive in Vitam instance at CINES ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - tags: contract - task: `T3113 `_ - lead: douardda - effort: 4 PM Mirrors ^^^^^^^ - tags: stability, scalability - depends: scale-out object storage - task: `T3116 `_ - lead: douardda - effort: 3 PM Includes work: - get up and running at least one mirror SWHID v2 ^^^^^^^^ - tags: stability, evolution, datamodel - task: `T3134 `_ - lead: zack - effort: 6 PM Includes work: - complete on paper spec - align with new git hashes - including migration plan from v1 - understand impact on internal microservice architecture - keep correspondence with v1 (there may be multiple v2 for one v1!) - reviewed by crypto experts Integrity ^^^^^^^^^ - tags: stability, reliability - task: `T3135 `_ - lead: olasd - effort: 2 PM Includes work: - making sure objects aren’t corrupted before insertion `T399 `_ - ... and that existing ones are not part of `T75 `_ - make corruption check periodically Share ----- swh-graph in production ^^^^^^^^^^^^^^^^^^^^^^^ - tags: scalability - task: `T2220 `_ - lead: zack - effort: 2 PM Efficient and reliable Vault download ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - tags: stability - task: `T3096 `_ - lead: vlorentz - effort: 3 PM Includes work: - swh-graph may speed up a lot operations Web API 2.0 ^^^^^^^^^^^ - tags: reliability, interoperability - task: `T2194 `_ - lead: anlambert - effort: 4 PM Includes work: - OpenAPI specification - implementation Expose metadata and make them searchable ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - tags: openscience - task: `T3097 `_ - lead: vlorentz - effort: 3 PM Includes work: - index extrinsic metadata in swh-search/Elasticsearch from the journal `T2073 `_ - create API endpoint to access raw_extrinsic_metadata `T2938 `_ - show metadata in the web UI `T2088 `_ Full text search prototype ^^^^^^^^^^^^^^^^^^^^^^^^^^ - tags: feature, wishlist - task: `T2204 `_ - lead: anlambert - effort: 3 PM Includes work: - requires integration with swh-graph and/or provenance index Organize -------- Collect extrinsic metadata ^^^^^^^^^^^^^^^^^^^^^^^^^^ - tags: compliance - task: `T2202 `_ - lead: vlorentz - effort: 3 PM Includesd work: - working pipeline - at least 1 instance running ClearlyDefined - forge metadata (info on the main page, etc.) Provenance in production ^^^^^^^^^^^^^^^^^^^^^^^^ - tags: contract, feature - task: `T3112 `_ - lead: zack - effort: 6 PM Prior art ^^^^^^^^^ - tags: compliance - depends: provenance \| swh-graph in production - task: `T3136 `_ - lead: zack - effort: 3 PM Includes work: - pinpoint origin of selected source code artifacts - possibly integrated with swh-scanner Measurement ----------- Efficient archive counters (HyperLogLog) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - tags: measure, comm - task: `T2912 `_ - lead: vsellier - effort: 1 PM Distribution of origins by forge ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - tags: measure, comm - task: `T3127 `_ - lead: anlambert - effort: 1 PM Stats on regular crawling by forge ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - tags: measure, comm - task: `T1363 `_ - lead: olasd - effort: 1 PM Includes work: - lag, periodicity, # of changes since last visit, etc. View deposits per user (admin and user) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - tags: measure, support - task: `T3128 `_ - lead: ardumont - effort: 1 PM Reliable user-level monitoring of services ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - tags: stability - task: `T3129 `_ - lead: vsellier - effort: 2 PM Includes work: - status.softwareheritage.org Documentation ------------- Write use case-specific documentation ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - tags: comm, web, doc - task: `T2234 `_ - lead: moranegg - effort: 2 PM Includes FAQ for: - users - ambassadors Improve quality of code documentation ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - tags: doc, *externalized* - task: TODO - lead: TBD - effort: 2PM Includes work: - doc(string) audit - team training about doc writing Documentation strategy ^^^^^^^^^^^^^^^^^^^^^^ - tags: doc - task: `T2624 `_ - lead: moranegg - effort: 1 PM Includes work: - respective role of docs.s.o, wiki, www.s.o, etc. Community --------- Tooling for fundraising campaigns ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - tags: web - task: `T3077 `_ - lead: anlambert - effort: 1 PM Dedicated page to list status of supported listers/loaders ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - tags: web, doc - task: `T3117 `_ - lead: anlambert - effort: 1 PM Includes work: - `T1870 `_ - design web page - process to maintain up to date - make clearly visible and link to Sloan subgrants Tooling ------- Migration to GitLab ^^^^^^^^^^^^^^^^^^^ - tags: forge, development - task: `T2225 `_ - lead: olasd - effort: 1PM