Command-line interface#
swh indexer#
Software Heritage Indexer tools.
The Indexer is used to mine the content of the archive and extract derived information from archive source code artifacts.
swh indexer [OPTIONS] COMMAND [ARGS]...
Options
- -C, --config-file <config_file>#
Configuration file.
journal-client#
Listens for new objects from the SWH Journal, and runs the indexer with the name passed as argument
Passing ‘*’ as indexer name runs all indexers.
swh indexer journal-client [OPTIONS] {origin_intrinsic_metadata|extrinsic_meta
data|content_mimetype|content_fossology_license|*}
Options
- --broker <brokers>#
Kafka broker to connect to.
- --prefix <prefix>#
Prefix of Kafka topic names to read from.
- --group-id <group_id>#
Consumer/group id for reading from Kafka.
- -m, --stop-after-objects <stop_after_objects>#
Maximum number of objects to replay. Default is to run forever.
- -b, --batch-size <batch_size>#
Batch size. Default is 200.
Arguments
- INDEXER#
Required argument
mapping#
Manage Software Heritage Indexer mappings.
swh indexer mapping [OPTIONS] COMMAND [ARGS]...
list#
Prints the list of known mappings.
swh indexer mapping list [OPTIONS]
list-terms#
Prints the list of known CodeMeta terms, and which mappings support them.
swh indexer mapping list-terms [OPTIONS]
Options
- --exclude-mapping <exclude_mapping>#
Exclude the given mapping from the output
- --concise#
Don’t print the list of mappings supporting each term.
translate#
Translates file from mapping-name to codemeta format.
swh indexer mapping translate [OPTIONS] MAPPING_NAME FILE
Arguments
- MAPPING_NAME#
Required argument
- FILE#
Required argument
rpc-serve#
Starts a Software Heritage Indexer RPC HTTP server.
swh indexer rpc-serve [OPTIONS] CONFIG_PATH
Options
- --host <host>#
Host to run the server
- --port <port>#
Binding port of the server
- --debug, --nodebug#
Indicates if the server should run in debug mode
Arguments
- CONFIG_PATH#
Required argument