swh.indexer.ctags module

swh.indexer.ctags.compute_language(content, log=None)[source]
swh.indexer.ctags.run_ctags(path, lang=None, ctags_command='ctags')[source]

Run ctags on file path with optional language.

Parameters
  • path – path to the file

  • lang – language for that path (optional)

Yields

dict – ctags’ output

class swh.indexer.ctags.CtagsIndexer(config=None, **kw)[source]

Bases: swh.indexer.indexer.ContentIndexer

CONFIG_BASE_FILENAME = 'indexer/ctags'
ADDITIONAL_CONFIG = {'languages': ('dict', {'ada': 'Ada', 'adl': None, 'agda': None}), 'tools': ('dict', {'name': 'universal-ctags', 'version': '~git7859817b', 'configuration': {'command_line': 'ctags --fields=+lnz --sort=no --links=no --output-format=json <filepath>'}}), 'workdir': ('str', '/tmp/swh/indexer.ctags')}
prepare()[source]

Prepare the indexer’s needed runtime configuration. Without this step, the indexer cannot possibly run.

filter(ids)[source]

Filter out known sha1s and return only missing ones.

index(id, data)[source]

Index sha1s’ content and store result.

Parameters
  • id (bytes) – content’s identifier

  • data (bytes) – raw content in bytes

Returns

a dict representing a content_mimetype with keys:

  • id (bytes): content’s identifier (sha1)

  • ctags ([dict]): ctags list of symbols

Return type

dict

persist_index_computations(results: List[Dict], policy_update: str) → Dict[str, int][source]

Persist the results in storage.

Parameters
  • results – list of content_mimetype, dict with the following keys: - id (bytes): content’s identifier (sha1) - ctags ([dict]): ctags list of symbols

  • policy_update – either ‘update-dups’ or ‘ignore-dups’ to respectively update duplicates or ignore them

results
scheduler