Skip to content

Python package

The refget Python package

The refget Python package provides a Python implementation of the GA4GH Refget Specifications, which define standards for identifying and distributing reference biological sequences, like reference genomes. It provides standards at 3 levels of data: sequences, sequence collections, and pangenomes (in progress).

Standard Local use Client Agent
Sequences
Sequence Collections

The Python package refget includes these utilities:

Refget Sequences

  1. A lightweight Python client for a remote refget sequences server.
  2. Local caching of retrieved results, improving performance for applications that require repeated lookups.
  3. A fully functioning local implementation of the refget sequences protocol for local analysis backed by either memory, SQLite, or MongoDB.
  4. Convenience functions for computing refget sequence digests from Python and handling FASTA files directly.

Refget Sequence Collections

  1. A lightweight Python client that can retrieve data from a remote refget sequence collections server (refget.SequenceCollectionClient).
  2. A local implementation of the refget sequence collections protocol
  3. Convenience functions for computing refget sequence collection digests from Python.

Refget Pangenomes

Implementation is still a work in progress.

Install

The built package is hosted on PyPI. Install with your flavor of:

pip install refget