Package refget
documentation
create_refget_router
create_refget_router(sequences=False, collections=True, pangenomes=False)
Create a FastAPI router for the sequence collection API. This router provides endpoints for retrieving and comparing sequence collections. You can choose which endpoints to include by setting the sequences, collections, or pangenomes flags.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sequences
|
bool
|
Include sequence endpoints |
False
|
collections
|
bool
|
Include sequence collection endpoints |
True
|
pangenomes
|
bool
|
Include pangenome endpoints |
False
|
Returns:
Type | Description |
---|---|
APIRouter
|
A FastAPI router with the specified endpoints |
Examples:
app.include_router(create_refget_router(sequences=False, pangenomes=False))
Source code in refget/refget_router.py
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
|
fasta_to_seqcol_dict
fasta_to_seqcol_dict(fasta_file_path, digest_function=sha512t24u_digest)
Convert a FASTA file into a Sequence Collection object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fasta_file_path
|
str
|
Path to the FASTA file |
required |
digest_function
|
DigestFunction
|
Digest function to use. Defaults to sha512t24u_digest. |
sha512t24u_digest
|
Returns:
Type | Description |
---|---|
dict
|
A canonical sequence collection object |
Source code in refget/utilities.py
108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 |
|
fasta_to_digest
fasta_to_digest(fa_file_path, inherent_attrs=['names', 'sequences'])
Given a fasta file path, return a digest
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fa_file_path
|
str | Path
|
Path to the fasta file |
required |
inherent_attrs
|
Optional[list]
|
Attributes to include in the digest. |
['names', 'sequences']
|
Returns:
Type | Description |
---|---|
str
|
The top-level digest for this sequence collection |
Source code in refget/utilities.py
91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 |
|
SequenceClient
SequenceClient(urls=['https://www.ebi.ac.uk/ena/cram'], raise_errors=None)
Bases: RefgetClient
A client for interacting with a refget sequences API.
Initializes the sequences client.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
urls
|
list
|
A list of base URLs of the sequences API. Defaults to ["https://www.ebi.ac.uk/ena/cram/sequence/"]. |
['https://www.ebi.ac.uk/ena/cram']
|
raise_errors
|
bool
|
Whether to raise errors or log them. Defaults to None, which will guess. |
None
|
Attributes: urls (list): The list of base URLs of the sequences API.
Source code in refget/clients.py
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
|
get_metadata
get_metadata(digest)
Retrieves metadata for a given sequence digest.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
digest
|
str
|
The digest of the sequence. |
required |
Returns:
Type | Description |
---|---|
dict
|
The metadata. |
Source code in refget/clients.py
78 79 80 81 82 83 84 85 86 87 88 89 |
|
get_sequence
get_sequence(digest, start=None, end=None)
Retrieves a sequence for a given digest.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
digest
|
str
|
The digest of the sequence. |
required |
Returns:
Type | Description |
---|---|
str
|
The sequence. |
Source code in refget/clients.py
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 |
|
SequenceCollectionClient
SequenceCollectionClient(urls=['https://seqcolapi.databio.org'], raise_errors=None)
Bases: RefgetClient
A client for interacting with a refget sequence collections API.
Initializes the sequence collection client.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
urls
|
list
|
A list of base URLs of the sequence collection API. Defaults to ["https://seqcolapi.databio.org"]. |
['https://seqcolapi.databio.org']
|
Attributes:
Name | Type | Description |
---|---|---|
urls |
list
|
The list of base URLs of the sequence collection API. |
Source code in refget/clients.py
97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 |
|
compare
compare(digest1, digest2)
Compares two sequence collections.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
digest1
|
str
|
The digest of the first sequence collection. |
required |
digest2
|
str
|
The digest of the second sequence collection. |
required |
Returns:
Type | Description |
---|---|
dict
|
The JSON response containing the comparison of the two sequence collections. |
Source code in refget/clients.py
142 143 144 145 146 147 148 149 150 151 152 153 154 |
|
get_attribute
get_attribute(attribute, digest, level=2)
Retrieves a specific attribute for a given digest and detail level.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
attribute
|
str
|
The attribute to retrieve. |
required |
digest
|
str
|
The digest of the attribute. |
required |
Returns:
Type | Description |
---|---|
dict
|
The JSON response containing the attribute. |
Source code in refget/clients.py
128 129 130 131 132 133 134 135 136 137 138 139 140 |
|
get_collection
get_collection(digest, level=2)
Retrieves a sequence collection for a given digest and detail level.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
digest
|
str
|
The digest of the sequence collection. |
required |
level
|
int
|
The level of detail for the sequence collection. Defaults to 2. |
2
|
Returns:
Type | Description |
---|---|
dict
|
The JSON response containing the sequence collection. |
Source code in refget/clients.py
114 115 116 117 118 119 120 121 122 123 124 125 126 |
|
list_attributes
list_attributes(attribute, page=None, page_size=None)
Lists all available values for a given attribute with optional paging support.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
attribute
|
str
|
The attribute to list values for. |
required |
page
|
int
|
The page number to retrieve. Defaults to None. |
None
|
page_size
|
int
|
The number of items per page. Defaults to None. |
None
|
Returns:
Type | Description |
---|---|
dict
|
The JSON response containing the list of available values for the attribute. |
Source code in refget/clients.py
182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 |
|
list_collections
list_collections(page=None, page_size=None, attribute=None, attribute_digest=None)
Lists all available sequence collections with optional paging and attribute filtering support.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
page
|
int
|
The page number to retrieve. Defaults to None. |
None
|
page_size
|
int
|
The number of items per page. Defaults to None. |
None
|
attribute
|
str
|
The attribute to filter by. Defaults to None. |
None
|
attribute_digest
|
str
|
The attribute digest to filter by. Defaults to None. |
None
|
Returns:
Type | Description |
---|---|
dict
|
The JSON response containing the list of available sequence collections. |
Source code in refget/clients.py
156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 |
|
service_info
service_info()
Retrieves information about the service.
Returns:
Type | Description |
---|---|
dict
|
The service information. |
Source code in refget/clients.py
203 204 205 206 207 208 209 210 211 |
|
RefgetDBAgent
RefgetDBAgent(engine=None, postgres_str=None, schema=f'{SCHEMA_FILEPATH}/seqcol.json', inherent_attrs=['names', 'lengths', 'sequences'])
Bases: object
Primary aggregator agent, interface to all other agents
Parameterized it via these environment variables: - POSTGRES_HOST - POSTGRES_DB - POSTGRES_USER - POSTGRES_PASSWORD
Source code in refget/agents.py
466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 |
|
truncate
truncate()
Delete all records from the database
Source code in refget/agents.py
559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 |
|
SequenceCollectionAgent
SequenceCollectionAgent(engine, inherent_attrs=None)
Bases: object
Agent for interacting with database of sequence collection
Source code in refget/agents.py
145 146 147 |
|
add
add(seqcol)
Add a sequence collection to the database, given a SeedCollection object
Source code in refget/agents.py
173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 |
|
add_from_dict
add_from_dict(seqcol_dict)
Add a sequence collection from a seqcol dictionary
Source code in refget/agents.py
237 238 239 240 241 242 243 244 |
|
add_from_fasta_pep
add_from_fasta_pep(pep, fa_root)
Given a path to a PEP file and a root directory containing the fasta files, load the fasta files into the refget database.
Args: - pep_path (str): Path to the PEP file - fa_root (str): Root directory containing the fasta files
Source code in refget/agents.py
251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 |
|
SequenceAgent
SequenceAgent(engine)
Bases: object
Agent for interacting with database of sequences
Source code in refget/agents.py
77 78 |
|