Running a Refgenie Server
This tutorial walks you through setting up and running your own refgenie server. A refgenie server allows you to share genome assets with others on your network or the public internet.
Why Run Your Own Server?
There are several reasons you might want to run your own refgenie server:
- Internal distribution: Share assets privately across your organization without uploading to public servers
- Custom assets: Serve specialized or proprietary genome assets that aren't available on public servers
- Performance: Run a local server for faster asset access within your network
- Community sharing: Distribute your resources through the familiar refgenie interface
Prerequisites
Before starting a server, you need:
-
Refgenie installed with server extras:
pip install refgenie[server] -
Built assets: You need local assets to serve. If you haven't built any yet, see the build documentation.
-
Archives created: Assets must be archived before they can be served.
Server vs Dashboard
Refgenie provides two web interfaces:
refgenie1 serve- A full server for distributing assets to remote clientsrefgenie1 dash- A local dashboard for browsing your own assets
Creating Archives
Before serving assets, you must create archives. Archives are compressed, downloadable packages of your assets.
Set up an archive folder
First, ensure you have an archive folder configured. You can set this during initialization or via environment variable:
# Set via environment variable
export REFGENIE_GENOME_ARCHIVE_FOLDER=~/.refgenie/archives
# Or specify during init
refgenie1 init --genome-archive-folder ~/.refgenie/archives
Archive individual assets
Create an archive for a specific asset:
refgenie1 archive create hg38/fasta
This creates a .tar.gz archive in your genome_archive_folder that can be served to clients.
Archive multiple assets
You can archive multiple assets at once:
refgenie1 archive create hg38/fasta hg38/bowtie2_index mm10/fasta
Archive during build
You can also create archives automatically when building:
refgenie1 build hg38/fasta --archive
List archives
View all available archives:
refgenie1 archive list
Archives Required for Serving
The server only serves archived assets. If an asset isn't archived, clients won't be able to pull it.
Starting the Server
Basic server startup
Start the server with default settings (port 8000):
refgenie1 serve
The server will be available at http://localhost:8000.
Custom port
Specify a different port:
refgenie1 serve --port 8080
Development mode with auto-reload
For development, enable auto-reload to automatically restart when code changes:
refgenie1 serve --reload
Production Deployment
For production use, consider running behind a reverse proxy (nginx, Apache) with HTTPS enabled. The --reload flag should not be used in production.
Using the Local Dashboard
The dashboard provides a read-only web interface for browsing your local assets:
refgenie1 dash
By default, this opens http://localhost:8080 in your browser. Specify a different port with:
refgenie1 dash --port 9000
The dashboard automatically opens in your default web browser and displays:
- All genomes and their aliases
- Available assets for each genome
- Asset metadata and descriptions
Server API Overview
The refgenie server exposes a REST API with several endpoint groups:
Core Endpoints
| Endpoint | Description |
|---|---|
/ or /index |
Landing page with all available genomes |
/healthcheck |
Health check endpoint (returns {"status": "ok"}) |
/summary |
Database summary with counts of genomes, asset groups, and assets |
/species/summary |
Statistics grouped by species |
Asset Endpoints
| Endpoint | Description |
|---|---|
/page/genome/{digest} |
Genome splash page with available assets |
/page/asset/{digest}/{asset_group} |
Asset detail page |
/archives/{digest}/download |
Download an archive by its digest |
GA4GH DRS Endpoints
The server implements the GA4GH Data Repository Service (DRS) specification:
| Endpoint | Description |
|---|---|
/ga4gh/drs/service-info |
DRS service information |
/ga4gh/drs/objects/{object_id} |
Get DRS object metadata |
/ga4gh/drs/objects/{object_id}/access/{access_id} |
Get access URL for download |
Data Channel Endpoints
| Endpoint | Description |
|---|---|
/data_channel/ |
List available data channels |
/data_channel/index.yaml |
Compiled index of all data channels |
Interactive API Documentation
Visit /docs for Swagger UI or /redoc for ReDoc-style documentation.
Configuration for Production
Using PostgreSQL
For production deployments, consider using PostgreSQL instead of SQLite. Create a database configuration file:
# ~/.refgenie/refgenie_db_config.yaml
type: postgresql
name: refgenie
host: localhost
port: 5432
user: refgenie_user
password: your_secure_password
Set the path to this config:
export REFGENIE_DB_CONFIG_PATH=~/.refgenie/refgenie_db_config.yaml
See the configuration documentation for more details.
Environment Variables
Key environment variables for server configuration:
| Variable | Default | Description |
|---|---|---|
REFGENIE_HOME_PATH |
~/.refgenie |
Base directory for refgenie files |
REFGENIE_GENOME_FOLDER |
$REFGENIE_HOME_PATH/genomes |
Directory for genome assets |
REFGENIE_GENOME_ARCHIVE_FOLDER |
$REFGENIE_HOME_PATH/archives |
Directory for asset archives |
REFGENIE_DB_CONFIG_PATH |
$REFGENIE_HOME_PATH/refgenie_db_config.yaml |
Path to database config |
REFGENIE_LOG_LEVEL |
INFO |
Logging verbosity |
Serving from Remote Storage
For serving assets from cloud storage (e.g., S3), configure a remote:
refgenie1 remote add --type http --prefix https://your-bucket.s3.amazonaws.com/archives --description "S3 archive storage"
When a remote is configured, the server will redirect download requests to the remote URL instead of serving files directly.
Configuring Data Channels
Data channels allow you to aggregate asset classes and recipes from external sources. Set the path to your data channels configuration:
export DATA_CHANNELS_CONFIG_PATH=/path/to/data_channels.yaml
See data channels documentation for the configuration format.
Docker Deployment
For containerized deployments, you can run the server with Docker:
docker run --rm -d -p 80:80 \
-v /path/to/genomes:/genomes \
-v /path/to/archives:/archives \
-e REFGENIE_GENOME_FOLDER=/genomes \
-e REFGENIE_GENOME_ARCHIVE_FOLDER=/archives \
--name refgenie-server \
refgenie/refgenie:latest refgenie1 serve --port 80
Client Configuration
Once your server is running, clients can subscribe to it:
refgenie1 subscribe http://your-server-address:8000
Then pull assets normally:
refgenie1 pull hg38/fasta
Troubleshooting
Server won't start
- Ensure the
serverextras are installed:pip install refgenie[server] - Check that the port isn't already in use
- Verify database configuration is correct
Assets not showing up
- Confirm assets are archived:
refgenie1 archive list - Check the archive folder path is correct
- Verify database contains the assets:
refgenie1 list
Download failures
- Ensure archive files exist in the archive folder
- If using remote storage, verify the remote configuration is correct
- Check server logs for detailed error messages
Next Steps
- Configuration options - Detailed configuration reference
- Data channels - Set up external data sources
- Building assets - Create assets to serve
- Public servers - List of available public servers