Endpoint Hits Collector in RefgenieServer
Overview
The endpoint_hits_collector is a core component in RefgenieServer designed to track and process hits to specific API endpoints. Its primary use case is to collect download counts for assets, but it is built to be easily extensible for other analytics or logging purposes.
How It Works
Middleware Collection
- The
EndpointHitCounterMiddlewareis registered as a middleware in the FastAPI app. - On each request, it inspects the matched route and path parameters.
- It records a hit by calling
record_hiton theendpoint_hits_collector(attached toapp.state).
Background Processing
- The
endpoint_hits_collectoris periodically processed by a background job, managed by an APSchedulerBackgroundScheduler. - The scheduler is started in the FastAPI app's
lifespancontext and runs thehandle_endpoint_collector_hitsfunction at a configurable interval (default: 60 seconds) - This function executes all registered handlers (see below), passing them the collected hits, and then resets the collector for the next interval.
Current Usage: Download Counts
- The main use case in RefgenieServer is to collect and persist download counts for assets.
- Handlers in
refgenieserver.stats.handlers.HANDLERS_CATALOGprocess the collected hits and update the relevant statistics (e.g., incrementing download counters in a database).
Extending Functionality: Adding More Handlers
Developers can extend the analytics and processing capabilities by adding new handlers. Here’s how:
- Create a Handler Function
- A handler is a function that takes two arguments: the
EndpointCollectorinstance and theRefgenieinstance. - It should process the collected hits as needed (e.g., log to a file, update metrics, trigger alerts).
def my_custom_handler(endpoint_collector, refgenie):
# Process collected hits
hits = endpoint_collector.get_hits()
# ... custom logic ...
- Register the Handler
- Add your handler to the
HANDLERS_CATALOGdictionary inrefgenieserver/stats/handlers.py:
HANDLERS_CATALOG = {
"download_count": download_count_handler,
"my_custom": my_custom_handler,
}
- Automatic Execution
- All handlers in the catalog are automatically called by
handle_endpoint_collector_hitsat each interval. - No further changes are needed to the scheduler or main app.
Configuration
- The interval for processing hits is controlled by the
DOWNLOAD_COUNT_DUMP_JOB_INTERVAL_SECONDSenvironment variable (default: 60 seconds). - This can be set in your environment or configuration files.
Summary
- The
endpoint_hits_collectorprovides a flexible, centralized way to collect and process endpoint usage data. - It is managed automatically by middleware and a background scheduler.
- Developers can easily extend its functionality by adding new handlers to the handler catalog.