Back to all projects
Active

ZeptoR8R

Intelligent request router

A high-performance request router with AI-aware load balancing. ZeptoR8R distributes traffic intelligently based on model capacity, queue depth, and latency requirements — ensuring optimal AI inference performance.

Quickstart

terminal
# Install ZeptoR8R
$ npm install -g @aisar/r8r
$
# Start with config
$ r8r --config ./r8r.yaml

Features

AI-Aware Routing

Routes requests based on model type, token count, and inference complexity.

Load Balancing

Distributes traffic across inference workers using real-time metrics.

Queue Management

Smart queuing with priority levels and timeout handling.

Low Latency

Sub-millisecond routing overhead with efficient connection pooling.

Architecture

ClientsHTTP/gRPC API consumers
RouterRequest classification and routing logic
Load BalancerWorker selection and health tracking
WorkersAI inference endpoints