Running as a Service¶
This guide covers how to run EmbedCache as a REST API service.
Starting the Service¶
Basic Start¶
This starts the service with default configuration on 127.0.0.1:8081.
With Configuration File¶
# Create .env file first
cat > .env << EOF
SERVER_HOST=0.0.0.0
SERVER_PORT=8080
DB_PATH=/var/lib/embedcache/cache.db
ENABLED_MODELS=BGESmallENV15,AllMiniLML6V2
EOF
# Start the service
embedcache
With Environment Variables¶
API Endpoints¶
Generate Embeddings¶
POST /v1/embed
Generate embeddings for a list of text strings.
curl -X POST http://localhost:8081/v1/embed \
-H "Content-Type: application/json" \
-d '{
"text": ["Hello, world!", "Another text to embed."],
"config": {
"chunking_type": "words",
"chunking_size": 512,
"embedding_model": "BGESmallENV15"
}
}'
Response:
Process URL¶
POST /v1/process
Fetch content from a URL, chunk it, and generate embeddings.
curl -X POST http://localhost:8081/v1/process \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/article",
"config": {
"chunking_type": "words",
"chunking_size": 256,
"embedding_model": "BGESmallENV15"
}
}'
Response:
{
"url": "https://example.com/article",
"config": {
"chunking_type": "words",
"chunking_size": 256,
"embedding_model": "BGESmallENV15"
},
"chunks": {
"0": "First chunk of text...",
"1": "Second chunk of text..."
},
"embeddings": {
"0": [0.123, -0.456, ...],
"1": [0.234, -0.567, ...]
},
"error": null
}
List Supported Features¶
GET /v1/params
Get a list of supported chunking types and embedding models.
Response:
{
"chunking_types": ["words", "llm-concept", "llm-introspection"],
"embedding_models": [
"AllMiniLML6V2",
"BGESmallENV15",
"BGEBaseENV15",
...
]
}
Health Monitoring¶
Check if the service is running by making a request to the params endpoint:
Logging¶
EmbedCache uses Actix's built-in logging. Logs are written to stdout. To increase verbosity:
Log levels: error, warn, info, debug, trace
Graceful Shutdown¶
The service handles SIGTERM gracefully, completing in-flight requests before shutting down.