Skip to content

REST API Reference

Complete reference for the EmbedCache REST API.

Base URL

http://localhost:8081

Authentication

EmbedCache does not require authentication by default. For production deployments, use a reverse proxy to add authentication.


POST /v1/embed

Generate embeddings for a list of text strings.

Request

POST /v1/embed
Content-Type: application/json

Body:

{
  "text": ["string1", "string2", ...],
  "config": {
    "chunking_type": "words",
    "chunking_size": 512,
    "embedding_model": "BGESmallENV15"
  }
}
Field Type Required Description
text array[string] Yes List of texts to embed
config object No Processing configuration
config.chunking_type string No Chunking strategy (default: "words")
config.chunking_size integer No Chunk size in words (default: 512)
config.embedding_model string No Model to use (default: "BGESmallENV15")

Response

Success (200 OK):

[
  [0.123, -0.456, 0.789, ...],
  [0.234, -0.567, 0.890, ...]
]

Array of embedding vectors, one per input text.

Error (400 Bad Request):

{
  "error": "Unsupported embedding model: InvalidModel"
}

Example

curl -X POST http://localhost:8081/v1/embed \
  -H "Content-Type: application/json" \
  -d '{
    "text": [
      "Machine learning is a subset of AI.",
      "Natural language processing enables text understanding."
    ],
    "config": {
      "embedding_model": "BGESmallENV15"
    }
  }'

POST /v1/process

Fetch content from a URL, chunk it, and generate embeddings. Results are cached.

Request

POST /v1/process
Content-Type: application/json

Body:

{
  "url": "https://example.com/article",
  "config": {
    "chunking_type": "words",
    "chunking_size": 256,
    "embedding_model": "BGESmallENV15"
  }
}
Field Type Required Description
url string Yes URL to fetch and process
config object No Processing configuration

Response

Success (200 OK):

{
  "url": "https://example.com/article",
  "config": {
    "chunking_type": "words",
    "chunking_size": 256,
    "embedding_model": "BGESmallENV15"
  },
  "chunks": {
    "0": "First chunk of extracted text...",
    "1": "Second chunk of extracted text...",
    "2": "Third chunk of extracted text..."
  },
  "embeddings": {
    "0": [0.123, -0.456, ...],
    "1": [0.234, -0.567, ...],
    "2": [0.345, -0.678, ...]
  },
  "error": null
}
Field Type Description
url string Processed URL
config object Configuration used
chunks object Map of chunk index to text
embeddings object Map of chunk index to embedding
error string|null Error message if processing failed

Scraping Failed:

{
  "url": "https://example.com/article",
  "config": {...},
  "chunks": {},
  "embeddings": {},
  "error": "Failed to scrape content"
}

Example

curl -X POST http://localhost:8081/v1/process \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://en.wikipedia.org/wiki/Machine_learning",
    "config": {
      "chunking_type": "words",
      "chunking_size": 200,
      "embedding_model": "AllMiniLML6V2"
    }
  }'

GET /v1/params

List supported chunking types and embedding models.

Request

GET /v1/params

Response

Success (200 OK):

{
  "chunking_types": [
    "words",
    "llm-concept",
    "llm-introspection"
  ],
  "embedding_models": [
    "AllMiniLML6V2",
    "AllMiniLML6V2Q",
    "AllMiniLML12V2",
    "AllMiniLML12V2Q",
    "BGEBaseENV15",
    "BGEBaseENV15Q",
    "BGELargeENV15",
    "BGELargeENV15Q",
    "BGESmallENV15",
    "BGESmallENV15Q",
    "NomicEmbedTextV1",
    "NomicEmbedTextV15",
    "NomicEmbedTextV15Q",
    "ParaphraseMLMiniLML12V2",
    "ParaphraseMLMiniLML12V2Q",
    "ParaphraseMLMpnetBaseV2",
    "BGESmallZHV15",
    "MultilingualE5Small",
    "MultilingualE5Base",
    "MultilingualE5Large",
    "MxbaiEmbedLargeV1",
    "MxbaiEmbedLargeV1Q"
  ]
}

Example

curl http://localhost:8081/v1/params

Error Responses

All endpoints may return error responses:

400 Bad Request:

{
  "error": "Unsupported chunking type: invalid-type"
}

500 Internal Server Error:

{
  "error": "Internal server error message"
}

OpenAPI Specification

The full OpenAPI specification is available at:

GET /openapi.json

Rate Limiting

EmbedCache does not implement rate limiting. For production use, configure rate limiting in your reverse proxy.


CORS

CORS is not configured by default. Enable it via a reverse proxy if needed.