Skip to content

Chunking Strategies

EmbedCache provides multiple text chunking strategies to break down documents into smaller pieces for embedding generation.

Why Chunking Matters

Embedding models have token limits and work best with focused, coherent text segments. Chunking strategies help:

  • Stay within model limits - Avoid truncation
  • Improve embedding quality - More focused embeddings
  • Enable semantic search - Find specific passages
  • Optimize storage - Index at appropriate granularity

Available Strategies

Word Chunking

Type: words

The simplest strategy - splits text by whitespace into fixed-size word chunks.

curl -X POST http://localhost:8081/v1/embed \
  -H "Content-Type: application/json" \
  -d '{
    "text": ["Your long text here..."],
    "config": {
      "chunking_type": "words",
      "chunking_size": 512
    }
  }'

Characteristics:

  • Fast and deterministic
  • May split mid-sentence or mid-concept
  • Good for general-purpose use
  • Always available

LLM Concept Chunking

Type: llm-concept

Uses an LLM to identify semantic concept boundaries in the text.

curl -X POST http://localhost:8081/v1/embed \
  -H "Content-Type: application/json" \
  -d '{
    "text": ["Your long text here..."],
    "config": {
      "chunking_type": "llm-concept",
      "chunking_size": 256
    }
  }'

Characteristics:

  • Semantically coherent chunks
  • Respects topic boundaries
  • Slower than word chunking
  • Requires LLM configuration
  • Falls back to word chunking on failure

LLM Introspection Chunking

Type: llm-introspection

Uses a two-step LLM process: first analyzes document structure, then creates optimized chunks.

curl -X POST http://localhost:8081/v1/embed \
  -H "Content-Type: application/json" \
  -d '{
    "text": ["Your long text here..."],
    "config": {
      "chunking_type": "llm-introspection",
      "chunking_size": 256
    }
  }'

Characteristics:

  • Best semantic quality
  • Document-aware chunking
  • Slowest option (2 LLM calls)
  • Requires LLM configuration
  • Falls back to word chunking on failure

Choosing a Strategy

Use Case Recommended Strategy
High throughput processing words
Semantic search quality llm-concept
Document analysis llm-introspection
Limited LLM budget words
Best retrieval accuracy llm-introspection

Chunk Size Guidelines

Content Type Recommended Size
Short documents 128-256 words
Articles 256-512 words
Long documents 512-1024 words
Technical docs 256-512 words

Finding Optimal Size

Start with 256-512 words and adjust based on your search results quality. Smaller chunks provide more precise retrieval, larger chunks provide more context.

Configuring LLM Chunking

To use LLM-based chunking, configure an LLM provider:

# In .env file
LLM_PROVIDER=ollama
LLM_MODEL=llama3
LLM_BASE_URL=http://localhost:11434

See LLM Chunking for detailed setup.

Custom Chunking

You can implement custom chunking strategies. See Custom Chunkers.

Example: Comparing Strategies

import requests

text = """
Machine learning is a subset of artificial intelligence that enables
computers to learn from data. Deep learning, a type of machine learning,
uses neural networks with many layers. Natural language processing (NLP)
allows computers to understand human language.
"""

for strategy in ["words", "llm-concept"]:
    response = requests.post(
        "http://localhost:8081/v1/embed",
        json={
            "text": [text],
            "config": {
                "chunking_type": strategy,
                "chunking_size": 20
            }
        }
    )
    print(f"{strategy}: {len(response.json())} embeddings")