Skip to content

Architecture Overview

Polymathy is designed as a high-performance, async web service built with Rust.

System Architecture

graph LR
    A[Client] --> B[Polymathy API]
    B --> C[SearxNG]
    B --> D[Content Processor]
    D --> E[Embedding Model]
    B --> F[USearch Index]

Components

API Layer

The HTTP server handles incoming requests, processes them, and returns responses. Built with Actix-web for high performance.

Responsibilities: - Accept search queries - Coordinate with external services - Return processed results - Serve API documentation

Search Integration

Polymathy uses SearxNG as a meta-search engine to retrieve initial search results.

Flow: 1. Receive query from client 2. Forward to SearxNG with JSON format 3. Extract URLs from results (max 10) 4. Pass URLs to content processor

Content Processor

An external service that handles the heavy lifting of content processing.

Functions: - Fetch web page content - Clean and extract text - Split into semantic chunks - Generate vector embeddings

Vector Index

USearch provides fast vector similarity search capabilities.

Configuration: - 384 dimensions (AllMiniLML6V2) - Inner Product metric - F32 quantization

Data Flow

sequenceDiagram
    participant C as Client
    participant P as Polymathy
    participant S as SearxNG
    participant PR as Processor

    C->>P: GET /v1/search?q=query
    P->>S: Search request
    S-->>P: Search results (URLs)
    loop For each URL (max 10)
        P->>PR: Process URL
        PR-->>P: Chunks + Embeddings
    end
    P-->>C: Processed chunks map

Technology Choices

Component Technology Reason
Language Rust Performance, safety, async support
Web Framework Actix-web High performance, mature ecosystem
Async Runtime Tokio Industry standard for Rust async
Vector DB USearch Fast, lightweight, Rust bindings
HTTP Client Reqwest Async-friendly, well-maintained

Concurrency Model

Polymathy processes multiple URLs concurrently using Tokio's async runtime:

// Simplified concurrent processing
let futures: Vec<_> = urls
    .iter()
    .map(|url| process_url(url))
    .collect();

let results = join_all(futures).await;

This allows efficient handling of I/O-bound operations like fetching web content.

Next Steps

  • Modules - Detailed module documentation
  • Data Flow - In-depth data flow analysis