Architecture Overview¶
Polymathy is designed as a high-performance, async web service built with Rust.
System Architecture¶
graph LR
A[Client] --> B[Polymathy API]
B --> C[SearxNG]
B --> D[Content Processor]
D --> E[Embedding Model]
B --> F[USearch Index]
Components¶
API Layer¶
The HTTP server handles incoming requests, processes them, and returns responses. Built with Actix-web for high performance.
Responsibilities: - Accept search queries - Coordinate with external services - Return processed results - Serve API documentation
Search Integration¶
Polymathy uses SearxNG as a meta-search engine to retrieve initial search results.
Flow: 1. Receive query from client 2. Forward to SearxNG with JSON format 3. Extract URLs from results (max 10) 4. Pass URLs to content processor
Content Processor¶
An external service that handles the heavy lifting of content processing.
Functions: - Fetch web page content - Clean and extract text - Split into semantic chunks - Generate vector embeddings
Vector Index¶
USearch provides fast vector similarity search capabilities.
Configuration: - 384 dimensions (AllMiniLML6V2) - Inner Product metric - F32 quantization
Data Flow¶
sequenceDiagram
participant C as Client
participant P as Polymathy
participant S as SearxNG
participant PR as Processor
C->>P: GET /v1/search?q=query
P->>S: Search request
S-->>P: Search results (URLs)
loop For each URL (max 10)
P->>PR: Process URL
PR-->>P: Chunks + Embeddings
end
P-->>C: Processed chunks map
Technology Choices¶
| Component | Technology | Reason |
|---|---|---|
| Language | Rust | Performance, safety, async support |
| Web Framework | Actix-web | High performance, mature ecosystem |
| Async Runtime | Tokio | Industry standard for Rust async |
| Vector DB | USearch | Fast, lightweight, Rust bindings |
| HTTP Client | Reqwest | Async-friendly, well-maintained |
Concurrency Model¶
Polymathy processes multiple URLs concurrently using Tokio's async runtime:
// Simplified concurrent processing
let futures: Vec<_> = urls
.iter()
.map(|url| process_url(url))
.collect();
let results = join_all(futures).await;
This allows efficient handling of I/O-bound operations like fetching web content.