Code Style Guide¶

Coding conventions for Polymathy contributors.

Rust Style¶

General Rules¶

Follow the official Rust Style Guide
Use cargo fmt before every commit
Address all cargo clippy warnings

Formatting¶

Run before committing:

cargo fmt

Linting¶

Ensure no warnings:

cargo clippy -- -D warnings

Naming Conventions¶

Functions and Variables¶

Use snake_case:

fn process_search_results() { }
let chunk_counter = 0;

Types and Traits¶

Use PascalCase:

struct SearchQuery { }
trait ContentProcessor { }
enum ProcessingError { }

Constants¶

Use SCREAMING_SNAKE_CASE:

const MAX_URLS: usize = 10;
const DEFAULT_CHUNK_SIZE: u32 = 100;

Modules¶

Use snake_case:

mod search_processing;
mod content_handler;

Documentation¶

Module Documentation¶

Every module should have a doc comment:

//! Search query processing and data structures.
//!
//! This module provides types for handling search queries
//! and processing their results.

Public Items¶

Document all public functions, structs, and traits:

/// Creates a new vector index for similarity search.
///
/// # Returns
///
/// A configured `Index` with 384 dimensions and Inner Product metric.
///
/// # Example
///
/// ```rust
/// let index = create_index();
/// ```
pub fn create_index() -> Index {
    // ...
}

Struct Fields¶

Document non-obvious fields:

/// Processed content from a URL.
pub struct ProcessedContent {
    /// The source URL of the content.
    pub url: String,

    /// Map of chunk IDs to chunk text.
    pub chunks: HashMap<String, String>,

    /// Map of chunk IDs to their embedding vectors (384 dimensions).
    pub embeddings: HashMap<String, Vec<f32>>,
}

Error Handling¶

Use Result and Option¶

// Prefer Result for operations that can fail
fn process_url(url: &str) -> Result<ProcessedContent> { }

// Use Option for optional values
fn get_embedding(id: &str) -> Option<Vec<f32>> { }

Use anyhow for Application Code¶

use anyhow::{Context, Result};

fn fetch_content(url: &str) -> Result<String> {
    let response = reqwest::blocking::get(url)
        .context("Failed to fetch URL")?;

    response.text()
        .context("Failed to read response body")
}

Don't Panic in Library Code¶

// Bad
fn get_item(index: usize) -> &Item {
    &self.items[index]  // Panics if out of bounds
}

// Good
fn get_item(&self, index: usize) -> Option<&Item> {
    self.items.get(index)
}

Async Code¶

Use async/await¶

async fn process_url(url: &str) -> Result<ProcessedContent> {
    let response = client.get(url).send().await?;
    let content = response.json().await?;
    Ok(content)
}

Prefer join_all for Concurrent Operations¶

use futures::future::join_all;

let futures: Vec<_> = urls
    .iter()
    .map(|url| process_url(url))
    .collect();

let results = join_all(futures).await;

Code Organization¶

Import Order¶

Standard library
External crates
Local modules

use std::collections::HashMap;
use std::sync::{Arc, Mutex};

use actix_web::{web, HttpResponse};
use serde::{Deserialize, Serialize};

use crate::search::SearchQuery;
use crate::index::create_index;

Module Structure¶

Keep modules focused and cohesive:

// Good: search.rs contains search-related items
pub struct SearchQuery { }
pub struct ProcessedContent { }

// Bad: mixing unrelated concerns
pub struct SearchQuery { }
pub struct DatabaseConnection { }  // Should be in a different module

Testing¶

Test Module Placement¶

Place unit tests in the same file:

pub fn add(a: i32, b: i32) -> i32 {
    a + b
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_add() {
        assert_eq!(add(2, 2), 4);
    }
}

Integration Tests¶

Place in tests/ directory:

// tests/integration_tests.rs
use polymathy::search::SearchQuery;

#[test]
fn test_search_query_creation() {
    let query = SearchQuery { q: "test".to_string() };
    assert_eq!(query.q, "test");
}

Test Naming¶

Use descriptive names:

#[test]
fn test_process_url_returns_error_for_invalid_url() { }

#[test]
fn test_create_index_with_correct_dimensions() { }

Comments¶

When to Comment¶

Explain why, not what
Document non-obvious behavior
Add context for complex algorithms

// Good: explains why
// Skip URLs that returned server errors to avoid poisoning the index
// with error page content
if response.status().is_server_error() {
    continue;
}

// Bad: states the obvious
// Check if status is server error
if response.status().is_server_error() {
    continue;
}

TODO Comments¶

Use for future improvements:

// TODO: Add retry logic for transient failures
// TODO(#123): Implement caching for repeated queries