Skip to content

Code Style Guide

Coding conventions for Polymathy contributors.

Rust Style

General Rules

  • Follow the official Rust Style Guide
  • Use cargo fmt before every commit
  • Address all cargo clippy warnings

Formatting

Run before committing:

cargo fmt

Linting

Ensure no warnings:

cargo clippy -- -D warnings

Naming Conventions

Functions and Variables

Use snake_case:

fn process_search_results() { }
let chunk_counter = 0;

Types and Traits

Use PascalCase:

struct SearchQuery { }
trait ContentProcessor { }
enum ProcessingError { }

Constants

Use SCREAMING_SNAKE_CASE:

const MAX_URLS: usize = 10;
const DEFAULT_CHUNK_SIZE: u32 = 100;

Modules

Use snake_case:

mod search_processing;
mod content_handler;

Documentation

Module Documentation

Every module should have a doc comment:

//! Search query processing and data structures.
//!
//! This module provides types for handling search queries
//! and processing their results.

Public Items

Document all public functions, structs, and traits:

/// Creates a new vector index for similarity search.
///
/// # Returns
///
/// A configured `Index` with 384 dimensions and Inner Product metric.
///
/// # Example
///
/// ```rust
/// let index = create_index();
/// ```
pub fn create_index() -> Index {
    // ...
}

Struct Fields

Document non-obvious fields:

/// Processed content from a URL.
pub struct ProcessedContent {
    /// The source URL of the content.
    pub url: String,

    /// Map of chunk IDs to chunk text.
    pub chunks: HashMap<String, String>,

    /// Map of chunk IDs to their embedding vectors (384 dimensions).
    pub embeddings: HashMap<String, Vec<f32>>,
}

Error Handling

Use Result and Option

// Prefer Result for operations that can fail
fn process_url(url: &str) -> Result<ProcessedContent> { }

// Use Option for optional values
fn get_embedding(id: &str) -> Option<Vec<f32>> { }

Use anyhow for Application Code

use anyhow::{Context, Result};

fn fetch_content(url: &str) -> Result<String> {
    let response = reqwest::blocking::get(url)
        .context("Failed to fetch URL")?;

    response.text()
        .context("Failed to read response body")
}

Don't Panic in Library Code

// Bad
fn get_item(index: usize) -> &Item {
    &self.items[index]  // Panics if out of bounds
}

// Good
fn get_item(&self, index: usize) -> Option<&Item> {
    self.items.get(index)
}

Async Code

Use async/await

async fn process_url(url: &str) -> Result<ProcessedContent> {
    let response = client.get(url).send().await?;
    let content = response.json().await?;
    Ok(content)
}

Prefer join_all for Concurrent Operations

use futures::future::join_all;

let futures: Vec<_> = urls
    .iter()
    .map(|url| process_url(url))
    .collect();

let results = join_all(futures).await;

Code Organization

Import Order

  1. Standard library
  2. External crates
  3. Local modules
use std::collections::HashMap;
use std::sync::{Arc, Mutex};

use actix_web::{web, HttpResponse};
use serde::{Deserialize, Serialize};

use crate::search::SearchQuery;
use crate::index::create_index;

Module Structure

Keep modules focused and cohesive:

// Good: search.rs contains search-related items
pub struct SearchQuery { }
pub struct ProcessedContent { }

// Bad: mixing unrelated concerns
pub struct SearchQuery { }
pub struct DatabaseConnection { }  // Should be in a different module

Testing

Test Module Placement

Place unit tests in the same file:

pub fn add(a: i32, b: i32) -> i32 {
    a + b
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_add() {
        assert_eq!(add(2, 2), 4);
    }
}

Integration Tests

Place in tests/ directory:

// tests/integration_tests.rs
use polymathy::search::SearchQuery;

#[test]
fn test_search_query_creation() {
    let query = SearchQuery { q: "test".to_string() };
    assert_eq!(query.q, "test");
}

Test Naming

Use descriptive names:

#[test]
fn test_process_url_returns_error_for_invalid_url() { }

#[test]
fn test_create_index_with_correct_dimensions() { }

Comments

When to Comment

  • Explain why, not what
  • Document non-obvious behavior
  • Add context for complex algorithms
// Good: explains why
// Skip URLs that returned server errors to avoid poisoning the index
// with error page content
if response.status().is_server_error() {
    continue;
}

// Bad: states the obvious
// Check if status is server error
if response.status().is_server_error() {
    continue;
}

TODO Comments

Use for future improvements:

// TODO: Add retry logic for transient failures
// TODO(#123): Implement caching for repeated queries