CLI Tools¶

Reference for the numaperf-bench command-line tool.

Installation¶

cargo install numaperf-bench

# Or build from source
cargo build --release -p numaperf-bench

Commands¶

numaperf-bench¶

Main entry point for NUMA benchmarking and system information.

numaperf-bench [COMMAND]

If no command is specified, defaults to info.

info¶

Display system topology, capabilities, and current state.

numaperf-bench info [SECTION] [OPTIONS]

Sections¶

Section	Description
`all`	Show all information (default)
`topology`	NUMA topology (nodes, CPUs, memory)
`capabilities`	System capabilities for hard mode
`affinity`	Current thread CPU affinity
`distances`	NUMA node distance matrix

Options¶

Option	Description
`-f, --format <FORMAT>`	Output format: `text` or `json` (default: text)
`-v, --verbose`	Verbose output with recommendations

Examples¶

# Show all system info
numaperf-bench info

# Show only topology
numaperf-bench info topology

# Show capabilities in JSON format
numaperf-bench info capabilities --format json

# Verbose output with recommendations
numaperf-bench info --verbose

Sample Output¶

=== numaperf System Information ===

Topology
--------
NUMA Nodes: 2

  Node 0:
    CPUs: 0-7 (8 total)

  Node 1:
    CPUs: 8-15 (8 total)

Capabilities
------------
Hard mode supported: yes
  Strict memory binding: yes
  Strict CPU affinity: yes
  Memory locking: yes
  NUMA balancing disabled: yes

Current Affinity
----------------
Allowed CPUs: 0-15 (16 total)
Currently on CPU: 5

Distance Matrix
---------------
       Node 0  Node 1
Node 0     10      20
Node 1     20      10

bench¶

Run NUMA locality and performance benchmarks.

numaperf-bench bench [OPTIONS]

Options¶

Option	Description
`-c, --category <CATEGORY>`	Benchmark category (default: all)
`-f, --format <FORMAT>`	Output format: `text` or `json` (default: text)
`-i, --iterations <N>`	Number of iterations (default: 1000)
`-t, --threads <N>`	Number of threads (default: all CPUs)

Categories¶

Category	Description
`all`	Run all benchmarks
`sharded`	Per-node sharded data structures
`memory`	NUMA-aware memory allocation
`scheduler`	Executor and work distribution
`affinity`	Thread pinning operations

Examples¶

# Run all benchmarks
numaperf-bench bench

# Run only memory benchmarks
numaperf-bench bench --category memory

# JSON output for CI/CD
numaperf-bench bench --format json

# Custom iterations
numaperf-bench bench --iterations 10000

# Limit thread count
numaperf-bench bench --threads 8

Sample Output¶

=== numaperf Benchmark Suite ===

System: 2 NUMA nodes, 16 CPUs
Hard mode supported: true

Running sharded benchmarks...
  sharded_counter_increment     : 45,678,901 ops/sec
  sharded_counter_sum           :  5,432,109 ops/sec
  numa_sharded_local_access     : 23,456,789 ops/sec

Running memory benchmarks...
  numa_region_alloc_1mb         :      1,234 ops/sec
  numa_region_alloc_bind        :      1,198 ops/sec
  numa_region_write_throughput  :  8,765,432 MB/sec

Running scheduler benchmarks...
  executor_submit_local         :  1,234,567 ops/sec
  executor_submit_remote        :    987,654 ops/sec (locality: 55.6%)
  work_distribution             :  2,345,678 ops/sec (locality: 92.3%)

Running affinity benchmarks...
  scoped_pin_create             :    345,678 ops/sec
  get_affinity                  :  5,678,901 ops/sec

=== Summary ===
Overall locality ratio: 89.2%
Health: Good

Recommendations:
  - Consider using LocalOnly steal policy for latency-sensitive workloads

JSON Output Format¶

{
  "system": {
    "numa_nodes": 2,
    "cpus": 16,
    "hard_mode_supported": true
  },
  "benchmarks": [
    {
      "name": "sharded_counter_increment",
      "ops_per_sec": 45678901,
      "duration_ns": 21893,
      "operations": 1000,
      "locality_ratio": null
    },
    {
      "name": "work_distribution",
      "ops_per_sec": 2345678,
      "duration_ns": 426331,
      "operations": 1000,
      "locality_ratio": 0.923
    }
  ],
  "locality_health": {
    "ratio": 0.892,
    "status": "good"
  }
}

Criterion Benchmarks¶

For detailed benchmarking with statistical analysis, use Criterion:

# Run all Criterion benchmarks
cargo bench -p numaperf-bench

# Run specific benchmark group
cargo bench -p numaperf-bench -- sharded
cargo bench -p numaperf-bench -- memory
cargo bench -p numaperf-bench -- scheduler
cargo bench -p numaperf-bench -- affinity

# Generate HTML report
cargo bench -p numaperf-bench -- --plotting-backend gnuplot

Criterion reports are saved to target/criterion/.

Exit Codes¶

Code	Meaning
0	Success
1	Runtime error (topology discovery failed, etc.)
2	Invalid arguments

Environment Variables¶

Variable	Description
`RUST_LOG`	Log level (e.g., `numaperf=debug`)
`NO_COLOR`	Disable colored output

CLI Tools¶

Installation¶

Commands¶

numaperf-bench¶

info¶

Sections¶

Options¶

Examples¶

Sample Output¶

bench¶

Options¶

Categories¶

Examples¶

Sample Output¶

JSON Output Format¶

Criterion Benchmarks¶

Exit Codes¶

Environment Variables¶

See Also¶