Skip to content

Profiling Guide

This guide documents how to profile mrrc performance using various tools available in the Rust ecosystem.

Prerequisites

cargo-flamegraph

Install flamegraph for CPU profiling:

cargo install flamegraph

macOS Requirements: - Full Xcode (not just Command Line Tools) - required for xctrace - If you only have Command Line Tools, flamegraph will fail to sample

Linux Requirements: - perf package (usually included with build tools) - perl for flamegraph script processing

CPU Profiling with Flamegraph

Single-threaded Benchmark

Profile the profiling_harness benchmark (single-threaded file reading):

cargo flamegraph --bench profiling_harness -o flamegraphs/profiling_harness.svg

This produces an interactive SVG showing: - Time spent in each function - Call stack depth - Percentage of total runtime per function

Concurrent Benchmark

Profile the parallel_benchmarks benchmark (rayon-based concurrent reading):

cargo flamegraph --bench parallel_benchmarks -o flamegraphs/parallel_benchmarks.svg

Detailed Profiling

Profile the detailed_profiling benchmark with memory and phase analysis:

cargo flamegraph --bench detailed_profiling -o flamegraphs/detailed_profiling.svg

Custom Flamegraph Options

Generate flamegraph with full call stacks and all events:

cargo flamegraph --bench profiling_harness -- --verbose \
  -o flamegraphs/profiling_harness_verbose.svg

Frequency (sampling rate in Hz, default 99):

cargo flamegraph --bench profiling_harness -- -F 1000 \
  -o flamegraphs/profiling_harness_1khz.svg

Interpreting Flamegraphs

  1. Width = time spent in function (wider = more time)
  2. Height = call stack depth (taller = deeper nesting)
  3. Color = random (for visual distinction)
  4. X-axis = execution order
  5. Y-axis = call stack

Common Patterns to Look For

  • Flat top = pure computation or tight loop
  • Sawtooth pattern = recursive calls or repeated function sequences
  • Thin slices = low-overhead functions called many times
  • Wide base = time spent in main execution path

Quick Profiling Checklist

For performance optimization work:

# Build debug symbols (required for flamegraph on release builds)
# Already configured in Cargo.toml [profile.bench] section

# Profile single-threaded baseline
cargo flamegraph --bench profiling_harness -o /tmp/single.svg

# Profile concurrent implementation
cargo flamegraph --bench parallel_benchmarks -o /tmp/concurrent.svg

# Compare outputs in browser:
# - Wider functions = optimization opportunities
# - Identify bottlenecks not visible in benchmarks

Integration with CI/CD

Flamegraphs can be: - Committed to git (flamegraphs/ directory) - Uploaded to artifact storage - Compared across releases for regression detection

Troubleshooting

macOS: "tool 'xctrace' requires Xcode"

  • Install full Xcode: xcode-select --install (installs Command Line Tools)
  • Full Xcode required: Download from App Store or developer.apple.com

Linux: "perf not found"

sudo apt-get install linux-tools  # Debian/Ubuntu
sudo yum install perf              # RHEL/Fedora

Flamegraph output empty

  • Ensure binary is running long enough to collect samples
  • Increase bench duration or iterations
  • Use -F flag to increase sampling frequency