Profiling Guide¶
This guide documents how to profile mrrc performance using various tools available in the Rust ecosystem.
Prerequisites¶
cargo-flamegraph¶
Install flamegraph for CPU profiling:
macOS Requirements:
- Full Xcode (not just Command Line Tools) - required for xctrace
- If you only have Command Line Tools, flamegraph will fail to sample
Linux Requirements:
- perf package (usually included with build tools)
- perl for flamegraph script processing
CPU Profiling with Flamegraph¶
Single-threaded Benchmark¶
Profile the profiling_harness benchmark (single-threaded file reading):
This produces an interactive SVG showing: - Time spent in each function - Call stack depth - Percentage of total runtime per function
Concurrent Benchmark¶
Profile the parallel_benchmarks benchmark (rayon-based concurrent reading):
Detailed Profiling¶
Profile the detailed_profiling benchmark with memory and phase analysis:
Custom Flamegraph Options¶
Generate flamegraph with full call stacks and all events:
cargo flamegraph --bench profiling_harness -- --verbose \
-o flamegraphs/profiling_harness_verbose.svg
Frequency (sampling rate in Hz, default 99):
Interpreting Flamegraphs¶
- Width = time spent in function (wider = more time)
- Height = call stack depth (taller = deeper nesting)
- Color = random (for visual distinction)
- X-axis = execution order
- Y-axis = call stack
Common Patterns to Look For¶
- Flat top = pure computation or tight loop
- Sawtooth pattern = recursive calls or repeated function sequences
- Thin slices = low-overhead functions called many times
- Wide base = time spent in main execution path
Quick Profiling Checklist¶
For performance optimization work:
# Build debug symbols (required for flamegraph on release builds)
# Already configured in Cargo.toml [profile.bench] section
# Profile single-threaded baseline
cargo flamegraph --bench profiling_harness -o /tmp/single.svg
# Profile concurrent implementation
cargo flamegraph --bench parallel_benchmarks -o /tmp/concurrent.svg
# Compare outputs in browser:
# - Wider functions = optimization opportunities
# - Identify bottlenecks not visible in benchmarks
Integration with CI/CD¶
Flamegraphs can be:
- Committed to git (flamegraphs/ directory)
- Uploaded to artifact storage
- Compared across releases for regression detection
Troubleshooting¶
macOS: "tool 'xctrace' requires Xcode"¶
- Install full Xcode:
xcode-select --install(installs Command Line Tools) - Full Xcode required: Download from App Store or developer.apple.com
Linux: "perf not found"¶
Flamegraph output empty¶
- Ensure binary is running long enough to collect samples
- Increase bench duration or iterations
- Use
-Fflag to increase sampling frequency