Benchmark
This page describes a repeatable way to benchmark Nestipy adapters (FastAPI vs BlackSheep) and track changes over time. The goal is to keep the setup simple, deterministic, and close to how Nestipy is used in real projects while still isolating adapter performance.
What This Measures
The benchmark focuses on:
- End-to-end request latency (p50/p95/p99).
- Throughput (requests per second).
- Error rate.
It runs a small Nestipy app with three endpoints:
/for a fast, mostly empty handler./cpufor a short CPU-bound loop./iofor a short async sleep.
This gives you a mix of fast-path, CPU-bound, and async I/O behavior.
Goals
- Same machine, same Python, same dependencies.
- Same endpoints and load profile.
- Multiple trials with warmup.
- Compare median p50/p95/p99 + RPS + failures.
Requirements
uvorpython3granian(used bybench.py serve)
If you already have servers running, you can skip granian and use --no-start.
Quick Start
There are two modes:
- Manual: start one server, benchmark it.
- Comparison runner: start both servers, warmup, run multiple trials, save summary JSON.
1) Run a single server
Then benchmark it:
uv run python bench.py bench --url http://127.0.0.1:8000/ -c 200 -d 20 --out bench_out/fastapi_root
uv run python bench.py bench --url http://127.0.0.1:8000/cpu?ms=2 -c 200 -d 20 --out bench_out/fastapi_cpu
uv run python bench.py bench --url http://127.0.0.1:8000/io?ms=5 -c 200 -d 20 --out bench_out/fastapi_io
Repeat for BlackSheep by switching --is-bs True and using another port.
2) Automated comparison
Use the comparison runner to do warmup + multiple trials and save a JSON report.
There is a convenience script that wraps the comparison runner:
Defaults benchmark these endpoints:
//cpu?ms=2/io?ms=5
Override endpoints with --endpoint (repeatable):
If you already started servers yourself, add --no-start and set ports.
Output
The comparison runner writes:
bench_out/bench_compare.jsonwith median stats across trials.
The single benchmark run (bench.py bench) writes charts:
rps.pngrequests per second over timeresponse_times.pngmedian and p95 latency over timeusers.pngconcurrent users over time
How To Read The Results
- Prefer median across multiple trials instead of a single run.
- Look at p95 and p99 for tail latency, not only average.
- Compare failures as well as speed. Fast but error-prone is not a win.
Reproducibility Tips
- Always run benchmarks on an idle machine.
- Pin the same Python version, dependencies, and OS.
- Keep the same concurrency and duration between runs.
- Keep the same CPU governor and power profile when possible.
- Restart servers between large test changes to avoid warm caches.
Common Pitfalls
- Comparing one-off runs. Always use multiple trials.
- Changing concurrency or duration mid-comparison.
- Running other CPU-heavy tasks in parallel.
- Comparing results from different hardware.
Extending The Benchmark
If you add endpoints to your app:
- Update
bench.pywith the new handlers. - Pass
--endpointtobench_compare.pyto include them.
Example:
Support us
Nestipy is a project released under the MIT license, meaning it's open source and freely available for use and modification. Its development thrives with the generous contributions of these fantastic individuals.