Apibench
Benchmark API’s
i like to work with fastapi. “FastAPI is a modern, fast (high-performance), web framework for building APIs with Python based on standard Python type hints.”. i also know that scripting languages like python are ways slower than compiled languages like c, c++, rust, …
why not build a little “hello world” api, running it on localhost, and then do a benchmark …
Rust
let’s start with rust.
Code
src/main.rs
use actix_web::{get, post, web, App, HttpResponse, HttpServer, Responder};
#[get("/")]
async fn hello() -> impl Responder {
HttpResponse::Ok().body("Hello world!")
}
#[post("/echo")]
async fn echo(req_body: String) -> impl Responder {
HttpResponse::Ok().body(req_body)
}
async fn manual_hello() -> impl Responder {
HttpResponse::Ok().body("Hey there!")
}
#[actix_web::main]
async fn main() -> std::io::Result<()> {
HttpServer::new(|| {
App::new()
.service(hello)
.service(echo)
.route("/hey", web::get().to(manual_hello))
})
.bind(("127.0.0.1", 8080))?
.run()
.await
}
run it
cargo run
benchmark it
localhost % wrk http://localhost:8080 -d 10 -t 1 -c 200
Running 10s test @ http://localhost:8080
1 threads and 200 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 653.52us 162.23us 7.01ms 86.31%
Req/Sec 164.30k 7.16k 177.05k 78.00%
1634817 requests in 10.01s, 137.20MB read
Socket errors: connect 0, read 20, write 0, timeout 0
Requests/sec: 163341.96
Transfer/sec: 13.71MB
Requests/sec: 163341.96
more than 160k Requests/s, impressed.
but what about python?
Python / FastAPI
Code
from fastapi import FastAPI
app = FastAPI()
@app.get("/")
async def read_root():
return {"message": "Hello, World!"}
run it
poetry run uvicorn main:app --port 8080
benchmark it
localhost % wrk http://localhost:8080 -d 10 -t 1 -c 200
Running 10s test @ http://localhost:8080
1 threads and 200 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 21.24ms 1.03ms 38.16ms 84.93%
Req/Sec 9.45k 284.31 10.05k 78.00%
94042 requests in 10.01s, 13.63MB read
Socket errors: connect 0, read 5, write 0, timeout 0
Requests/sec: 9395.51
Transfer/sec: 1.36MB
Requests/sec: 9395.51
9.3k Requests/s
Let’s tune that and disable console output
run it
poetry run uvicorn main:app --port 8080 --log-level warning
benchmark it
localhost % wrk http://localhost:8080 -d 10 -t 1 -c 200
Running 10s test @ http://localhost:8080
1 threads and 200 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 16.04ms 0.90ms 38.94ms 87.60%
Req/Sec 12.52k 344.90 13.51k 84.00%
124637 requests in 10.01s, 18.07MB read
Requests/sec: 12452.41
Transfer/sec: 1.81MB
Requests/sec: 12452.41
12.4k Requests/s
better, but still ways worse than rust. did some research and found other optimisation
run it
poetry run uvicorn main:app --port 8080 --log-level warning --workers 4 --loop uvloop
benchmark it
localhost % wrk http://localhost:8080 -d 10 -t 1 -c 200
Running 10s test @ http://localhost:8080
1 threads and 200 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 3.24ms 1.05ms 44.19ms 73.91%
Req/Sec 61.47k 1.42k 63.29k 86.00%
611696 requests in 10.01s, 88.67MB read
Requests/sec: 61122.86
Transfer/sec: 8.86MB
Requests/sec: 61122.86
5 times faster, what about Gunicorn ?
run it
gunicorn -b localhost:8080 -w 4 -k uvicorn.workers.UvicornWorker main:app
benchmark it
localhost % wrk http://localhost:8080 -d 10 -t 1 -c 200
Running 10s test @ http://localhost:8080
1 threads and 200 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 3.17ms 0.89ms 37.76ms 74.04%
Req/Sec 62.72k 1.25k 65.11k 91.00%
624335 requests in 10.01s, 90.50MB read
Socket errors: connect 0, read 22, write 0, timeout 0
Requests/sec: 62378.97
Transfer/sec: 9.04MB
Requests/sec: 62378.97
around the same as the last test. tested on a “Macbook Pro” with “Apple M3 Max”
Any Comments ?
sha256: 868b15db0ad48b00123400b3ecf82a767de6edbc4b16e29d1e1f670d03699c06