Skip to content

A fast async python request engine

How I made a generic async sdk engine in python: https://github.com/ryukyi/async-requester

TL;DR

A simplified version of source code is all here: https://github.com/ryukyi/async-requester NOTE: this post is simplified from the closed source version and doesn't include auth, custom errors or rate limiting

Making an SDK

I work for a large consultancy with many employees. An API existed which was old and crummy. Despite leveraging Swagger v3 the routes were inconsistent, enum params weren't reused and the documentation was poor. Whats worse the errors were abstract, unhelpful and occassionally misleading.

We weren't able to rewrite the API. But we were able to make a nice python sdk!

The strategy

Before firing off lots of requests we needed to handle request info and minimise bad requests being sent. To do this we set out to:

  • bundle request information into structured lists
  • make async without contributors needing to write async await blocks
  • validate all requests before sending in order to minimise load on the API

Below is a simplified generic version for how I implemented this in the consultancy.

Lightweight request objects

namedtuples are like dictionaries except faster and require no more memory than regular tuples.

RequestInfo objects
from collections import namedtuple
from typing import Any, Dict, Optional, Union


class RequestInfo(namedtuple("RequestInfo", ["method", "path", "params", "body"])):
    """
    A named tuple to represent request information.

    Parameters:
    - method (str): The HTTP method (GET, POST, PUT, PATCH, DELETE).
    - path (str): The URL path for the request.
    - params (Optional[Dict[str, Any]]): Query parameters for the request.
    - body (Any): Data to be sent in the request (for POST, PUT, and PATCH).

    Using __slots__ improves memory efficiency and attribute access speed.
    It restricts the creation of additional instance variables and enforces
    a strict attribute structure. This is helpful when making many async requests
    """

    __slots__ = ()

    def __new__(
        cls,
        method: str,
        path: str,
        params: Optional[Union[Dict[str, Any], str]] = None,
        body: Optional[Any] = None,
    ):
        """RequestInfo dto for all methods"""
        # GET and DELETE REST API requests must not have body
        if method.upper() in {"GET", "DELETE"}:
            body = None
        return super().__new__(cls, method, path, params, body)
Accessing RequestInfo fields is simple and similar to dictionaries:

python3.10 interactive terminal
>>> from src.http_requests import RequestInfo
>>> r = RequestInfo(method="GET", path="endpoint", params="latest")
>>> r.method
"GET"
>>> r.path
"endpoint"

A list of requests is pretty logical from here e.g.

python3.10 interactive terminal
>>> requests = [
    RequestInfo(method="POST", path="endpoint", body={"here": "now"}),
    RequestInfo(method="GET", path="endpoint", params="latest"),
    # and so on...
]

Abstracting away async await

Pairing httpx AsyncClient with a modern async runtime via trio nursery meant users no longer needed to declare async await blocks when adding endpoint methods to src/api.py. Note the make_request works similar to asyncio collecting tasks and feeding arguments at runtime:

src/http_requests.py
    # https://github.com/ryukyi/async-requester/blob/main/src/http_requests.py
    async def send_requests(self, requests: List[RequestInfo]):
        """
        Send a list of asynchronous HTTP requests.

        Args:
            requests (List[RequestInfo]): A list of RequestInfo objects
            representing the requests to send.
        """
        async with AsyncClient(
            base_url=self.base_url,
            headers=self.headers,
            verify=self.verify,
            timeout=self.timeout,
        ) as client:
            async with trio.open_nursery() as nursery:
                for request in requests:
                    method = self.get_method_by_name(client, request.method)
                    nursery.start_soon(
                        self.make_request,
                        method,
                        request.path,
                        request.params,
                        request.body,
                    )

Validating requests before sending

I love pydantic and have since I first saw it used extensively in source code of opennem schema and more famously known for playing a crucial role in FastAPI. A quick shoutout to Samuel Colvin. Everything he touches turns to gold and is a huge win for open source communities.

Say for example you have an endpoint which has a really nasty complicated body with many possible user input mistakes:

[
    {"gday": {"mate": {"how": {"the": {"bloody": {"hell": ["are", "ya", 0]}}}}}},
    {"gday": {"mate": {"how": {"the": {"bloody": {"hell": ["are", "ya", 1]}}}}}},
    {"gday": {"mate": {"how": {"the": {"bloody": {"hell": ["are", "ya", 2]}}}}}},
]

We can define this schema and lean on pydantics excellent built in errors to do the hard lifting:

src/schema.py
# https://github.com/ryukyi/async-requester/blob/main/src/schema.py
from __future__ import annotations

from typing import List, Optional, Union

from pydantic import BaseModel, RootModel


class Bloody(BaseModel):
    hell: List[Union[int, str]]


class The(BaseModel):
    bloody: Bloody


class How(BaseModel):
    the: The


class Mate(BaseModel):
    how: How


class Gday(BaseModel):
    mate: Mate


class GdayBody(BaseModel):
    gday: Optional[Gday] = None


class GdayBodyList(RootModel):
    root: List[GdayBody]

Now anything invalid will stop before even sending the request. For an async client this is really important to prevent our servers from working too hard responding to dumb requests.

Declaring api endpoints with validation

The schema has been defined and now the last step is describing the endpoint class and including validation.

from typing import List, Dict, Any

from httpx import Response

from src.http_requests import Requests, RequestInfo
from src.schema import GdayBodyList


class Anything:
    """Anything endpoint requests with validation"""

    def __init__(self, requests: Requests):
        self.requests = requests.client_requests

    def get_anything(self, gday_body_list: List[Dict[str, Any]]) -> List[Response]:
        """GET request of /anything endpoint

        https://httpbin.org/anything

        Args:

        Returns:
            Response: The httpx response.

        Raises:
            ValueError: If the input body doesn't comply.

        Example Usage:

        ```python
        >>> responses = client.get_anything(
            [
                {"gday": {"mate": {"how": {"the": {"bloody": {"hell": ["are", "ya", 0]}}}}}},
                {"gday": {"mate": {"how": {"the": {"bloody": {"hell": ["are", "ya", 1]}}}}}},
                {"gday": {"mate": {"how": {"the": {"bloody": {"hell": ["are", "ya", 2]}}}}}},
            ]
        )
        ```
        """
        # Validate and do nothing if ok
        GdayBodyList.model_validate(gday_body_list)
        # request
        request_info = [
            RequestInfo(method="GET", path="anything", params=None, body=body)
            for body in gday_body_list
        ]
        return self.requests(requests=request_info)

Don't forget the client class is inherits Anything api class so all methods are easily accessible:

class ApiClient(Anything):

Typically inheritance is a really bad design choice... unless you really have to. The alternative is composition which is 99% of the time the way to go but that would mean users accessing methods like this:

client = ApiClient("https://baseurl.com")
responses = client.anything.get_anything(
    [
        # RequestInfo objects
    ]
)

instead of:

client = ApiClient("https://baseurl.com")
responses = client.get_anything(
    [
        # RequestInfo objects
    ]
)

By including inheritance here it ends up cleaner and most importantly won't increase maintenance overheads later.

Making requests

import json

from loguru import logger

from src.client import ApiClient

client = ApiClient("https://httpbin.org/")

responses = client.get_anything(
    [
        {"gday": {"mate": {"how": {"the": {"bloody": {"hell": ["are", "ya", 0]}}}}}},
        {"gday": {"mate": {"how": {"the": {"bloody": {"hell": ["are", "ya", 1]}}}}}},
        {"gday": {"mate": {"how": {"the": {"bloody": {"hell": ["are", "ya", 2]}}}}}},
    ]
)

for response in responses:
    tidy_json_str_response = json.dumps(response.json(), indent=2)
    logger.debug(tidy_json_str_response)