Justus Perlwitz

Using Async Functions in Python 3.5

With the new async syntax in Python 3.5, defining asynchronous functions has become a lot simpler. In this article, I will demonstrate a simple example for this new feature. It will involve pulling a set of homepages of popular websites and displaying the first 10 characters of every HTTP response. The example will utilize the awesome aiohttp library. Make sure that your machine has aiohttp and Python 3.5 installed.

A Traditional, Synchronous Approach

First, let’s take a look at how this would have been solved in a naive, synchronous fashion. Let’s define our set of URLs that we want to retrieve.

sites = [
    'https://www.google.com',
    'https://www.yahoo.com',
    'https://www.bbc.co.uk',
    'https://en.wikipedia.org',
    'https://de.wikipedia.org',
    'https://news.ycombinator.com',
    'https://www.tagesschau.de',
]
FORMAT_STRING ="{site.url:<30.30}: {site.text:.10} in {site.elapsed}s"

We also need some logic to retrieve the pages.

from requests import get

def get_site_snippet(site):
    return FORMAT_STRING.format(site=get(site))

def main():
    for site in sites:
        print(get_site_snippet(site))

main()

Once we run our example, we will see the following response

https://www.google.com/       : <!doctype  in 0:00:00.798940s
https://www.yahoo.com/        : <?xml vers in 0:00:00.883083s
http://www.bbc.co.uk/         : <!DOCTYPE  in 0:00:01.479646s
https://en.wikipedia.org/wiki/: <!DOCTYPE  in 0:00:00.172367s
https://de.wikipedia.org/wiki/: <!DOCTYPE  in 0:00:00.166793s
https://news.ycombinator.com/ : <html op=" in 0:00:01.464526s
https://www.tagesschau.de/    : <!DOCTYPE  in 0:00:00.991291s

So far so good. But knowing how this kind of code works, we quickly realize that a lot of time is being spent waiting for and being blocked by external resources. Usually, HTTP servers take their time to respond, and depending on a multitude of factors, results are almost always not being delivered instantly.

Since one HTTP request has to be finished in order to execute the next HTTP request, our program will lose a considerable amount of time being blocked by an external resource. This is also known as an I/O bound computation.

By interleaving the execution of several function calls at once, we will be able to save a considerable amount of time. In our example, this would mean that while one function call is busy retrieving HTTP results, another function call can already get the next site name and initiate the next HTTP request.

Async All The Things

We are going to execute the same task, but with a twist. Instead of synchronously executing the get_site_snippet function, we are going to asynchronously get all website results and join the results in the end. Let’s take a look at how to achieve that.

from asyncio.client import get

async def async_get_site_snippet(site):
    response = await get(site)
    content = await response.read()
    return FORMAT_STRING.format(site=content)

A keen observer will immediately notice the usage of async and await. I won’t try to get too much into the details of how these are being handled in CPython internally. Let’s just say that this means that the functions we’re calling will not return the desired result immediately. Instead, calling an async function, will return a promise to eventually calculate a result. To be more precise, calling async_get_site_snippet(‘http://www.google.com’) will return a coroutine object. The same applies for the two await calls, as get(site) returns a promise, as well as response.read().

Now, by itself the coroutine object will not do anything. Getting to the result of every coroutine call involves some extra functionality that we are going to add now.

First, a list of tasks needed to be created, containing all the coroutine objects that need to be run.

from asyncio import get_event_loop, wait

def async_main():
    tasks = [async_get_site_snippet(site) for site in sites]

Then, we create a BaseEventLoop object, that runs all our tasks until they all have completed. In order to execute all coroutines concurrently, the list of tasks needs to be wrapped with asyncio.wait. The BaseEventLoop object can then run the wrapped tasks until all results are returned.

    loop = get_event_loop()
    # We can safely discard pending futures
    result, _ = loop.run_until_complete(wait(tasks))
    loop.close()

The results of loop.run_until_complete are now contained in a list and are ready to be retrieved.

    for task in result:
        print(task.result())

async_main()

Pretty neat!

The benefits immediately become apparent when we compare execution times:

print("Running synchronous example")
start = time()
main()
duration = time() - start

print("Running asynchronous example")
async_start = time()
async_main()
async_duration = time() - async_start

print("Synchronous example took {} seconds".format(duration))
print("Asynchronous example took {} seconds".format(async_duration))

This outputs the following on my trusty laptop:

Running synchronous example
https://www.google.com/       : <!doctype  in 0:00:00.798940s
https://www.yahoo.com/        : <?xml vers in 0:00:00.883083s
http://www.bbc.co.uk/         : <!DOCTYPE  in 0:00:01.479646s
https://en.wikipedia.org/wiki/: <!DOCTYPE  in 0:00:00.172367s
https://de.wikipedia.org/wiki/: <!DOCTYPE  in 0:00:00.166793s
https://news.ycombinator.com/ : <html op=" in 0:00:01.464526s
https://www.tagesschau.de/    : <!DOCTYPE  in 0:00:00.991291s
Running asynchronous example
https://www.google.com/       : <!doctype  in 0:00:00.618827s
http://www.bbc.co.uk/         : <!DOCTYPE  in 0:00:00.501347s
https://en.wikipedia.org/wiki/: <!DOCTYPE  in 0:00:00.169479s
https://www.yahoo.com/        : <?xml vers in 0:00:00.762460s
https://news.ycombinator.com/ : <html op=" in 0:00:00.711696s
https://www.tagesschau.de/    : <!DOCTYPE  in 0:00:00.645607s
https://de.wikipedia.org/wiki/: <!DOCTYPE  in 0:00:00.167020s
Synchronous example took 12.025413990020752 seconds
Asynchronous example took 6.950876951217651 seconds

While the result will vary slightly depending on the network and server load it becomes clear that we can shave off a few seconds of execution time by not letting HTTP requests block us. In this run the execution time was halved!

Further reading

If you want to read more about Python’s asyncio refer to the following sources:

Python Documentation

Tutorials

Date created:
17 Sep 2015

You are more than welcome to share your thoughts via email