1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606
|
.. _a-conceptual-overview-of-asyncio:
****************************************
A Conceptual Overview of :mod:`!asyncio`
****************************************
This :ref:`HOWTO <how-tos>` article seeks to help you build a sturdy mental
model of how :mod:`asyncio` fundamentally works, helping you understand the
how and why behind the recommended patterns.
You might be curious about some key :mod:`!asyncio` concepts.
You'll be comfortably able to answer these questions by the end of this
article:
- What's happening behind the scenes when an object is awaited?
- How does :mod:`!asyncio` differentiate between a task which doesn't need
CPU-time (such as a network request or file read) as opposed to a task that
does (such as computing n-factorial)?
- How to write an asynchronous variant of an operation, such as
an async sleep or database request.
.. seealso::
* The `guide <https://github.com/anordin95/a-conceptual-overview-of-asyncio/
tree/main>`_ that inspired this HOWTO article, by Alexander Nordin.
* This in-depth `YouTube tutorial series <https://www.youtube.com/
watch?v=Xbl7XjFYsN4&list=PLhNSoGM2ik6SIkVGXWBwerucXjgP1rHmB>`_ on
``asyncio`` created by Python core team member, Ćukasz Langa.
* `500 Lines or Less: A Web Crawler With asyncio Coroutines <https://
aosabook.org/en/500L/a-web-crawler-with-asyncio-coroutines.html>`_ by A.
Jesse Jiryu Davis and Guido van Rossum.
--------------------------------------------
A conceptual overview part 1: the high-level
--------------------------------------------
In part 1, we'll cover the main, high-level building blocks of :mod:`!asyncio`:
the event loop, coroutine functions, coroutine objects, tasks and ``await``.
==========
Event Loop
==========
Everything in :mod:`!asyncio` happens relative to the event loop.
It's the star of the show.
It's like an orchestra conductor.
It's behind the scenes managing resources.
Some power is explicitly granted to it, but a lot of its ability to get things
done comes from the respect and cooperation of its worker bees.
In more technical terms, the event loop contains a collection of jobs to be run.
Some jobs are added directly by you, and some indirectly by :mod:`!asyncio`.
The event loop takes a job from its backlog of work and invokes it (or "gives
it control"), similar to calling a function, and then that job runs.
Once it pauses or completes, it returns control to the event loop.
The event loop will then select another job from its pool and invoke it.
You can *roughly* think of the collection of jobs as a queue: jobs are added and
then processed one at a time, generally (but not always) in order.
This process repeats indefinitely with the event loop cycling endlessly
onwards.
If there are no more jobs pending execution, the event loop is smart enough to
rest and avoid needlessly wasting CPU cycles, and will come back when there's
more work to be done.
Effective execution relies on jobs sharing well and cooperating; a greedy job
could hog control and leave the other jobs to starve, rendering the overall
event loop approach rather useless.
::
import asyncio
# This creates an event loop and indefinitely cycles through
# its collection of jobs.
event_loop = asyncio.new_event_loop()
event_loop.run_forever()
=====================================
Asynchronous functions and coroutines
=====================================
This is a basic, boring Python function::
def hello_printer():
print(
"Hi, I am a lowly, simple printer, though I have all I "
"need in life -- \nfresh paper and my dearly beloved octopus "
"partner in crime."
)
Calling a regular function invokes its logic or body::
>>> hello_printer()
Hi, I am a lowly, simple printer, though I have all I need in life --
fresh paper and my dearly beloved octopus partner in crime.
The :ref:`async def <async def>`, as opposed to just a plain ``def``, makes
this an asynchronous function (or "coroutine function").
Calling it creates and returns a :ref:`coroutine <coroutine>` object.
::
async def loudmouth_penguin(magic_number: int):
print(
"I am a super special talking penguin. Far cooler than that printer. "
f"By the way, my lucky number is: {magic_number}."
)
Calling the async function, ``loudmouth_penguin``, does not execute the print statement;
instead, it creates a coroutine object::
>>> loudmouth_penguin(magic_number=3)
<coroutine object loudmouth_penguin at 0x104ed2740>
The terms "coroutine function" and "coroutine object" are often conflated
as coroutine.
That can be confusing!
In this article, coroutine specifically refers to a coroutine object, or more
precisely, an instance of :data:`types.CoroutineType` (native coroutine).
Note that coroutines can also exist as instances of
:class:`collections.abc.Coroutine` -- a distinction that matters for type
checking.
A coroutine represents the function's body or logic.
A coroutine has to be explicitly started; again, merely creating the coroutine
does not start it.
Notably, the coroutine can be paused and resumed at various points within the
function's body.
That pausing and resuming ability is what allows for asynchronous behavior!
Coroutines and coroutine functions were built by leveraging the functionality
of :term:`generators <generator iterator>` and
:term:`generator functions <generator>`.
Recall, a generator function is a function that :keyword:`yield`\s, like this
one::
def get_random_number():
# This would be a bad random number generator!
print("Hi")
yield 1
print("Hello")
yield 7
print("Howdy")
yield 4
...
Similar to a coroutine function, calling a generator function does not run it.
Instead, it creates a generator object::
>>> get_random_number()
<generator object get_random_number at 0x1048671c0>
You can proceed to the next ``yield`` of a generator by using the
built-in function :func:`next`.
In other words, the generator runs, then pauses.
For example::
>>> generator = get_random_number()
>>> next(generator)
Hi
1
>>> next(generator)
Hello
7
=====
Tasks
=====
Roughly speaking, :ref:`tasks <asyncio-task-obj>` are coroutines (not coroutine
functions) tied to an event loop.
A task also maintains a list of callback functions whose importance will become
clear in a moment when we discuss :keyword:`await`.
The recommended way to create tasks is via :func:`asyncio.create_task`.
Creating a task automatically schedules it for execution (by adding a
callback to run it in the event loop's to-do list, that is, collection of jobs).
Since there's only one event loop (in each thread), :mod:`!asyncio` takes care of
associating the task with the event loop for you. As such, there's no need
to specify the event loop.
::
coroutine = loudmouth_penguin(magic_number=5)
# This creates a Task object and schedules its execution via the event loop.
task = asyncio.create_task(coroutine)
Earlier, we manually created the event loop and set it to run forever.
In practice, it's recommended to use (and common to see) :func:`asyncio.run`,
which takes care of managing the event loop and ensuring the provided
coroutine finishes before advancing.
For example, many async programs follow this setup::
import asyncio
async def main():
# Perform all sorts of wacky, wild asynchronous things...
...
if __name__ == "__main__":
asyncio.run(main())
# The program will not reach the following print statement until the
# coroutine main() finishes.
print("coroutine main() is done!")
It's important to be aware that the task itself is not added to the event loop,
only a callback to the task is.
This matters if the task object you created is garbage collected before it's
called by the event loop.
For example, consider this program:
.. code-block::
:linenos:
async def hello():
print("hello!")
async def main():
asyncio.create_task(hello())
# Other asynchronous instructions which run for a while
# and cede control to the event loop...
...
asyncio.run(main())
Because there's no reference to the task object created on line 5, it *might*
be garbage collected before the event loop invokes it.
Later instructions in the coroutine ``main()`` hand control back to the event
loop so it can invoke other jobs.
When the event loop eventually tries to run the task, it might fail and
discover the task object does not exist!
This can also happen even if a coroutine keeps a reference to a task but
completes before that task finishes.
When the coroutine exits, local variables go out of scope and may be subject
to garbage collection.
In practice, ``asyncio`` and Python's garbage collector work pretty hard to
ensure this sort of thing doesn't happen.
But that's no reason to be reckless!
=====
await
=====
:keyword:`await` is a Python keyword that's commonly used in one of two
different ways::
await task
await coroutine
In a crucial way, the behavior of ``await`` depends on the type of object
being awaited.
Awaiting a task will cede control from the current task or coroutine to
the event loop.
In the process of relinquishing control, a few important things happen.
We'll use the following code example to illustrate::
async def plant_a_tree():
dig_the_hole_task = asyncio.create_task(dig_the_hole())
await dig_the_hole_task
# Other instructions associated with planting a tree.
...
In this example, imagine the event loop has passed control to the start of the
coroutine ``plant_a_tree()``.
As seen above, the coroutine creates a task and then awaits it.
The ``await dig_the_hole_task`` instruction adds a callback (which will resume
``plant_a_tree()``) to the ``dig_the_hole_task`` object's list of callbacks.
And then, the instruction cedes control to the event loop.
Some time later, the event loop will pass control to ``dig_the_hole_task``
and the task will finish whatever it needs to do.
Once the task finishes, it will add its various callbacks to the event loop,
in this case, a call to resume ``plant_a_tree()``.
Generally speaking, when the awaited task finishes (``dig_the_hole_task``),
the original task or coroutine (``plant_a_tree()``) is added back to the event
loops to-do list to be resumed.
This is a basic, yet reliable mental model.
In practice, the control handoffs are slightly more complex, but not by much.
In part 2, we'll walk through the details that make this possible.
**Unlike tasks, awaiting a coroutine does not hand control back to the event
loop!**
Wrapping a coroutine in a task first, then awaiting that would cede
control.
The behavior of ``await coroutine`` is effectively the same as invoking a
regular, synchronous Python function.
Consider this program::
import asyncio
async def coro_a():
print("I am coro_a(). Hi!")
async def coro_b():
print("I am coro_b(). I sure hope no one hogs the event loop...")
async def main():
task_b = asyncio.create_task(coro_b())
num_repeats = 3
for _ in range(num_repeats):
await coro_a()
await task_b
asyncio.run(main())
The first statement in the coroutine ``main()`` creates ``task_b`` and schedules
it for execution via the event loop.
Then, ``coro_a()`` is repeatedly awaited. Control never cedes to the
event loop which is why we see the output of all three ``coro_a()``
invocations before ``coro_b()``'s output:
.. code-block:: none
I am coro_a(). Hi!
I am coro_a(). Hi!
I am coro_a(). Hi!
I am coro_b(). I sure hope no one hogs the event loop...
If we change ``await coro_a()`` to ``await asyncio.create_task(coro_a())``, the
behavior changes.
The coroutine ``main()`` cedes control to the event loop with that statement.
The event loop then proceeds through its backlog of work, calling ``task_b``
and then the task which wraps ``coro_a()`` before resuming the coroutine
``main()``.
.. code-block:: none
I am coro_b(). I sure hope no one hogs the event loop...
I am coro_a(). Hi!
I am coro_a(). Hi!
I am coro_a(). Hi!
This behavior of ``await coroutine`` can trip a lot of people up!
That example highlights how using only ``await coroutine`` could
unintentionally hog control from other tasks and effectively stall the event
loop.
:func:`asyncio.run` can help you detect such occurences via the
``debug=True`` flag which accordingly enables
:ref:`debug mode <asyncio-debug-mode>`.
Among other things, it will log any coroutines that monopolize execution for
100ms or longer.
The design intentionally trades off some conceptual clarity around usage of
``await`` for improved performance.
Each time a task is awaited, control needs to be passed all the way up the
call stack to the event loop.
That might sound minor, but in a large program with many ``await``'s and a deep
callstack that overhead can add up to a meaningful performance drag.
------------------------------------------------
A conceptual overview part 2: the nuts and bolts
------------------------------------------------
Part 2 goes into detail on the mechanisms :mod:`!asyncio` uses to manage
control flow.
This is where the magic happens.
You'll come away from this section knowing what ``await`` does behind the scenes
and how to make your own asynchronous operators.
================================
The inner workings of coroutines
================================
:mod:`!asyncio` leverages four components to pass around control.
:meth:`coroutine.send(arg) <generator.send>` is the method used to start or
resume a coroutine.
If the coroutine was paused and is now being resumed, the argument ``arg``
will be sent in as the return value of the ``yield`` statement which originally
paused it.
If the coroutine is being used for the first time (as opposed to being resumed)
``arg`` must be ``None``.
.. code-block::
:linenos:
class Rock:
def __await__(self):
value_sent_in = yield 7
print(f"Rock.__await__ resuming with value: {value_sent_in}.")
return value_sent_in
async def main():
print("Beginning coroutine main().")
rock = Rock()
print("Awaiting rock...")
value_from_rock = await rock
print(f"Coroutine received value: {value_from_rock} from rock.")
return 23
coroutine = main()
intermediate_result = coroutine.send(None)
print(f"Coroutine paused and returned intermediate value: {intermediate_result}.")
print(f"Resuming coroutine and sending in value: 42.")
try:
coroutine.send(42)
except StopIteration as e:
returned_value = e.value
print(f"Coroutine main() finished and provided value: {returned_value}.")
:ref:`yield <yieldexpr>`, like usual, pauses execution and returns control
to the caller.
In the example above, the ``yield``, on line 3, is called by
``... = await rock`` on line 11.
More broadly speaking, ``await`` calls the :meth:`~object.__await__` method of
the given object.
``await`` also does one more very special thing: it propagates (or "passes
along") any ``yield``\ s it receives up the call-chain.
In this case, that's back to ``... = coroutine.send(None)`` on line 16.
The coroutine is resumed via the ``coroutine.send(42)`` call on line 21.
The coroutine picks back up from where it ``yield``\ ed (or paused) on line 3
and executes the remaining statements in its body.
When a coroutine finishes, it raises a :exc:`StopIteration` exception with the
return value attached in the :attr:`~StopIteration.value` attribute.
That snippet produces this output:
.. code-block:: none
Beginning coroutine main().
Awaiting rock...
Coroutine paused and returned intermediate value: 7.
Resuming coroutine and sending in value: 42.
Rock.__await__ resuming with value: 42.
Coroutine received value: 42 from rock.
Coroutine main() finished and provided value: 23.
It's worth pausing for a moment here and making sure you followed the various
ways that control flow and values were passed. A lot of important ideas were
covered and it's worth ensuring your understanding is firm.
The only way to yield (or effectively cede control) from a coroutine is to
``await`` an object that ``yield``\ s in its ``__await__`` method.
That might sound odd to you. You might be thinking:
1. What about a ``yield`` directly within the coroutine function? The
coroutine function becomes an
:ref:`async generator function <asynchronous-generator-functions>`, a
different beast entirely.
2. What about a :ref:`yield from <yieldexpr>` within the coroutine function to a (plain)
generator?
That causes the error: ``SyntaxError: yield from not allowed in a coroutine.``
This was intentionally designed for the sake of simplicity -- mandating only
one way of using coroutines.
Initially ``yield`` was barred as well, but was re-accepted to allow for
async generators.
Despite that, ``yield from`` and ``await`` effectively do the same thing.
=======
Futures
=======
A :ref:`future <asyncio-future-obj>` is an object meant to represent a
computation's status and result.
The term is a nod to the idea of something still to come or not yet happened,
and the object is a way to keep an eye on that something.
A future has a few important attributes. One is its state which can be either
"pending", "cancelled" or "done".
Another is its result, which is set when the state transitions to done.
Unlike a coroutine, a future does not represent the actual computation to be
done; instead, it represents the status and result of that computation, kind of
like a status light (red, yellow or green) or indicator.
:class:`asyncio.Task` subclasses :class:`asyncio.Future` in order to gain
these various capabilities.
The prior section said tasks store a list of callbacks, which wasn't entirely
accurate.
It's actually the ``Future`` class that implements this logic, which ``Task``
inherits.
Futures may also be used directly (not via tasks).
Tasks mark themselves as done when their coroutine is complete.
Futures are much more versatile and will be marked as done when you say so.
In this way, they're the flexible interface for you to make your own conditions
for waiting and resuming.
========================
A homemade asyncio.sleep
========================
We'll go through an example of how you could leverage a future to create your
own variant of asynchronous sleep (``async_sleep``) which mimics
:func:`asyncio.sleep`.
This snippet registers a few tasks with the event loop and then awaits a
coroutine wrapped in a task: ``async_sleep(3)``.
We want that task to finish only after three seconds have elapsed, but without
preventing other tasks from running.
::
async def other_work():
print("I like work. Work work.")
async def main():
# Add a few other tasks to the event loop, so there's something
# to do while asynchronously sleeping.
work_tasks = [
asyncio.create_task(other_work()),
asyncio.create_task(other_work()),
asyncio.create_task(other_work())
]
print(
"Beginning asynchronous sleep at time: "
f"{datetime.datetime.now().strftime("%H:%M:%S")}."
)
await asyncio.create_task(async_sleep(3))
print(
"Done asynchronous sleep at time: "
f"{datetime.datetime.now().strftime("%H:%M:%S")}."
)
# asyncio.gather effectively awaits each task in the collection.
await asyncio.gather(*work_tasks)
Below, we use a future to enable custom control over when that task will be
marked as done.
If :meth:`future.set_result() <asyncio.Future.set_result>` (the method
responsible for marking that future as done) is never called, then this task
will never finish.
We've also enlisted the help of another task, which we'll see in a moment, that
will monitor how much time has elapsed and, accordingly, call
``future.set_result()``.
::
async def async_sleep(seconds: float):
future = asyncio.Future()
time_to_wake = time.time() + seconds
# Add the watcher-task to the event loop.
watcher_task = asyncio.create_task(_sleep_watcher(future, time_to_wake))
# Block until the future is marked as done.
await future
Below, we'll use a rather bare object, ``YieldToEventLoop()``, to ``yield``
from ``__await__`` in order to cede control to the event loop.
This is effectively the same as calling ``asyncio.sleep(0)``, but this approach
offers more clarity, not to mention it's somewhat cheating to use
``asyncio.sleep`` when showcasing how to implement it!
As usual, the event loop cycles through its tasks, giving them control
and receiving control back when they pause or finish.
The ``watcher_task``, which runs the coroutine ``_sleep_watcher(...)``, will
be invoked once per full cycle of the event loop.
On each resumption, it'll check the time and if not enough has elapsed, then
it'll pause once again and hand control back to the event loop.
Eventually, enough time will have elapsed, and ``_sleep_watcher(...)`` will
mark the future as done, and then itself finish too by breaking out of the
infinite ``while`` loop.
Given this helper task is only invoked once per cycle of the event loop,
you'd be correct to note that this asynchronous sleep will sleep *at least*
three seconds, rather than exactly three seconds.
Note this is also of true of ``asyncio.sleep``.
::
class YieldToEventLoop:
def __await__(self):
yield
async def _sleep_watcher(future, time_to_wake):
while True:
if time.time() >= time_to_wake:
# This marks the future as done.
future.set_result(None)
break
else:
await YieldToEventLoop()
Here is the full program's output:
.. code-block:: none
$ python custom-async-sleep.py
Beginning asynchronous sleep at time: 14:52:22.
I like work. Work work.
I like work. Work work.
I like work. Work work.
Done asynchronous sleep at time: 14:52:25.
You might feel this implementation of asynchronous sleep was unnecessarily
convoluted.
And, well, it was.
The example was meant to showcase the versatility of futures with a simple
example that could be mimicked for more complex needs.
For reference, you could implement it without futures, like so::
async def simpler_async_sleep(seconds):
time_to_wake = time.time() + seconds
while True:
if time.time() >= time_to_wake:
return
else:
await YieldToEventLoop()
But, that's all for now. Hopefully you're ready to more confidently dive into
some async programming or check out advanced topics in the
:mod:`rest of the documentation <asyncio>`.
|