We discussed basic use of the for
loop in a previous article. We will cover the internals of how for
loops work in Python using iterators. How can you use the iterator pattern to create your own objects that are consumed in loops and throughout the language.
The Iterator Pattern in Python
These are the steps that take place when consuming, then exhausting an iterator. They are explained below:
-
Call the
__iter__()
method to receive an iterator. -
Call the
__next__()
method to receive individual items from the iterator. -
Catch a
StopIteration
exception to end iteration.
For an object var1
, Python will call var1.__iter__()
to receive an iterator returned by this method. An iterator is an object that implements the __next__()
method. An example of this is shown below:
class NextMethod:
def __next__(self):
....
class ReturnIter:
def __iter__(self):
return NextMethod()
We discuss __next__()
below.
This example has 2 objects: one with a __iter__()
method and a second with a __next__()
method. In practice we usually implement both __iter__()
and __next__()
in the same class. This is what we will stick with going forward.
An iterator can return itself with self
.
class CompleteIter:
def __iter__(self):
return self
def __next__(self):
....
The __next__()
method is called repeatedly, once at the beginning of each iteration. The result returned by __next__()
is the next output of the iterator.
In a for
loop, each result of __next__()
becomes the variable in the next iteration of the for loop. To stop iterating we raise a StopIteration
exception. This example simply counts to 3 and stops.
class CompleteIter:
def __init__(self):
self.counter = 0
def __iter__(self):
return self
def __next__(self):
self.counter += 1
if self.counter == 4:
raise StopIteration
else:
return self.counter
for i in CompleteIter():
print(i)
# Output:
# 1
# 2
# 3
We see above a complete example of a working iterator.
An iterator is a class that implements certain methods, so can use the __init__()
method for one-time setup.
The __iter__()
method simply returns itself. CompleteIter
features both __iter__()
to return an iterator (itself) and __next__()
to implement the iterator pattern.
Each call to __next__()
provides the next value for the next iteration. In this case the numbers 1, 2, 3 are returned after 3 iterations. A StopIteration
exception is raised on the 4th iteration, stopping the loop and not producing any further results.
We saw how we create iterators by implementing __iter__()
and __next__()
. We now focus on what the Python runtime does with our class and how the iterator is initialised.
>>> instance = CompleteIter()
__init__
>>> iterator = iter(instance)
__iter__
>>> first = next(iterator)
__next__
>>> second = next(iterator)
__next__
>>> third = next(iterator)
__next__
>>> fourth = next(iterator)
__next__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 12, in __next__
StopIteration
>>>
In the example above we first create an instance of CompleteIter
. This calls the __init__()
method to initialise the instance. We need to call iter()
on an object with the correct method, not the class. This instantiates the instance, setting initial variables.
The first iterator-related step is the call by iter()
on the instance. This calls __iter__()
, returning the iterator.
Since CompleteIter
returns itself this step is superfluous (we could continue to use instance
in place of the iterator). We see this below where instance
and iterator
both have the same address[1] , demonstrating that they are the same object. We use the above example to show the general case.
Each call by next()
executes the __next__()
method on the object. The variable fourth
is never set. The last call to next()
raises the StopIteration
exception.
>>> instance
<__main__.CompleteIter object at 0x7f8679c13e20>
>>> iterator
<__main__.CompleteIter object at 0x7f8679c13e20>
Generators and the yield statement
There is another method to create iterators that has a much simpler syntax. We can define a function as we usually do, with the exception that we replace the return
statement with yield
. That’s it. Every time we yield from this function it creates a new output for that iteration. For our for
loop each value yielded is the variable of the next loop.
We recreate our example as a generator.
def as_generator():
yield 1
yield 2
yield 3
for i in as_generator():
print(i)
The output from this example is the same as before. Note how much clearer and less verbose this is. This function will yield at each statement, but execution will resume from the last yield. Note the example below:
>>> def as_generator():
... print("Before first yield.")
... yield 1
... print("After 1")
... yield 2
... print("After 2")
... yield 3
... print("After 3")
...
>>> gen = as_generator()
>>> next(gen)
Before first yield.
1
>>> next(gen)
After 1
2
>>> next(gen)
After 2
3
>>> next(gen)
After 3
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>>
When a generator is first executed it does not run any of the code defined in the function. It merely returns an iterator. We see this with gen = as_generator()
. The first result is yielded when next()
is first run on this instance of the generator. Execution begins from the top of the function up to the first yield
. Execution always resumes from the previous yield
.
We need not always use a different yield
for each result. The following wraps the range()
function to only return even values.
def even_range(end):
for i in range(0, end, 2):
yield i
list(even_range(10))
# Output:
# [0, 2, 4, 6, 8]
Notice in the above statement that we loop over an iterator created by range()
and yield each result. This is consumed by list()
. If we are yielding from an iterator we can use the yield from
statement directly, reducing the amount of code to write.
The example above is repeated using yield from
.
def even_range(end):
yield from range(0, end, 2)
list(even_range(10))
# [0, 2, 4, 6, 8]
The result is the same with slightly less code. This solution is cleaner when we want to yield from an iterator.
Throughout this discussion about generators we have only covered iterators as an object that produces values. It is possible to send values back into the generator. This is done using yield
as a statement and assigning its value to a variable.
def add_to_counter():
counter = 0
input_value = 0
while True:
counter += 1
input_value = yield input_value+counter
In this example we have a generator add_to_counter()
it will:
-
receive a value at each iteration,
-
assign that value to
input_value
, -
then increment a counter and return the sum of that counter and the input value.
>>> gen = add_to_counter()
>>> gen.send(None)
1
>>> gen.send(5)
7
>>> gen.send(10)
13
>>> gen.send(16)
20
>>> gen.send(16)
21
We start the first iteration using .send(None)
. This is equivalent to next()
. Since a generator begins at the top of a function we can only send a None
value. Any other value would raise a TypeError
.
When Python arrives at a line like variable = yield value
it will
-
first yield
value
, -
the iteration will end,
-
the new value sent to the generator using
.send(new_value)
at the start of the next iteration will be assigned tovariable
.
This behaviour is similar to regular functions which receive parameters to change the behaviour of the function. In this case we can send a parameter to change the behaviour of a single iteration. It is key to how coroutines are implemented in Python.