Behind the scenes of Python closure function
In this article, we will learn what python closure is, how to use it and what it can be used for. Also, we will explain some extra terms like nonlocal and free variables, what a scope for variables is and what nested and first-class functions are.
First, let’s look at the definition from Wikipedia:
(…) closure, (…) is a technique for implementing lexically scoped name binding in a language with first-class functions. Operationally, a closure is a record storing a function together with an environment. The environment is a mapping associating each free variable of the function (variables that are used locally but defined in an enclosing scope) with the value or reference to which the name was bound when the closure was created. Unlike a plain function, a closure allows the function to access those captured variables through the closure’s copies of their values or references, even when the function is invoked outside their scope.
It might be a bit difficult to understand for now so let’s explain some terms and look at the examples.
Nested functions
Function that is defined inside another function is called nested function. For example:
def outer(variable):
var = variable
def nested():
print(var)
nested()
outer(1) # 1
Variable scope
In Python, the variable can be accessed from inside the scope that it was created. In the above example, we can call the nested function only inside the outer function. The same with var
variable, we can’t call it outside outer
function.
First-class functions
In Python, functions are first class functions. It means that functions can be treated as objects. So, for example, it can be passed as an argument of another function, can be assigned or returned.
Closure example compared to a class-based example
Let’s say we want to get the maximum number from the collection to which we can add elements.
Class based example
class Maximum():
def __init__(self):
self.collection = []
def __call__(self, new_value):
self.collection.append(new_value)
return max(self.collection)
maximum = Maximum()
print(maximum(1)) # 1
print(maximum(3)) # 3
print(maximum(2)) # 3
Closure example
def make_maximum():
collection = []
def maximum(new_value):
collection.append(new_value)
return max(collection)
return maximum
maximum = make_maximum()
print(maximum(1)) # 1
print(maximum(3)) # 3
print(maximum(2)) # 3
As we can see here both examples are very similar. It is obvious where class Maximum
keeps collection
. It is an attribute of its instance. But where closure keeps collection
? While we call maximum
, we are long after the execution of make_maximum
. Note that collection
is a local variable of make_maximum
, we initialize it thereby = []
. We can even extend this example by removing make_maximum
and it will still work.
maximum = make_maximum()
del make_maximum
print(maximum(1)) # 1
print(maximum(3)) # 3
print(maximum(2)) # 3
So how can we access collection
? Well, it is a free variable. It means that it is not bound to its local scope. Closure keeps binding to a free variable that exists when the function is defined and it can be used later even when make_maximum
no longer exists.
Nonlocal variable
Our example code can be easily refactored. Every time that we add a new variable to collection
, we get the max value of it. max
function is O(n) time complexity because it always goes through all elements to get it. We can store max value of the previous state and then just compare it with the new variable and return a higher number. It can look like this:
def make_maximum():
highest = 0
def maximum(new_value):
highest = new_value if new_value > highest else highest
return highest
return maximum
maximum = make_maximum()
print(maximum(1))
print(maximum(3))
print(maximum(2))
We replace collection
with highest
and it should work fine. Well, not so quickly, it throws us an error.
UnboundLocalError: local variable 'highest' referenced before assignment
The problem here is that previously we had a mutable type and we just append
to it. Now it is an immutable type. So highest = new_value
or highest = highest
actually makes a new variable with local scope. It is not a free variable anymore.
To fix this we need to use a nonlocal keyword. It makes a variable a free variable and allows us to change the immutable values stored in closure. A correct example should look like this:
def make_maximum():
highest = 0
def maximum(new_value):
nonlocal highest
highest = new_value if new_value > highest else highest
return highest
return maximum
maximum = make_maximum()
print(maximum(1)) # 1
print(maximum(3)) # 3
print(maximum(2)) # 3
Requirements for closure
There are three requirements to make closure:
- We need to have a nested function
- We need to have access to a free variable (it is a collection variable in our example).
- We need to return a nested function.
When closures can be used
We can find some cases when closures might be a better choice
- When we want to reduce the use of global variables. It is some kind of data hiding.
- When we have some simple functions to implement. With more complex tasks it is better to go with classes and OOP.
Let me provide simple example here
elements = list(range(20))
def stepper_of(n):
def every(l):
return l[::n]
return every
every_second = stepper_of(2)
every_third = stepper_of(3)
print(every_second(elements)) # [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
print(every_third(elements)) # [0, 3, 6, 9, 12, 15, 18]
print(every_third(every_second(elements))) # [0, 6, 12, 18]
__closure__ attribute
What if someone asked us to get highest
number without calling it? There is a way to get values encolsed by closure. We need to use __closure__
attribute. It returns tuple of objects. For us:
print(maximum.__closure__) # (<cell at 0x7fe3fdb9b5b8: int object at 0x55f6a5f0b040>,)
and to get it we need to add cell_contents
to get its value
print(maximum.__closure__[0].cell_contents) # 3