Python 3.7 was released on June 27, 2018. It came with few interesting features and many performance improvements. What does exactly the new Python version bring? Is it as cool as we all expected? Let’s see! In short words, you can find there:
- new
breakpoint()
function, async
/await
as reserved keywords,- module level
__getattr__
and__dir__
method - nanosecond resolution in
time
functions - postponed evaluation of type hints (annotations)
- context variables
- data classes
In this note I’m going to take deep insight into data classes. In my opinion this is the most important change in our latest Python version.
Why data classes are great
Data classes allow to write more consistent and meaningful code. As other cool features in Python like e.g. generators, they provide new syntax for describing thinking processes more compactly and with less lines of code. This is what makes Python awesome ( import this )
.
Let’s get to the point
Data classes provide new class decorator that can be used with classes to represent data structures. They came to the standard library with a new dataclasses module:
from dataclasses import dataclass
Simple data class
@dataclass
class City:
citizens: int
area: float
krakow = City(767, 326)
print(krakow)
> City(citizens=767, area=326)
With this 4-lines class declaration, we get automatically created __init__
and __repr__
function definitions.
Ordered data classes
Morover we can also define ordered data structures and compare if needed:
@dataclass(order=True)
class City:
name: str = field(compare=False)
citizens: int
area: float
zamosc = City('Zamość', 65, 30)
krakow = City('Kraków', 767, 326)
ochock = City('Ochock', 3, 400)
bigger = zamosc if zamosc > krakow else krakow
print(f'Bigger city: {bigger}')
print(sorted([zamosc, krakow, ochock]))
> Bigger city: City(name='Kraków', citizens=767, area=326)
> [City(name='Ochock', citizens=3, area=400), City(name='Zamość', citizens=65, area=30), City(name='Kraków', citizens='767', area=326)]
With this example we’ve just introduced new field
function (from dataclasses module). It allows to override default behavior of the class for a single field – in this case we can disable ordering based on city names.
As you can see, we can also sort data objects with sorted
. What’s interesting here, is that class atributes ordering matters – in this case citizen
is the significant parameter.
Default dataclass
parameters
By default, dataclass decorator is set up with given list of parameters:
@dataclass(init=True, repr=True, eq=True, order=False, unsafe_hash=False, f)
They correspond to autogenerated methods like __init__
, __repr__
, __eq__
, __hash__
and ordering helpers ( __lt__
, __le__
, __gt__
, and __ge__
), so you have control over what’s acutally being generated in specific case. frozen
parameter allows to create immutable objects.
Inheritance
Data classes can be inherited by child-classes. Let’s see what is the ordering of the resulted class atributes.
@dataclass
class A:
x: int = 1
y: int = 2
@dataclass
class B(A):
z: int = 3
x: int = 5
print(A())
print(B())
print(B(0, 1, 2))
> A(x=1, y=2)
> B(x=5, y=2, z=3)
> B(x=0, y=1, z=2)
Field ordering from the base class is preserved.
Last but not least
@dataclass
class Product:
size: int
quantity: int
price: float
p = Product(10, 1, 1.23)
print(asdict(p))
> {'size': 10, 'quantity': 1, 'price': 1.23}
Whoa! We’ve just introduced JSON serialization of Python object with just few lines of code – it wasn’t that trivial before (see this and this). With asdict dataclasses
module level function, you can just get dict representation of data structure hold by given data class. Just send it over your favourite API framework.
Conclusion about python 3.7 features
Dataclasses can be very useful improvement to your daily coding style. It’s nice to have separated data model and business logic services, and with our new feature it’s easier to achieve. This new Python module allows to write data stores very explicitly. Even if if it’s not providing any new functional changes, it’s definitely good to adjust your code base. In case of updating from recent versions of Python, it’s not bringing any backwards incompatible changes, cause it’s settled in a brand new module. However, when upgrading to Python 3.7, it can be profitable to refactor existing classes to make use of dataclasses
. Just identify your data objects and reduce your lines of code with @dataclass
decorator. “Readability counts” – stay tuned!
Reference
https://www.python.org/dev/peps/pep-0557/
https://docs.python.org/3/whatsnew/3.7.html