How to automate your coding style? A short guide to code formatting, pre-commit and CI.

Piotr Gołdyś Piotr Gołdyś • Aug 10
Post Img

Readability counts

Code is communication. We are used to thinking about it this way – using some language to tell a computer what to do, translating your thoughts into instructions. But it’s not about computers only. 

Code is a way of communicating with your colleagues, both current and future. Sometimes with your own future self. 

This little guide, I hope, will help to make this communication a little more understandable and effortless, by using automated code formatting and other useful code styling checks.

Quote Sign

Indeed, the ratio of time spent reading versus writing is well over 10 to 1. We are constantly reading old code as part of the effort to write new code.
...[Therefore,] making it easy to read makes it easier to write.

Quote Author

Robert C. Martin, Clean Code: A Handbook of Agile Software Craftsmanship

Why does code readability matter?

  • Code is read much more often than it is written.
  • Better readability means easier maintainability.
  • Readable code makes onboarding easier.
  • It prevents or delays “code rot”.
  • And makes lives just easier.

What you will learn

  • How to use a code formatter. I used black as an example.
  • How to automate it for yourself and your team.
  • And how to forget about it.

Code formatting in Python

As an example I’m using black. It’s a python-specific, open-source tool successfully used by many great companies and projects.

Visit an online playground for a quick glance at its formatting capabilities.

Install it with package manager of your choice:

$ pip3 install black

Use it right away with default settings (dot as current working directory):

$ black .

And that’s it – pretty straightforward. Our code should be formatted automatically.

Settings

In order to tweak some settings, include them in pyproject.toml file in repository’s root:

More on that in the documentation.

Automation

Cool, we know how to format code with black. But now every developer needs to remember to do it before every commit. There’s a way to automate it, however.

Git has a way to fire off custom scripts when certain important actions occur, like pre-commit. These scripts are called “hooks”.

pre-commit also happens to be the name of a framework that helps to manage those.

It’s language agnostic, which means you can use it for every project and I highly encourage it.

Install pre-commit with your preferred package manager

$ pip3 install pre-commit

Configure. I put black code formatter inside, along with a few other useful hooks.

Check out this repository for more useful plug&play pre-commit scripts.

Set up git hook scripts with git (repository-specific).

$ pre-commit install

Now the scripts from config should run automatically with `git commit`

If you’re using your IDE’s U, in PyCharm you can enable that with a checkbox located in Commit dialogue options, which appears after installing pre-commit (enabled by default).

Committing with pre-commit

When trying to commit unformatted code, two things should happen.

Firstly, we should get an error:

The same error will occur in terminal:

Secondly, our code should get formatted automatically.

No commit was made yet, changes are still staged for commit. 

In order to commit now-formatted changes, run git commit one more time. It should commit successfully now.

More automation

Let’s go one step further. We have our pre-commit convention in check for our team. But how to make sure that everyone runs it locally when committing and messy code won’t make it into our master branch for example?

Continuous Integration is the answer.

Following is the simple configuration file for Gitlab Continuous Integration (gitlab-ci.yaml), where I put pre-commit as a part of the pipeline that should start on some trigger events, like Merge Requests.

If that’s a new subject to you, here are some resources to help you get started. CI configuration looks different for GitHub or other hosting services, so follow your provider’s documentation.

Pre-commit gets installed on the runner machine just like it would on a local computer and then it runs on all files within the repository (by default pre-commit only checks files staged for commit). If it fails, we shouldn’t be able to merge our changes.

MR with unformatted code is rejected (pipeline fails)

After pushing fixes, the pipeline starts again. It takes about 1 minute, but there’s room for optimisations here, for example using caching mechanisms.

Pipeline passed for formatted code, MR is ready to merge.

What about git blame?

Let’s talk about one more issue.

Oftentimes we want to make changes to an existing project.

Imagine the following situation:

You join a project with a large, existing codebase. Multiple authors of the code, multiple coding styles colliding with each other. Of course you want to be a saviour and fix this ASAP. You run black (or Prettier, or any other code formatter for that matter) on the whole repository, everyone cheers you – good job.

Few days pass and you stumble upon some unknown code fragment. You don’t quite understand it, so you’d like to check who is the author of those lines and ask him for help (or just know who came up with such stupid solution). You fire the `git blame` command or use PyCharm’s Annotate feature and to your surprise – it is you who is the author of the last change and it was just a few days ago. 

Yes – of course the formatting commit messed up the whole change history in the repository.

ignore-revs-file

Git since version 2.23 (August 2019) enables ignoring some commits within `git blame` command.

More on that in git documentation.

In order to use this to our advantage during repository-wide formatting and to avoid obfuscating change history, we need to split our formatting procedure into two separate commits.

First commit will be just running our code formatting on the repository, like we would normally do.

The magic happens in the second one. We need to copy the revision number of the first commit (the formatting one) and put it into a file. Then pass a command to git, so that it knows that `git blame` should ignore commits listed in the created file.

Example file with revisions to ignore

Command:

$ git config blame.ignoreRevsFile .git-blame-ignore-revs

From now on, `git blame` should ignore specified commits and show previous changes instead. For PyCharm’s Annotate IDE restart is required.

There’s a small cost to that, however.

Abovementioned command works only locally, so every developer needs to run it for himself.

The good news is that it is only a one-time action, so we can put this info into our README.md or CONTRIBUTING.md file with other instructions on how to setup the project and that shouldn’t be a big deal after all.

Example CONTRIBUTING.md file containing setup instructions for project using pre-commit and repo-wide formatting commits.

Bonus

Don’t forget to show how cool you are by adding badges to your repo’s README 🙂

Resources


Break the infinite recruiting
loop and extend your team

Fully Remote! Permanent, like your in-house team.

I want to know more Button Arrow