Indenting Python Pythonically

According to the Zen of Python, "There should be one — and preferably only one — obvious way to do it." Also, "simple is better than complex." By these criteria, Python's PEP8 indentation rules are unpythonic.

First, there is a choice of using a so-called hanging indent

    def long_function_name(
            var_one, var_two,
            var_three,
            var_four):
        print(var_one)

or of aligning with the opening parenthesis

    def long_function_name(var_one, var_two,
                           var_three,
                           var_four)

With the latter, what if the name of the function changes? Perhaps you have an editor that automatically reindents the following lines to maintain alignment, or you run all your code through a formatter — but why should that be necessary?

The latter choice would also means allowing

    if (this_is_one_thing and
        that_is_another_thing):
        do_something()

even though the length of the "if (" coincides with that of the indentation unit. so that the indentation fails to distinguish the line on which the nested statements begin. And so PEP8 has to treat this as a special case

    if (this_is_one_thing
            and that_is_another_thing):
        do_something()

But in cases other than an 'if' statement, unless a line break is added after the '(', that's not allowed.

Then there is the choice that PEP8 offers between

    my_list = [
        1, 2, 3,
        4, 5, 6,
        ]
    result = some_function_that_takes_arguments(
        'a', 'b', 'c',
        'd', 'e', 'f',
        )

and

    my_list = [
        1, 2, 3,
        4, 5, 6,
    ]
    result = some_function_that_takes_arguments(
        'a', 'b', 'c',
        'd', 'e', 'f',
    )

The second option entails adding another special case to the rules, i.e., treating closing parentheses differently from other code. There is not even any particularly compelling reason for giving the closing square or round bracket a line of its own.

PEP8 indentation rules even allow for continuation lines to be indented by fewer than four spaces.

Choice is appealing, but short and simple rules are what give real freedom. It's because all drivers must stop at every red light that we can all get across town relatively quickly. The PEP8 rules, in contrast, invite questions and discussion and time spent deciding between the relative importance of personal preferences and consistency with existing code.

Here then, I submit, is the one obvious way to do it:

  1. Nested statements are indented four spaces.
  2. Continuation lines are indented by eight spaces relative to the first line.

That's all.

Here's how it looks:

    def long_function_name(var_one,
            var_two, var_three,
            var_four):
        my_list = [
                1, 2, 3,
                4, 5, 6]
        nest_dict = {
                a: [
                (1, 'a'),
                (2, 'b'),
                ],
                b: [
                (3, 'c'),
                (4, 'd'),
                ]}
        if (var_one > len(my_list) and
                nest_dict[a][0] == 1):
            do_something()
        print var_one
        result = some_function('a',
                'b', 'c', 'd', 'e', 'f')

That example is not PEP8-compliant, but you're free to add line breaks to satisfy PEP8's "When using a hanging indent ... there should be no arguments on the first line". The pep8 checker will also complain about the lack of indentation for the nested data structure — though in fact PEP8 itself says nothing about intra-statement indentation. If you prefer to liberalize the above rules to include indentation for data structures, you can satisfy the pep8 tool (at least version 1.6.2) with code like

    def long_function_name(
            var_one,
            var_two, var_three,
            var_four):
        my_list = [
                1, 2, 3,
                4, 5, 6]
        nest_dict = {
                a: [
                    (1, 'a'),
                    (2, 'b'),
                ],
                b: [
                    (3, 'c'),
                    (4, 'd'),
                ]}
        if (var_one > len(my_list) and
                nest_dict[a][0] == 1):
            do_something()
        print var_one
        result = some_function(
                'a',
                'b', 'c', 'd', 'e', 'f')

For pylint, on the other hand, that's still not enough: it will tut-tut loudly at the universal eight-space indentation for continuation lines. To silence it, I suggest simply disabling 'bad-continuation' in your pylintrc.

Now, here's a useful and pragmatic test of an indentation scheme: does the indentation alone allow all statements to be distinguished from one another?

Put another way, could a simple program reliably pick out the statements by considering only the leading whitespace? For this, we exclude comments and multiline string literals. Pipe that last example above through

    perl -ne'{/^( *)(\S*.*)$/; if (!$2) {print} else {
        $l = length $1;
        if ($l <= $last + 4) {$last = $l; $c = ($c + 1) % 10};
        $i = $1; $m = $2; $m =~ s/./$c/g; print "$i$m\n"}}'

to see the result:

    11111111111111111111111
            11111111
            1111111111111111111
            1111111111
        22222222222
                22222222
                22222222
        3333333333333
                3333
                    333333333
                    333333333
                33
                3333
                    333333333
                    333333333
                33
        444444444444444444444444444444
                4444444444444444444444
            55555555555555
        6666666666666
        77777777777777777777777
                7777
                777777777777777777777777

—  Michael Breen, 2017