Rational Java Indentation and Style

Michael Breen, 2004. $Revision: 1.8 $

Preface

When I began writing Java some time back I didn't find any standard I was completely happy with. The following has been added to whenever I've had a chance and just might prove useful to someone with similar prejudices (or even an open mind). As far as possible, I've tried to include the reasoning behind various conventions.

Note that while many coding standards address issues such as variable naming standards, sometimes leaving freedom on "religious" issues like brace style, this is a style guide which takes the opposite approach and concentrates on visual form and structure.

Comments

Prologue Comments

The single most valuable goal for a coding style is to minimize wasted time. Readability follows as a sub-goal: code that is easy to read and understand saves time spent in maintenance and debugging. However, the time spent writing code must also be considered. Years ago, it was not unknown for C programmers to spend as much time on borders as the average customs officer:

bad
   /***********************************
    * A beautifully-framed comment    *
    * about a very important          *
    * C function or module.           *
    ***********************************/

Sun's widely-followed convention of using only a left-hand border is certainly an improvement. However, putting in the asterisk in each line still takes time; perhaps relatively little per line but it all adds up (arguments that begin, "But my editor / macro / tool..." must be disregarded)

bad
   /**
    * Constructs a list containing the elements of
    * the specified collection, in the order they
    * are returned by the collection's iterator.
    *
    * @param c the collection whose elements are to
    * be placed into this list.
    * @throws NullPointerException if the specified
    * collection is null.
    */
   public void myMethod(...

An improvement in readability would perhaps be worth the extra effort. However, prologue comment text is not difficult to distinguish from code: Even at a glance, the general absence of indentation and the relatively uniform starting position and line length give these comments a blockish shape quite different from ragged code.

An exception is where code is included in a comment, for example, to show how the methods of a class should be used. But such comments are rare. Moreover, any potential confusion for a maintainer is brief, being resolved as soon as they scroll at the top or bottom of the sample code (which is of course the first thing you do with code anyway, so that you know what you're looking at).

Conclusion: the advantage of delimited comments is not that you are free to put fancy borders between the "/*" and the "*/", it is that, unlike line comments, you are free not to put anything extra on each line. Neither is any indentation relative to the comment delimiters necessary:

OK
   /**
   Constructs a list containing the elements of
   the specified collection, in the order they
   are returned by the collection's iterator.

   @param c the collection whose elements are to
   be placed into this list.
   @throws NullPointerException if the specified
   collection is null.
   */
   public void myMethod(...

Normally, comment delimiters each occupy a line of their own. However, it is acceptable to put the entire comment on one line if length permits.

OK
   /** @return the only instance of this class */
   public static MyClass getInstance()
   {
      return onlyInstance;
   }

Use only delimited comments (not line comments) for module and method prologues and descriptions of class and instance variables.

Code Comments

The reverse arguments apply for comments within code blocks, where only line comments should be used. In this context, comments should be brief (usually one line) and, being indented to various levels together with the code, are more easily distinguished from it by the leading "//" on each line. At the same time, two lines are saved that would otherwise be needed for the "/*" and "*/" of delimited multiline comments, thus helping to minimize separation of the code (though one may choose to add a blank line before the comment for readability).

OK
   private TokenBuffer getTokenBuffer()
   {
      if (tokBuffer == null)
         tokBuffer = new TokenBuffer(BUF_SIZE);
      // If the parser is reinitialised the token
      // manager may change - so make sure we're
      // buffering the current one.
      if (bufferedSource != tokenSource)
      {
         bufferedSource = tokenSource;
         tokenSource.addTokenListener(tokBuffer);
      }
      return tokBuffer;
   }

Avoid trailing comments, that is, comments following code on the same line - this makes little difference when reading the code sequentially but the reduced salience of the leading "//" when it is not at the beginning of the line tends to slow down scanning (the eye takes a little longer to separate code from comment). Also, put any comment on the line before the corresponding statement, never within a statement, where it gives a false visual cue by separating code that should be kept together.

bad
      if (args[i].charAt(0) == '-')
      {
         ...
      }
      else // args[i] must be a file to open
      {
         ...
      }

bad
      if (args[i].charAt(0) == '-')
      {
         ...
      }
      else
      // args[i] must be a file to open
      {
         ...
      }

OK
      if (args[i].charAt(0) == '-')
      {
         ...
      }
      else
      {
         // args[i] must be a file to open
         ...
      }

Inline comments should be used sparingly for brief explanations or references to longer ones elsewhere, and not to repeat information already present in the code. Most methods may require no inline comments but be sure to include them if they're needed for a maintainer (including you, when you've forgotten the code in a couple of months) to understand why things are as they are and so reduce not only the time spent trying to understand the code but also the danger of a bug being introduced with a future change.

If an algorithm needs to be explained, consider putting the explanation in a delimited comment at the end of the file with a reference to this comment within the method. Separating the description from the implementation avoids breaking up the code with a multitude of comments, which could just make an already complex control structure even longer and more difficult to follow. It also allows more freedom in writing and formatting the description (which, in turn, is freed from the repeated intrusion of code fragments).

Line Length

Maximum line length is 69 visible characters (i.e., not including carriage return or line-feed characters).

There are good reasons for a conservative line length policy. First, it helps minimize wasted white space: It is often useful to be able to see at least two (and sometimes three) files side-by-side on the screen at once - or to view different parts of the same file in separate windows. It is also convenient to configure the default window size of an editor to be exactly wide enough to allow you to do this. Having to continually resize editor windows or do horizontal scrolling is distracting and vexing.

Second, because it makes code easier to read. This is the main reason that ordinary text is printed in two or more columns when the paper width is large relative to the font size, as in a newspaper or a dictionary for example.

Shorter lines also allow headroom for unforeseen future requirements. For example, while you may currently find that an 80 character width is fine for printing two columns of code per page in landscape format, you may wish in future to precede each line in the printout with its number, or a coverage instrumentation tool might be used to annote the start of each line with the number of times it's executed in regression tests. You might even need to submit the code to some standards body or authority which demands a shorter maximum line length.

Indent Unit

The unit of indentation is 3 spaces (do not use tabs). Less than this causes the indentation levels to come close to a continuum, making it a little too hard to judge what code is at the same level when it encompasses longer stretches of more deeply-nested blocks. More is unnecessary and the wasted space quickly accumulates with each level of nesting, also forcing the eye to traverse larger spaces.

Intra-statement Indentation

Indentation invariably serves two main functions:

(it is because the first of these is the more important that an "else" appears at the same level of indentation as the corresponding "if").

Additional and varying levels of indentation for other purposes only distracts the eye and thus weakens its usefulness for these, the most important functions. From this observation follows our rule:

No progressive indentation within a statement.

bad
               firstResult =
                  alpha.conjoin(gamma).
                    disjoin(theta.conjoin(
                            omega.disjoin(kappa)));
               DedicatedQTValues dedVals = new
                  DedicatedQTValues((Variable)
                                    vars.get(k),
                                    (BitField)
                                    col.get(k));

OK
               firstResult = alpha.conjoin(gamma).
                  disjoin(theta.conjoin(
                  omega.disjoin(kappa)));
               DedicatedQTValues dedVals = new
                  DedicatedQTValues(
                  (Variable) vars.get(k),
                  (BitField) col.get(k));

Not progressively indenting saves useful space and unnecessary additional line breaks, especially if the statement is already deeply indented. Also, it is very difficult to define conventions which can be consistently followed, for example, it can often be impractical to place all the arguments of a method beyond the corresponding opening parenthesis - so that one may agonize over the best way to arrange things, iteratively changing indentation and the positions of line breaks; choosing the line breaks alone is enough to spend time thinking about.

Further, the benefits to be derived from progressive intra-statement indentation are relatively limited: all the code involved is together on a few successive lines so the eye does not need to be aided in navigation; and, unlike when scanning larger sections, one must at the level of the individual statement read the code in sequence anyway. These reflections (and the experience of the author) indicate that the improvement in overall readability comes at little cost to that of individual statements. In cases where this seems not to be so, it may be that the statement would be difficult to format alternatively anyway (see above) so that it would be better broken down into two or more statements regardless of this convention.

Brace Style

The main bone of contention among proponents of C-like languages concerns the position of the opening braces of code blocks. In the Allman style, the "{" gets a line of its own, and lines up with the "}", whereas according to the K&R Style, it goes at the end of the preceding line, thus saving a line and allowing more code to be seen on screen at once (a legitimate concern, though the K&R style dates back to the days of primitive terminals rather than modern displays).

A problem with the K&R approach is that, if the part of the statement or declaration coming before the "{" has a line break, you can no longer immediately tell by the indentation alone (that is, without reading the code) where the contents of the block begin. Typically, you must look first to the right to find the brace and then go one line down and to the left. Of course, this is not difficult and is what one does anyway if reading sequentially through the code.

However, the point is that most of the time code is not being read sequentially. Instead, a programmer's eye frequently jumps from one part of the code to another, examining the flow of control, finding the locations where a given variable is used, and so on. When indentation is no longer sufficient for the eye to find the right spot, this is activity is slowed (in the same way that reading ordinary text with excessively long lines is slower than reading text in a narrower column, though one might very well not be consciously aware of it).

bad
   public static void execute(String command)
      throws SyntacticException, SemanticException {
      commandHandler.reinitialise();
      if (command == null ||
         command.trim().length() == 0) {
         System.out.println()
         throw new SyntacticException(
            "Empty command",
            commandHandler.getIdentifier());
      }
      // more code...
   }

To overcome this, K&R advocates typically use extra indentation for continuation lines. If you've been reading carefully so far, you might already guess that this refinement is not going to find support here. With this solution, no longer is it sufficient to consider whether a line is indented or not, one must also consider how much it is indented (or else look some lines further up to see the relative indentation of the first line of the statement in question). Again, this is not at all difficult, in the sense that, say, adding two one-digit numbers is not difficult. However, while trying to understand code, it does demand a little more conscious attention than merely scanning the aligned braces down the right hand side of the code.

An irony is that indenting continuation lines by more than one unit makes it more likely that additional line breaks will be needed. Thus, one may end up using as many lines as the Allman style would use, without any of the benefit deriving from the "{" being in a prominent position and an almost empty line to help distinguish the code in the block from that preceding it.

bad
   public static void execute(String command)
         throws SyntacticException,
         SemanticException {
      commandHandler.reinitialise();
      if (command == null ||
            command.trim().length() == 0) {
         System.out.println()
         throw new SyntacticException(
            "Empty command",
            commandHandler.getIdentifier());
      }
      // more code...
   }

OK
   public static void execute(String command)
      throws SyntacticException, SemanticException
   {
      commandHandler.reinitialise();
      if (command == null ||
         command.trim().length() == 0)
      {
         System.out.println()
         throw new SyntacticException(
            "Empty command",
            commandHandler.getIdentifier());
      }
      // more code...
   }

A variant of the Allman style is the GNU approach of indenting the braces as well. From the point of view of logical elegance and consistency, this convention has much to recommend it. However, it actually aids readability only where local blocks may be defined as isolated statements, that is, blocks which are not part of a loop or other flow of control (perhaps in order to limit the scope of variables declared within them?) - or, in practice, effectively never. It also means using more indentation, leaving less space for code. This is partly addressed by reducing the indent unit to two spaces. However the width left for code in a block is just as if four spaces were used in the Allman or K&R styles while there are more levels of indentation with fewer columns separating them.

The examples below show the indentation to be followed for the various flow-of-control statements.

OK
      if (condition)
      {
         code
      }
      else if (condition)
      {
         code
      }
      else
      {
         code
      }

OK
      switch (x)
      {
      case A:
         code
      case B:
         code
      default:
         code
      }

(On blank lines and label indentation, see below.)

Where the syntax and semantics of the language allow, the braces in the above templates may be omitted - but only provided every part of the statement fits on one line.

OK
      for (int i = 0; i < j; buf[i++] = 0)
         ;

OK
      if (initialized)
         return buffer(bufIndex);
      else
         return null;

bad
      if (initialized && (bufIndex > 0) &&
         (bufIndex < buffer.length))
         return buffer(bufIndex);
      else
         return null;

bad
      if (initialized)
         return buffer(bufIndex);
      else
         for (int i = 0; i < buffer.length; ++i)
            buffer[i] = null;

bad
      if (initialized)
         return buffer(bufIndex);
      else
      {
         for (int i = 0; i < buffer.length; ++i)
            buffer[i] = null;
      }

OK
      if (initialized)
      {
         return buffer(bufIndex);
      }
      else
      {
         for (int i = 0; i < buffer.length; ++i)
            buffer[i] = null;
      }

It is permissible to put a declaration including a code block, or a statement containing such a declaration, on one line (provided this violates no other rule, in particular, provided the code block does not contain two or more statements).

The labels of a switch statement align with the "switch" (as seen above); any other label is indented the same amount as the loop or block which it labels (see the next example below).

A statement containing a declaration with a code block is indented first as a statement.

OK
      SteppingTransform mySteps =
         new SteppingTransform()
         {
            public void fillSteps(int steps[][],
               int x, int y)
            {
               eachRow:
               for (int i = 0; i < x; ++i)
               {
                  for (int j = 0; j < y; ++j)
                  {
                     if (steps[i][j] < 0)
                        break eachRow;
                     steps[i][j] = i + j;
                  }
               }
               contentsChanged();
            }
         };
      allSteps.add(mySteps);

Spaces

A space never follows another space except in indentation (and strings or comments).

No space follows an opening bracket, "(" or "[", or precedes a closing bracket, ")" or "]".

Each brace, "{" or "}", whether delimiting a code block or the elements of an array, is preceded by a space; a space also follows an opening brace "{".

Commas "," and semicolons ";" are always followed by a space; except to observe this rule, no space precedes them. Colons ":" are not preceded by a space.

Assignments ("=", "%=", ...), tests for equality/inequality ("==", ">", ...), binary operators ("+", "&&", ...), and the "?" and ":" of the conditional assignment operation are all preceded and followed by a space.

Unary operators ("++", "!", ...) are not separated from their operands.

No space precedes the opening bracket of an array index "[", or the opening bracket, "(", of a method argument argument list, either in declaration or invocation.

No space appears either before or after the access operator ".".

Line Breaks

A label always appears on a line on its own. Every statement in a block of two or more statements separated by ";"s appears on its own line (note this does not include the expressions in a "for (a; b; c)" statement).

A line break (followed by appropriate indentation) can replace any space.

A line break may also be inserted after a "[" or the access operator ".". It is also possible to break a line after a "(" but only provided no space precedes it. (Note, for example, that the "(" enclosing the type of a cast operation is what gives it its meaning and therefore attaches to the type rather than anything preceding it. However, the "(" enclosing the arguments of a method call helps to identify the method as a method and therefore should appear with it. It is for the same reason that in ordinary printed text the hyphen "-" within words broken across lines always appears at the end of one line, never the beginning of the next: this means there is no transient confusion about what one is looking at.)

Blank Lines

When a label in a switch follows a "break" statement, the two are separated by a blank line. If the preceding statement is not a break, there is no blank line.

OK
      switch (nextChar)
      {
      case 's':
         echo = false;
         break;

      case 'l':
         logging = true;
      default:
         if (echo)
            System.out.println(nextChar);
      }

Blank lines separate each of the main sections in a file, including prologue comment (with copyright or version information, etc.), package declaration, imports and interface or class. Within a class, at least all constructors, methods and inner classes should be separated from each other and from any other fields by blank lines (always before the prologue comment, of course).

A blank line may optionally be inserted anywhere to improve readability but the use of two or more contiguous blank lines is unnecessary and should be avoided.

Imports

Import statements should be arranged alphabetically in a contiguous block, uninterrupted either by blank lines or comments. General imports (those ending in ".*") and static imports ("import static ...") should not be used.

Miscellaneous

Where it does not matter whether the post- or pre- versions of an increment or decrement operator is used, use the pre- version. (The point here is not so much that "increment x" reads more naturally than "x increment" - rather, the increment operator is visually much more prominent at the beginning of a long variable name, improving readability.)