Updated 2015-02-08 13:06:46 by dkf

GPS Oct 14, 2003 - Factoring code is something that all too often people don't seem to like doing.

Here are some reasons why you should factor your code:

  • fewer bugs
  • the code is more flexible
  • information is divided into more usable sections
  • you won't generally need to use things like goto
  • you may be able to reuse more parts and reduce code size

Let us address the fewer bugs topic. The human brain can only handle so much information at once. If we see information divided into sections we can more easily manage and understand it. It's much easier to see potential problems as well, because we aren't dealing with as complex a problem. As Charles Moore once said: "Write code so simple and clear that bugs simply can't happen."

The code becomes more flexible because we can modify certain modules related to it. We also can encapsulate ourselves from what other routines do (unless we use many global variables).

By having information divided into sections we can more easily understand it. This relates to the flexibility of the code addressed previously.

Goto isn't needed very often, because each module/function handles what is needed and returns a result. For example if function F calls A, B, and C to initialize a command but any of A, B, or C may fail we can encapsulate this into a new function called InitABC(). This function will merely return TCL_OK or TCL_ERROR. Now we have reduced the complexity involved in F. We now merely need to worry about the result of InitABC and not of A, B, and C.

Code reuse is an often used reason for factoring. The advantages of it are that we can modify a single area of a program and thereby change the behavior of all functions that call it.

What is a good size for a function or proc? Many experts agree that 3 - 10 lines is a good size for languages like Tcl and C. As RS says "... a good proc fits between thumb and middle finger... ... even O(N**N) may be ok, for sufficiently small N ..." Charles Moore and countless other respected engineers believe in this principle as well. Some complex switch statements may not apply, but when you think about it, the switch statements are like mini-procedures.

Now that you know why factoring is important here are some tools that may help:

  • a dictionary
  • think about what the block of code is doing
  • name functions based on what they perform if possible
  • structures that allow sharing information between procedures more easily than proc arguments

In closing would you prefer a 300 page technical book written in 100 paragraphs or the same information written with at least 2 paragraphs per page? How would you feel about a book that only labelled every 5 pages? What if it was 10 or 15? How would this affect your quest to search for information that you vaguely remember?

3-10 seems a little too small sometimes. A maximum of 50-60 or whatever fits on a page in your editor is probably good for a language like C. In my own C code I've given up on the idea of keeping everything extremely short like (10 lines). Now I target a maximum of 50-60 lines (which fits on a single page for me).

In a recent project I had a rather nasty large switch for event handling. At first I was apprehensive about restructuring it into more functions, and using a table of function pointers. Overall however now the code is easier to understand. I've fixed 2 bugs more easily, and I noticed them after the restructuring. It has also been easier to add more event handlers.

See also edit