For the sake of argument, let’s consider the smallest useful unit of software to be the humble function. Now let’s consider how adding a function to your program can ruin it:
- The function changes non-local state when evaluated. Thus, the behaviour of all software artefacts that depend on that state cannot be predicted.
- The function may return values that are not deterministic according to its inputs. Thus, the state of the program after executing the function cannot be predicted.
- The function causes the program to enter an anomalous control flow. Thus, the function introduces the risk that the program state will become invalid.
- The function can return a bottom value (i.e.
null). Thus, the function introduces the risk that the program’s state will be only partially represented.
Of course, no one would ever intentionally write such a function. Yet it is not uncommon to find methods that change their object/receiver’s state (1.) in response to a value read from a non-local source (2.), throwing an exception if the read fails (3.) and returning a null value if the operation could not be completed (4.). Most functions don’t exhibit all four of the above problems, but many exhibit at least one.
Why do we write such code? Partly, because many imperative languages encourage the creation of such functions. However, we also believe we can manage the complexity these functions introduce—manage it when the function is created, when the surrounding artefacts change, when the function’s implementation is modified, and when components of the system are replaced. We write unit, integration and system tests to ensure that our complexity management efforts are succeeding, then we also expend effort to manage those tests. Alternatively, the potential defects described in this blog post can be eliminated at the source. This is possible using two concepts central to functional programming: purity and totality.
A function that does not exhibit points 1. and 2. is a pure function: its output is deterministic according to its inputs and its evaluation causes no side-effects. A function that does not exhibit points 3. and 4. is a total function: it returns a defined output—a non-bottom, or ‘non-
null‘ value—for every input. The value of functions that are both pure and total (PT hereafter) cannot be understated. A PT function always produces a valid output, maintains local control flow, and does not modify program state. If a PT function contains a defect, the defect can only reside in the function’s body, as the function does not rely on non-local state. For the same reason, adding PT functions to a program does not cause a greater-than-linear increase in the overall maintenance burden. Due to these benefits, a number of functional languages enforce purity and verify totality at the function level.
Of course, it is possible to write functions in imperative code that exhibit purity and totality. However, imperative code cannot guarantee function purity and totality, and this is an extremely important distinction. If a language does not enforce purity and check totality at the function level, then impurity and partiality can creep into functions despite best intentions—the function may depend on library code that is impure and partial, or the code may accidentally lose purity and totality during maintenance. Depending on developers to manage the complexity introduced by impure and partial functions is as flawed as assuming that testing is unnecessary because developers don’t intentionally create defects. We don’t indulge the latter foolishness, yet we promote the use of imperative language features and practices that make it more difficult to create correct code. The value of the functional paradigm is not its headline features such as folds, transducers and monads, it is the guarantees that can be made about program behaviour, at the function level, by enforcing purity and verifying totality.
Header image courtesy of josh james.
Thanks to Rhys Adams and Ewouldt Kellerman for proofreading and providing suggestions.