Two types of ADT

Algebraic Data Types and Abstract Data Types share the same acronym but are very different concepts. Both are clever ideas but with opposite aims. In this post I will use F# to show abstract data types hinting at old and fascinating techniques for their algebraic specification.

Algebraic Data Types

Algebraic Data Types are the workhorses of statically typed functional languages like F#. They are also known as sum types or discriminated unions, so I will refer to them as DUs here, reserving the ADT acronym for abstract data types which is the main subject of the post.

There's no need to introduce DUs because I assume they are well known since they're often praised as a great tool for domain modeling and making illegal states unrepresentable. In this post we will see that they are not completely unrelated to abstract data types: in fact DUs are useful as internal representation of ADT values and they also play a role in the algebraic specification of ADTs (as the algebraic adjective may suggest).

Abstract Data Types

While DUs are concrete data structures, abstraction is (unsurprisingly) the defining trait of abstract data types. ADTs are the ultimate tool for encapsulation and information hiding: their behavior can be described only implicitly, relating values obtained applying the ADT operations without referring to their internal representation.

Objects and ADTs

There are important differences between objects and ADTs but even Bertrand Meyer advocates conflating the two concepts and I sympathize a lot with this view because for me the similarities outweigh the differences.

It's a fact of (computing) history that objects won over ADTs and even a functional first language like F# embraces objects. That's why I'm using F# objects to show ADTs (modules are used in the ML tradition). This is also an occasion to show that immutable objects are very nice and close to the mathematical notion of abstract data type.

The obligatory stack example

The standard pedagogical example of ADT is the stack, whose signature features the operations New, Push, Pop and Top:

1: 
2: 
3: 
4: 
5: 
    type IStack<'a> =
        // abstract static New: unit -> IStack<'a>
        abstract Push: 'a -> IStack<'a>
        abstract Pop: unit -> IStack<'a>
        abstract Top: 'a Option

Unfortunately neither constructors nor static members are allowed in .NET interfaces (Java 8 allows static methods) so we cannot express the New operation in the above signature. Let's then move to a proper class, although with a dummy implementation:

1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
    type private DummyRepresentation = TODO

    type Stack<'a> private(repr) =

        static member New() = Stack(TODO)
        member __.Push(x: 'a) = Stack(TODO)
        member __.Pop() = Stack(TODO)
        member __.Top: 'a Option = None

The class constructor is private and takes a private representation object. For now all the ADT constructors (New, Push and Pop) necessarily use the same dummy representation value (TODO).

A side note on terminology: in ADT parlance a constructor is any operation returning a value of the ADT type. That's why in our stack all operations (except Top) qualify as constructors.

The above implementation is clearly wrong but, to claim that, we first need to specify what is the correct behavior and we can do this with axioms stating the properties of the ADT operations.

We can use FsCheck to verify the correctness of our implementation (hence we won't prove it but at least test it thoroughly with many random values).

Stack Axioms

Here are the stack axioms (we use int for the generic parameter but any type will do):

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
    let axiom1 () =
        Stack<int>.New().Pop() = Stack<int>.New()

    let axiom2 () =
        Stack<int>.New().Top = None

    let axiom3 (s: Stack<int>) x =
        s.Push(x).Pop() = s

    let axiom4 (s: Stack<int>) x =
        s.Push(x).Top = Some x


    open FsCheck

    Check.Quick axiom1
    Check.Quick axiom2
    Check.Quick axiom3
    Check.Quick axiom4

The meaning of the axioms is fairly readable:

  1. popping an empty stack should leave it empty
  2. there's no top of an empty stack
  3. pop undo the effect of a push
  4. pushing an item make it the top

Different API's are possible, but this one has the advantage of defining total functions: Top is defined (albeit with value None) also on the empty stack, and popping an empty stack is possible although useless because it will leave it empty.

Equality

Before running FsCheck we need to fix a couple of things: the first one is that since axioms are expressed as equations comparing ADT values, to compare stacks we have to override Equals so that stack values with the same internal representation (whatever it is) are equivalent:

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
    type private DummyRepresentation = TODO

    type Stack<'a> private(repr) =

        static member New() = Stack(TODO)
        member __.Push(x: 'a) = Stack(TODO)
        member __.Pop() = Stack(TODO)
        member __.Top: 'a Option = None

        member private __.repr = repr
        override __.Equals(obj) =
            match obj with
            | :? Stack<'a> as x -> x.repr.Equals(repr)
            | _ -> false
        override __.GetHashCode() = repr.GetHashCode()

Term Algebra

Now we are equipped with stack equality but the remaining issue is that FsCheck can't generate instances of the Stack type (because it is not a concrete data type). The trick is to define the so called term algebra as a DU:

1: 
2: 
3: 
4: 
    type StackTerm<'a> =
        | New
        | Push of 'a * StackTerm<'a>
        | Pop of StackTerm<'a>

Notice how data constructors in StackTerm correspond to ADT constructors in the Stack API and how straightforward is to create stack instances from terms:

1: 
2: 
3: 
4: 
5: 
    let rec stack term =
        match term with
        | New -> Stack.New()
        | Push(x, s) -> (stack s).Push(x)
        | Pop(s) -> (stack s).Pop()

The stack function highlights the correspondence between the term algebra and the stack API. Terms are just stack expressions, for example term1 corresponds to stack1:

1: 
2: 
    let term1 = Push(5, Pop(Push(42, New)))
    let stack1 = Stack.New().Push(42).Pop().Push(5)

and applying the stack function to term1 we expect to get a stack instance equivalent to stack1:

1: 
    stack term1 = stack1

Random terms can now be generated by FsCheck, and we just need to adapt our properties a little:

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
    let axiom0 (s: StackTerm<int>) =
        stack s = stack s

    let axiom1 () =
        Stack<int>.New().Pop() = Stack<int>.New()

    let axiom2 () =
        Stack<int>.New().Top = None

    let axiom3 (s: StackTerm<int>) x =
        let s' = stack s
        s'.Push(x).Pop() = s'

    let axiom4 (s: StackTerm<int>) x =
        let s' = stack s
        s'.Push(x).Top = Some x

We added also a new axiom for equality (axiom0). This is not specific for stacks, in every ADT we expect two values to be equivalent if they are constructed with the same sequence of operations.

We can finally run FsCheck and see our tests fail. Now it's time to get back to the implementation and make it compliant with the axioms.

Initial Algebra

There are of course multiple implementation options, but first we will pursue one of theoretical interest only. We use terms also for the internal representation:

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
22: 
23: 
24: 
25: 
26: 
27: 
    type Stack<'a when 'a : equality> private(repr: StackTerm<'a>) =

        let rec reduce term =
            match term with
            | New -> New
            | Push(x, s) -> Push(x, reduce s)
            | Pop(New) -> New
            | Pop(Push(_, s)) -> reduce s
            | Pop(s) -> Pop(reduce s)

        static member New() = New |> Stack

        member __.Push(x: 'a) = Push(x, repr) |> Stack

        member __.Pop() = Pop(repr) |> Stack

        member __.Top: 'a Option =
            match reduce repr with
            | Push(x, _) -> Some x
            | _ -> None

        member private __.repr = repr
        override __.Equals(obj) =
            match obj with
            | :? Stack<'a> as x -> (reduce x.repr).Equals(reduce repr)
            | _ -> false
        override __.GetHashCode() = (reduce repr).GetHashCode()

Clearly this implementation is not good from a performance point of view: the representation object always keeps growing, even when Pop is called, and a lot of processing is needed (the reduce function) for Top, Equals and GetHashCode. But it works, and it's derived almost mechanically from the axioms.

In fact there are specification languages in which this kind of implementation can often be automatically derived from axioms.

Axioms determine an equivalence relation on terms (two terms are equivalent if the axioms allow to reduce them to the same term) and the ADT values are conceptually the equivalence classes of this relation. This theoretical model is a sort of reference implementation (quotient term algebra) and this approach (initial algebra) has strong connections with category theory (so it has to be fancy!). There is also the dual approach of final algebra, where two terms are considered equivalent unless the axioms forbid it. There's a lot of beautiful theory behind abstract data types.

Canonical Constructors

The key insight to improve the implementation is nicely explained here. The idea is to identify canonical constructors, the ones intuitively sufficient to construct all values. In our example New and Push are enough because every stack involving Pop can be expressed in a simpler form using only New and Push. For example:

1: 
    Stack<int>.New().Push(42).Pop().Push(5) = Stack<int>.New().Push(5)

This approach provides a precise guideline to specify axioms: one axiom is needed for each combination of a non-canonical operation applied to the result of a canonical operation. Our stack axioms happen to follow this pattern:

  1. Pop(New) = ...
  2. Top(New) = ...
  3. Pop(Push(x, s)) = ...
  4. Top(Push(x, s)) = ...

The first two axioms specify the behavior of the non-canonical operations (Pop and Top) when applied to the empty stack; the other two axioms specify the behavior of the non-canonical operations when applied to a stack obtained with a push operation. There are no more cases to cover.

Since only New and Push are needed to build stacks, we can simplify our representation type:

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
22: 
23: 
24: 
25: 
    type private Representation<'a> = New | Push of 'a * Representation<'a>

    type Stack<'a when 'a : equality> private(repr: Representation<'a>) =

        static member New() = New |> Stack

        member __.Push(x: 'a) = Push(x, repr) |> Stack

        member __.Pop() =
            match repr with
            | New -> New
            | Push(_, s) -> s
            |> Stack

        member __.Top: 'a Option =
            match repr with
            | New -> None
            | Push(x, _) -> Some x

        member private __.repr = repr
        override __.Equals(obj) =
            match obj with
            | :? Stack<'a> as x -> x.repr.Equals(repr)
            | _ -> false
        override __.GetHashCode() = repr.GetHashCode()

Also this implementation just mirrors the axioms, but this is a realistic one. If you squint a bit you notice that our final representation type is isomorphic to the good old F# (linked) list, which is in fact a reasonable internal representation for a stack.

Conclusion

The purpose of this post is to present a few old but interesting concepts about data abstraction. I'm not claiming that specifying and implementing an ADT is always as nice and simple as in the stack example, nor I'm advocating formal methods; but I think these ideas may still provide a useful intellectual guideline.

type IStack<'a> = abstract member Pop : unit -> IStack<'a> abstract member Push : 'a -> IStack<'a> abstract member Top : Option<'a>
type unit = Unit
<summary>The type 'unit', which has only one value "()". This value is special and always uses the representation 'null'.</summary>
<category index="1">Basic Types</category>
module Option from Microsoft.FSharp.Core
<summary>Contains operations for working with options.</summary>
<category>Options</category>
type private DummyRepresentation = | TODO
union case DummyRepresentation.TODO: DummyRepresentation
Multiple items
type Stack<'a> = private new : repr:DummyRepresentation -> Stack<'a> member Pop : unit -> Stack<'a> member Push : x:'a -> Stack<'a> static member New : unit -> Stack<'a> member Top : Option<'a>

--------------------
private new : repr:DummyRepresentation -> Stack<'a>
val repr : DummyRepresentation
val x : 'a
val __ : Stack<'a>
union case Option.None: Option<'T>
<summary>The representation of "No value"</summary>
val axiom1 : unit -> bool
type Stack<'a> = private new : repr:DummyRepresentation -> Stack<'a> member Pop : unit -> Stack<'a> member Push : x:'a -> Stack<'a> static member New : unit -> Stack<'a> member Top : Option<'a>
Multiple items
val int : value:'T -> int (requires member op_Explicit)
<summary>Converts the argument to signed 32-bit integer. This is a direct conversion for all primitive numeric types. For strings, the input is converted using <c>Int32.Parse()</c> with InvariantCulture settings. Otherwise the operation requires an appropriate static conversion method on the input type.</summary>
<param name="value">The input value.</param>
<returns>The converted int</returns>


--------------------
[<Struct>] type int = int32
<summary>An abbreviation for the CLI type <see cref="T:System.Int32" />.</summary>
<category>Basic Types</category>


--------------------
type int<'Measure> = int
<summary>The type of 32-bit signed integer numbers, annotated with a unit of measure. The unit of measure is erased in compiled code and when values of this type are analyzed using reflection. The type is representationally equivalent to <see cref="T:System.Int32" />.</summary>
<category>Basic Types with Units of Measure</category>
val axiom2 : unit -> bool
val axiom3 : s:Stack<int> -> x:int -> bool
val s : Stack<int>
val x : int
member Stack.Push : x:'a -> Stack<'a>
val axiom4 : s:Stack<int> -> x:int -> bool
union case Option.Some: Value: 'T -> Option<'T>
<summary>The representation of "Value of type 'T"</summary>
<param name="Value">The input value.</param>
<returns>An option representing the value.</returns>
namespace FsCheck
type Check = static member All : config:Config * test:Type -> unit + 1 overload static member Method : config:Config * methodInfo:MethodInfo * ?target:obj -> unit static member One : config:Config * property:'Testable -> unit + 1 overload static member Quick : property:'Testable -> unit + 1 overload static member QuickAll : test:Type -> unit + 1 overload static member QuickThrowOnFailure : property:'Testable -> unit static member QuickThrowOnFailureAll : test:Type -> unit + 1 overload static member Verbose : property:'Testable -> unit + 1 overload static member VerboseAll : test:Type -> unit + 1 overload static member VerboseThrowOnFailure : property:'Testable -> unit ...
static member Check.Quick : property:'Testable -> unit
static member Check.Quick : name:string * property:'Testable -> unit
Multiple items
type Stack<'a> = private new : repr:DummyRepresentation -> Stack<'a> override Equals : obj:obj -> bool override GetHashCode : unit -> int member Pop : unit -> Stack<'a> member Push : x:'a -> Stack<'a> static member New : unit -> Stack<'a> member Top : Option<'a> member private repr : DummyRepresentation

--------------------
private new : repr:DummyRepresentation -> Stack<'a>
Multiple items
val obj : obj

--------------------
type obj = System.Object
<summary>An abbreviation for the CLI type <see cref="T:System.Object" />.</summary>
<category>Basic Types</category>
val x : Stack<'a>
property Stack.repr: DummyRepresentation with get
System.Object.Equals(obj: obj) : bool
System.Object.GetHashCode() : int
module Module2 from 2018-05-26-The lost art of data abstraction
type StackTerm<'a> = | New | Push of 'a * StackTerm<'a> | Pop of StackTerm<'a>
union case StackTerm.New: StackTerm<'a>
union case StackTerm.Push: 'a * StackTerm<'a> -> StackTerm<'a>
union case StackTerm.Pop: StackTerm<'a> -> StackTerm<'a>
val stack : term:StackTerm<'a> -> Stack<'a>
val term : StackTerm<'a>
type Stack<'a> = private new : repr:DummyRepresentation -> Stack<'a> override Equals : obj:obj -> bool override GetHashCode : unit -> int member Pop : unit -> Stack<'a> member Push : x:'a -> Stack<'a> static member New : unit -> Stack<'a> member Top : Option<'a> member private repr : DummyRepresentation
static member Stack.New : unit -> Stack<'a>
val s : StackTerm<'a>
val term1 : StackTerm<int>
val stack1 : Stack<int>
val axiom0 : s:StackTerm<int> -> bool
val s : StackTerm<int>
val axiom3 : s:StackTerm<int> -> x:int -> bool
val s' : Stack<int>
val axiom4 : s:StackTerm<int> -> x:int -> bool
module Module3 from 2018-05-26-The lost art of data abstraction
Multiple items
type Stack<'a (requires equality)> = private new : repr:StackTerm<'a> -> Stack<'a> override Equals : obj:obj -> bool override GetHashCode : unit -> int member Pop : unit -> Stack<'a> member Push : x:'a -> Stack<'a> static member New : unit -> Stack<'a0> (requires equality) member Top : Option<'a> member private repr : StackTerm<'a>

--------------------
private new : repr:StackTerm<'a> -> Stack<'a>
val repr : StackTerm<'a> (requires equality)
val reduce : (StackTerm<'b> -> StackTerm<'b>)
val term : StackTerm<'b>
val x : 'b
val s : StackTerm<'b>
val x : 'a (requires equality)
val __ : Stack<'a> (requires equality)
val x : Stack<'a> (requires equality)
property Stack.repr: StackTerm<'a> with get
module Module5 from 2018-05-26-The lost art of data abstraction
module Module4 from 2018-05-26-The lost art of data abstraction
type Stack<'a (requires equality)> = private new : repr:StackTerm<'a> -> Stack<'a> override Equals : obj:obj -> bool override GetHashCode : unit -> int member Pop : unit -> Stack<'a> member Push : x:'a -> Stack<'a> static member New : unit -> Stack<'a0> (requires equality) member Top : Option<'a> member private repr : StackTerm<'a>
type private Representation<'a> = | New | Push of 'a * Representation<'a>
union case Representation.New: Representation<'a>
union case Representation.Push: 'a * Representation<'a> -> Representation<'a>
Multiple items
type Stack<'a (requires equality)> = private new : repr:Representation<'a> -> Stack<'a> override Equals : obj:obj -> bool override GetHashCode : unit -> int member Pop : unit -> Stack<'a> member Push : x:'a -> Stack<'a> static member New : unit -> Stack<'a0> (requires equality) member Top : Option<'a> member private repr : Representation<'a>

--------------------
private new : repr:Representation<'a> -> Stack<'a>
val repr : Representation<'a> (requires equality)
val s : Representation<'a> (requires equality)
property Stack.repr: Representation<'a> with get