9.10. Pattern Match Compiler

Mikael Pettersson (Sweden)

9.10.1. Basics

A pattern match expression has the general form:

case
    \ p11 p12 ... := e1
    \ p21 p22 ... := e2
    ...
    \ pn1 pn2 ... := en

We convert the case expression step by step into a case tree.

The first m rows begin with a variable pattern:

(assume 0 < m < n) In that case all rows above m are not reachable. This is an error for the whole pattern match expression, if the not reachable rows do not occur in other branches.

All pattern in the first column are variable pattern:

We have to rename the variable such that the same variable appears in all rows and then collapse the first column into one case.

\ x :=
    case
        \ p12 ... := e1
        \ p22 ... := e2
        ...
        \ pn2 ... := en
The first column contains constructor pattern:

Pattern with the same constructor have to be renamed such that the same variables occur in the corresponding positions. If the first row has variable pattern, then the same variable has to be chosen and this variable has to appear in all constructor pattern. Assume we have

case
    \ []        p12 ... := e1
    \ x         p22 ... := e2
    \ (x :: xs) p32 ... := e3

-- rename
case
    \ (a := [])         p12 ... := e1
    \ a                 p22 ... := e2
    \ (a := x :: xs)    p32 ... := e3

If the constructors are exhaustive, no default case is needed. Otherwise a default case is needed. The rows with a variable pattern belong to all cases.

case
    \ (a := []) :=
        case
            \ p12 ... := e1
            \ p22 ... := e2     -- duplicated
    \ (a := x :: xs) :=
        case
            \ p22 ... := e2     -- duplicated
            \ p32 ... := e3

If the constructors are not exhaustive and the column has no variable pattern, then the whole pattern match is not exhaustive.

9.10.2. ?

Renaming of the variables ensures that each pattern in different rows which matches the same part of the input object has the same variable name. Algebraic data types construct tree like objects. Therefore we can use the Dewey decimal notation to have a canonical naming of the variables.

Canonical naming: xijk…

  • i: constructor (with some arity)

  • j: j-th argument of the constructor

Other namings are possible as long as the same part of the input object gets the same variable. E.g. in lists the constructor nil does not have arguments. The constructor cons has 2 arguments, the head and the tail. Therefore x1 can be the name of the head and x2 the name of the tail, if x ist the root.

9.10.3. Example nodups

nodups: List Int -> List Int := case
    \ (x :: y :: ys)    := e1
    \ xs                := xs


    e1 := if x =? y then res else x :: res
    where
        res := nodups (y :: ys)

Introduce root variables and rename the variables consistently.

nodups:
    all (x: List Int): List Int
:=
    \ (x = x1 :: (x2 = (x21 :: x22)))               := q1

    \ x                                             := q2

Now we have two rows and one pattern column and one result column:

match0: {x = x1 :: (x2 = (x21 :: x22))}     {q1}
        {x}                                 {q2}

The constructor :: matches row1 and row2. It is not exhaustive, therefore we need a default case. The first case distinction is between the :: constructor and otherwise.

q0: inspect x case
        \ (x1 :: x2)    := match1
        \ _             := match2

match1 can be entered if x has been constructed by the :: constructor. The remaining pattern is x2 which is either the constructor pattern x2 = x21 :: x22 pattern or the variable pattern x2.

match1: {x2 = x21 :: x22}       {q1}
        {x2}                    {q2}

match1 is not final i.e. we need a new state q3.

match2 can be entered if x has not been constructed by the :: constructor. There is no remaining pattern.

match2: {}  {q2}                -- reduces to q2

match2 immediately reduces to q2 i.e. it is a final state.

The complete q0 state looks like:

q0: inspect x case
        \ (x1 :: x2)    := q3
        \ _             := q2

Now we have to construct q3 from match1. :: is the only constructor in match1, but it is not exhaustive (i.e. we need a default case.

q3: inspect x2 case
        \ (x21 :: x22) :=   match3
        \ _            :=   match4

match3 comes from match1 where x2 has been constructed by ::. This is possible for both rows. There is no more pattern to match. Therefore we get:

match3: {}  {q1}
        {}  {q2}

which reduces to q1.

match3 comes from match1 where x2 has been constructed by anything different from ::. This is possible only for the second row:

match4: {}  {q2}

which reduces to q2.

For q3 we get:

q3: inspect x2 case
        \ (x21 :: x22)  := q1
        \ _             := q2

q2 occurs in q0 and q3. In our simple case q2 = x but it might be a more complicated expression which should not be repeated and the compiler should generate a local abbreviation.

nodups:
    all (x: List Int): List Int
:= \ x :=
    inspect x case
        \ (x1 :: x2)    :=
            inspect x2 case
                \ (x21 :: x22) :=
                    if x1 =? x21 then
                        nodups x2
                    else
                        x1 :: nodups x2
                \ _ :=
                    abbr x
        \ _ :=
            abbr x
   where
    abbr x := x

9.10.4. Example: Eliminate Duplicates

nodups: List Int -> List Int := case
    \ (a := x :: (b := y :: c))  :=  q1
    \ a                          :=  a

    -- where q1:
    --  if x =? y then
    --      nodups b
    --  else
    --      x :: nodups b

:: is the only constructor, it is not exhaustive.

nodups: List Int -> List Int := case
    \ (a := x :: b) :=
        -- rows which fall into this case:
        -- \ (a := x :: (b := y :: c)) := q1
        -- \ (a := x :: b)             := a
        inspect b case
            \ (y :: c)  := q1
            \ _         := a
    \ a :=
        a

9.10.5. Examply unwieldy

unwieldy {A: Any}: List A -> List A -> R := case
    \ [] [] := c
    \ a  b  := f a b

[] is the only constructor in the first column, it is not exhaustive.

\ (a := []) :=
    -- \ (a := []) (b := []) := c
    -- \ (a := []) b         := f a b
    case
        \ (b := [])     := c
        \ b             := f a b

\ a :=
    -- \ a b := f a b
    case            -- 'case' not strictly needed, only variable pattern
        \ b := f a b

9.10.6. Example: demo

demo {A: Any}: List A -> List A -> R := case
    \ []        b           := f b
    \ a         []          := g a
    \ (x :: xs) (y :: ys)   := h x xs y ys

First columns has both constructors [] and ::, it is exhaustive.

\ (a := []) :=
    -- \ (a := [])  b   := f b
    -- \ (a := [])  []  := g a
    -- second case is shadowed
    case \ b := f b
\ (a := x :: xs) :=
    -- \ (a := x :: xs)     []          := g a
    -- \ (a := x :: xs)     (y :: ys)   := h x xs y ys
    case
        \ [] := g a
        \ (y :: ys) := h x xs y ys

9.10.7. Example: less equal

(<=?): Natural -> Natural -> Bool := case
    \ zero      _           := true
    \ _         zero        := false
    \ (succ n)  (succ m)    := n <=? m
\ zero :=
    -- \ (a := zero)    _       := true
    -- \ (_ := zero)    zero    := false
    -- second case is shadowed
    case \ _ := true

\ (succ n) :=
    -- \ (_ := succ n)  zero            := false
    -- \ (a := succ n)  (b := succ m)   := n <=? m
    case
        \ zero          := false
        \ succ m        := n <=? m

9.10.8. Example: greater equal

(>=?): Natural -> Natural -> Bool := case
    \ _             zero        := true
    \ zero          _           := false
    \ (succ n)      (succ m)    := n >=? m
\ zero :=
    -- \ (_ := zero)    zero    := true
    -- \ zero           _       := false
    case
        \ zero :=
            -- \ (b := zero)    := true
            -- \ (_ := zero)    := false
            true
        \ succ m :=
            -- \ (_ := succ m)  := false
            false

\ succ n :=
    -- \ (_ := succ n)      zero        := true
    -- \ (a := succ n)      (succ m)    := n >=? m
    case
        \ zero      := true
        \ succ m    := n >=? m

After clean up:

(>=?): Natural -> Natural -> Bool := case
    \ zero :=
        case \ zero     := true
             \ (succ m) := false
    \ succ n :=
        case \ zero     := true
        case \ (succ m) := n >=? m

9.10.9. Example: map2

map2 {A B C: Any} (f: A -> B -> C): List A -> List B -> List C
:= case
    \ []        _           := []
    \ _         []          := []
    \ (x :: xs) (y :: ys)   := f x y :: map2 xs ys


-- analysis:
\ [] :=
    -- \ []         _       := []
    -- \ (_ := [])  []      := []
    -- first row shadows second row
    case \ _ := []

\ (x :: xs) :=
    -- \ (_ := _ :: _)  []          := []
    -- \ (x :: xs)      (y :: ys)   := f x y :: map2 xs ys
    case \ []           := []
         \ (y :: ys)    := f x y :: map2 xs ys

-- cleanup:
case
    \ [] :=
        \ _ := []
    \ (x :: xs) :=
        case
            \ []        := []
            \ (y :: ys) := f x y :: map2 xs ys

The same with vectors:

map2
    {A B C: Any} (f: A -> B -> C)
    : all {n}: Vector A n -> Vector B n -> Vector C n
:= case
    \ []        []          := []
    \ (x :: xs) (y :: ys)   := f x y :: map2 xs ys

-- Analysis:
\ [] :=
    -- \ [] [] := []
    -- unification: n := 0, therefore second column is exhaustive
    case \ [] := []
\ (x :: xs) :=
    -- \ (x :: xs) (y :: ys) := f x y :: map2 xs ys
    -- unification: n := succ m, therefore second column is exhaustive
    case \ (y :: ys) := f x y :: map2 xs ys

-- cleanup:
case
    \ [] :=
        \ _ := []
    \ (x :: xs) :=
        case \ (y :: ys) := f x y :: map2 xs ys