Safe, polykinded universally quantified constraints (Part 2 of 3)

This is Part 2 of 3. See Part 1 for context, and Part 3 for conclusion.

I’m going to simplify for the rest of the discussion. Instead of instF and its friends from the constraints package, we’ll just use the following. I’m pretty sure it’s equipotent.

data Dict :: Constraint -> * where
  Dict :: c => Dict c

-- yields a Constraint
type Forall (p :: * -> Constraint) = ??? -- the $64K question

inst :: forall p a. Forall p => Dict (p a)
inst = unsafeCoerce (Dict :: Dict (Forall p)) -- forgive us
  -- NB: the ascription on Dict may need to be adjusted
  --     depending on the definition of Forall

The first thing to notice is that I’m using Max Bolingbroke’s ConstraintKinds language extension. Second is the use of unsafeCoerce — this blasphemy isn’t the first concern only because the code doesn’t even compile without ConstraintKinds.

If Forall p were a adequate representation of (forall a. p a) then the “unsafe” coercion would actually be safe. Unfortunately, GHC Haskell does not support that external syntax, so we can’t use it in place of the ??? syntax. So how can we emulate it? We must come up with some constraint that is satisfied iff p a is satisfiable for any a. Since a constraint is satisfiable iff GHC can synthesize a dictionary or cast (ie a proof of propositonaly equality in System FC) for it, this means we need a dictionary or cast that is parametric in a.

In our Box/Monoid example above, there is an instance that results in a suitably parametric dictionary.

instance Monoid [a] -- Defined in `Data.Monoid'

This instance matches our p — which is akin to \a -> Monoid [a] — and is parametric in its argument.

The ??? question becomes: how can we write a constraint that is satisfied iff there is a dictionary or coercion for p a that is parametric in a?

The naïve approach

My first guess for ??? was as follows.

type Bad0_Forall (p :: * -> Constraint) = p Skolem

data Skolem -- not exported

Since Skolem is not exported, I figured the only way an instance could be declared that matched p Skolem is if the instance were parametric. Sound reasonable? Here’s where things start getting informative.

That intuition is wrong. It would be correct (I think) if GHC instances had no support for testing type equality. But, fortunately for strongly-typed programmers everywhere, GHC does. Here’s an example instance that violates the incorrect intuition.

class Same (a :: *) (b :: *) where safeCoerce :: a -> b
instance Same a a where safeCoerce = id

At first glance, that instance of Same does look suitably parametric. In particular, its dictionary can evidence Same Skolem Skolem. But we can use the naïve definition of Forall to create evidence for Int ~ Char!

-- this class let's us nest uses of Forall
class Forall (c a) =>
  Forall_Nested_Once (c :: * -> * -> Constraint) (a :: *)
instance Forall (c a) => Forall_Nested_Once c a

-- this type should look like trouble...
ohnoes :: forall a b. a -> b
ohnoes = case inst :: Dict (Forall_Nested_Once Same a) of
  Dict -> case inst :: Dict (Same a b) of
    Dict -> safeCoerce

Well this is exciting! I can use ohnoes to finally figure out the value of that pesky NaN!

*NaiveForall> ohnoes "NaN" :: Double
-5.075883694985077e-116
*NaiveForall> ohnoes "NaN" :: Double
-1.727233717929506e-77
*NaiveForall> ohnoes "NaN" :: Double
1.4916681522467867e-154

I bet it’s such a special number because it’s nondeterministic.

How’d that happen?

That naïve definition of Forall assumes that not exporting Skolem justifies our use of unsafeCoerce, and clearly it doesn’t: forall a b. a -> b should only be inhabited by undefined. So what happened there?

It helps to understand the relationship between Same and GHC’s type equality predicate ~.

class    (a ~ b) => Same' (a :: *) (b :: *)
instance            Same' a a

ohnoes' :: forall (a :: *) (b :: *). Dict (a ~ b)
ohnoes' = case inst :: Dict (Forall_Nested_Once Same' a) of
  Dict -> case inst :: Dict (Same' a b) of
    Dict -> (Dict :: Dict (a ~ b))

Now, Same' a b is satisfied iff GHC determines a and b to be equivalent. That is effectively also true of Same a b, but this class makes explicit the relationship between GHC’s type equality and the non-linear binding allowed in instance heads.

That relationship is one half of the problem. The other half involves the nested use of Forall. Note that in ohnoes we built a dictionary for (forall a. forall b. Same a b). The double quantification is key, and it was the auxiliary Forall_Nested_Once class that let us nest one use of Forall inside of another. The naïve Forall represents both of those type variables with the same actual type: Skolem. But if a and b are both represented by Skolem, then our use of unsafeCoerce ultimately uses a dictionary for Same Skolem Skolem as a dictionary for Same a b. That’s what went wrong.

Here’s how Neil breaks down what goes wrong when a is Int and b is Char. The annotations are mine.

   Skolem ~ Skolem
-- => explicit partial application of ~
   (Skolem ~) Skolem
-- => defn of Forall
   Forall (Skolem ~)
-- => our naive and unsound assumption:
--      for any p, p Skolem implies forall a. p a
   forall a. (Skolem ~ a)
-- => a couple instantiations
   Skolem ~ Int and Skolem ~ Char
-- => ~ is transitive
   Int ~ Char

Kmett’s fix for this

So the problem was that our emulation of universal quantification uses the same skolem type (ie Skolem) to represent any number of distinct type variables introduced by nested quantifiers. If we quantify over a and b with our naïve Forall, then the GHC instance selection mechanism sees a and b as the same type: Skolem.

Edward Kmett defends against this abuse of the naïve Forall by using the following definition of Forall.

data Skolem1 -- not exported
data Skolem2 -- not exported
type Forall (p :: * -> Constraint) = (p Skolem1, p Skolem2)

This is robust against our above abuses; the ohnoes functions are ill-typed. In terms of my explanation of the abuse, the GHC instance selection is now invoked with all four possible representations of the a and b type variables.

a ~ Skolem1, b ~ Skolem1 -- bad representation
a ~ Skolem1, b ~ Skolem2 -- good
a ~ Skolem2, b ~ Skolem1 -- good
a ~ Skolem2, b ~ Skolem2 -- bad

And in order for the nested Forall constraints to be accepted, the constraint p a b must hold for every combination of a and b possible representations. Since there are two representations in there in which a and b have different representations, our ohnoes functions won’t type-check. That’s a pretty cool idea.

End of Part 2

Kmett’s technique is a neat defense mechanism and I think it is actually strong enough to justify the unsafeCoerce. While it’s not totally sound, the key point is that it’s extremely unlikely that anyone would subvert it accidentally. As far as I know, to do so requires XOverlappingInstances, which can alone be abused to do unsafe things even without involving something like unsafeCoerce.

{-# LANGUAGE OverlappingInstances #-}

import Data.Constraint
import qualified Data.Constraint.Forall

class    ObviousAbuse a b where obviousAbuse :: Maybe (Dict (a ~ b))
instance ObviousAbuse a a where obviousAbuse = Just Dict
instance ObviousAbuse a b where obviousAbuse = Nothing

-- thin wrappers around Kmett's Data.Constraint.Forall module
type Forall p = Data.Constraint.Forall.Forall p
inst :: forall p a. Forall p => Dict (p a)
inst = case Data.Constraint.Forall.inst :: Forall p :- p a of
  Sub Dict -> Dict

-- the obvious abuse
attack :: forall a b. a -> b
attack x = case inst :: Dict (Forall_Nested_Once ObviousAbuse a) of
  Dict -> case inst :: Dict (ObviousAbuse a b) of
    Dict -> case obviousAbuse :: Maybe (Dict (a ~ b)) of
      Just Dict -> x

*KmettForall> attack "NaN" :: Double
3.105036197105827e231
*KmettForall> attack "NaN" :: Double
2.6815615967173103e154
*KmettForall> attack "NaN" :: Double
2.681561596787666e154

Like I said, this abuse is clearly an attack — it is unlikely that someone would come up with this trick simply by reacting to the error messages generated by their attempted direct use of inst. But I supposed a million monkeys generating Safe Haskell programs might come up with it within a million years. Yet another reason not to hire monkeys for programming Haskell…

It would still be nice to be even more robust, perhaps even totally sound, wouldn’t it? Just in case you only had monkeys for employees. Furthermore, the need for XOverlappingInstances in attack will soon be supplanted by just overlapping type families, which I believe Richard Eisenberg is in the middle of proving to be a safe language extension. It’s still unlikely that someone could accidentally subvert Data.Constraint.Forall using overlapping type families, but I think we can still do better.

See Part 3 for the conclusion.

Leave a comment