This is Part 3 of 3. See Part 1 for the introduction and Part 2 for the previous details.
I’m going to explain my idea for a sound Forall
in two steps. In the first step, which was inspired by SPJ’s idea for “Fixing Any
”, we’ll see that it’s still susceptible to the same ohnoes
attack above. When we understand why, we’ll see that GHC type equality is in one sense kind of mischevious. To avoid it’s cleverness, I use a new technique in the second step, which Kmett noticed looked a lot like Hilbert’s ε operator. I suspect this final version completely justifies the unsafeCoerce
, but I have no plans to show it rigorously. I’ll just sketch my argument.
(This first step can also be added to Kmett’s double-skolem method. The second step is instead an alternative to Kmett’s approach. However, Kmett’s double-skolem plus the first step renders attack
ill-typed; that combination might also be sound, even without the second step.)
In step one, we switch from using a data type for the skolem to using an uninstantiable type family.
type family Skolem :: k -- not exported
type Bottom = Skolem -- can be exported
type Forall (p :: k -> Constraint) = p (Bottom :: k)
Since this type family is not exported, it cannot be instantiated. This is crucial; otherwise it would effectively be just as susceptible to attack as the Skolem
data type from the naïve attempt. Interestingly, the synonym Bottom
can be exported. This is because applications of type families are not allowed in the arguments of instances, so exposing Bottom
to the user gives them no capabilities that compromise our assumptions: it is still the case that only parametric instances can support Bottom
. This is one advantage of type-family-as-skolem over data-type-as-skolem.
Also notice that we’ve switched to polykinds. Since our skolem was previously a data type, it had to end in *
. So Kmett was forced to declare skolem data types at each kind of interest: *
, * -> *
, and so on. This is another big advantage of skolem type families over skolem type data types.
But alas, we are still susceptible to the same ohnoes
attack as before. The ultimate problem is that GHC’s type equality supports unrestricted reflexivity: it considers Skolem ~ Skolem
regardless if Skolem
is a data type or an uninstantiated type family.
I do think that the general reflexive rule (a ~ a
for all a
) makes total sense, but I also think it might be more powerful than necessary. And since that “extra” power confounds our otherwise very tidy approach to this interesting problem of universally quantified constraints, I have at least a pet cause to motivate explaining my concerns.
Bottom types
My studies did not include a thorough treatment of domain theory. I’m weilding terminology here to my best understanding, and am probably wrong in parts — please correct me if so. Whatever correct knowledge about domain theory I have is due to some unlikely retention of past readings and a recent conversation with Neil Sculthorpe.
I called the synonym Bottom
because it inhabits every kind and if type checking involves inspecting it, then there’s a type error. This is akin to the value-level, where undefined
is the bottom of the domain (as in domain theory). It inhabits every type and if your program forces evaluation of undefined
, then you get a (mostly) fatal run-time error: divergence. The same thing happens at compile-time with Bottom
: if you force it, perhaps with Int ~ Bottom
, you get a type error (saying Bottom
isn’t defined enough), ie a fatal compile-time error (“compile-time divergence”?).
(The deferred type errors of Vytiniotis, Peyton Jones, and Magalhães are too clever to not mention here!)
The Bottom
type resembles an actual domain theoretic bottom element, since it inhabits every kind and it’s a normal form but not a value. Yet it is not an actual bottom because type-level reduction — ie simplification via type family and type class instances — can manipulate Bottom
in what would be a discontinuous way. For example, GHC synthesizes a coercion for Bottom ~ Bottom
, but not for Bottom ~ Int
. If you turn an argument (like Int
) into a less defined type (like Bottom
) and get more information out of a function, then it is discontinuous. Continuous means “more info in, more info out”, crudely. I get the feeling that, in domain theory, discontinuous things are suspicious. Is there already a known story about this discontinuous-seeming behavior in System FC? Does it matter?
Given XOverlappingTypeFamilies
, I think we’ll also be able to define a discontinuous type family, Equal
.
type family Equal (a :: k) (b :: k) :: Bool
type instance where
Equal a a = True
Equal a b = False
I’m assuming that Richard’s overlapping type family instances uses general reflexivity just like the non-linear patterns of instance heads, so that Equal Bottom Int
is a normal form, but Equal Bottom Bottom
reduces to True
.
GHC’s type equality ~
is actually the propositional (type) equality of System FC pro, which indicates why it seems discontinuous: it is not defined soley in terms of reduction. One of the rules in that paper defining ~
is reflexivity, which is purely syntactic, so Bottom
is determined equal to Bottom
without reducing either argument. Only if the terms are not syntactically equivalent does the equality test need to do something like weak-head normal form reduction on the types and recur. In other words, an equality test is only strict if the argument types are syntactically distinct.
Again, this is far from non-sensical. Some notion of syntactic reflexivity is in fact required for a type variable to be equal to itself since type variables do not reduce. However, perhaps we could restrict reflexivity to disallow its invocation at type terms like Bottom
, which do not involve type variables. Or perhaps add a primitive bottom type term that is not ~
-equivalent to anything, even itself. I’m unaware of any programs that would be ill-typed under this restriction. But I’m also only aware of one program that would be well-typed because of it 🙂 — and it might also complicate and harm the efficiency of the GHC implemention of type equality.
(If someone can predict what lies down this path — good or bad — I’d appreciate it.)
Enough about bottoms…
Let’s get back on topic.
We can still write a robust Forall
constraint even with GHC’s fully reflexive type equality. Because Bottom ~ Bottom
, the skolem representatives of distinct type variables are considered equivalent. In particular, this equivalence judgment follows from the skolem terms’ syntactic equivalence: they’re all just Bottom
. So, given what we have, how can we make them syntactically distinct? I’m going to discuss two options.
The first option is to use Kmett’s double-skolem technique again.
-- do not instantiate
type family Skolem1 :: k
type family Skolem2 :: k
type Forall (p :: k -> Constraint) = (p Skolem1, p Skolem2)
With two uninsantiable type families, even the attack
trick above is ill-typed! The overlapping instances of the ObviousAbuse
class still handles the scenarios where a
and b
are both represented by the same type family. But, when they are represented by the syntactically distinct type families, the equality test then tries to reduce them, and fails because they are uninstantiated! Success.
Recall that we’re just trying to make the term for each skolem be syntactically distinct. Kmett’s approach does that combinatorially: if you have N
nested forall quantifiers, it checks that the quantified constraint holds for all 2^N
representations, many of which use syntantically distinct skolem terms for each type variable.
The exponential complexity of checking the double-skolem technique is currently the only reason I suspect that the second option might be preferrable. I don’t know where to start in an attempt to subvert this first option. It’s seems just as robust as the second option I’m about to discuss.
The second option simply includes the p
variable itself in the skolem term.
-- do not instantiate
type family Skolem :: (k -> Constraint) -> k
type Forall (p :: k -> Constraint) = p (Skolem p)
This is my final definition for Forall
. I think it might even completely protect the use of unsafeCoerce
. Here’s my proof sketch.
The only attack vectors I’m aware of comes from skolems being confused for one another when nesting Forall
s — that’s what I used to build my ohnoes
functions. To even nest Forall
s in the first place requires auxiliary classes like Forall_Nested_Once
, because Haskell has no type-level lambda. Note that such auxiliary classes become part of the quantified predicate p
. In our attacks above, the first type variable is introduced by a call to Forall
with the predicate Forall_Nested_Once Same
and the second is introduced by the predicate Forall (Same SKO1)
, where SKO1
is the skolem for the first variable. So, if SKO1
and SKO2
are syntactically the same, we’re in an unsafe scenario. But by tucking the argument to Forall
into the skolem as an argument, SKO1
and SKO2
can never be the same. As a result, I think we have the lemma: two skolem type terms representing type variables bound by quantifiers in the same nested sequence are tagged with the same constraint p
iff they represent the same type variable. This stymies the attack vector we used, which is the only one I know of.
Of course, that’s just a sketch. Please let me know if you have a counterexample in mind.
Look familiar?
When I sent this idea to Kmett, he noticed that the Skolem
type family with the new argument now resembles Hilbert’s ε operator. However, there’s some important distinctions.
Hilbert’s ε is characterized by the “transfinite axiom”: p (
εx. p x)
, which formalizes the notion that εx. p x
is some thing that satisfies p
, if one exists. We don’t have that axiom for Skolem
— that’s a big difference.
However, what happens when no such thing exists? Then εx. p x
is “some nonsensical thing” — ideally a bottom type, for our purposes! This (I’m presuming) the reason that universal quantification can be realized using ε as follows.
forall a. p a = p (epsilon x. not (p x))
(This next sentence is a dense doozy; here’s a fuller discussion.) In words, a universally quantified predicate is true iff it is true for the term that is either some thing for which the predicate does not hold or else some nonsensical thing if no such thing exists. This is like parametricity: a function (eg in System F) is parametrically polymorphic iff it does not inspect its type argument.
So, I think it’d be accurate, for our uses here, to consider the Skolem
type family with an argument to coincide with a composition of Hilbert’s ε and the hypothetical Not
constraint. Neat.