Safe, polykinded universally quantified constraints (Part 3 of 3)

This is Part 3 of 3. See Part 1 for the introduction and Part 2 for the previous details.

I’m going to explain my idea for a sound Forall in two steps. In the first step, which was inspired by SPJ’s idea for “Fixing Any, we’ll see that it’s still susceptible to the same ohnoes attack above. When we understand why, we’ll see that GHC type equality is in one sense kind of mischevious. To avoid it’s cleverness, I use a new technique in the second step, which Kmett noticed looked a lot like Hilbert’s ε operator. I suspect this final version completely justifies the unsafeCoerce, but I have no plans to show it rigorously. I’ll just sketch my argument.

(This first step can also be added to Kmett’s double-skolem method. The second step is instead an alternative to Kmett’s approach. However, Kmett’s double-skolem plus the first step renders attack ill-typed; that combination might also be sound, even without the second step.)

In step one, we switch from using a data type for the skolem to using an uninstantiable type family.

type family Skolem :: k -- not exported

type Bottom = Skolem -- can be exported

type Forall (p :: k -> Constraint) = p (Bottom :: k)

Since this type family is not exported, it cannot be instantiated. This is crucial; otherwise it would effectively be just as susceptible to attack as the Skolem data type from the naïve attempt. Interestingly, the synonym Bottom can be exported. This is because applications of type families are not allowed in the arguments of instances, so exposing Bottom to the user gives them no capabilities that compromise our assumptions: it is still the case that only parametric instances can support Bottom. This is one advantage of type-family-as-skolem over data-type-as-skolem.

Also notice that we’ve switched to polykinds. Since our skolem was previously a data type, it had to end in *. So Kmett was forced to declare skolem data types at each kind of interest: *, * -> *, and so on. This is another big advantage of skolem type families over skolem type data types.

But alas, we are still susceptible to the same ohnoes attack as before. The ultimate problem is that GHC’s type equality supports unrestricted reflexivity: it considers Skolem ~ Skolem regardless if Skolem is a data type or an uninstantiated type family.

I do think that the general reflexive rule (a ~ a for all a) makes total sense, but I also think it might be more powerful than necessary. And since that “extra” power confounds our otherwise very tidy approach to this interesting problem of universally quantified constraints, I have at least a pet cause to motivate explaining my concerns.

Bottom types

My studies did not include a thorough treatment of domain theory. I’m weilding terminology here to my best understanding, and am probably wrong in parts — please correct me if so. Whatever correct knowledge about domain theory I have is due to some unlikely retention of past readings and a recent conversation with Neil Sculthorpe.

I called the synonym Bottom because it inhabits every kind and if type checking involves inspecting it, then there’s a type error. This is akin to the value-level, where undefined is the bottom of the domain (as in domain theory). It inhabits every type and if your program forces evaluation of undefined, then you get a (mostly) fatal run-time error: divergence. The same thing happens at compile-time with Bottom: if you force it, perhaps with Int ~ Bottom, you get a type error (saying Bottom isn’t defined enough), ie a fatal compile-time error (“compile-time divergence”?).

(The deferred type errors of Vytiniotis, Peyton Jones, and Magalhães are too clever to not mention here!)

The Bottom type resembles an actual domain theoretic bottom element, since it inhabits every kind and it’s a normal form but not a value. Yet it is not an actual bottom because type-level reduction — ie simplification via type family and type class instances — can manipulate Bottom in what would be a discontinuous way. For example, GHC synthesizes a coercion for Bottom ~ Bottom, but not for Bottom ~ Int. If you turn an argument (like Int) into a less defined type (like Bottom) and get more information out of a function, then it is discontinuous. Continuous means “more info in, more info out”, crudely. I get the feeling that, in domain theory, discontinuous things are suspicious. Is there already a known story about this discontinuous-seeming behavior in System FC? Does it matter?

Given XOverlappingTypeFamilies, I think we’ll also be able to define a discontinuous type family, Equal.

type family Equal (a :: k) (b :: k) :: Bool
type instance where
  Equal a a = True
  Equal a b = False

I’m assuming that Richard’s overlapping type family instances uses general reflexivity just like the non-linear patterns of instance heads, so that Equal Bottom Int is a normal form, but Equal Bottom Bottom reduces to True.

GHC’s type equality ~ is actually the propositional (type) equality of System FC pro, which indicates why it seems discontinuous: it is not defined soley in terms of reduction. One of the rules in that paper defining ~ is reflexivity, which is purely syntactic, so Bottom is determined equal to Bottom without reducing either argument. Only if the terms are not syntactically equivalent does the equality test need to do something like weak-head normal form reduction on the types and recur. In other words, an equality test is only strict if the argument types are syntactically distinct.

Again, this is far from non-sensical. Some notion of syntactic reflexivity is in fact required for a type variable to be equal to itself since type variables do not reduce. However, perhaps we could restrict reflexivity to disallow its invocation at type terms like Bottom, which do not involve type variables. Or perhaps add a primitive bottom type term that is not ~-equivalent to anything, even itself. I’m unaware of any programs that would be ill-typed under this restriction. But I’m also only aware of one program that would be well-typed because of it 🙂 — and it might also complicate and harm the efficiency of the GHC implemention of type equality.

(If someone can predict what lies down this path — good or bad — I’d appreciate it.)

Enough about bottoms…

Let’s get back on topic.

We can still write a robust Forall constraint even with GHC’s fully reflexive type equality. Because Bottom ~ Bottom, the skolem representatives of distinct type variables are considered equivalent. In particular, this equivalence judgment follows from the skolem terms’ syntactic equivalence: they’re all just Bottom. So, given what we have, how can we make them syntactically distinct? I’m going to discuss two options.

The first option is to use Kmett’s double-skolem technique again.

-- do not instantiate
type family Skolem1 :: k
type family Skolem2 :: k

type Forall (p :: k -> Constraint) = (p Skolem1, p Skolem2)

With two uninsantiable type families, even the attack trick above is ill-typed! The overlapping instances of the ObviousAbuse class still handles the scenarios where a and b are both represented by the same type family. But, when they are represented by the syntactically distinct type families, the equality test then tries to reduce them, and fails because they are uninstantiated! Success.

Recall that we’re just trying to make the term for each skolem be syntactically distinct. Kmett’s approach does that combinatorially: if you have N nested forall quantifiers, it checks that the quantified constraint holds for all 2^N representations, many of which use syntantically distinct skolem terms for each type variable.

The exponential complexity of checking the double-skolem technique is currently the only reason I suspect that the second option might be preferrable. I don’t know where to start in an attempt to subvert this first option. It’s seems just as robust as the second option I’m about to discuss.

The second option simply includes the p variable itself in the skolem term.

-- do not instantiate
type family Skolem :: (k -> Constraint) -> k

type Forall (p :: k -> Constraint) = p (Skolem p)

This is my final definition for Forall. I think it might even completely protect the use of unsafeCoerce. Here’s my proof sketch.

The only attack vectors I’m aware of comes from skolems being confused for one another when nesting Foralls — that’s what I used to build my ohnoes functions. To even nest Foralls in the first place requires auxiliary classes like Forall_Nested_Once, because Haskell has no type-level lambda. Note that such auxiliary classes become part of the quantified predicate p. In our attacks above, the first type variable is introduced by a call to Forall with the predicate Forall_Nested_Once Same and the second is introduced by the predicate Forall (Same SKO1), where SKO1 is the skolem for the first variable. So, if SKO1 and SKO2 are syntactically the same, we’re in an unsafe scenario. But by tucking the argument to Forall into the skolem as an argument, SKO1 and SKO2 can never be the same. As a result, I think we have the lemma: two skolem type terms representing type variables bound by quantifiers in the same nested sequence are tagged with the same constraint p iff they represent the same type variable. This stymies the attack vector we used, which is the only one I know of.

Of course, that’s just a sketch. Please let me know if you have a counterexample in mind.

Look familiar?

When I sent this idea to Kmett, he noticed that the Skolem type family with the new argument now resembles Hilbert’s ε operator. However, there’s some important distinctions.

Hilbert’s ε is characterized by the “transfinite axiom”: p (εx. p x), which formalizes the notion that εx. p x is some thing that satisfies p, if one exists. We don’t have that axiom for Skolem — that’s a big difference.

However, what happens when no such thing exists? Then εx. p x is “some nonsensical thing” — ideally a bottom type, for our purposes! This (I’m presuming) the reason that universal quantification can be realized using ε as follows.

forall a. p a = p (epsilon x. not (p x))

(This next sentence is a dense doozy; here’s a fuller discussion.) In words, a universally quantified predicate is true iff it is true for the term that is either some thing for which the predicate does not hold or else some nonsensical thing if no such thing exists. This is like parametricity: a function (eg in System F) is parametrically polymorphic iff it does not inspect its type argument.

So, I think it’d be accurate, for our uses here, to consider the Skolem type family with an argument to coincide with a composition of Hilbert’s ε and the hypothetical Not constraint. Neat.


One response to “Safe, polykinded universally quantified constraints (Part 3 of 3)

  1. If you’re familiar with real analysis, then the suspicion of discontinuity should be familiar. To put a finer point on it: consider a sequence of real numbers which appears to be “converging” on some point. That is, each time we take a step along the sequence it looks like we’re getting closer to some consistent goal (even if we can’t yet figure out where that goal is, per se). Sequences like these are in principle infinitely long; if someone gives us a finite sequence, just repeat the last element forever. The problem is, even if every local step is “converging”, there are infinitely many steps. Thus, the whole sequence could be converging on some value A, but then at the infinity-th step it could jump over to B instead. A local definition of convergence can’t rule out these sorts of sequences, but if we allowed them then we couldn’t reason about “the point” that a given convergent sequence converges on. This is why actual real-analysis requires a more intricate definition of a “converging sequence”. It’s to ensure you don’t have discontinuities which break the ability to reason.

    In domain theory the situation is similar. Though, in particular, the need for continuity tends to have more to do with ensuring that things are valid domain homomorphisms etc. If I have a bunch of partial orders and map between them, I can’t draw any conclusions unless I know those maps are monotonic (or antitonic). The idea for domain theory is similar, you just have lubs and infinities being thrown about too.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s