Redshift phase 2 is about classes for types with an arbitrary number of
*-kinded parameters. So one class, for example, that can be instantiated by
(,,,), and so on. Because I want to offer generic functionality that works on all these types, I need classes instantiable at all of their kinds.
So far, I’ve generalized over the sequence of type classes
Functor3 etc. Even without concern for a generic definition, this turns out to pose some interesting design decisions and be another motivation for GADT promotion. I share here one exploration of the design space and a little hack for simulating promotion of vectors today in GHC 7.6.1.
Full code in this gist.
If you’ve ever used anything above
BiFunctor/Functor2, please comment on this post! I’m guessing they’ve been used/desired for big tuples, but nothing else obvious comes to mind.
Lead with Examples!
Here’s a taste of the user interface to my generalization. Note that these instances reuse the same
CovariantN class at various kinds (ie
* -> * and
* -> * -> * here). The instances realize the fact that, for example, the
Either types are both independently covariant in both their first and second arguments.
-- intuitive pseudo-declaration -- class CovariantN (t :: k) where -- covmapN :: (a -> z) -> (t a) b c d ... -> (t z) w y x ... instance CovariantN  where covmapN = underUncurry . map instance CovariantN (,) where covmapN f = underUncurry $ f Control.Arrow.*** f instance CovariantN ((,) b) where covmapN f = underUncurry $ (\b, a) -> (b, f a) instance CovariantN Either where covmapN f = underUncurry $ \case -- LambdaCase FTW Left x -> Left (f x) Right x -> Right x instance CovariantN (Either b) where covmapN f = underUncurry $ \case Left x -> Left x Right x -> Right (f x)
I’m guessing independently covariant set off your jargon alarm. By independently covariant, I mean a type is covariant in some type parameter with no regard for its other parameters — they could be contravariant or even indexed. I chose this interface for
CovariantN because it’s more general than the path extrapolated from the the normal definition of
Bifunctor, which requires that the class parameter be covariant in all of its arguments. This precludes supporting datatypes like the following, where one argument is covariant but the others are instead, say, indexed.
data G :: * -> * -> * where G :: a -> G a Int instance CovariantN G where covmapN f = underUncurry $ \(G a) -> G (f a)
covmapN method applies a single function to the type’s first argument, it doesn’t quite match up to the conventional
BiFunctor/Functor2. I’ll show how to recover that interface at the end of the post. For now, I want to show how
covmapN is used.
-- inferred exampleA :: (Bool -> b) -> (Char, b) exampleA f = covmapN f `asUC1` ('c', False) -- inferred exampleC :: (Char -> b) -> (b, Bool) exampleC f = covmapN f `asUC2` ('c', False) -- inferred exampleD :: (Num n1, Num n2) => (n2 -> b) -> (n1, b) exampleD f = covmapN f `asUC1` (0, 0) -- inferred exampleG :: GADT [Char] Int exampleG = covmapN show `asUC2` GADT ()
That’s pretty nifty, right? It ends up being a nice interface with only one wrinkle: the user has to explicitly invoke functions like
asUC2, etc. But it’s an essential wrinkle; these functions indicate to which parameter of the argument you’re applying the mapped function. I say wrinkle because I don’t think these are actually much of a burden. If the user has even the simplest understanding of kinds, it becomes easy to know which function to use — as I explain below. And everything else works today in GHC 7.6.1.
While the instances and uses of
covmapN are tidy, the class is a little less so. But do realize that the instances and invocations of
covmapN are what the user deals with! And those two things are both very formulaic; the semantics is simple. The user doesn’t necessarily have to deal with the class declaration.
Now, about that class declaration… brace yourselves.
The Actual Class Declaration
class CovariantN (t :: k) where covmapN :: VecR (NumArgs (t a)) ps => -- hack! explained below :) (a -> b) -> Uncurry t (a ': ps) -> Uncurry t (b ': ps)
We can get through this. Start by comparing it to the intuitive pseudo-declaration I showed earlier.
-- class CovariantN t where -- covmapN :: (a -> z) -> (t a) b c d ... -> (t z) w y x ...
The key idea is that
covmapN works on the first parameter of
t, no matter how many other parameters it has (all of which must be of kind
*). The actual class attains this flexibility by fully uncurrying
t at the type-level. We apply the uncurried
t to a type-level list of all its arguments. In the domain and range of
covmapN f, all of
t’s arguments are in the lists
(a ': ps) and
(b ': ps), respectively.
I’ll explain the
VecR constraint in the
covmapN signature below, when it becomes relevant. The piece to focus on now is the
Uncurry data family.
Uncurrying at the Type-Level
Uncurry data family fully uncurries a type: it converts a type expecting
N parameters into one that expects just one parameter: a tuple of
data family Uncurry (t :: k) :: [*] -> *
Note that the
N-tuple of type is unfortunately modeled as a list. Without promoted GADTs and kind-indexed kind families, we cannot specify that the length of the list is the number of arguments in
k — that’s what we’ll workaround later via
We can write two instances of
Uncurry that together handle any number of
* arguments. I unfortunately don’t see a tidy way to lift the restriction to
*. I know how to support a fixed set of kinds as arguments, but I don’t know how to make that set extensible. However, this restriction to
* still permits use of
CovariantN with very many common types.
newtype instance Uncurry (t :: *) ps = UCZ t newtype instance Uncurry (t :: * -> k) ps = UCS (Uncurry (t (Head ps)) (Tail ps))
It is also crucial to the user interface to define the following functions.
asUCZ f = (\(UCZ x) -> x) . f . UCZ asUCS f = (\(UCS x) -> x) . f . UCS asUC1 f = asUCZ (asUCS f) asUC2 f = asUC1 (asUCS f) asUC3 f = asUC2 (asUCS f) -- ... more as needed in practice
I consider these functions part of
CovariantN’s user interface. They’d be just as crucial for any other class that uses
Uncurry for type-level variadism. The way these functions are used is so intuitive that I think the benefits of
CovariantN at the type-level outweigh having these functions in the interface — they more than break even, in my opinion.
The user chooses between
asUC2, … by using the formula
N = <number-of-args> - <which-arg> + 1. Note that we’re just counting the arguments from the right-to-left, starting at one instead of zero. Consider again the example uses I showed above.
exampleA f = covmapN f `asUC1` ('c', False) -- f is applied to False exampleC f = covmapN f `asUC2` ('c', False) -- f is applied to 'c'
Since the tuple has two arguments, the user thinks one of the following.
“I want to apply
fto the 2nd argument of the tuple and
2 - 2 + 1 = 1, so I apply via
“I want to apply
fto the last argument of the tuple, so I apply via
“I want to apply
fto the 1st argument of the tuple and
2 - 1 + 1 = 2, so I apply via
“I want to apply
fto the second to last argument of the tuple, so I apply via
To hit the 4th element in an 6 tuple, you’d use
asUC3. We count backwards – right-to-left — because the
asUCN is how many more arguments the type expects. For the 4th element, we’re relying on
CovariantN ((,,,,,) a b c), and that type expects 3 more arguments; the 4th element is the third to last.
That’s all there is to using the
covmapN method. But manys user will also need to declare new instances of
CovariantN. For that, they need the inverse of the
asUCN fuctions. The inverses are called
underUC1, and so on. I used a generalization of these in the instances at the top of the post:
underUncurry. If the type
t is determined by the context, we can can implicitly choose the
N. For details, see
RecurryIso in the gist. That’s where
underCurry comes from.
OK. That’s it for the user interface.
Some Related Work
Before I start explaining
VecR, I’d like to briefly share some connections.
Ralf Hinze’s Polytypic values possess polykinded types uses some type features that Haskell does not currently support in order to implement a generic function for covariant mapping. Like
Functor2, it requires covariance in all parameters as opposed to the one-at-a-time of
CovariantN, but it also supports arbitrarily kinded parameters. The major correspondence with this post is that the paper also uses
foralls on the RHS of type family instances, which my use of
José Pedro Magalhães’ The Right Kind of Generic Programming shows that allowing
foralls on the RHS of type family instances would help to make the generic representation of a datatype more strongly kinded.
Uncurry could be a type family if type families could be injective.
Weirich, Hsu, and Eisenberg recently submitted Down with kinds, which, among many neat things, would enable promotion of GADTs. In light of that paper — the content of which is probably imminent for implemention in GHC — the rest of the technical content in this post is, in some sense, me picking up pennies in front of a steam-roller.
So what is it that
VecR constraint on
covmapN all about?
If you remove the
VecR constraint from the type of
exampleC becomes ill-typed.
-- intended to be exampleC :: (Char -> b) -> (b, Bool) exampleC f = covmapN f `asUC2` ('c', False) -- without the VecR constraint on covmapN, -- exampleC raises this error -- -- Couldn't match type `Head * ps0' with `Bool' -- The type variable `ps0' is ambiguous
The key bit is that the
ps0 type is ambiguous. In the type of
asUC2 (covmapN f), there are two type-level lists of parameters, one for the input and one for the output. In
exampleC, the input one gets fixed when we provide the tuple argument, but the output one doesn’t. The error message indicates GHC inferred that the first element in the list should be
Bool. However, GHC has no information about the spine of the list, so it can’t put that constraint to use (ie it can’t simplify
Head ps0). This is the problem
VecR solves: it determines the spine of the parameter list from the kind of
t, which is often statically known. We could have instead provided annotations determining the spine of
ps, but they get overwhelming quickly — inference is necessesary for usability here.
Ideally, we could use a promoted GADT and kind families to specify this relation between the kind of
t and the length of
ps in a type
Uncurry t ps. If
Vec is the sized-list GADT and
NumArgs returns a promoted
Uncurry could be declared as follows.
data family Uncurry (t :: k) :: Vec (NumArgs k) * -> *
I think this would solve the inference issue rather directly: the length of the list (vector) of arguments is now explicitly determined by the kind of
t. I’m not sure how the other code might need to change, since there’s no implementation for me to bang on. But an implementation of these ideas in GHC seems likely to me. Peruse the UPenn group’s submitted draft, if you’re interested.
VecR Solves the Problem
Instead of appealing to promotion of GADTs, we can put type equality constraints to work. In particular, type equalities “are enough” to produce a type-level list’s spine from a type-level natural.
Recall the type of
covmapN :: VecR (NumArgs (t a)) ps => -- explained now! (a -> b) -> Uncurry t (a ': ps) -> Uncurry t (b ': ps)
NumArgs is simple; it just counts the number of expected arguments. We get a
Nat from a kind.
type family Bottom :: k -- never to be instantiated type family NumArgs (t :: k) :: Nat type instance NumArgs (t :: *) = Z type instance NumArgs (t :: kd -> kr) = S (NumArgs (Bottom :: kr))
VecR; it’s just a constraint alias for a type equality.
class (ps ~ NonStrictTakeN n ps) => VecR (n :: Nat) (ps :: [k]) instance (ps ~ NonStrictTakeN n ps) => VecR n ps
The important part is the
NonStrictTakeN type family. It’s functionally equivalent to a promotion of
take, but it crucially produces the result list’s spine without forcing that of the input list.
type family NonStrictTakeN (n :: Nat) (ts :: [k]) :: [k] type instance NonStrictTakeN Z ts = ' type instance NonStrictTakeN (S n) ts = Head ts ': NonStrictTakeN n (Tail ts)
The superclass constraint of
VecR basically ties the knot around
That’s all there is to the
VecR hack. It’s only interesting because
it enables type inference on uses of
covmapNin GHC 7.6.1 today, and
and it suffices for one possible use of a promoted GADT.
The second bullet suggests that there may be an alternative route to the expressiveness that we expect from promoted GADTs. That could be interesting to explore. In particular, that route could possibly be more viable for other compilers of Haskell-like languages. Or it could just be a cute little device, totally subsumed by GADT promotion — time will tell.
I have two reasons for generalizing over
Functor2, and so on.
I’m experimenting with a variadic version of
generic-deriving, so I need a single class of which to provide a generic definition that supports covariance in any argument.
This is the only approach to Haskell type classes of types with varying number of arguments (and also with methods) that I’m aware of. It’s an interesting space to explore, especially when considering various variances and indexing.
Uncurry family is the key to manipulating the values of variadic types. Combined with the
Uncurry is a way to squeeze multiple universally quantified type variables (each parameter) out of one universally quantified type variable (the list
ps) — that’s how I came up with the approach. There’s plenty of design decisions around this ingredient to be revisited, but I think more applications might need to be identified. A similar generalization of the
EqT class mentioned in my previous Redshift post might be one.
A final word about modularity. Recall that
covmapN can be reused at any kind, which is a clear improvement over the sequence
fmap2, … This is inherited from the polykindedness of
Uncurry. However, the example uses of
covmapN I showed required
asUC1, and so on, so
covmapN is in a way only half reusable. I think it is interesting to consider that the
asUC* functions in a way factor out the difference between
fmap2, … They can play the same role for any other variadic type class like
CovariantN — though I don’t have any obvious candidates in mind (other than
Since I claimed in the intro that I generalized over that sequence of type classes, I’ll say a bit about how to get that interface from
CovariantN. We can recover the
Functor2 interface as follows:
Functor2 t iff (CovariantN t, forall a. CovariantN (t a)). One safe way to encode that forall in valid GHC Haskell would be to use a generalization of the machinery in Edward Kmett’s constraints package. I’ve been talking to him about it, and will blog about it very soon.
The single-parameter-at-a-time approach I’ve taken here is relatively straight-forward, and I think it’ll be pretty workable. But it is just one point in a rather rich design space. In particular, an alternative to the
asUC2, etc might be to add a
Nat as a class parameter to specify which argument we’re mapping over. I haven’t explored that option yet: the user interface would be a bit more approachable, but I’m not sure of the cost to the internal defintions (like