Redshift Phase 2: types classes for types with arbitrary numbers of * parameters

Redshift phase 2 is about classes for types with an arbitrary number of *-kinded parameters. So one class, for example, that can be instantiated by [], (,), Either, (,,), (,,,), and so on. Because I want to offer generic functionality that works on all these types, I need classes instantiable at all of their kinds.

So far, I’ve generalized over the sequence of type classes Functor, BiFunctor/Functor2, Functor3 etc. Even without concern for a generic definition, this turns out to pose some interesting design decisions and be another motivation for GADT promotion. I share here one exploration of the design space and a little hack for simulating promotion of vectors today in GHC 7.6.1.

Full code in this gist.

If you’ve ever used anything above BiFunctor/Functor2, please comment on this post! I’m guessing they’ve been used/desired for big tuples, but nothing else obvious comes to mind.

Lead with Examples!

Here’s a taste of the user interface to my generalization. Note that these instances reuse the same CovariantN class at various kinds (ie * -> * and * -> * -> * here). The instances realize the fact that, for example, the (,) and Either types are both independently covariant in both their first and second arguments.

-- intuitive pseudo-declaration
-- class CovariantN (t :: k) where
--  covmapN :: (a -> z) -> (t a) b c d ... -> (t z) w y x ...

instance CovariantN [] where covmapN = underUncurry . map

instance CovariantN (,) where
  covmapN f = underUncurry $ f Control.Arrow.*** f
instance CovariantN ((,) b) where
  covmapN f = underUncurry $ (\b, a) -> (b, f a)

instance CovariantN Either where
  covmapN f = underUncurry $ \case -- LambdaCase FTW
    Left  x -> Left  (f x)
    Right x -> Right x
instance CovariantN (Either b) where
  covmapN f = underUncurry $ \case
    Left  x -> Left  x
    Right x -> Right (f x)

I’m guessing independently covariant set off your jargon alarm. By independently covariant, I mean a type is covariant in some type parameter with no regard for its other parameters — they could be contravariant or even indexed. I chose this interface for CovariantN because it’s more general than the path extrapolated from the the normal definition of Bifunctor, which requires that the class parameter be covariant in all of its arguments. This precludes supporting datatypes like the following, where one argument is covariant but the others are instead, say, indexed.

data G :: * -> * -> * where G :: a -> G a Int

instance CovariantN G where
  covmapN f = underUncurry $ \(G a) -> G (f a)

Because the covmapN method applies a single function to the type’s first argument, it doesn’t quite match up to the conventional BiFunctor/Functor2. I’ll show how to recover that interface at the end of the post. For now, I want to show how covmapN is used.

-- inferred exampleA :: (Bool -> b) -> (Char, b)
exampleA f = covmapN f `asUC1` ('c', False)

-- inferred exampleC :: (Char -> b) -> (b, Bool)
exampleC f = covmapN f `asUC2` ('c', False)

-- inferred exampleD :: (Num n1, Num n2) => (n2 -> b) -> (n1, b)
exampleD f = covmapN f `asUC1` (0, 0)

-- inferred exampleG :: GADT [Char] Int
exampleG = covmapN show `asUC2` GADT ()

That’s pretty nifty, right? It ends up being a nice interface with only one wrinkle: the user has to explicitly invoke functions like asUC1, asUC2, etc. But it’s an essential wrinkle; these functions indicate to which parameter of the argument you’re applying the mapped function. I say wrinkle because I don’t think these are actually much of a burden. If the user has even the simplest understanding of kinds, it becomes easy to know which function to use — as I explain below. And everything else works today in GHC 7.6.1.

While the instances and uses of CovariantN/covmapN are tidy, the class is a little less so. But do realize that the instances and invocations of covmapN are what the user deals with! And those two things are both very formulaic; the semantics is simple. The user doesn’t necessarily have to deal with the class declaration.

Now, about that class declaration… brace yourselves.

The Actual Class Declaration

class CovariantN (t :: k) where
  covmapN :: VecR (NumArgs (t a)) ps => -- hack! explained below :)
             (a -> b) -> Uncurry t (a ': ps) -> Uncurry t (b ': ps)

We can get through this. Start by comparing it to the intuitive pseudo-declaration I showed earlier.

-- class CovariantN t where
--  covmapN :: (a -> z) -> (t a) b c d ... -> (t z) w y x ...

The key idea is that covmapN works on the first parameter of t, no matter how many other parameters it has (all of which must be of kind *). The actual class attains this flexibility by fully uncurrying t at the type-level. We apply the uncurried t to a type-level list of all its arguments. In the domain and range of covmapN f, all of t’s arguments are in the lists (a ': ps) and (b ': ps), respectively.

I’ll explain the VecR constraint in the covmapN signature below, when it becomes relevant. The piece to focus on now is the Uncurry data family.

Uncurrying at the Type-Level

The Uncurry data family fully uncurries a type: it converts a type expecting N parameters into one that expects just one parameter: a tuple of N types.

data family Uncurry (t :: k) :: [*] -> *

Note that the N-tuple of type is unfortunately modeled as a list. Without promoted GADTs and kind-indexed kind families, we cannot specify that the length of the list is the number of arguments in k — that’s what we’ll workaround later via VecR.

We can write two instances of Uncurry that together handle any number of * arguments. I unfortunately don’t see a tidy way to lift the restriction to *. I know how to support a fixed set of kinds as arguments, but I don’t know how to make that set extensible. However, this restriction to * still permits use of CovariantN with very many common types.

newtype instance Uncurry (t :: *)      ps = UCZ t
newtype instance Uncurry (t :: * -> k) ps =
  UCS (Uncurry (t (Head ps)) (Tail ps))

It is also crucial to the user interface to define the following functions.

asUCZ f = (\(UCZ x) -> x) . f . UCZ
asUCS f = (\(UCS x) -> x) . f . UCS

asUC1 f = asUCZ (asUCS f)
asUC2 f = asUC1 (asUCS f)
asUC3 f = asUC2 (asUCS f)
-- ... more as needed in practice

I consider these functions part of CovariantN’s user interface. They’d be just as crucial for any other class that uses Uncurry for type-level variadism. The way these functions are used is so intuitive that I think the benefits of CovariantN at the type-level outweigh having these functions in the interface — they more than break even, in my opinion.

The user chooses between asUC1, asUC2, … by using the formula N = <number-of-args> - <which-arg> + 1. Note that we’re just counting the arguments from the right-to-left, starting at one instead of zero. Consider again the example uses I showed above.

exampleA f = covmapN f `asUC1` ('c', False) -- f is applied to False
exampleC f = covmapN f `asUC2` ('c', False) -- f is applied to 'c'

Since the tuple has two arguments, the user thinks one of the following.

  • “I want to apply f to the 2nd argument of the tuple and 2 - 2 + 1 = 1, so I apply via asUC1.”

  • “I want to apply f to the last argument of the tuple, so I apply via asUC1.”

  • “I want to apply f to the 1st argument of the tuple and 2 - 1 + 1 = 2, so I apply via asUC2.”

  • “I want to apply f to the second to last argument of the tuple, so I apply via asUC2.”

To hit the 4th element in an 6 tuple, you’d use asUC3. We count backwards – right-to-left — because the N in asUCN is how many more arguments the type expects. For the 4th element, we’re relying on CovariantN ((,,,,,) a b c), and that type expects 3 more arguments; the 4th element is the third to last.

That’s all there is to using the covmapN method. But manys user will also need to declare new instances of CovariantN. For that, they need the inverse of the asUCN fuctions. The inverses are called underUC0, underUC1, and so on. I used a generalization of these in the instances at the top of the post: underUncurry. If the type t is determined by the context, we can can implicitly choose the N. For details, see Recurry and RecurryIso in the gist. That’s where underCurry comes from.

OK. That’s it for the user interface.

Before I start explaining VecR, I’d like to briefly share some connections.

Ralf Hinze’s Polytypic values possess polykinded types uses some type features that Haskell does not currently support in order to implement a generic function for covariant mapping. Like Functor2, it requires covariance in all parameters as opposed to the one-at-a-time of CovariantN, but it also supports arbitrarily kinded parameters. The major correspondence with this post is that the paper also uses foralls on the RHS of type family instances, which my use of Uncurry emulates.

José Pedro Magalhães’ The Right Kind of Generic Programming shows that allowing foralls on the RHS of type family instances would help to make the generic representation of a datatype more strongly kinded.

Uncurry could be a type family if type families could be injective.

Weirich, Hsu, and Eisenberg recently submitted Down with kinds, which, among many neat things, would enable promotion of GADTs. In light of that paper — the content of which is probably imminent for implemention in GHC — the rest of the technical content in this post is, in some sense, me picking up pennies in front of a steam-roller.

So what is it that VecR constraint on covmapN all about?

What VecR Does

If you remove the VecR constraint from the type of covmapN, then exampleC becomes ill-typed.

-- intended to be exampleC :: (Char -> b) -> (b, Bool)
exampleC f = covmapN f `asUC2` ('c', False)

-- without the VecR constraint on covmapN,
-- exampleC raises this error
--
--   Couldn't match type `Head * ps0' with `Bool'
--   The type variable `ps0' is ambiguous

The key bit is that the ps0 type is ambiguous. In the type of asUC2 (covmapN f), there are two type-level lists of parameters, one for the input and one for the output. In exampleC, the input one gets fixed when we provide the tuple argument, but the output one doesn’t. The error message indicates GHC inferred that the first element in the list should be Bool. However, GHC has no information about the spine of the list, so it can’t put that constraint to use (ie it can’t simplify Head ps0). This is the problem VecR solves: it determines the spine of the parameter list from the kind of t, which is often statically known. We could have instead provided annotations determining the spine of ps, but they get overwhelming quickly — inference is necessesary for usability here.

Ideally, we could use a promoted GADT and kind families to specify this relation between the kind of t and the length of ps in a type Uncurry t ps. If Vec is the sized-list GADT and NumArgs returns a promoted Nat kind, Uncurry could be declared as follows.

data family Uncurry (t :: k) :: Vec (NumArgs k) * -> *

I think this would solve the inference issue rather directly: the length of the list (vector) of arguments is now explicitly determined by the kind of t. I’m not sure how the other code might need to change, since there’s no implementation for me to bang on. But an implementation of these ideas in GHC seems likely to me. Peruse the UPenn group’s submitted draft, if you’re interested.

How VecR Solves the Problem

Instead of appealing to promotion of GADTs, we can put type equality constraints to work. In particular, type equalities “are enough” to produce a type-level list’s spine from a type-level natural.

Recall the type of covmapN.

covmapN :: VecR (NumArgs (t a)) ps => -- explained now!
           (a -> b) -> Uncurry t (a ': ps) -> Uncurry t (b ': ps)

First off, NumArgs is simple; it just counts the number of expected arguments. We get a Nat from a kind.

type family Bottom :: k -- never to be instantiated

type family NumArgs (t :: k) :: Nat
type instance NumArgs (t :: *) = Z
type instance NumArgs (t :: kd -> kr) = S (NumArgs (Bottom :: kr))

And here’s VecR; it’s just a constraint alias for a type equality.

class    (ps ~ NonStrictTakeN n ps) => VecR (n :: Nat) (ps :: [k])
instance (ps ~ NonStrictTakeN n ps) => VecR  n          ps

The important part is the NonStrictTakeN type family. It’s functionally equivalent to a promotion of take, but it crucially produces the result list’s spine without forcing that of the input list.

type family NonStrictTakeN (n :: Nat) (ts :: [k]) :: [k]
type instance NonStrictTakeN Z ts = '[]
type instance NonStrictTakeN (S n) ts =
  Head ts ': NonStrictTakeN n (Tail ts)

The superclass constraint of VecR basically ties the knot around NonStrictTaken.

That’s all there is to the VecR hack. It’s only interesting because

  • it enables type inference on uses of covmapN in GHC 7.6.1 today, and

  • and it suffices for one possible use of a promoted GADT.

The second bullet suggests that there may be an alternative route to the expressiveness that we expect from promoted GADTs. That could be interesting to explore. In particular, that route could possibly be more viable for other compilers of Haskell-like languages. Or it could just be a cute little device, totally subsumed by GADT promotion — time will tell.

Recap

I have two reasons for generalizing over Functor, Functor2, and so on.

  1. I’m experimenting with a variadic version of generic-deriving, so I need a single class of which to provide a generic definition that supports covariance in any argument.

  2. This is the only approach to Haskell type classes of types with varying number of arguments (and also with methods) that I’m aware of. It’s an interesting space to explore, especially when considering various variances and indexing.

The Uncurry family is the key to manipulating the values of variadic types. Combined with the VecR hack, Uncurry is a way to squeeze multiple universally quantified type variables (each parameter) out of one universally quantified type variable (the list ps) — that’s how I came up with the approach. There’s plenty of design decisions around this ingredient to be revisited, but I think more applications might need to be identified. A similar generalization of the EqT class mentioned in my previous Redshift post might be one.

A final word about modularity. Recall that CovariantN and covmapN can be reused at any kind, which is a clear improvement over the sequence Functor/fmap, Functor2/fmap2, … This is inherited from the polykindedness of Uncurry. However, the example uses of covmapN I showed required asUC0, asUC1, and so on, so covmapN is in a way only half reusable. I think it is interesting to consider that the asUC* functions in a way factor out the difference between fmap, fmap2, … They can play the same role for any other variadic type class like CovariantN — though I don’t have any obvious candidates in mind (other than InvariantN and ContravariantN).

Appendix: Recovering Functor2

Since I claimed in the intro that I generalized over that sequence of type classes, I’ll say a bit about how to get that interface from CovariantN. We can recover the Functor2 interface as follows: Functor2 t iff (CovariantN t, forall a. CovariantN (t a)). One safe way to encode that forall in valid GHC Haskell would be to use a generalization of the machinery in Edward Kmett’s constraints package. I’ve been talking to him about it, and will blog about it very soon.

The single-parameter-at-a-time approach I’ve taken here is relatively straight-forward, and I think it’ll be pretty workable. But it is just one point in a rather rich design space. In particular, an alternative to the asUC1, asUC2, etc might be to add a Nat as a class parameter to specify which argument we’re mapping over. I haven’t explored that option yet: the user interface would be a bit more approachable, but I’m not sure of the cost to the internal defintions (like underCurry eg).

About these ads

2 responses to “Redshift Phase 2: types classes for types with arbitrary numbers of * parameters

  1. Really nice, the user interface is actually pretty!

    (Two typos though, covmapN should be :: (a -> z) -> (t a) b c d … -> (t z) w y x … and exampleC should be :: (Char -> b) -> (b, Bool))

  2. Thanks much @Sjoerd!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s