Project Redshift: expanding the generic-deriving/GHC.Generics universe, Phase 1

Summary: the :.: type from generic-deriving/GHC.Generics precludes the use of indexed datatypes inside of the represented datatypes; I propose a way to lift that restriction, but don’t have a compelling example that needs the added exressivity.

In generic-deriving — and hence in GHC.Generics — the :.: type (read “compose”) is used to represent internal applications. I’ll explain how in a bit.

newtype (f :.: g) p = Comp1 (f (g p))

In this post, I’m proposing an alternative to :.:. This alternative is designed to allow indexed datatypes as the f argument of :.:; currently such use of indexed datatypes results in either ill-typed or unusable instances of Generic1. I explain why they’re currently not supported, stop to wonder the benefit of adding this support for indexed datatypes, and then demonstrate how to do it.

The Problem with :.:

In generic-deriving, f and g parameters of :.: are both supposed to be representations. In GHC.Generics, f is a normal type (ie an unrepresented type like [], Maybe, or IO) and g is a representation. That discrepancy is my fault — I talked Pedro into it. So if it’s ultimately the wrong move… mea culpa. (If you’re interested in why it’s beneficial, please ask in the comments and I’ll blog about it.)

To set the stage, let’s see :.: in action. Consider representing the MyList type.

data MyList f a = MyNil | MyCons [f a] (MyList f a)

The :.: representation type handles applications of types (excepting recursive occurrences — that’s what Rec1 is for). In the MyCons constructor, two types are applied: [] and f. These applications are what I call internal applications; they occur inside the datatype’s fields. Since there’s two internal applications in MyList, there are two resulting uses of :.: in the representation.

instance Functor f => Generic1 (MyList f) where
  type Rep1 (MyList f) =
    U1     :+:     [] :.: (f :.: Par1)   :*:   Rec1 (MyList f)

In order to define the methods of Generic1 for representations involving :.:, we use fmap — hence the Functor f constraint. In general, any internally applied type must be a Functor and results in a corresponding constraint in the Generic1 instance.

  from1 MyNil = L1 U1
  from1 (MyCons x y) =
    R1 $ Comp1 (fmap (Comp1 . fmap Par1) x) :*: Rec1 y
  to1 (L1 U1) = MyNil
  to1 (R1 (Comp1 x :*: Rec1 y)) =
    MyCons (fmap (fmap unPar1 . unComp1) x) y

I’ll let you convince yourself that that instance declaration type-checks in GHC 7.6.1. It’s not the full GHC.Generics representation of MyList, but it’s sufficiently characteristic for this post.

That Functor f constraint is the focus of this post. In this example, it prevents us from using generic functions on MyList f when f is an indexed datatype, because indexed datatypes generally cannot be Functors. My proposed alternative to :.: eliminates the requirement that internally applied types be Functors.

… Why?

I actually don’t have a devastatingly compelling reason to accommodate internal applications of indexed datatypes. I figured out the stuff in this post before realizing that the generic functions I had in mind all themselves (ie independent of the jump through Generic1) ultimately require that internally applied types be Functors. There is, though, still the slightest hope for usefulness.

I have come up with one class whose generic definition would work with internally applied indexed datatypes: the EqT class. And indeed, instances of EqT are tedious to write. However, two things render this class a mediocre motivator.

  • It’s not terribly common.

  • I don’t know how to write a generic definition of EqT in generic-deriving/GHC.Generics. (I am, however, making promising progress on one with my polyvariadic variant of generic-deriving.)

So…

  • What are the generic functions used on indexed datatypes?

  • Do any of them not require the internally applied types to be covariant?

  • Is EqT the only one? Is it compelling to you?

If you’re still interested in how to remove that Functor constraint, read on.

My proposal: :@: in place of :.:

Full code in this gist.

We can eliminate that constraint by using :@: in place of :.:.

-- newtype (f :.: g  ) p = Comp1 (f (       g   p))

newtype    (t :@: rep) p = App1  (t (UnRep1 rep p))

The UnRep1 type family is the linchpin. It is only ever applied to representations of constructor arguments: Par1, K1, Rec1, and :@:.

type family UnRep1 (rep :: * -> *) (p :: *) :: *

type instance UnRep1 Par1     p = p
type instance UnRep1 (K1 i c) p = c
type instance UnRep1 (Rec1 f) p = f p
type instance UnRep1 (t :@: rep) p = t (UnRep1 rep p)

This family — as you may have guessed — in a sense does the opposite of Rep1: it converts a representation of a constructor argument back into the actual constructor argument. Note well that the names I’ve given are a bit misleading, since UnRep1 is not fully inverse to Rep1. The Rep1 family describes the structure of an entire datatype declaration, whereas the UnRep1 family forgets the structure of a type term — so it’s more so an inversion of the spine view. What UnRep1 is actually undoing is the reification of a constructor argument’s structure performed by the generic-deriving Template Haskell code or by the -XDeriveGeneric extension of GHC.

Since I’m showing that :@: totally supplants :.:, I need to show that generic definitions for :.: can also be defined for :@:. I’ll just do Functor — classes like Eq and Show can be derived via -XStandaloneDeriving. Enum also works.

We’ll need the UnGeneric1 class in order to do this modularly. It supports UnRep1 in the same way that the Generic1 class supports Rep1, and it’s not otherwise remarkable.

class UnGeneric1 (rep :: * -> *) where
  fromUn1 ::        rep p -> UnRep1 rep p
  toUn1   :: UnRep1 rep p ->        rep p
instance UnGeneric1 Par1     where
  fromUn1 = unPar1 ; toUn1 = Par1
instance UnGeneric1 (K1 i c) where
  fromUn1 = unK1   ; toUn1 = K1
instance UnGeneric1 (Rec1 f) where
  fromUn1 = unRec1 ; toUn1 = Rec1
instance UnGeneric1 (t :@: rep) where
  fromUn1 = unApp1 ; toUn1 = App1

Using fromUn1, toUn1, and a small type ascription, the Functor instance for :@: is just a tweak of the conventional instance for :.:.

-- instance (Functor f, Functor g) => Functor (f :.: g) where
--   fmap f = Comp1 . fmap (     (fmap f)) . unComp1

instance (Functor t, Functor rep, UnGeneric1 rep) =>
  Functor (t :@: rep) where
     fmap f = App1  . fmap (wrap (fmap f)) . unApp1 where
       wrap g = fromLeaf1 .
         g . (id :: forall x. rep x -> rep x) . toLeaf1

It’s also a drop-in replacement for :.: in Rep1 instances, and it simplifies the method definitions, to boot.

instance Generic1 (MyList f) where
  type Rep1 (MyList f) =
    U1      :+:      [] :@: (f :@: Par1)   :*:   Rec1 (MyList f)
  from1 MyNil = L1 U1
  from1 (MyCons x y) = R1 $ App1 x :*: Rec1 y
  to1 (L1 U1) = MyNil
  to1 (R1 (App1 x :*: Rec1 y)) = MyCons x y

Note well the absence of the Functor f constraint — that’s basically the whole point.

Recap

So, using :@: in place of :.: does two things.

  • It weakens the context of Generic1 instances. As a particular result, datatypes that contain internal applications of indexed datatypes can be instances of Generic1.

  • It slightly complicates the instances of generic classes for type application. In particular, it requires a bit of additional coercion; compare and contrast the Functor instance for :.: to the one for :@: to get a feel for it.

I am not currently aware of an existing generic definition that actually requires this weakening of the Generic1 instances’ contexts. I’m working on one for EqT, but it’s going to be rather complicated and isn’t the most compelling example. It’d be really nice to have an existing example that benefits from :@: — you don’t by chance know of one, do you?

I do have other possibile generic definitions in mind that would benefit from :@:, but they’re rather experimental. They are thankfully enough, however, for me to believe that :@: might actually end up being useful!

About these ads

3 responses to “Project Redshift: expanding the generic-deriving/GHC.Generics universe, Phase 1

  1. I’m not aware of a distinct treatment of :.: between GHC.Generics and generic-deriving; if this is the case, then generic-deriving is wrong and should be fixed. Anyway, from what I can see, the GFunctor instance for :.: in generic-deriving is `instance (GFunctor f, GFunctor’ g) => GFunctor’ (f :.: g)`, which seems to be in line with considering `f` a regular type constructor, and `g` a representation.

    Also, in your representation of MyList, you use a `Functor f` constraint, but you could as well use a `GFunctor f` constraint (which basically translates into a `Generic1 f` constraint). This has the advantage of, at least, not requiring “external” constraints, and, if Generic1 is to ever support indexed datatypes, then the composition case would also work for indexed datatypes.

  2. I spoke loosely when discussing the “discrepancy”. Concretely, I see two differences.

    • When GHC derives Generic1, it populates the instance context with Functor constraints on internally applied types. The generic-deriving methodology/paper used Generic1 constraints.
    • Your suggestion is actually case-in-point for the other difference. In instances of type classes like Functor for the :.: type, the context could require Functor or Generic1 for the left argument of :.:. This choice should probably mirror the choice made at the Generic1 instances.

    So, the only hard discrepancy would be in the Template Haskell of the generic-deriving package versus the GHC deriver for Generic1. I haven’t actually checked the generic-deriving Template Haskell code; I’m assuming its behavior matches the methodology in the paper.

    • I may have replied a bit too quickly.

      I should first mention that since the Generic1 methods are (ideally) an isomorphism, we should use Invariant instead of Functor, since it is a strictly weaker constraint. (As far as I understand, being invariant wrt to some argument is equivalent to being parametrically polymorphic wrt that argument — that’s the meaning I intend in the rest of this discussion.)

      And one more thing before the meat of my response. I operate under the assumption that (λt. (GFunctor (Rep1 t), Generic1 t)) is — and always will be — no weaker of a constraint than (λt. Functor t). In particular, if indexed datatypes become representable, I’m not seeing how their representations could possibly instantiate GFunctor. So I don’t follow your suggestion; did you have that issue in mind?

      OK; here’s my better thought out response. I’m presuming that indexed datatypes — either directly indexed or indexed by internal application of some other indexed datatype — will eventually be representable. I think a key observation is that once indexed datatypes are representable, then the invariant types and the representable types will be incomparable. So if we use either one for the constraints on internally applied types inside the contexts of generated Generic1 instances, then those generated instances may feasibly exclude the use of certain generic functions (such as EqT eg) solely because we chose to use one constraint instead of the other. Whichever we choose, we’re disallowing the use of generic programming on one of the two crescents on the sides of the incomparability Venn diagram.

      I lobbied for Generic1 to use Functor (and someday Invariant) instead of Generic1 because I believe Functor is currently a strictly weaker constraint than is Generic1. IE In Haskell today, it’s the case that all representable types (with generic-deriving/GHC.Generics) are invariant. But when indexed datatypes become representable, that will no longer be the case. So, :@: is designed to make the choice between the two moot.

      One last qualification: while there currently are types that are invariant but not representable, they 1) are not terribly common and 2) could feasibly become representable in the future, as we expand the universe. The day all invariant types are representable is the day that Generic1 is a strictly weaker constraint than Invariant.

      My current motivation is to prevent the case where a design choice we made unintentionally prohibits a user from applying generic-deriving/GHC.Generics to their data type today.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s