syslog
31Aug/140

ICFP 2014: Day 0 (WGP)

Posted by Leonhard Markert

The 19th International Conference on Functional Programming will begin tomorrow. Two affiliated events, the 10th Workshop on Generic Programming (WGP) and the Workshop on Higher-Order Programming with Effects (HOPE), take place today.

WGP Session 1

Invited talk: Functional Programming, Object-Oriented Programming and Algebras

Bruno Oliveira (University of Hong Kong)

How to achieve modularity, type-safety and reuse without requiring a PhD in type theory? Bruno Oliveira proposes algebras as an alternative to algebraic datatypes and OO hierarchies, more specifically: variants of F-Algebras. Fold Algebras (using products of functions) and F-Algebras (using sums of products) are isomorphic.

An example of encoding generalized algebraic datatypes (GADTs) using type classes in a way similar to Church encodings due to Hinze from 2003, when there were no GADTs in Haskell.

Similarly, in OO, Church encodings were used to model the visitor pattern.

The Expression Problem: a problem of modularity. Given an existing type of expressions, how do you extend it (e.g. how do you extend data Exp = Lit Int | Add Exp Exp with multiplication)?

Solution 1: encode algebras as type classes, extend using subclasses.

Solution 2: use F-Algebras (Datatypes à la Carte)

In OO languages, can also solve the Expression Problem with algebras: use interface inheritance.

All current solutions using algebras still require heavy encodings. In the future: programming language support for algebras?

WGP Session 2

Generic Constructors and Eliminators from Descriptions

Larry Diehl (presenting), Tim Sheard (Portland State University)

Abstract: Dependently typed languages with an “open” type theory introduce new datatypes using an axiomatic approach. Each new datatype introduces axioms for constructing values of the datatype, and an elimination axiom (which we call the standard eliminator) for consuming such values. In a “closed” type theory a single introduction rule primitive and a single elimination rule primitive can be used for all datatypes, without adding axioms to the theory.

Abstract continued: We review a closed type theory, specified as an A GDA program, that uses descriptions for datatype construction. Descriptions make datatype definitions first class values, but writing programs using such datatypes requires low-level understanding of how the datatypes are encoded in terms of descriptions. In this work we derive constructors and standard eliminators, by defining generic functions parameterized by a description. Our generic type theory constructions are defined as generic wrappers around the closed type theory primitives, which are themselves generic functions in the A GDA model. Thus, we allow users to write programs in the model without understanding the details of the description-based encoding of datatypes, by using open type theory constructions as an internal domain-specific language (IDSL).

Ornaments in Practice

Thomas Williams (presenting), Pierre-Évariste Dagand, Didier Rémy (INRIA)

We often write very similar functions for structurally similar data types, e.g. "add" for natural numbers and "append" for lists. Natural numbers and lists have the same recursive structure and a coherent property, a "length" attribute:

add (length ml) (length nl) = length (append ml nl)

A simple projection function exists from lists to natural numbers: the "length" function.

The motivation for this paper was to add ornaments to ML. Naturals/lists example: an ornament is defined by a projection function from the ornamented type to the bare type, e.g. "length". Declare an ornament as

ornament from length : a list -> nat

The system checks whether recursive structure is preserved.

Syntactic lifting is how "append" can be derived from "add" almost automatically. Given

let rec add m n = match m with
  | Z -> n
  | S m' -> S (add m' n)

and the lifting expression

let lifting append from add with
  {length} -> {length} -> {length}

the system can derive

let rec append ml nl = match ml with
  | Nil -> nl
  | Cons(x, ml') -> Cons(?, append ml' nl)

But the ? is ambiguous (it's the "creative part") and needs to be completed manually, by code inference (heuristically) or a "patch".

A patch would look like

let append from add
  with {length} -> {length} -> {length}
  patch fun _ -> match _ with Cons(x, _) -> Cons({x}, _)

Applications of this: for refactoring (see question further down), or to derive an evaluation function (uniquely) from "conv", a bijective function between expression types.

More complex data structures can be lifted too (e.g. Sets and Maps); so can higher-order functions. Lifting for GADTs is also possible and is automatic for some invariants given the expected type of the function.

Conclusions: describing ornaments by projection is a good fit for ML, syntactic lifting gives good predictable results. Future work: better patches, integration into ML, combining ornaments.

Q: Projections seem very similar to abstraction functions in abstract interpretation.
A: Could be seen as abstract interpretation.

Q: Is this weaker than what was presented in the 2012 paper? There you could compute ornaments out of indices.
A: Indices flow in the other direction in the presented approach.

Q: Refactoring example. Normally want to throw old code away and keep new code. Here we get some conversion between old and new.
A: Yes, could remove old code.

Q: How about code maintenance?
A: Patches are part of the answer.

Q: Can the patch language also talk about the logical parts?
A: That would not be easy.

Type Inference for the Spine View of Data

Matthew Roberts (presenting), Tony Sloane (Macquarie University)

Context: rho-Calculus (theory of term rewriting), Pattern Calculus and bondi, Stratego, "Scrap your Boilerplate" (SYB paper), kiama, Reloaded/Revolutions.

Motivation: precisely characterize the type inference machinery needed to support the spine view of data.

Result: need to add "ispair" expression (see further down) to first-class polymorphism (FCP), which is a small extension of Hindley-Milner.

Fully applied constructor view means data equals constructor with all arguments (fully applied). When arguments are missing this is a function, so we can't pattern match against it.

Spine view only knows nodes of arity two or zero. This means that all pattern matching can be done using

ispair d bind (x, y) in e1 else e2

which is morally equivalent to

case c of o x y -> e1
          _     -> e2

Here x refers to the constructor and all its arguments except the last one, and y is the last argument.

Examples: generic map and fold.

Implementations: the "decidedly generic compiler" dgen and "kiama".

Q: Why is complete type inference so important?
A: The semantics of doing spine view with only one construct took many attempts; this one allows for type inference so it was chosen over the others.

Q: Have you tried to use FCP on other spine view representations?
A: With FCP, you trade off type annotations with data annotations.

WGP Session 3

First-class Isomorphic Specialization by Staged Evaluation

Alexander Slesarenko (presenting), Alexander Filippov, Alexey Romanov (Huawei Technologies)

There are three types of program specialization: partial evaluation, supercompilation and isomorphic specialization. Idea: we have A, B, P such that P : A -> B; also A', B' which are isomorphic to A and B, respectively, and P' : A' -> B' which can be computed more efficiently than P. Potentially there are multiple options for A', B' and P'. How can we use P' on A and B? (A and B: 'domain' data; A', B': 'core' data)

The example used is matrix-vector multiplication. Matrices, for example, can be represented as dense (vector of vectors), This is implemented as an EDSL (embedded domain specific language) in Scala. Here we can abstract over dense and sparse vectors and matrices. The specialization can take place at runtime, and we can allow for alternative specialized versions. Specialization can lead to faster code.

Generic composition of isomorphisms -- idea: build isomorphisms for each constructor in the core language. These isomorphisms are first class citizens of the framework.

How it works: staged method invocation with graphs.

Take-home agenda:

  1. Develop algorithm in EDSL
  2. Identify isomorphic representations in the domain language
  3. Implement ADTs using a core language and isomorphic representations
  4. Generate representation-specific implementations in the core language

Implementation can be found at github.com/scalan.

Q: How did you proceed? Paper includes theory and implementation notes
A: Original idea was to use graphs, that was the starting point

Q: You talk about large performance increases compared to original Scala implementation. Have you compared with C/C++?
A: No.

Algebraic Effects and Effect Handlers for Idioms and Arrows

Sam Lindley (University of Edinburgh)

Idea due to Plotkin: algebraic effects describe abstract computations. Abstract computations are trees (modulo equations).

Monad trees are unrestricted abstract computation trees (dynamic control and data flow). Arrow trees are monad trees where only static control flow is allowed (static control flow, dynamic data flow). Idiom trees (idiom = applicative functor) are arrow trees where only static data flow is allowed (static control and data flow).

Examples are given for each monad, arrow, and idiom; the control flow and data flow trees show the dynamic / static nature.

Static computations are useful as they allow optimisations and low-level implementations (arrows correspond approximately to circuits, for example).

Key contribution: using Flow Effects, we can write monad/arrow/idiom code in a single style -- this is very different from Haskell, where different notations exist for each.

A handler is an interpreter for an abstract computation. It is defined as a fold over a computation tree, specifying how return values and operations are interpreted. Given a monad handler, we can derive an arrow and idiom handler.

Q: Can it be inferred from a Flow Effects expression whether the computation it defines is a monad, an arrow or an idiom?
A: Yes.

Q: What's the status of the Flow Effects syntax?
A: Not finalised yet -- the syntax shown in the paper is a "mathematical" syntax with braces, indices etc.

A short discussion followed on whether a unifying notation is a good idea -- maybe it is a good thing that we can tell from the syntax whether a computation is a monad, an arrow or an idiom?

Scoping Rules on a Platter -- A Framework for Understanding and Specifying Name Binding

Larisse Voufo, Marcin Zalewski (presenting), Andrew Lumsdaine (Indiana University)

Name binding: for a given reference (use), where is the corresponding declaration (what it refers to)?

In order to understand and unambiguously define name binding rules, a number of name binding combinators has been developed.

Scope combinators are: hiding, merging, opening, weak hiding.

Together they provide a language that describes how scopes are combined to look up a reference. These are then used to describe how a specific language does name binding. The example is C++: the standard devotes over 60 pages in total to name binding rules (and it turns out both Clang and GCC get operator resolution wrong).

Discussant: Ilya Sergey

Q: Could we use attribute grammars for name binding?
A: Finding a grammar for C++ has been attempted many times... It's hard to take a principled approach and applying to C++.

Q: Did you find inconsistencies in the C++ standard when you tried to collect the scoping rules?
A: No, to our surprise -- it's complex but consistent.

Q: Did you come up with any ideas to simplify the rules along the way?
A: There don't seem to be any ways to simplify the rules (except using a different language).

WGP Session 4

Composing and Decomposing Data Types -- A Closed Type Families Implementation of Data Types à la Carte

Patrick Bahr (University of Copenhagen)

This paper was inspired by the recent addition of closed type families to the Glasgow Haskell Compiler (GHC). Closed type families were used to implement Data Types à la Carte, specifically the subtyping constraint :<:

Data Types à la Carte -- idea: decompose data types into two-level types. Express recursive data types as the fixpoint of a functor. These functors can be combined by coproduct construction :+:.

Example:

-- recursive data type
data Exp = Val Int | Add Exp Exp

-- as fixpoint of a functor
data Arith a = Val Int | Add a a
type Exp = Fix Arith

-- combine by coproduct construction
data Mul a = Mul a a
type Exp' = Fix (Arith :+: Mul)

The subtyping constraint is easily expressed as a type class with an injection function inj and a projection function prj. But this approach treats :+: asymmetrically, doesn't inspect the left-hand side and is ambiguous.

Contribution of this paper: a re-implementation of :<: which behaves as expected (f :<: g <==> there exists a unique injection from f to g), avoids ambiguous subtyping (by rejecting multiple occurrences of signatures) and allows expressing isomorphism.

Implementation: Have a type-level function Embed taking signatures f, g as arguments. Produces proof object p for f :<<: g -- a relaxed version of :<: allowing ambiguity -- and checks whether p proves f :<: g. We can then derive implementations of inj and prj using the proof object as an oracle in instance declarations.

The code for Embed, Sub and helper type classes is shown. It's all contained in the "compdata" package.

Caveats: unhelpful error messages; compile time performance is unpredictable.

Discussant: Wouter Swierstra

With this approach we gain lots of expressive power, e.g. control over how the search proceeds, at the price of more type level programming.

C: A solution for the error messages apparently exists. Also, Idris does this really nicely (it's also much more complicated).

Q: Motivation: want to use sums of products. Interesting tradeoff by maybe focusing on coproducts of products?
A: ?

Q: Would instance chains have solved your problem?
A: Yes.

Q: All this type hackery for deriving injections and projections -- how about just providing them every time?
A: Of course you can do that, but it's inconvenient and obvious from the type. If you need more flexibility, you can still do it manually.

Q: No canonical ordering for injection (eg sum of int and bool) -- is this a problem?
A: Yes, at the moment you have to annotate.

True Sums of Products

Edsko de Vries, Andres Löh (presenting) (Well-Typed LLP)

A new approach to datatype-generic programming in Haskell; notable features: faithful representation of datatypes as n-ary sums of n-ary products; a library of high-level combinators to encourage concise and reusable functions; separation of metadata in the representation.

Available on Hackage as "generics-sop". "basic-sop", "lens-sop" and "json-sop" provide examples, generic lenses and generic JSON (de)serialization respectively.

Motivation was to improve on existing models: GHC.Generics representation is too constrained (binary sums and products) but at the same time too flexible (arbitrary nesting). Also has noisy metadata everywhere. Functions end up making implicit assumptions about the structure of the data which is not guaranteed by the type system.

Discussant: Patrik Janssen

Q: What kind of code does this generate?
A: Not looked at yet. But lots of implicit arguments are potentially being passed around at runtime.

Q: Can this be done better in Agda?
A: Library was completely developed in Haskell. Can probably be done more nicely in Agda, but in this case the development was driven by an actual need to deal with certain generics problems in Haskell.

Q: n-ary lists are left-biased: don't you run into similar performance problems as with binary representation?
A: List-based representation gives uniform encoding for all constructors. But probably not ideal for data types with thousands of constructors.