ICFP 2014: Day 2

2Sep/140

ICFP 2014: Day 2

This is day 2 of the 19th International Conference on Functional Programming. Leo and I(Heidi) are here bright and early to bring you the highlights fromÂ Gothenburg, Sweden. We are not the only livebloggers here at ICFP, check out Edward Yang's excellent notes.

Keynote (Chair: Jesse Tov)

Behavioral Software Contracts
Speaker: Robert Bruce Findler (Northwestern University)

Intro: Robby's work on behavioural software contracts is beautiful :)

Instead of an abstract, the keynote has a bibliography of interesting work in the area

Contracts are infectious

Example of racket code, working in a call-by-value language here a point-in function. Demonstration of contract violation including a blame, from applying bools instead of integers. Robby's adds an contract to guard access to point-in and the blame now points to the top-level.

Graph demonstrating how the number contracts in Racket in git has constantly increased. Robby identifiers the outliers e.g. big libraries with lots of contracts have been introduced. Most commits add 1 contract, if they add any.

Everyone is the audience is thinking, why use contracts when you can use types, as identified by Robby.

Contracts Verse Types: The previous example is used with types instead of contracts and negative integers are used to demonstrate how types are insufficient. He introduces a dependant contract, hinting at a comparison to dependent types. We have the arguments at the time of checking unlike dependant types hence its actually possible in real languages.

Switch to an example of a function called dc, which takes a drawing function and bonding box and draws as a side effect. Robby introduces a broken function, which introduces state pollution. We can't check that state after equals the state before, but this would be too late. Here's a contract example, generating a random state and testing for state pollution, this doesn't give us complete corrects but its good enough in this contexts.

Don't think about contracts in terms of projections, but in terms of boundaries (work by Christos)

Addition of contacts to simply typed lambda calculus, in particular how monitors add contracts to boundaries. Example of if we have contract (x =< 3), how values of x > 3 causes the function to abort. If we have another contract to checks f's arguments and return values are x=<3, we allow the function to passed though to with extra monitors on return values.

What happens if we have a bug in the monitor ?

Applications: How can contracts work if you don't give have access to the code the other side of the boundary. We can use the contract exercise function to generate random functions to test monitors. We have an notion of whether one contract is stronger than another.

Gradual typing: How we interfere between typed and untyped languages, see how we do it with Typed Racket.
Static Analysis: Havic, exhaustive boundary checking

This presentation is actually written in racket, all output is real output.

Takeaways: Contracts are not just crappy types, contracts infect code bases, boundaries are vital.

ICFP knows how to prove specifications. Contract infectiousness is our opportunity to share that knowledge with the world

Q: Randomisation would break formal tools, why not use NaN ?
A: Just setting to default may not find errors

Q: Contracts are checked at runtime, what about the overhead ? Is there a switch to turn it off ?
A: There's no kill switch, but there are option contracts

Q: Effects are alarming !
A: Well, it doesn't alarm me, its vital for practical code and we need to look into this further

Q: Why can't we introduce randomness to type checking ... ?
A: that the point

Q: Thank you for the cross community impact !

Q: How is the blaming system implemented ? Can we add it to python or C ?
A: Fundamentally we could do this, we would love to see more of this.

Q: The opposite of gradual typing, often checks aren't just typed based. Can we use types to check for placement of contracts ?
A: Contracts are specification, they can be wrong like normal programs.

Q: Dependent contracts seems to explicitly pass all arguments, its looks a bit strange ?
A: Explicitly labelling dependency makes it cheaper

Session 6: Contracts & Scheme (Chair: Michael Sperber)

Soft Contract Verification
Phuc C. Nguyen (University of Maryland); Sam Tobin-Hochstadt (Indiana University); David Van Horn (University of Maryland)

This is the snake game with contracts, most computation time is executing contracts not code, there is serious overhead.

Contract and blame avoidance cycle, how can we then introduce static contract verification.

We will consider soft contract verification:
- we can analysis first class contracts
- using SMT solver
- competitive
- works even without apparent contracts

CPCF

Introduce PCF language, and introducing contract notation.

How can we verify CPCF ? We abstract CPCF function to a type, symbolic values are sets of contracts.
Soundness: All concentration are approximated by CPCF

Check refine values are they flow through, these refines can influence the computation.

Theorem: Verified modules can't be blamed - Proved
We proof well to soft typing, occurrence typing, H.O recursion schemes and dependent refinements and video games for the beginning.

Contracts fail at first order, only need a F.O solver for verification
Q: If we replace with pos? with >4, where you short cutting the SMT solver ?
A: Yes, the name was just a shortcut

Q: How to do you handle recursion ?
A: This is described in the paper

On Teaching How to Design Programs: Observations from a Newcomer
Norman Ramsey (Tufts University)

This is course in problem solving, not functional programming

Process:
1) Describe data, input, making data examples
2) Describing functions, making function examples
3) Creating code templates from types, fill in the template
4) Test, review and re-factor

Example of 2D-tree data definition and writing a function to find the nearest point.

What's hard of students ?

- Type directed development is difficult even though there's mechanical rules to apply
- "Purpose statements" are hard, student just ignore them and read code, this breaks down then they get to recursion

The Tech:
- You don't need 5 language levels
- Don't be fooled by the DrRacket IDE - designed for fill racket, but every time you compile, untested code is thrown in your face
- Go deep intro "world programs" - interactive apps by composing pure functions

Open Problem - How can we evaluate students code ?
- Is this code the produce of systematic design ?
- Experienced instructions use point deducts

HtDP: low cost, high reward

Q: Design recipes suggest how to evaluate student assignments, this is the solution to your problem ?
A: We'll sit down and talk about it

Session 7: Tools Used in Anger (Chair: Jacques Garrigue)

SML# in Industry: A Practical ERP System Development (Experience Report)
Atsushi Ohori (Tohoku University); Katsuhiro Ueno (Tohoku University); Kazunori Hoshi (NEC Software Tohoku, Ltd.); Shinji Nozaki (NEC Software Tohoku, Ltd.); Takashi Sato (NEC Software Tohoku, Ltd.); Tasuku Makabe (NEC Software Tohoku, Ltd.); Yuki Ito (NEC Software Tohoku, Ltd.)

Experience report from using SML# in production systems

With SML# with can produce robust systems, regoros specification and high quality flexible code. We conducted a pilot for using SML# for indeustry software, 15 month project with to man-months

Features of SML# for practical software development:
Normal ML features plus record polymorphism, direct C interface, seamless SQL integration and fully concurrent non-moving GC => mulit-core is simple

The system represents a typical business application.

Lessons learned: We had various issues with ML as a production language and its essential to select best of bread components.

To deploy ML with need to train people, software quality control systems.

Record polymorphism is very useful for data intensive application.

Q: What's the evidence that its reliable/productivity ? So no quantifiable evidence ?
A: We need to work on this

Q: Is accounting interesting ?
A: Yes, I enjoyed it

Lem: Reusable Engineering of Real-World Semantics
Dominic P. Mulligan (University of Cambridge); Scott Owens (University of Kent); Kathryn E. Gray (University of Cambridge); Tom Ridge (University of Leicester); Peter Sewell (University of Cambridge)

Sorry for the bombastic title

CS community has build many models of real world artifices, build by engineers instead of computer scientists. E.g. memory models, processor architectures etc.. This take many postdoc years to build.

Models developed inside theorem prover or PL and then hand ported, porting maybe machine assisted (perl scripts :)). This language to language translation is ugly. We have theorem prover lock-in. Porting requires experiments in both systems. There it not a library of off the shelf components.

Example of C++11 relaxed memory models

Can we have a write once, prove anywhere model ?

Example model: Power and ARM operational model of relaxed memory (Sarkar et al.)
Build from multiprocessor memory subsystem and thread semantics
Another Example model: Axiomatic model for C/C++11 concurrency
Another Example model: CakeML language semantics & verified compiler (Kumar et al.)

Introducing Lem:
Source -> Lem -> HOL, OCaml, Documentation
Lem syntax is similar to OCaml
Lem is a pragmatic tool, we are happy to accept code that causes problems for TP's
Example Lem including highlighting how whitspace and comments are presented

get lem from cl.cam.ac.uk/~pes20/lem

Q: How to you support naming and binding ?
A: We would like to add nominal naming

Session 8: Type Systems (Chair: Geoffrey Mainland)

Safe Zero-Cost Coercions for Haskell
Joachim Breitner (Karlsruhe Institute of Technology); Richard A. Eisenberg (presenting, University of Pennsylvania); Simon Peyton Jones (Microsoft Research); Stephanie Weirich (University of Pennsylvania)

Generative type abstractions like newtype HTML = MkH String are useful to prevent programmer errors. The constructor has no runtime overhead. But stringList :: [HTML] -> [String] requires a "map", taking linear time!

Answer: a new equivalence relation, â‰ˆ (written in Haskell syntax as Coercible a b instead of a â‰ˆ b): coerce :: Coercible a b => a -> b which should do nothing at runtime!

Want: Coercible a b then Coercible [a] [b]

Need instances of Coercible for newtype ("wrapping") and data ("lifting") declarations.

Naive approach is too permissive: can derive Int â‰ˆ Bool -> Bool when type families are used. Reason: have two equalities, nominal (compile time) and representational (run time) equality. Type families do not respect representational equality.

Answer: assign roles to type parameters -- nominal, representational, or phantom. Example for phantom: instance Proxy p1 â‰ˆ Proxy p2

Role Inference. Goal is to determine the most permissive yet safe role for type parameters.

Application: GeneralizedNewtypeDeriving has been safely reimplemented in terms of coerce. Still only want to allow newtype (un)wrapping when constructor is in scope.

Abstraction: when do we want to allow coercion? The Default Debate: "preserve abstraction! Make roles default to nominal!" vs. "Be backward compatible! Allow GeneralizedNewtypeDeriving!" (latter was chosen).

Trouble on the horizon? Problem with Monads including "join".

Conclusion: efficient, safe, straightforward interface, implemented and released in GHC 7.8.

Q: Miranda dealt with ADTs rather differently. It later turned out that the Haskell and Miranda approach can be translated into each other.
A: Miranda's ADT system might run into trouble with type families...

Q: How many packages would need role annotations?
A: Not clear how to assess that. Probably very few (2?).

Q: How about Safe Haskell?
A: Coercible is not yet in the safe subset.

Q: Why not put these role annotations in the kind syntax?
A: Too big a change to the kind system. Makes it less flexible.

Hindley-Milner Elaboration in Applicative Style (Functional Pearl)
Francois Pottier (INRIA)

The story of ML type inference and how it has been explained over the years.

70s: Milner invents ML type system and polymorphism. Publishes declarative presentation, Algorithm W, and imperative one, Algorithm J. Contains global variable for "current substitution". Both compose substitutions produced by unification, and create new variables as needed.

80s: Cardelli, Wand formulate type inference as a two-stage process: generating and solving a conjunction of equations. Benefits: higher-level thinking: equations and conjunction instead of substitutions and composition; also greater modularity: constraint solving in library, generation by user. But new variables still created via global side effects.

90s: Kirchner and Jouannaud, Rémy explain new variables as existential quantification and constraint solving as rewriting -- necessary step on the road towards explaining polymorphic inference.

2000s: Gustavsson and Svenningsson explain polymorphic type inference using constraint satisfaction.

Constraint solving: on paper, every constraint can be rewritten until it is either solved or you get an error.

A problem: submitting a closed ML term to the generator yields a closed constraint which the solver rewrites to false or true (is well typed or is not well typed).

Question: can one perform elaboration without compromising the modularity and elegance of the constraint-based approach?

A "low level" solution: the generator could produce a pair of a constraint and a template for an elaborated term, sharing mutable placeholders for evidence so that after the constraint is solved, the template can be "solidified" into an elaborated term.

This approach has three stages: generation, solving, solidification; each user construct is dealt with twice -- not very elegant.

Can we describe generation and solidification in a unified manner? Yes: give the user a DSL to express computations which generate constraints and read their solutions simultaneously. The implementation of this is simple, and allows the user to define inference and elaboration in one inductive function.

Conclusion: solution is simple, modular, elegant, performant. Could be used in other settings potentially -- e.g. higher-order pattern unification?

Q: Would this work with lazy evaluation?
A: Not sure.

C: Notation of the algorithm is awkward.
A: Yes, this shows that we need a nicer way to express computations with applicative functors.

Q: How about a constructive proof instead of returning true or false?
A: Possible but then the proof tree would have to be inspected -- would still need to do work twice.

Session 9: Incremental Computing (Chair: Tiark Rompf)

Settable and Non-Interfering Signal Functions for FRP
Daniel Winograd-Cort (presenting, Paul Hudak (Yale University)

FRP: programming with continuous values and streams of events, like signal processing diagrams.

Used in Yampa, Nettle, Euterpea.

Event-based vs. continuous: introduce notation for differentiating the two from each other. Standard arrow operators: partial application, feedback loops, composition; stateful arrows (delay), arrow choice -- running the signal function becomes a dynamic decision.

Higher order arrows (with switch) are inherently dynamic, can replace arrow function. Arrows with switch are equivalent to Monad. Makes it harder to optimize.

Why arrows with switch? -- "power start and stop".

Contribution: introduce resettability and non-interfering choice. Get general resettability and arrowized recursion.

Example: IntegralReset. Want to be able to reset it with an event. Easy to do with switch: just replace the integral function. Without switch we can simulate a reset but we can't modify the function itself. Not nice: doesn't generalize.

Resetting state: state is in delays and loops.

General resettability: can take any signal function and transform it into a settable signal function.

Presents "settable laws": identity, uniformity, default.

Other example: pausing an arrow. IntegralWhen, a signal function that performs an integral only under a given condition. Can't do this with arrow choice. Solution: use non-interfering choice.

Non-interfering choice gives recursion! Two kinds: fixpoint-like recursion and structural recursion (which is static).

Arrowized recursion is "predictably dynamic" which nicely allows optimization.

Causal Commutative Arrows (CCA) can be heavily optimized. Allows choice but not switch. Reduces arrows to one of two forms.

Summary: introduced settability, a new model for controlling FRP state; non-interfering choice allows new forms of expression and arrowized recursion. Switch is really only needed for true higher order expressions!

Q: Is settability like a transistor?
A: Yes.

Q: Are there other circuit design ideas that we could re-use?
A: Not sure. Interesting idea.

Q: Can you use CCA factorization with GHC e.g. using rewrite rules?
A: Unresolved so far... There is a Template Haskell library for doing something like this.

Functional Programming for Dynamic and Large Data with Self-Adjusting Computation
Yan Chen (Max Planck Institute for Software Systems); Umut Acar (Carnegie Mellon University); Kanat Tangwongsan (Mahidol University)

Context: Big Data. Challenges are variety, volume, velocity -> requirements: expressive, parallel, incremental.

Why not use FP for computing big data? Parallelism can be done. How about incremental data?

Implicit self-adjusting computation: in addition to normal output, the program creates a dependency graph so when an input changes, only the necessary recomputation is done.

Result: faster but needs more memory to keep dependency graph -- time vs. space trade-off. This paper: controlling memory use.

Speedup as a function of memory usage has diminishing returns. Want to find sweetspot on the speedup vs. memory usage curve.

Type-directed translation: extend SML types with Changeable / Stable.

Need a technique to control granularity of the dependency graph. Treat blocks of data as a single reference cell. Fixed block size leads to disproportionately slow update time -- so instead use probabilistic chunking scheme.

Applications and Evaluation: incremental PageRank, graph connectivity, social circles. Very high speedups achieved in prototype implementations.

Future work: parallel self-adjusting computation.

Q: Have you done experiments with larger changes?
A: Yes.

Q: Is it realistic for companies to use 10x more memory in order to speed up computations?
A: It's a trade-off.

Q: What would be the speedup when you don't use additional memory?
A: Zero (?)

Q: This looks like it's intrinsically sequential. How do you plan to parallelize this?
A: It's all about dependency tracking -- dependencies can be used to decide where to parallelize.

Q: How would a non-pure merge sort perform?
A: Need to check.

Q: Can you incrementalize the whole or parts of SML?
A: Interesting question. Non-pure parts of SML would be problematic.

ICFP Contest Presentation by Duncan Coutts and Nicolas Wu

Created by members of the Department of Computer Science at Oxford University and Well-Typed.

The ICFP Programming Contest started in 1998 (in response to ACM programming contest allowing only mainstream imperative programming languages).

Open to anyone, any PL, teams of any size, takes place on the internet, 24h lightning round, 3-day full round.

Certain similarities between the problem ten years ago and this year's. But this year: two compilers instead of only one!

Setting the scene for this year's contest: 80s pop culture, 80s PL technology, 70s PL research.

LambdaMan runs around in a maze eating pills and evading ghosts ... striking resemblance to a more well-known game!

LamCo (producer of LambdaMan) had some interesting but weird technology and a spectacularly dysfunctional development process: team of 8-bit microcontroller fanatics and team of LISP fans combined.

Lightning round: LambdaMan AI running on LISP cpu
Full round: ghost AI running on 8-bit microcontroller in addition

Task description: many pages of cpu specs.

Simulator and debugger given -- in the "lingua franca of the internets", JavaScript (actually written in Haskell and the compiled using GHCJS).

Before the contest, pointless hints were tweeted.

Major panic as the git-based site deployment system went haywire just when the contest started, resolved manually. Next 5 hours spent with all hands on deck. Participants could see specification diffs on GitHub.

Judging process: collected and verified all submissions. Had enough resources to run all vs all rather than multiple-round tournament (one team's lambda man against another team's ghosts and the other way round). Ran 200220 games.

First prize: Supermassive Black Hom-set (Pavel Lepin) -- Haskell is the PL of choice for discriminating hackers!

Second prize: UnaGi -- C++ is a fine tool for many applications.

Third prize: DiamondPrincess -- Perl is also not too shabby.

Winner lightning round: jabber.ru (Alexey Shchepin) -- OCaml is very suitable for rapid prototyping.

Judge's prize: gagallium -- Team gagallium are an extremely cool bunch of hackers! Used OCaml. Best ghost. Used some intermediate stage of the OCaml compiler to compile their code from.

Pavel Levin gives a short presentation of his approach to solving the contest.

ICFP 2004 Most Influential Paper Award presented by Derek Dreyer

Goes to "Scrap More Boilerplate": Reflections, Zips, and Generalised Casts" by Ralf LÃ¤mmel and Simon Peyton Jones.

Significantly extended the authors' earlier "Scrap Your Boilerplate" paper and library.

Simon's observations: 1. an unequal partnership made this possible 2. good titles are powerful!

ICFP 2014: Day 2

Keynote (Chair: Jesse Tov)

Behavioral Software Contracts
Speaker: Robert Bruce Findler (Northwestern University)

Soft Contract Verification
Phuc C. Nguyen (University of Maryland); Sam Tobin-Hochstadt (Indiana University); David Van Horn (University of Maryland)

On Teaching How to Design Programs: Observations from a Newcomer
Norman Ramsey (Tufts University)

Session 7: Tools Used in Anger (Chair: Jacques Garrigue)

Lem: Reusable Engineering of Real-World Semantics
Dominic P. Mulligan (University of Cambridge); Scott Owens (University of Kent); Kathryn E. Gray (University of Cambridge); Tom Ridge (University of Leicester); Peter Sewell (University of Cambridge)

Leave a comment

Pages

Categories

Blogroll

Archive

Meta

ICFP 2014: Day 2

Keynote (Chair: Jesse Tov)

Behavioral Software Contracts Speaker: Robert Bruce Findler (Northwestern University)

Soft Contract Verification Phuc C. Nguyen (University of Maryland); Sam Tobin-Hochstadt (Indiana University); David Van Horn (University of Maryland)

On Teaching How to Design Programs: Observations from a Newcomer Norman Ramsey (Tufts University)

Session 7: Tools Used in Anger (Chair: Jacques Garrigue)

Lem: Reusable Engineering of Real-World Semantics Dominic P. Mulligan (University of Cambridge); Scott Owens (University of Kent); Kathryn E. Gray (University of Cambridge); Tom Ridge (University of Leicester); Peter Sewell (University of Cambridge)

Leave a comment

Pages

Categories

Blogroll

Archive

Meta

Behavioral Software Contracts
Speaker: Robert Bruce Findler (Northwestern University)

Soft Contract Verification
Phuc C. Nguyen (University of Maryland); Sam Tobin-Hochstadt (Indiana University); David Van Horn (University of Maryland)

On Teaching How to Design Programs: Observations from a Newcomer
Norman Ramsey (Tufts University)

Lem: Reusable Engineering of Real-World Semantics
Dominic P. Mulligan (University of Cambridge); Scott Owens (University of Kent); Kathryn E. Gray (University of Cambridge); Tom Ridge (University of Leicester); Peter Sewell (University of Cambridge)