{"id":1634,"date":"2013-11-03T17:24:21","date_gmt":"2013-11-03T17:24:21","guid":{"rendered":"http:\/\/www.syslog.cl.cam.ac.uk\/?p=1634"},"modified":"2013-11-04T12:18:00","modified_gmt":"2013-11-04T12:18:00","slug":"liveblog-from-programming-languages-and-operating-systems-2013","status":"publish","type":"post","link":"https:\/\/www.syslog.cl.cam.ac.uk\/2013\/11\/03\/liveblog-from-programming-languages-and-operating-systems-2013\/","title":{"rendered":"Liveblog from Programming Languages and Operating Systems 2013"},"content":{"rendered":"
<\/a>I'm here in the Nemacolin Woodlands<\/a> at SOSP 2013 with a huge gaggle of the SRG (both old and new), in a room packed to capacity for the PLOS 2013 workshop! \u00c2\u00a0I'm co-chairing it with Tim Harris, so I'll be liveblogging talks when I'm not coordinating sessions. \u00c2\u00a0The keynote is from Russ Cox from Google, and you can find all the papers on the ACM digital library<\/a>.<\/p>\n <\/p>\n Go is an open-source programming language that makes it easy to build simple, reliable and efficient software. Started in 2007, open sourced in Nov 2009, and\u00c2\u00a0has been open source development since then. \u00c2\u00a0Started as an answer to software problems at Google: multicore, networked systems, massive clusters, and working at scale: 10^7 LoC, 10^3 programmers, 10^6(+) machines.<\/a><\/p>\n Go is a simple but fun language. Start with C and remove the complex parts -- add interfaces, concurrency and GC\/closures\/reflection\/strings.<\/p>\n \"Less is exponentially more\".<\/p><\/blockquote>\n The joke is that Go started when waiting for a large C++ program to compile.<\/p>\n import \"fmt\" is guaranteed to read exactly one file.\u00c2\u00a0 Metadata in that package describes what else may be needed in the future.<\/p>\n Q from audience: what doesnt scale with 1000 includes instead of one? Still guaranteed to read exact one file, but the import namespace is decentralized.<\/p>\n There's a mechanical mandated format that go requires for syntax.<\/p>\n Prints the same program but with the Info line from before into Error.\u00c2\u00a0 Very useful for refactoring large code bases.<\/p>\n In C\/C++, too much programming and API design is about mem management. Go has garbage collection only.\u00c2\u00a0 Fundamental for interfaces.<\/p>\n Go lets you limit allocation by controlling memory layout -- for example, a ring buffer.<\/p>\n The GC is still an area of active research. Design decision: cannot reuse Java GC directly due to an early design decision to permit interior pointers, as are foreign pointers.\u00c2\u00a0 It is implemented as a mark and sweep language.\u00c2\u00a0 It's idiomatic in Go to plan your memory layout upfront, unlike (for example) in Java where you are encouraged to treat garbage collection as \"free\".<\/p>\n An interface defines a set of methods.<\/p>\n A type implements the interface by implementing the methods (structural typing).<\/p>\n An implementation of an interface can be assigned to a variable of that interface type.<\/p>\n Reader is an obvious counterpart, and this lets you write a function Copy that operates purely over Reader and Writer interfaces.<\/p>\n A more complex adapter is Adapters.<\/p>\n creates a writer that duplicates it writes to all the provided writers, like Unix tee.<\/p>\n Interfaces tend to be fairly small, but one bigger example is in the Networking library.\u00c2\u00a0 Errors are explicitly returned.<\/p>\n You can build other things such as SSL on top of this interface.<\/p>\n There's no dependence between the interface and implementation, unlike in Java where interfacing is very explicit.\u00c2\u00a0 Expressive composition, easy testing and avoids overdesign and the rigid hierarchy of inheritance-based OO.<\/p>\n Interfaces are the source of all generality the Go.<\/p>\n Concurrency is about dealing with a lot of things at once. Parallelism is about doing a lot of things at once.\u00c2\u00a0 Concurrency is about structure, and parallelism is about execution.<\/p>\n Examples of concurrent: keyboard, display, disk drivers in an OS.<\/p>\n Parallel: vector dot product, matrix multiply which can often be sped up almost linearly with cores.<\/p>\n Concurrency enables parallelism, but is useful on its own since modern programs must deal with many things at once.<\/p>\n Go provides two important concepts: a goroutine is a thread of control within the program, with its own local variables and stack.\u00c2\u00a0 They are cheap and easy to create.<\/p>\n Channels provide a mechanism for goroutines to communicate via typed messages between goroutines.<\/p>\n The func() runs concurrently with the main() which consumes those messages in the Println.<\/p>\n The model is taken from Tony Hoare's CSP and is orthogonal to the rest of the language and can keep a familiar model for computation, and focus on composition instead.\u00c2\u00a0 You can switch to CSP primitives only when you want to worry about computation.\u00c2\u00a0 On the other hand, something like Erlang requires you to learn a new method of working (actors) and is less easy to move from C\/C++.<\/p>\n Caveat: they're not purely memory safe and sharing is legal.\u00c2\u00a0 Passing a pointer over a channel is idiomatic, and its been working well in practise.<\/p>\n Sequential network address resolution given a work list.<\/p>\n This is a simple one that does it step by step.\u00c2\u00a0 To do it in parallel, lets use channels<\/p>\n for _, w := range worklist { for i:= ; i<len(worklist)l;i++) { Channels are being used here to build a pattern, which can be put in the stdlib.\u00c2\u00a0 This is actually a sync.WaitGroup from the standard library. Parallelism can also be bounded by building a channel to keep track of channel limits.<\/p>\n If you are using mutexes, you are certainly creating bugs.\u00c2\u00a0 Channels seem easier for people to get right in practise.<\/p>\n Example: replicated storage with read\/write quorums.<\/b>\u00c2\u00a0 (Eventually consistent)<\/p>\n (Shows an example of this using channels and writes)<\/p>\n Q: How are slow messages cancellable?<\/p>\n A: the channel itself will be GCed since the writer stuffs the request into the channel. However cancellation is a common pattern in distributed systems and so a more advanced version can return a channel to be used for interruptions (Which can in turn by watched by a network RPC engine and invoked if stuff takes too long).<\/p>\n Select allows choosing between multiple channel ops.\u00c2\u00a0 For example a chat program<\/p>\n This is a *key* feature for building distributed systems, and supported by closures and garbage collection.\u00c2\u00a0 Message passing inside the program, and also outside the program.\u00c2\u00a0 This helps you to not change your thinking when moving from within a single program to a distributed system (which has to be message passing).<\/p>\n Most importantly, do not communicate by sharing memory and instead, share memory by communicating.<\/p>\n Q: are closures deep copy? Q: what about cache coherence across domains? What the type system chooses not to express is as important as what it does.<\/p>\n Interfaces are most powerful when they are the least restrictive.\u00c2\u00a0 A Write for example, doesnt have a flush call, just a write.\u00c2\u00a0 Restrictions that could bifurcate interfaces that Go doesnt have are:<\/p>\n Concurrency details restrict implementations:<\/p>\n In Go, blocking is assumed, and the runtime can interleave goroutine execution. For concurrency, use a goroutine!\u00c2\u00a0 This makes stacks much more useful.<\/p>\n Is go used in production?\u00c2\u00a0 Yes, with a couple of examples within Google that are public.<\/p>\n This is all available at http:\/\/golang.org<\/p>\n Q: in Haskell, the pain in the IO monad is much felt.\u00c2\u00a0 The type system is a tool for finding technical debt.\u00c2\u00a0 Haskell prevents shortcuts and forces you to do the right thing. Q: are there any fundamental things you can do in Go vs (e.g.) Erlang Q: in Singularity, it had channels that were safe, but in that case channels are statically guaranteed. Q: there's a certain complexity to Java now with generics and so on.\u00c2\u00a0 How do you stay simple? Oracle Labs, University of Cambridge <\/a> what kind of thing is $d, and what does grep do? $d is a tree node (directory).<\/p>\n grep -r will recursively explore (readdir), work on leaf nodes, read characters (bytes) and if matched, output a reference. What stops it generalising further?<\/p>\n It turns out that if you grep the grep docs, they are gzipped and we hit a semantic wall for composability.\u00c2\u00a0 Hitting a limit of Unix (semantics free) byte streams.<\/p>\n In a Smalltalk world, Dan Ingles suggests that an \"operating system is a collection of things that dont fit into a language. There shouldn't be one\". Specific grumbles: \"depart from an otherwise consistent framework\", \"very primitive'\".\u00c2\u00a0 This never quite took off for some reason though...<\/p>\n A key part of OS's value is communicating abstractions. Unix is tending towards Smalltalk -- but its meta abstractions are lurking. Smalltalk is all about a programming abstraction with pervasive late binding, with uniformity and big, well-designed libraries.<\/p>\n What is a \"meta system\"?\u00c2\u00a0 A \"system\" is a collection of behaviors modelling a domain.\u00c2\u00a0 A \"metasystem\" models the domain of systems.\u00c2\u00a0 For this talk, a machine-readable specification is of interest.\u00c2\u00a0 Metasystems enable value added orthogonal functionality such as persistence, REPLs, debugging and so forth.<\/p>\n In the land of UNIX, \"since we are programmers, we naturally designed the system to make it easy to write, test and run programs\".\u00c2\u00a0 It has multiple programming abstractions, primitive metaabstraction from I\/O and some late binding.\u00c2\u00a0 But it's missing a semantic meta-abstraction (e.g. classes) and in modern UNIX, uniformity...<\/p>\n In Plan9, \"we build a UNIX out of little systems, not a system out of little Unixes\"<\/p>\n Files and objects are some subdivision of a computation, implementing a basic protocol, consisting of messages that are network transparent -- onto which additional meaning can be layered.<\/p>\n It's now missing documented semantics for files (e.g control files) and so in this world, what can I grep or cp -r ?<\/p>\n There is a lurking Smalltalk in UNIX, but also metasystems:<\/p>\n What Stephen wants to do is to build a Smalltalk out of lots of fragments, not a big fragmented system out of little [insert PL here]s\"<\/p>\n Smalltalk is one view among many!\u00c2\u00a0 Languages appear to be more ephemereal than OSes, and language implementations should be \"a view on the wider world\".<\/p>\n What are the benefits if we join the dots on metasystems?\u00c2\u00a0 Via minor expansions to UNIX like interfaces to unify the metasystem and not just the system.<\/p>\n Concrete Use:<\/b> How can an IPC metasystem have a provided\/require metasystem? Can we have someting similar to Dwarf to interpret files and directories with more semantic information in the style of DWARF?<\/p>\n [paper link]<\/a><\/p>\n <\/a>Causes of filesystem bugs can be broken down into semantic confusion (58%), memory handling (14%) and error handling (10%) and finally concurrency.\u00c2\u00a0 For now, putting concurrency aside and worrying about the rest.<\/p>\n Starts with an Isabelle functional specification and a domain specific language (DDSL) which contains the data layout spec<\/p>\n The DDSL compiler compiles serialisation code which is then passed through to a lower level Isabelle spec.\u00c2\u00a0 Finally all this is extracted to C code along with proofs that they are equivalent to the specification.<\/p>\n The current implementation contains a CDSL core design and has static and dynamic semantics formally defined.\u00c2\u00a0 A case study involves\u00c2\u00a0 a low level filesystem.<\/p>\n The DDSL is a key part of the system.\u00c2\u00a0 There are a lot of existing data description languages (how to specify a data structure and layout, and how to express constrains on the values and how to check them statically).\u00c2\u00a0 A lot of work, but mostly engineering.<\/p>\n The control CDSL was more interesting from a research perspective.\u00c2\u00a0 It involved a tussle between systems people who demand efficiency, destructive updates and expressiveness. The theory people want to have controlled side effects, memory safety and static reasoning.<\/p>\n Therefore the design of CDSL ended up using two key things: a linear type system allows interpretation via value as well as update semantics.\u00c2\u00a0 The main idea is that the values of a linear type have to be used exactly once. \u00c2\u00a0 This can be statically checked.<\/p>\n The main problem with linear types is how it makes programming more difficult.\u00c2\u00a0 Therefore CDSL has syntactic sugar to make it easier to break apart values and deal with them separately.<\/p>\n Error handling is dealt with by encoding it in the type.\u00c2\u00a0 There is only a single error code type and results are propagated to be linear-friendly (i.e. on success the same object is returned as a fresh one)<\/p>\n CDSL ends up having sound value semantics and update semantics. Given a well typed CDSL program, it can be interpreted with functional semantics (i.e. equational) and also sound update semantics for efficient implementation. The results of evaluation are the same!\u00c2\u00a0 In addition to the final value, every single evaluation step has a correspondence in the value and update sematnics.<\/p>\n Linear types have been around for quite a while, but they aren't used widely despite their nice properties. The type of programs that can be written are quite restrictive, but this is still a good tradeoff to achieve the correctness goals for this project.<\/p>\n The results of this project are very promising, and its really a question of how cheap it'll be.\u00c2\u00a0 How much reuse of proofs will be possible, and how much can the abstraction level of CDSL be pushed towards proof automation?<\/p>\n Q: (Butler Lampson) you already built a verified kernel. What's the differnece here? [paper link]<\/a><\/p>\n What's a trustworthy system? Its about DO-178C -- they check development process, testing, config management, certification and eventually formal verification.<\/a><\/p>\n The issue with this right now is the expense of scaling up formal systems as LoC grows.\u00c2\u00a0 A component-based development system offers an opportunity to build layers on top of a verified seL4 kernel and eventually grow it into a large scale system at much cheaper cost than is currently possible (with monolithic verification).<\/p>\n The end to end guarantees involve starting from an architecture description, generating glue code that combines component code into a userspace image.\u00c2\u00a0 The arch description also outputs an architecture semantics, which lets us show the strong separation of multiple components.\u00c2\u00a0 The user provides an architecture spec, a trusted component specification and component code.<\/p>\n It's all done in Isabelle\/HOL and they have a lot of code built using it now (including previous filesystem talk), so its heading into being a \"real\" system.<\/p>\n CAmkES is a component platform which has abstractions for procedures, events and shared memory. (shows the architecture semantics) <\/p>\n The component process language is a concurrent imperative language. It consists of a UserStep to change a local value, a SKIP to do nothing and a composition double-semicolon to make a request, and a similar if\/then\/else to handle non-deterministic responses.<\/p>\n Q: safety-critical but not real time systems? [paper link]<\/a><\/p>\n <\/a> There is a Physical Resource Graphs (the PRG) that contains hw functions and configuration into a Logical Protocol Graph (the LPG) that has protocol state and packet processing. The PRG is combined with the LPG to embed into a real system.<\/p>\n Unicorn is a programming language to talk about NICs. A concrete syntax for writing PRGs and LPGs. The building blocks are function nodes: a single computation with a single input and ports with with potentially multiple output edges. When computation is done, one port is activated and subsequently, nodes connected to that port are activated. Operator nodes provide more useful In the LPG (shows a packet forwarding example), there is a very loose partial ordering to enable more flexibility when embedding. This allows sw\/hw resources to be used interchangeably.<\/p>\n Modelling NIC configuaration is quite difficult -- there are lots of options that drastically change the behaviour of the NIC. Thus configuration has to be first-class in the embedding to get high performance.<\/p>\n The implementation is in Haskell -- a edsl that embeds algorithms into PRG\/LPGs and then into Unicorn (language+model). At the very bottom is the Intel DPDK which executes it on an Intel NIC (i82599). The stack responds to ping (so ARP\/ICMP\/etc) and they are continuing to develop it.<\/p>\n Q: Does this system have any modularity? [paper link]<\/a><\/p>\n <\/a>File system aware applications include a file system checker (fsck) to ensure that it's consistent. For example, two files with the same name and directory will result in fsck renaming one of them so that the unique naming invariant is preserved.<\/p>\n Filesystem tools are strongly coupled with the fs-specific details - imagine n file systems and k file system tools.<\/p>\n Goal is to decouple a file system format from file system tools. It's analogous to compilers decoupling the frontend and backend formats.<\/p>\n The challenge is that we need a specification of the filesystem format to traverse the fs graph. However, nodes are under-specified and so we dont have enough information to make a tool.<\/p>\n The ondisk format of this has a variable length array (which can't easily be declared in C). <\/p>\n The second problem is that edges are implement. For instance, does struct foo point to struct bar? The ondisk format doesnt reflect this strongly.<\/p>\n Their approach is to create an annotation language to identify on-disk data structures which arent currently specified in the C code. This involves specifying the pointer information, the type of structure references, and the conversion to disk addresses. Next, conditional interpretation of fields (tagged unions) increases flexibility.<\/p>\n They are building a simple ext2-style file system to evaluate their language.<\/p>\n POINTER (type=dir_block, when = self.i_type == I_DIR) These annotations are detailed in the paper, but essentially map ondisk data to the C struct. Their tool can now use these to generate code that (e.g.) forms iterators. They can annotate real file systems (e.g. ext3 btrfs etc) and improve code generation. <\/p>\n Q: (Russ Cox) this seems to be focussing on serialization, so why not just provide libraries? Q: can you handle corrupted datastructures? Q: the idea is to write one bit of code in fscheck, and then write code against all file systems. What would this look like? Q: could you write a database like this? cf the internal contents of files which are quite complicated now (e.g. a sqlite database stored as a file) [paper link]<\/a><\/p>\n <\/a>Futures is a placeholder for a result that may not exist. A Future[Int] future is an integer-valued future for example. Futures are used for any sort of async operation, for example pending RPCs, disk I\/O, or pending timeouts. Generally they're ignostic to the source of the operation, and consumer and producers are coupled. A promise is a writeable future to fill in its value.<\/p>\n It is possible to view futures as an at-most once element container. You can transform them by constructing a map function.<\/p>\n The most important way to use futures is to compose futures as a function of other futures. <\/p>\n Futures recover from errors by another form of dependent composition.<\/p>\n It's possible to build up a collection of futures and wait for all the futures to succeed, returning the sequence of returned values. The returned future fails if any of the constituent futures fail.<\/p>\n def crawl(url:String): Future[Seq[Buf]] = He shows a simple webcrawler (for DAGs only!) that use all these abstractions. The observation is that futures, services and filters form a good orthogonal basis on which to build server software of the sort found at production at Twitter. The style of programming encourages good modularity and separation of concerns. Most of their systems are just future transformers.<\/p>\n As presented though, futures are purely read-only constructs which form a DAG where consumers are separated from consumers. For example, a producer should know when a value is no longer needed to avoid needless work and avoid queuing. They therefore added an Interrupt model to cancel a future. <\/p>\n Make futures a little less clean, for example caching futures is a problem and its not fully composable anymore. But in practise it's been fine.<\/p>\n Futures, services and filters form an orthogonal basis for writing server software. The strategy has paid off: picking a few composable abstractions has led to the emergence of small, reusable components. This is the software tooling approach for building server software now. Separating semantics from mechanics has been hugely beneficial too.<\/p>\n Q: when it breaks how do you expose a useful explanation to the user? Q: Mothy: in the past, people have talked about futures as a lot :) How are these futures different? [paper link]<\/a><\/p>\n <\/a>Motivated by the need to keep track of what's going on in a complex Xen stack. There are 3-4 layers of scheduling and an I\/O match and generally a complex distributed system. Xentrace provides a large execution trace from a Xen setup.<\/p>\n Weir is a new language that takes an event trace and combines them together into a pipeline for processing. The analysis algorithm is a composition of stages, and the logic of the algorithm consists of the topology of the stream. He then shows off the basic pipeline operators such as a pipe, sequence and if operator. A join operator lets you count the time between the pair of events for example.<\/p>\n An example is to build a histogram in the Xentrace trace, which consists of a foreach and join that computes that there is a high number of a particular event (exits). Then they figure out why the exit happens (a rtdsc timestamp event). The next Weir trace figures out the total time in Xen spent due to rdtsc events, which is built using the gate constructs (to open and close portions of the pipeline).<\/p>\n [paper link]<\/a><\/p>\n <\/a> MBrace is a new programming model for the cloud that provides an elastic fault-tolerant programming framework for the cloud. It uses a monad for composing distribution workflows. It's essentially a continuation monad that admits distribution through the use of primitive combinators. It is based on F#s computation expression (which is the F# equivalent of monads but including useful keywords such as while loops). The whole thing is inspired by F# asynchronous workflows.<\/p>\n The cloud keyword shows that its a monad, and the return value is a future (in the parlance of Marius' talk).<\/p>\n let downloadParallel () = cloud { The basic operators are a binary parallel operator, a variadic parallel combinator and a non-deterministic parallel combinator for selecting one. F# provides useful support for these monads -- monadic for, while and exception handling loops.<\/p>\n Execution happens with a scheduler\/worker organization and a symbolic execution stack (free monad and trampolines) and the scheduler interprets the \"monadic skeleton\". The native leaf expressions are dispatched to workers. The symbolic stack winds across multiple machines, and so if there is a distributed computation with an exception, it can be wound across multiple machines.<\/p>\n There's no storage service built into MBrace, and instead it relies on third party storage services. Out of the box support for SQL, Azure and filesystems and planned support for Amazon S3. Storage services are interfaced through data primitives which act as references to distributed resources that can only be accessed through the monad. A \"cloud ref\" is a data reference, and is conceptually similar to ML ref cells.<\/p>\n Shows benchmarks for a distributed grep to benchmark it (a straightforward mapreduce implementation) and found that mBrace is slightly faster than Hadoop with roughly equivalent performance. Moving onto iterative mapreduce (kmeans) they compared against Apache Mahout. mBrace pulls ahead an order of magnitude faster.<\/p>\n [paper link]<\/a><\/p>\n <\/a>Motivational example is a repeated insertion and deletion on a bounded linked list. The desire is for the total length of the linked list to be less than a million. While debugging it, they discover the list is too big. In the debug history they have a list of its length. How do we find the fault in the lifetime of the program? Ideally, they want to do a binary search of the desired invariant over the process lifetime.<\/p>\n First issue is determining how to find a halfway point while debugging, which is essential for a binary search. By looking at the debug history, they can attempt a binary search over it and determine which of the debugging commands caused the invariant to be violated. Each debug command though might either be a breakpoint (next, continue) but the continue consists of lots of commands. They want to decompose the continue in a series of smaller nexts to home in on the problem quickly.<\/p>\n The other issue is the slow speed of repeated execution of debug history. There is a lot of repeated computation when falling back through a series of debug commands to find the bug. hey use process checkpointing to speed up the binary search by storing a common checkpoint and restarting from there in a future binary search through the history.<\/p>\n The final main issue is how to deal with non-determinstic programs (such as gettimeofday). They need to record interesting events and ensure they replay those events regularly. In multithreaded programs, there are locks involved for synchronization and on restart, locks are acquired\/released in a different order. They keep an event log and use it to deterministically reexecute the program after a restart. The assumption is that a well structured program only communicates across threads via locks.<\/p>\n What if a fault is triggered in a secondary thread? Threads are not kept synchronous, so background threads continue to execute. Kapil describes the algorithm to do the scheduler locking tricks to find the problem (details in paper, my fingers hurt now!).<\/p>\n Did the debugging on MySQL and found a couple of bugs (atomicity violation, data race, MySQL bug 12228 and 32319 if anyone wants to look it up). The checkpoint image size for MySQL was around 60MB with 1.2 million entries in the event log, and the average log entry size was around 4 bytes, so not a huge amount of storage required.<\/p>\n Q: Tim Harris: binary search is over an increasing value. In the linked list example, its not binary as it goes over and under the desired list length size.Russ Cox: Using Go to build distributed systems<\/h2>\n
Engineering: fast compilation<\/h3>\n
\r\npackage main\r\n import \"fmt\"\r\n func main () {\r\n fmt.Printf(\"\")\r\n }\r\n<\/code><\/pre>\n
\nA: the compilation time doesnt scale -- not runtime issues.<\/p>\n\r\n$ go get github.com\/golang\/glog\r\n\r\npackage main\r\n import (\r\n \"flag\"\r\n \"github.com\/golang\/glog\"\r\n }\r\n\r\nfunc main () {\r\n flag.Set(\"logtostderr\", \"true\")\r\n glog.InfoF(\"hello world\")\r\n }\r\n<\/code><\/pre>\n
Engineering: Program rewrites<\/h3>\n
\r\n$ gofmt -r 'glog.InfoF -> glog.Errorf' hello1.go\r\n<\/code><\/pre>\n
Engineering: garbage collection<\/h3>\n
Interfaces<\/h3>\n
\r\npackage io\r\ntype Writer interface {\r\nWrite (data []byte) (n int, err error)\r\n}\r\n<\/code><\/pre>\n
\r\nfunc MultiWriter(writers ...Writer) Writer\r\n<\/code><\/pre>\n
\r\nfunc NewClient(conn net.Conn, host string) (*Client, error)\r\n<\/code><\/pre>\n
Interface lessons<\/h3>\n
Concurrency vs Parallelism<\/h3>\n
\r\npackage main\r\nimport \"fmt\"\r\n func main() {\r\n c := make(chan string)\r\n go func() {\r\n c <- \"Hello\"\r\n c <- \"World\"\r\n } ()\r\n fmt.Println(<-c, <-c)\r\n}\r\n<\/code><\/pre>\n
\n
\nfor _, w := range worklist {
\nw.addrs, w.err = LookupHost(w.host)
\n}
\n<\/code><\/p>\n
\ndone := make(chan bool, len(worklist))<\/code><\/p>\n
\ngo func (w *Work) {
\nw.addrs, w.err = LookupHost(w.host)
\ndone <- true
\n} (w)
\n}<\/p>\n
\n<- done
\n}<\/p>\n
\nconst (
\nF = 2
\nN = 5 \/\/ >= 2F + 1
\nReadQuorum = F + 1
\nWriteQuorum = N - F
\n)
\n<\/code><\/p>\n\r\nfor {\r\n select {\r\n case event := <- ui:\r\n \/\/process user interface event\r\n case msg := <- server:\r\n \/\/ process server message\r\n case t := <- tick:\r\n \/\/ time has elapsed\r\n }\r\n}\r\n<\/code><\/pre>\n
Concurrency lessons<\/h3>\n
\nA: no, they are capturing references to variables.<\/p>\n
\nA: being worked on, but not there yet.<\/p>\nType systems<\/h3>\n
\n
\n
\n
\nA: Loves Haskell, but is unable to write programs. There's a real tradeoff between the type systems complexity and expressivity.\u00c2\u00a0 Go hits a spot in the middle that doesnt force you to be honest all the time (like Haskell).<\/p>\n
\nA: in Erlang, there's no concept of a channel and you pass around process ids.\u00c2\u00a0 In Go, it's just a nice solid model based on CSP that is easy to pick up.<\/p>\n
\nA: they dont make this static guarantee, but there is a static analyser called Race Detector which records all reads and writes and observes bad backtraces.<\/p>\n
\nA: not claiming we still stay simple forever.\u00c2\u00a0 Go offers a balance: Java doesnt trust you with anything and they make things clear.<\/p>\nStephen Kell: An operating system, should there be one?<\/h2>\n
\n[paper link]<\/a><\/p>\n
\nRecursive search of a pttern.\u00c2\u00a0 What is grep doing?<\/p>\n\r\ngrep - r 'pattern' $d\r\n<\/code><\/pre>\n
\n
\nNew tool called Pmirror : knows Linux \/proc, ELF, Dwarf (more wanted) Explains any byte in a system to more semantic introspection.<\/p>\nGabrielle Keller: verifying filesystems<\/h2>\n
\ncreateObj :: Int -> Object
\nupdateObj :: (.Object, Int) -> .Object
\n<\/code>
\n(where the dots represent linear types).<\/p>\n
\nA: Heavy emphasis on proof automation.\u00c2\u00a0 Gernot Heiser observes that it took 22 person years<\/p>\nTowards a verified component platform (Matthew Fernandez)<\/h2>\n
\nA: not modelling time at the moment.<\/p>\nModelling NICs with Unicorn<\/h2>\n
\nBuilds on a paper at HotOS -- increasing complexity, diversity and functionality in network cards (checksum, segmentation offload, lots of hardware queues, hw filtering, virtualization, direct cache acces). Cores are not getting faster, but networks are. Current operating systems fail to utilize NIC hardware resources though.
\nThe Dragonet stacks breaks up the problem into the NIC and net stackb being dataflow graphs (c.f. x-kernel or click).<\/p>\n
\nfunctionality such as supporting multiple inputs.<\/p>\n
\nA: Yes! (I missed how though, sorry)<\/p>\nAnnotation for automation: Rapid prototyping<\/h2>\n
\nstruct baz {
\n short name_len;
\n \/* name goes here *\/
\n};
\n<\/code><\/p>\n
\nFSSTRUCT inode {
\n u32 i_type;
\n u32 i_mod_time;
\n u32 i_size;<\/p>\n
\n u32 i_block_nr[NR_DIRECT_BLOCKS];
\n ... (see paper)
\n<\/code><\/p>\n
\nA: the libs are provided by the filesystem authors and only spottily available. The goal is to support all styles of applications (both online and offline).<\/p>\n
\nA: this would be considered part of the application logic. They currently generate code which would to be embedded in the template.<\/p>\n
\nA: You need to write the fschecker, but also the filesystem specific consistency rules. In addition to annotation, there is also the fs-specific policies and consistency invariants which also need to be specified. <\/p>\n
\nA: yes, you would treat the database file as a filesystem and specify the metadata that's part of the db definition.<\/p>\nMarius Eriksen: your server as a function, Twitter Inc<\/h2>\n
\nFuture[T].map[U](f: T => U): Future[U]
\n<\/code><\/p>\n
\nFuture[T].flatMap[U]( f: T => Future[U]): Future[U]
\n<\/code><\/p>\n
\nval f = auth(id, pw) rescue {
\n case Timeout => auth(id,pw)
\n}
\n<\/code> (which is similar to stack based catching of exceptions)<\/p>\n
\ndef fetch(url:String): Future[Buf]
\ndef parse(buf: Buf): Seq[String]<\/p>\n
\n fetch (url) flatMap { buf =>
\n val ls = Future.collect(parse(buf) map crawl)
\n ls map(_.flatten)
\n}
\n<\/code><\/p>\n
\nThen shows how services are just functions (e.g. memcached, redis and so on). A service can have filters which transforms a set of requests over a service and constructs the pipeline of changes. Simple example is a timeout filter which can either timeout a request or map through the result. Another filter is an authentication one that ensures that a given request has been authenticated.<\/p>\n
\nval timeout = new TimeoutFilter
\nval auth = new AuthFilter
\nval authAndTimeuot = auth andThen timeout
\n<\/code><\/p>\n
\nval p = new Promise[T]
\np setInterruptHandler {
\n case exc => if (p.updateIfEmpty(Throw(exc))) tearDown ()
\n}
\n<\/code><\/p>\n
\nA: have a virtual stack trace, exposing the graph. When evaluating the graph they record the stack trace and show it to the user. A separate approach is a bytecode rewriting plugin for the JVM which rewrites all the spots in the code where promises are included to include provenance information about what they're waiting for.<\/p>\n
\nA: The traditional notion of a future in Java\/C++ have precluded the type of composition shown here. For example a java.util.Future can't be composed; you can only wait for the result (same is true for i-structured variables). OCaml's Lwt is most similar.<\/p>\nWeir: a streaming language for performance analysis, Anton Burtsev <\/h2>\n
\nWeir constructs these pipelines -- it is an imperative streaming language with arbitrary topologies and dynamic scheduling.<\/p>\n
\n
\nvm(0) | {
\n match(HYPERCALL);
\n match(EXCEPTION);
\n} | count ()
\n<\/code><\/p>\n
\nforeach(vm_id()) {
\n match(EXCEPTION) | count ();
\n}
\n<\/code><\/p>\nMBrace: Cloud Computing with Monads<\/h2>\n
\nMost distributed frameworks are tied to specific programming models: map reduce, dataflow or actors for example. They tend to restrict specific distribution patterns, and arent expressive enough for certain classes of algorithms. It is difficult to influence task granularity and is time consuming to deploy, manage and debug code to and from the cloud.<\/p>\n
\nlet download url : string = cloud {
\n let client = new System.Net.WebClient ()
\n let content = client.DownloadString(url)
\n return content
\n} : Cloud
\n<\/code><\/p>\n
\nlet downloadSequential () = cloud {
\n let! c1 = download \"http:\/\/foo\"
\n let! c2 = download \"http:\/\/bar\"
\n let c = c1 + c2
\n return c
\n}<\/p>\n
\n let! c1,c2 =
\n download \"http:\/\/foo\"
\n <||>
\n download \"http:\/\/bar\"
\n let c = c1 + c2
\n return c
\n<\/code><\/p>\nSemi-Automated Debugging via Binary Search through a Process Lifetime. Kapil Arya<\/h2>\n
\nA: the goal is to find at least one bug and then go back from there.<\/p>\nUnderstanding the genetic makeup of device drivers<\/h2>\n