syslog
8Nov/170

SOSP’17 Trip Report

Posted by Jianxin Zhao

I'm glad to have the opportunity to present our poster at SOSP, one of the top conference in the field of system research. Here is a brief review of the papers that I find interesting from this conference.

Debugging

One of this year's best papers goes to "DeepXplore", a whitebox framework for systematically testing real-world deep learning (DL) systems. Instead of system, the focus of this paper is more on machine learning IMO. This framework, or algorithm, tries to automatically find corner cases to invalidate your deep learning method. For example, suppose you have a image classification DL model. Even it works perfectly in 99% cases, there are always corner cases that makes your model behave abnormally, e.g. recognise a panda as, say, a dinosaur. And obviously a user cannot find these potentially numerous cases and label them manually one by one.

What this algorithm proposes is to use a series of models of the same purpose, and use gradient descent to find a input example that can maximise the disagreement among all these models. It also try to maximise the activation of an inactive neuron to push it above a threshold?—?the newly proposed "neuron coverage" idea. The whole problem is formalised as a optimisation problem. It is proved to be more efficient than random testing and adversarial testing. I'm very interested in the execution time to find out an corner case input —only seconds as reported! Amazing.

Resource Management

The paper "Monotasks" is about performance reason-ability of data analytics system. By "reason-ability" I mean questions like what hardware/software configuration should a user use and possible reasons to poor system performance. Currently data analytics systems such as Spark provide fin-grained pipelining to parallelise use of CPU, network, and disk within each task. But this kind of pipelining is the exact reason that we cannot reason the system performance, because tasks have non-uniform resource use, concurrent tasks on a machine may contend, and resource use may occurs outside the control of the system.

The basic idea of monotasks is that jobs are decomposed into units of work that each use exactly one unit of the CPU, network and disk resources, so that the usage is uniform can predictable. These work units form a DAG. Each kind of resource has a scheduler. The decomposition happens when the jobs arrive on the worker machine.

What I find most interesting in this paper is the idea of "going back": eliminate the use of pipelining, sacrifice part of performance for its reasonability. And also the implementation doesn't require modification of user code.

Security

The other best paper "Efficient Server Audit" revisit the classic topic of execution integrity with a newly-defined efficient server audit problem: since your have no visibility into AWS, how can you assure that AWS (executor)is executing the actual application you've written? Specifically, The verification algorithm (verifier) is given an accurate trace of the executor's inputs and delivered outputs. The executor gives the verifier reports, but these are untrusted. The verifier must somehow use the reports to determine whether the outputs in the trace are consistent with having actually executed the program. The problem is to design the verifier and the reports.

In the proposed solution, for different requests, same control flow of executor are grouped, and the verifier re-executes each control flow group in a batch. Re-executed read operations are fed based on the most recent write entry in the logs, and the verifier checks logged write operations opportunistically, with the assurance that alleged operations can be ordered consistent with observed requests and responses. Finally, the verifier compares each request's produced output from executor to the request's output in the trace all control flow groups.

Data Analytics

Stores for temporal analytics is crucial to many important data analytic applications, such as recommend system, financial analysis, Smart Home, ect. "SummaryStore" is for large scale time-series analysis. This approximate time-series store aims at providing high query accuracy while keeping extremely cost-effective storage.

Mainly three techniques are used here: 1)it maintains compact summaries through aggregates, samples, sketches, and other probabilistic data structures (e.g. Bloom Filter) instead of raw data; 2)based on the observation that newer data often conveys more information, it defines a novel time-decayed approximation scheme; 3) some special events are separately stored as is in raw format rather than being summarised. The appendix proves comparison of different decay functions and the query error bound estimation.

Privacy

A common theme in the Privacy session is "shuffle".

Atom is an anonymous messaging system that both scales horizontally like Tor, and also provides clear security properties under precise assumptions like DCnet-based systems. The assumption here is global adversary and malicious servers. The main technique here are onion messages and random output permutation.

Interestingly, the next paper Stadium focus the exact opposite application area: one-to-one communication, compared with the broadcast style of Atom. The main problem this work wants to solve is how to hide communication metadata. It scale horizontally. This works is based on the SOSP'15 paper Vuvuzela, which problem is that every server has to handle all messages. The challenge here is how to distribute to untrusted servers. The main design is to combine collaborative noise generation, message shuffling, and verifiable parallel mixnet processing pipeline.

Prochlo from Google propose an ESA system architecture?—?encode, shuffle,analyse?—?for large-scale online monitoring of client software behaviour, which may entail systematic collection of user data. One interesting point is that to enhance the proposed ESA, this work proposes Stash Shuffle, an algorithm based on Intel's SGX. One of the question raised at the conference is that "Why trusting Intel is better than trusting Google?" :)

Scalability

Facebook's SVE system is deployed in production for uploading and processing videos at large scale, with low latency to support interactive applications, reliability, and robustness. The key idea here is very simple: process a video while upload another one in the pipeline at the same time, but it has shown good performance compared to Facebook's previous MES system. This is indeed one of the advantages of "being large scale".

Kernels

Here is a paper that is quite related to Unikernel: "My VM is lighter than your container". This paper "find and eliminate bottlenecks (such as image size) when launching large numbers of lightweight VMs(both unikernels and minimal Linux VMs)". The results shows a 4ms boot time for a unikernel based VM while a docker container takes 150ms to boot. This system provides tools to automatically build minimised image, and also replace the Xen control plane so that front-end and back-end drivers can communicate directly through shared memory. This paper reminds me of the "Mosaic" from EuroSys'17, which also tries to fit more computation units in one computer.

Other Sessions

Obviously we cannot (or more Specifically I cannot) cover the essence of a sosp paper with one to two paragraphs, and there are many other great papers that are not mentioned here yet. If you're interested, please visit the program to see the full list of accepted papers.

One more thing...

SOSP this year also host ACM Student Research Competition. 16 are selected from 42 submitted abstracts, and 6 are further selected to do a 5-min presentation. The presentations are surprisingly interesting. Please see the list of winners (actually all 6 participants, divided into graduate and undergraduate groups) here.

Also, this year's female participation rate is 6.6% for authors and 7% for Program Committee, both slightly lower than previous 3 years' numbers.

23Apr/160

Eurosys 2016

Posted by Jon Crowcroft

some papers caught my eye include:
STRADS: A Distributed Framework for Scheduled Model Parallel Machine Learning.
looks quite clever - not sure how it would work for a bayesian inferencer, but made me think

Increasing Large-Scale Data Center Capacity by Statistical Power
uses MORE servers to reduce power - clever control theory approach - Baidu traces to eval is real, large system

A High Performance File System for Non-Volatile Main Memory.
seems solid

Crayon: Saving Power through Shape and Color Approximation on Next-Generation Displays.
neat, but niche - OLED laptop displays consume less power if you render stuff cleverly - nice bit of human factors driven algorithm design to minimise impact on perceived image quality - gets 56% power saving on tablet with little subjective impact

A Study of Modern Linux API Usage and Compatibility: What to Support When You're Supporting.
Best paper award - nice talk - fun....

BB: Booting Booster for Consumer Electronics with Modern OS.
basically, Samsuung's smart TV boots a lot faster coz they hacked it a lot. (I have one, and replaced an LG with it, and its true:)

TFC: Token Flow Control in Data Center Networks.
is basically isarhythmic flow control (an idea from donald davies 1960s packet switched networking) done right:)

JUGGLER: A Practical Reordering Resilient Network Stack for Datacenters.
uses offload engine and other stuff to do a very solid job of dealing with putting packets in right order for TCP (where out-of-order delivery was caused by load balancers)

Flash Storage Disaggregation.
what it says on the tin

Shared Address Translation Revisited. evil question about reverse page->structure mapping in linux - how to figure out which process to go to with shared stuff...

POSIX Abstractions in Modern Operating Systems: The Old, the New, and the Missing. - hopelessly optimistic but engaging speaker:)

All findable via
http://eurosys16.doc.ic.ac.uk/program/program/

Filed under: Uncategorized No Comments
28Aug/150

Sigcomm 2015

Posted by Jon Crowcroft

A number of us attended ACM Sigcomm 2015conference in London, which was a very well managed affair - hopefully next year (in Brazil will be as good

two things of note here
1/ Heidi Howard won the Student Research Competition
2/ There was an interesting debate around netethics, which George Danezis, et al, blogged

14Jan/150

Programming Languages Mentoring Workshop (PLMW)

Posted by hh360

Good morning from POPL 2015 in Mumbai, India.

Throughout this blog, * denotes a notation heavy segment in the talk. These can be difficult to summaries quickly without typesetting. See the authors paper online for now and I'll try to find speaker slides at the later time.

PLMW Organizers - Intro

This is the 4rd PLMW workshop and we are very pleased to see you all. Mentoring workshops are now being much more common. This year we have funded 75 students. We would like the thank the speakers, sponsors and organisers (including Annabel :)

Video from Derek

POPL proceedings are now available online


 

Nate Foster (Cornell University) - You and Your Graduate Research

Prelude:

- doing a PhD is not easy, many do not finish. This is a survivors story
- some of my comments many be US specific
- acknowledgements (including Jorge Cham from PhD comics :))

Motivation

THis is useful from everyone, not just undergraduates.

Why you shouldn't do it?
- money: nick shows a graph of salaries in university, football coaching is clearly the right way to go for a big salary.
- start-ups: Tech start-ups are now a top topic, Oculus VR example and you don't need a PhD for this
- respect: Education is highly valued and there's respect from degrees, but no one will call you doctor exact your mom.
- to stay in school: Undergraduate was fun so lets stay in school, a PhD program is totally different.
- being a professor: Let all become a professor, demonstration of the numbers of academics verse other careers

Why you should?
- opens opportunities
- cool jobs in industry: leadership, work on cool problems, you can work on large systems and with real users
- freedom: applies on any level, freedom to choose interesting problems, opportunities to work on the big problems

What is a PhD?

Comparing degrees at different levels:
- high school: basics for life
- undergrad: broad knowledge in a field
- professional: advanced knowledge and practical skills
- PhD: advanced knowledge and a research contribution.

PhD is open edge

PhD is a transformation - we start with intelligent people who aren't researchers. Its a apprentice based scheme to being a researcher. Its not an easy or painless process.

PhD Success

pick an institution
The community that your in will shape your work,
Very important factors include advisor, opportunities and peers. Typically less important factors are finances, institution and location.

Dive into Research
Don't be distracted by other things e.g. classes and teaching
Pick great problems

"It is better to do the right problem the wrong way than the wrong problem the right way" - Hamming

Stay Engaged
Working with others means it easier, as you can motivate each other.
Know when its time to switch topics - good research are versatile and able to switch quickly
Independence - you need to know when the time is right to go rogue and ignore your adviser
Reaching escape velocity to graduate


 

Peter Müller (ETH Zürich) - Building automatic program verifiers

verification: Given a program P and a specification S, prove that all execution of P satisfies S
automatic verification: Develop an algorithm that decide whether P satisfies S

We can't prove halting => we are finished :)

Lets start again ...

Peter demonstrates of verification tool called Viper, using the typical account balance and transfer example. The verifier gives two error, we add preconditions and the verifier successes. We then use the verifier for thread safety, again fixing a concurrency bug with locking.

Automatic verification is not b/w. We can have semi-automatic verification: we can guide the verifier. The complexity of the code varies greatly, most success in the area focuses on a small area. We build verifiers from weaker guarantees. Verification can working with modularity (or maybe it will not). Varying complexity of properties.

7 Stage process

Define the research goals
What is the state of the art?
Find examples the area in the problem space that is not currently handled (well) by current verifiers.

Find Reasoning Principle
How to explain the correctness of the code to a friend? For the loop, we would use a loop invariant.
This is very different to techniques like model checking, I wouldn't say to friend "I have checked 15 billion possible evaluation paths and are all valid".

Break Down Arguments
Decompose the correctness argument: what do I need to prove for each program and what can I reuse between programs. How can I modularise the program.

Design Specification Language
Designing a language to express intended behaviour, we must know who the user will be and their level of experience. Find the right compromise between expressibility and simplicity, that is right for the intended audience.

Design Program Logic
Determine which properties need to be checked and which properties may be assumed.

Automate Proof Search
Utilize existing infrastructure like SAT and SMT solvers. Develop decision procedures for aspects not already covered by existing infrastructure.

Evaluate the Solution
Firstly an experimental evaluation, does the solution work on the example code you were executing. Then he meta-theory like the soundness proof.

Research Direction
Either develop now technical for new languages and feature (e.g. the recent interest in event-bases programming due to android.) or work in general infrastructure for utilise in many areas.


 

Frank Pfenning (Carnegie Mellon University) - Proof theory and its role in programming language research

I will show you why every PL researcher needs a bit of proof theory.

How do we write correct programs ?
We don't :)

In practice, programming and informal reasoning go hand in hand, we use mental logical assertions, e.g. after calling sort the list is sorted. It's vital to decompose the problem into part so we can reason locally.

We need to develop programming language with the logic of the programmer in mind. Think about your least favourite programming language and why the operational or logical model of the languages doesn't agree with you. Reasoning is an integral part of programming. We need to co-design the language and logic for reasoning about it.

Logic is computation so the key is to design coherent logical and operations semantics.

Examples:

Computation first...
- runtime code generation => IS4
- partial eval => temporal logic
- dead code ele => model logic
- distribution computation => IS5
- message-passing
- concurrency => linear logic
- generic effects => lax logic

... and the logics first
- lax logic => ??
- temporal logic => ??
- epistemic logic => ??
- ordered logic => ??

Key ingredients.
Understand the difference between judgements and propositions. The basic style of proof systems e.g. natural deduction, sequent calculus, axiomatic proof system and binary entailment.

Example: Hypothetical Judgement*

Example: Runtime Code Generation

We have the source expression at runtime. We distinguish ordinary variable which are bound to values and expression values which are bound to source code.*


 

Stephanie Weirich (University of Pennsylvania) - How to write a good research paper

Stephanie is giving the popular talk "How to write a good research paper" from Simon Peyton Jones.

Start by writing the paper. It focuses us. Write a paper about any idea no matter now insignificant it seems. Then you develop the idea.

Identify your key idea, the paper main goal is to convey your idea. Be explicit about the main idea for the paper, the reviews should need to be a detective.

The paper flow:
- Here is a problem
- It's an interesting problem
- It's an unsolved problem
- Here is my idea
- My idea works

The introduction - describe the problem and what your paper contributes towards to problem. Don't describe the problem to broadly to quickly. It's vital to nail the exactly contributions.

The related work - it belongs after the main body of the paper, not straight after the introduction. Be nice in the related work. Be honest about the weaknesses of your approach.

Use simple, direct language, putting the reader first. Listen to readers.


Damien Pous (CNRS, LIP, ENS Lyon) - Coinductive techniques, from automata to coalgebra

Checking language equivalence of finite automata

Demonstration of the naive algorithm to compare equivalence by walking through the stages and comparing. This algorithm is looking for bi simulation. This algorithm has quadratic complexity. Instead we can used HK (Hopcroft and Kerp) algorithm.

[I got a bit lost in the theory of coalgebras for the rest of the talk, sorry]*

25Nov/140

New Directions in Operating Systems

Posted by dwws2

Notes from the New Directions in Operating Systems 1 day conference at the Shoreditch Village Hall in London.

Rump kernels and {why,how} we got here

Antti Kantee, Fixup Software

Writing drivers and filesystems is hard. Want to reuse this existing code, continuing to benefits from updates. A rump kernel runs on a hypervisor. A normal OS kernel with most of the code removed. Does not provide threads, scheduler, exec, VM. Can therefore run anywhere and integrates with anything. Some glue code connects to hypervisor. Application can add libc, etc, if desired. The syscall interface contains much useful logic (e.g. path resolution), so want to keep that too.

Was able to compile to JavaScript and debug driver using FireBug. Fetched a web-page in the Linux kernel. Also runs on Xen (Mini-OS) and bare metal.

http://rumpkernel.org/ (BSD license)

Where are they useful?

  • e.g. in embedded systems: reduces complexity

Genode -- OS security by design

Norman Feske, Genode Labs

Laptop boots to GENODE desktop environment. Aims:

  • Reduce TCB (Linux has a very large attack surface, lots of code running with high privilege).
  • Accountability and utilisation. Linux tries to provide an illusion of unlimited resources, with hacks like the OOM killer.
  • Usable security. e.g. SE-Linux is unusable.

A GENODE system is structured as a tree, with parents owning their children. Parents provide resources to their children from their own resources and control what the children can do. Each application only needs to trust those components it depends on (e.g. its ancestors and services it uses).

Combined with virtualisations, can run e.g. Linux and BSD as processes. Also takes components from many places: Linux TCP stack, Rump kernel filesystems, etc, each running as a GENODE process.

Security is based on object capabilities. e.g. a child can tell its parent that it provides a service (e.g. the GUI) and other children can retrieve them. For example, xterm asks for access to the GUI. It’s parent, the user session, tags the request with the application name and passes it to its parent (init), which forwards it to the GUI. Once the capability is granted, the application can use it directly without going via its parents.

Processes can trade resources. For example, a client can make a request on a server and pass the server the resources needed for the request.

VirtualBox has been ported (shows it booting Windows). Not optimised yet, but was able to play a video under it (slightly jerky).

Using a modified libc, can run a Unix environment without needing a virtualised kernel. Uses Vi to edit the GENODE background, showing integration with the rest of the system.

Runs a web browser (Qt app). The File Open dialog shows only a few files - the browser can't access anything else. Browser can run GENODE processes as (isolated) plugins. Uses a nested GENODE GUI to run a 3D demo. The plugin is also isolated from the browser - the browser can't get key presses from it (not sure how this works).

Current focus is making it usable as a dev system, adding a capability GUI.

Eating own dog food
- noux (gcc, vim, bash coreutils)
- wireless networking

Capability-based UI

seL4 as base platform

ARM virtualization

package management

Q: How to revoke capabilities?
A: Destroy subsystem or destroy cap referent object.

New ideas about old OS security

Robert Watson, Cambridge University, FreeBSD

1990s: MAC and DAC. Orange book. Auditing. Slow, poor usability.

computer security in 2014

- embedded
- ubiquitous
- many computers per person rather than many users per computer
- security still neglected
- software liability still disclaimed
- lots of malware

Lots of access control models. Hard for OS vendors to decide what to implement. => Linux LSM, FreeBSD MAC, etc. LSM not very composable, but MAC is. If policies disagree, it uses the most restrictive one.

Filtering system calls is not a good approach, due to concurrency. Instead every kernel component talks to the MAC framework. Attach labels to many kernel objects. They are useful for many different security frameworks.

MAC originally controlled access by users, but on modern devices it’s applications that are the key subjects.

Debugging security is hard. Need good tracing. DTrace useful here.

Capsicum. Applications get compromised. Want to contain the attacker. Object capabilities for FreeBSD (extends POSIX by removing ambient authority). File descriptors become capability. Now a production feature in FreeBSD 10. Linux patches have been posted to linux-kernel. Linux seccomp required 11,300 lines of code, compared to 100 on Capsicum (doing what?).

Libraries can use sandboxing even when their application doesn’t.

Compartmentalization is Neat!
- but different ways to break things up
- has overhead
- hardware does this badly
- so, put capability support into the architecture!
- now, userspace sandboxing with little os intervention
- BERI/CHERI
- can we do fine-grained cpu-supported compartments and not touch kernel?
- How does tagged memory interact with virtual-memory systems?
- How should syscalls work? Sould be blox them?
- How do signals work?

CHERI downloadable now!

Conclusion
- Blurring boundaries to support application needs
- FreeBSD OS book (second edition)
- Saltzer and Schroeder

CHERI: Extending processors with better support for security. Open Source. Can be run on FPGA at home.

Managing configuration for future operating systems

Gareth Rushgrove, Puppet Labs

Any input to infrastructure is configuration

Configuration management is about managing those inputs over time

History
Emerging Patterns
- immutable infrastructure
- infra APIs
- Autonomous systems
- Simpler hosts

Future infrastructure as code

History

50s research, 60s 480 series, 1991 MIL-HDBK-61 (conf mgmt guidance),
1998 ANSI-EIA-649?

Identification
Control
Status accounting
Verification and audit

Cfmg verifies that the system is identified and documented in detail and performs as intended

Infrastructure as code

Ruby puppet SSH config example

Immutable Infrastructure
- Build once, run many times
- Amazon Machine Images
- E2e automation to avoid the golden image problem (heavyweight build
requirements but no tracing)
- AMI build, audit, and trace cycle is faster, fundamental change

Containers
- Docker is the UI
- LXC was painful
- Usability matters
- We have the primitives for immutable infrastructure but toolchains
are coming up

How immutable are your Docker containers?
- Nothing. You must impose it.

Infrastructure with APIs
- Cloud providers
- OSv machine APIs
- Network and storage, too (not just compute)
- Not just *nix, windows, too
- I love PowerShell.

Configuration at a distance
- No physical access.
- No remote access.
- Not user-like. Programmatic.

Increasingly managing higer-level systems
- Not individual machines
- Servers are cattle not pets (fields and farms)
- Autoscaling groups (AWS)
- Mesos is a system which does this aggregation
- An operating system for a data center
- Kubernetes
- An ocean of user containers
- Scheduled and dynamically packed into nodes

Simpler hosts
- Combinatorial package explosion
- Management of this package configuration is hard
- Project Atomic addresses this
- OSTree "git for operating system binaries"
- atomic config example
- CoreOS
- For container operating system (etcd, initd)
- "firmware for running containers" ~ John Vincent lusis.org

Moving configuration from host to network
- etcd, consul, zookeeper

Future infrastructure as code
- From
- host centric
- localized
- executable for integration
- To
- Cluster centric
- Distributed
- HTTP for integration

Going from Puppet to etcd
- puppet is dsl for describing infrastructure
- provides a graph abstraction
- where similar interfaces exist, provides abstractions
- e.g. key value store
- garethr/key_value_config

Going from etcd to Puppet
- e.g. writing to disk

Software installation done
More interesting to control installed software
- start a docker container
- run app on mesos
- set up security groups on AWS
- digitalOcean

I want a pony
- Managing an autoscaling CoreOS/Atomic cluster in AWS
- with etcd/consul
- immutable instances
- with the network in VPC/Weave
- with docker container arranged by Kubernetes
- All from Puppet DSL

Conclusions
- Future here but not evenly distributed
- These tools look like the future
- Manage not just provision
- In Search of Certainty

Exploring a new way to manage systems with ostree and Atomic project

Michael Scherer, Red Hat

Ostree: read-only system in /usr (with symlinks to /var for mutable stuff). To upgrade, you must reboot. Can also reboot to old version. "Git for filesystem" (can branch, but not diff).

Want to protect base system from containers for OpenShift. Also, protect containers form each other. Uses SE-Linux for this (svirt).

Example: setting up a static web service, plus wiki and cache. Will put everything in a different container. Web designer used Fedora 20, so want to use the same. Want to isolate mediawiki from the network, except for the cache service. Using S3 storage for mediawiki. Want to make sure that updates are tracked, not just hacks on the production server. Everything is described in a JSON file (not shown).

Tribblix: adventures with illumos

Peter Tribble, Tribblix

Good features: ZFS, DTrace, Zones, compatibility.

Some cruft in the code base, bits not open sourced, things that need to be brought up-to-date.

Various distros: OpenIndiana, OmniOS (servers), SmartOS, Delphix (storage), etc.

But didn't like them. Wanted to understand how as OS really works. => Tribblix.

Focus on being retro, lightweight. Uses zones for app deployment.

Problems: not enough time or people. Fragmentation (work only being done at the distro level), SPARC port (about 3 users), cgo (C Go bindings) not working.

Rumprun for Rump Kernels: Instant Unikernels for POSIX applications

Martin Lucina, Lucina & Associates

Rump kernel plus:

  • threads, scheduling, MM, locking, events (from hypervisor)
  • libc
  • build system to make existing code work without changes
  • deployment: rumprun (configure network, mount FSs, platform-specific launching)

e.g. "rumprun xen ...". Runs as PV Xen guest using Mini-OS. Bare metal in progress.

Demo: compiles to 20MB hello world (debug build).

Demo: builds the same project once for Unix and once for Xen. Passes parameters for network and to map host files to guest NS.

Lots of changes to Mini-OS; needs upstreaming.

An introduction to userland networking

Franco Fichtner, Packetwerk

Currently, no standards in this area. Reasons to do it include: speed, keeping complex code out of the kernel, using user-space libraries, binary libraries from venors, debugging.

gettimeofday slows things down. So, push the timestamp into the packet metadata and pass it through the stack.

zero copy: share memory between kernel and user. DPDK is stable. PF_RING has nice API. Linux only / GPL. Also, Intel only. Netmap (FreeBSD; patches exist for Linux).

Hard to play around with transport layer if you have to recompile the kernel every time.

Example: switch positions of TCP port and sequence number headers. If both endpoints are modified they will still work, but to monitoring systems on the path it will be difficult to track them (looks like the port keeps increasing).

Weave: Myths of the New OS

Alexis Richardson, Zettio

Weave: software defined network, for networking containers. Focus is ease of use.

Docker has changed things: now a container is what you ship.

New myths (like Sun's old networking myths):

  • App lifecycle is indep of OS. An app is a single-purpose OS.
  • Distributed systems need new tools and skills. We can now actually go back to older tools. e.g. DNS, DHCP.
  • Apps must be rewritten.
  • Platforms are better at building your app.
  • You need an all-inclusive system. People want pluggability.
  • Dev/test tools and different from production tools.
  • Coordination is a platform concern. Transactions should be handled by the app.

Network Stack in Userspace

Hajime Tazaki, University of Tokyo

It takes a long time to get a new network feature (e.g. a TCP option) deployed and widely used. In the filesystem world, FUSE makes it easy to test new features easily, even though performance isn't great.

Options:

  • Containers: but, the network stack is still shared with the host
  • Library OS

NUSE is a library OS version of a normal network stack. Patch to the kernel tree adds new arch "arch/sim". Hijacks application's system calls to redirect to NUSE so that existing apps don't need to be changed.

Possible uses for NUSE: developing new protocols, process-level network virtualisation. Currently, very few applications work unmodified (ping etc).

Jitsu: Just-in-Time Summoning of Unikernels

Anil Madhavapeddy, OCaml Labs

In distributed systems, complexity is the killer. We need to get rid of all these layers. We all depend on infrastructure that suffers security breaks again and again. Let’s get rid of POSIX and just keep network protocols as the standards.

Example: the slide stack is being server from a little ARM box (attached to the laptop by a network cable).

Although Mirage often runs of Xen, Mirage is more general than that. e.g. there is a JavaScript backend. You can start by building against Unix with Linux sockets, then remove this and use direct stacks.

Mirage aims to be contained, compact and efficient. And type-safe.

Each Xen guest has a flat memory layout and uses a single core. We trust all the code running in the unikernel, relying on type-safety, and try to protect against external threats.

Some common unikernels (DNS server, web server, etc) are around half a MB (smaller with dead-code elimination). Unikernels boot fast enough that they can be used like processes. Small enough to store the binaries in Git. Example: git pull to update web-site. Can git bisect to track down problems. Can detect a commit to a Git repository fixing a security bug and recompile within seconds.

Mirage defines abstract module types for devices, network flows, etc. There are about 80 libraries implementing these protocols.

OCaml is a good choice because of its module system. Like C++ templates but with more type safety.

http://nymote.org is a project building various services using Mirage.

Example: OCaml video decoder compiled to JavaScript. The Mirage JavaScript port replaces the entire HTTP library with JavaScript requests.
Example: TLS echo service running on ARM board, all implemented in OCaml.

CentOS Linux: A Continuously integrating platform

Karanbir Singh, CentOS

CentOS has thousands of extra tests, performed on every build. Build by ops people, not developers.

It's the MySQL developers' responsibility to check their code, but CentOS's responsibility to make sure it links against the right versions of its libraries.

They have a custom build system called Reimzul. A system of triggers will e.g. run the tests for MySQL when openssl changes. Can automatically find code that changed and map it to tests that failed.

Also test external uses of CentOS. e.g. check that WordPress and Drupal still work after a CentOS change.

Also sanity checks: e.g. new version number is higher than previous.

128 physical machines, 768 cores.

How to program computers (kos)

Geo Carncross, Telemetry

Kos is a tiny OS written in K. Very fast by avoiding bloat. Terse syntax. KDB database written in K. Violates what is generally considered best practices in programming - food for thought.

DIOS, a distributed operating system for your data centre

Malte Schwarzkopf, University of Cambridge

Too many abstractions are bad for scalability, scheduling, data locality, data-flow tracking security, optimisations. Get away from POSIX - make an API for datacentres. Distributed OS.

DIOS adds 11 syscalls to POSIX. UUIDs for global object names. Kernel provides objects, but provides additional data about them. e.g. where stored. Can be used for optimisations. DIOS runs in the Linux kernel, but is portable enough that it could be ported to BSD.

Example: loads dios kernel module and uses it for map reduce (word counting). Could run across machines, but for demo all running on one laptop.

Status: alpha

Currently no libc; must write to syscall API. Working on a Rust API.

seL4 microkernel: Leave nothing to chance: building high-assurance software systems

Bernard Blackham

Aim: eliminate all bugs through formal methods. Decompose system into small pieces and verify them. This requires strong isolation.

seL4 is a high performance ARM+x86 microkernel. ~10 KLOC. Also suitable for real-time systems.

Kernel is written in C and compiled with gcc. Every C function has an abstract specification: will terminate, will not crash, will not modify other memory, etc.

  • Proof spec provides integrity, confidentiality and availability.
  • Proof C implements spec.
  • Proof machine code implements the C. Could use a certified C compiler, but actually they use a solver to prove the binary is correct (requires human help).

Can generate driver code automatically from specs (done for disk controllers, network cards, etc). Should be correct by construction (currently, no proof that generated drivers match spec).

Cost: about $200-$400 per line of code. 25 person years for the kernel.

Isolation should now let us scale up.

Code, specs and proofs are on github.

Trying to make an OS popular: a cautionary tale

Gerry Carr

Experience doing marketing for Ubuntu (to 2012). Ubuntu's founding goal was originally to fix bug #1 - the dominance of Windows. Wanted to centralise everything around Ubuntu/launchpad and bring open source to the masses.

Plan was Canonical would work closely with manufacturers.

Problems:

  • OSS communities suspicious as launchpad not open source.
  • Git beat bzr.
  • PS makers scared of MS.
  • Intel had nothing to gain.
  • Public indifferent. Hard to interest people outside the core enthusiasts.
  • Ubuntu diversification confused people.

Lessons:

  • Focus on a clear message.
  • Beware of big companies setting the direction. e.g. Ubuntu-for-TV is not a community driven idea.
  • Don't rely on other people to finish your project when it's not their key focus.

What should have been done differently?

Ubuntu cloud and server could have been split out.

Filed under: Uncategorized No Comments