syslog
13Apr/110

EuroSys 2011, day three

Posted by Derek Murray

Session 7: Better Clouds

Kaleidoscope: Cloud Micro-Elasticity via VM State Coloring

The problem is that load on internet services fluctuates wildly throughout the day, but the bursts are very short (median around 20 minutes) and cloud providers are becoming "less elastic" (bigger VMs up for longer), and cannot support such short bursts because VMs are too heavyweight. The solution is based on VM cloning (SnowFlock), but the lazy propagation of state in SnowFlock leads to lots of blocking after the clone (for TPC-H). The solution is to do page coloring to work out the probable role of the page (code vs data, kernel vs user, etc.), and then tune the prefetching by color (such as read-ahead for cached files). Kaleidoscope also reduces the footprint of cloned VMs by allocating memory on-demand, and performing de-duplication. Most server apps tolerate cloning (only change is a new IP for the clones), and SPECweb, MySQL, httperf work fine. The experiments involved running Apache and TPC-H. Blocking decreases from 2 minutes to 30 seconds. TPC-H takes 80 seconds on a cold Xen VM, 20 seconds on a warm one, 130 seconds on a SnowFlock clone, and 30 seconds on a Kaleidoscope clone. Based on a simulation of an AT&T hosting service, Kaleidoscope achieved 98% fewer overheads using a 50% smaller data center. - dgm36

12Apr/110

EuroSys 2011, day two

Posted by Derek Murray

Session 4: Joules and Watts

Energy Management in Mobile Devices with the Cinder Operating System

A new mobile device OS, whose aim is to allow users to control their energy use, and allow applications to become more energy-efficient. First abstraction is throttling, which limits the draw that a particular application may have. However, the energy use is bursty, so this uses a reserve buffer that allows an application to use more energy if it has been running below maximum for a while. A process with an empty reserve will not be scheduled. To prevent hoarding of energy, the reserve drains with multiplicative decrease (e.g. 10%/sec). Reserves may be nested, to, for example, isolate the energy usage of a plugin like Adobe Flash. Energy may also be ring-fenced in "virtual batteries" for uses such as emergency calls. The OS abstraction is a process launcher called "enwrap", which launches an application with an allocation of power consumption. Background applications draw power from a smaller virtual battery to prevent unexpected power draw from applications you can't see; this is managed via a custom window manager. Development issues arose from the implementation of the HTC Dream, which uses a binary blob shared object to interact with the secure ARM9 core, and the exposure of the battery level as an integer 0 to 100; this led to concerns that future mobile phones will be more difficult to develop research OSs for, as there is a move to more use of secure cores and signed code. As a result of these frustrations, they moved to implement their abstractions in Linux, giving Cinder-Linux. One challenge was IPC: it was necessary to attribute energy use in daemons to the process making the IPC request. (This was easier in Cinder due to the use of gates, based on the same mechanism in HiStar.) One application developed was an energy-aware photo gallery, which modulated its download rate depending on energy properties. Next step is working out how to use these primitives, in terms of UI design (presenting a breakdown of energy use to users), energy modeling (currently use a simple energy model based on offline profiling, but could use something more sophisticated such as the approach described in the following talk), userspace code instrumentation and running Android (Dalvik) on Cinder.

11Apr/110

EuroSys 2011, day one

Posted by Derek Murray

Session 1: Data, Data, Data

Keypad: An Auditing File System for Theft-Prone Devices

The challenge is that mobile devices are prone to theft and loss, and encryption is not sufficient, because people have a habit of attaching the password to the device on a post-it, and it is vulnerable to social and hardware attack. Aim is to know what (if any) data is compromised in the event of a loss, and prevent future compromises. Solution is to force remote auditing on every file access (with encryption), by storing keys on the auditing server; this is done in the file system. File system metadata are stored on the trusted server. There are significant challenges in making this performant: caching/prefetching/preallocation are used to optimize key requests, but file creation is more challenging to optimize due to file systems semantics. Blocking filename registrations have correct semantics, but poor performance; vice versa for non-blocking registrations. To reconcile this, force a thief to use blocking semantics while allowing the user to use non-blocking semantics (as much as possible), which is based on using filenames as public keys. Second challenge is allowing disconnected access: the idea is to use multiple devices carried by the user to cross-audit file accesses, which still requires devices to hoard keys before going disconnected. - dgm36

10Apr/110

Systems for Future Multi-Core Architectures

Posted by Derek Murray

Today was workshop day at EuroSys 2011, and I spent the day at the inaugural SFMA workshop. The aim of the workshop was to bring together practitioners from the fields of operating system, programming language and computer architecture research, and provoke discussion about new trends in parallel computing. The most notable thing about the workshop was the number of practitioners that it attracted, starting off with standing room only at 9am in the morning, and maintaining a respectable audience of 35 people through to 5pm. I was on the program committee for the workshop, and Ross did a great job of organising the whole thing.

6Apr/110

Supporting control flow in the CIEL execution engine

Posted by Derek Murray

CIEL - a universal execution engineHow do you write a program that runs on hundreds or thousands of computers? Over the last decade, this has become a real concern for many companies that must be able to handle ever-growing data sets in order to stay in business. When those data sets grow to terabytes or petabytes in size, a single disk (or even a RAID array) can't deliver the data fast enough, so a solution is needed to exploit the throughput of hundreds or thousands of disks in parallel. In this post, I'll introduce various solutions to this problem, and explain how our CIEL execution engine supports a larger class of algorithms than existing systems.