syslog
14Apr/130

Eurosys 2013 – Doctoral Workshop

I am at the Doctoral Workshp at EuroSys in Prague today, ahead of the main conference. Below are some notes on the 5 minutes presentations in the workshop.  There are no notes on the first quarter of the workshop because I talked about my work as well.

Ludi Akue - Online Configuratin Checking for Network and Service Management
- Network management as a sequence of operations (control loop): observe -> decide -> adjust.
- The changes must be autonomous and validated.
- Add runtine validation to the control loop: observe -> decide <-> validation -> adjust . (arc from observe to validation as well).
- Lots of generic configurations => validation must be generic, enabled at runtime and flexibile (modifications at runtime).
- Designed a higher level specification language (MeCSV metamodel) to design a reference model.
- Reference model - the administrator defines the configuration management of the system (e.g. config structure, state parameters, constraints).
- The reference model is used at runtime by the validator (also known as online checker).
- MeCSV implemented in UML profile and tested in Eclipse MDT.
Q1(Dushyanth): What is the domain? Network config (e.g. switches)?
A: Network & middleware configuration. We are focusing on middleware. Tested the system on Common Information Model (CIM) standard.

Q2(Kim Keeton): What's the form of the constraints? Could you please give some specific examples?
A: We want to avoid bad configurations. We're not working on generating good configurations. Just want to check that the system is reliable.

Q3(Allen Clement): Comments on the presentation. The big thing what was really missing is: Why was this techincal work necessary? What's the problem that necessitated this technique? Framing the technical specific problem is important.

--------------------------------------------------------------------------------------
Petr Hosek - MX: Safe Software Updates Via Multi-Version Execution
- Updates are hard. Sometimes one step forward and two steps backward. Easy to introduce bugs.
- Users are very often reluctant to update their software to a new version.
- Lighttpd - fix in 04/2009 introduces a bug that was only identified 11 months later.
- Goal: provide benefits of the newer version by ensuring stability provided by the old version.
- Solution: run both versions side by side. Pick the correct solution at runtime.
- System completely transparent to users. At the moment running two versions with small differences.
- Lighttpd 1.4.22 vs Lighttpd 1.4.23. Check every syscall and its arguments.
- Have sync points. Once at a sync point, the new version will fail, but it can continue executing using the result (for the section) obtained running the old version.
- Future work: improving performance overhead, support for more complex code changes, support for non-crashing type of divergences.

Q1: How do you know when to make snapshots?
A: We make them at almost every syscall.

Q2(Dushynath): On a crash, you run one version with the other version's state. Does that work?
A: It's fragile, but it works. We're looking on reducing its fragility.

Q3(Allen Clement): How do you do state comparisons without coming out with a nasty bottleneck?
A: Using syscalls.
Allen Clement: So you're saying that as long as the functions are the same and the arguments are the same, then it's fine?
A: Yes.
Allen Clement: If you treat syscalls as black-box then you can still have your state diverging in a subtle way.
--------------------------------------------------------------------------------------
Stefan Wigert - Advanced Persistent Thread (APT) Detection
- Goal: detect stealthy, target-oriented, internet-based attacks by just looking at Internet communication logs
- Why is it hard to detect? Social component (employees).
- If you know for each company all the subnets the company owns, then you can construct its IP-space and determine top-k communication peers
- How can you find the IP-spread of a company automatically? Using community detection algorithms...
- Came up with an iterative approach, start with a seed-set and crawl around it.

Q(Allen Clement): What community detection are you using?
A: Random-walks. We now use that combines the seed-set with labour propagation.
Allen Clement: How connected do you see this work with civil identities?

Q(Valentin Dalibard): How better are you performing than doing "cheap" techniques? E.g. consider nodes with many edges.
A: I think the algorithm we're using now it's not that difficult. The algorithm is good because it allows us to process iteratively. In the paper, we took 10% analyzed it, then we took another random 10% and analyzed it.
--------------------------------------------------------------------------------------
Thomas Hruby - NewtOS - Reliable and efficient system for multicores
- High performance fork of MINIX 3.
- Smaller components with less state.
- Performs better because exploiting multicores.
- Developed on top of a microkernel.
- No data sharing, no synchronization between components. Only uses message passing to interect among components.
- A component crash does not kill the system.
- The system is extremely slow: too many context-switches => bad cache usage.
- NewtOs - switched from kernel communication to user level async point-to-point communication channels (e.g. shared memory queues).
- Proof of concept: developed a new network stack. Chopped single block stack into multiple components/pieces. Increased Minix 3 net performance from 200Mbps to 10 Gbps.
- On multicore the components can be migrated depending on the load.
- We're looking on how to use the cores efficiently. We're focusing on the scheduler and on placing these components.

Q1(Dushynath): There are many ideas here (e.g. microkernel, BarrelFish). Are they all tied together like that? Were do you see your contribution? Do I have to use Minix to buy in your contribution?
A: We need a reliable system. We showed that chopping the system into smaller pieces does that.
Dushynath: Sorry, but do you have to closely relate your project to Minix? Nobody really uses Minix.
A: Well, nobody really uses Barrelfish either.

Allen Clement: The vehicle you're using it's not a driving fact for the problem.

--------------------------------------------------------------------------------------
Thomas Knauth - Web service consolidation
- Greenpeace published a report (2009). If cloud computing would be rated as a country, then cloud computing would come 6th. It uses more energy than entire Germany? (Comment: This looks extremely sketchy to me).
- There are lots of websites that don't get lots of traffic, then there's potential of switching off.
- Modified Apache to keep inter-arrival request arrival time. If request are low then VM are shutdown.
- How fast can we resume virtual machines? Depending on the state (1GB to 4GB of memory), it can take from 2 secs to 10 secs.
- Looking at techniques to do lazy resume. For some queries you can just initially resume part of the state.

Q1(Allen Clement): I feel I've heard this story before (DC are inefficient). I also think I've heard that powering off machines and powering them back on is less efficient than keep them on at low efficiency.
A: For Google it doesn't really work because their idle times are low. We say that there are other kinds of workloads where this may work.

Q2(Dushynath): There are many operational issues why people running DC to switch machines (e.g. machines are not coming back properly). Be careful about the Greenpeace statement, they actually use some different metric with which you can get those results.

Q3(Allen Clement): It was missing what's the problem/challenge this mechanism solves? What's the challenge that it solves? Even if I don't remember how the mechanism work, I can still remember the challenge.

--------------------------------------------------------------------------------------
Marius Vlad - Detecting and analysing multi-stage payload attacks
Missed this one. Didn't really get what it was about.

--------------------------------------------------------------------------------------
Martin Nowack - Reducing runtime overhead of software fault checks using symbolic execution
- Software typical has errors (integer overflow, of-by-1 accesses) => security risks, downtime.
- How to analyze? Add instrumentation or analyze beforehand.
- There are code paths which are executed most often. The idea is to use symbolic execution to go through those paths of execution and try to remove uneeded checks.
- Implementation based on Klee.

Q1: Do you it offline?
A: I'm doing it offline.

Q2(Allen Clement): Short description of what you're proposing: it's expensive to do all the checks, to remove checks. Hence, just do a heuristic.
A: Yes.

Q: KLEE has lots of heuristics. How can you discard a check statically when even at runtime you can't reason about it?
A: ...

Q: Would it work for other optimizations?
A: Yes, sure.

Q: Your optimization can only work for very specific workflows? It looks that your optimization can only optimize for specific paths. (e.g. only the second request of a web server and not the forth one).
A: Time out.

--------------------------------------------------------------------------------------
Natacha Crooks
Didn't take notes on this one.

--------------------------------------------------------------------------------------
Qian Ge - Eliminating Timing Channels from OS Kernel
- Cover channels - allow barries enforced by system protection mechanisms to be surpasssed.
- Timing channel - ...
- Eliminating timing channels from seL4 (formally verified microkernel).
- Evaluate bandwidth of timing channels in seL4 and the to be proposed solutions.

Q1: Are you going to enumare all the timing channels? Will somehow seL4 help you do that?
A: ...

Q2: How far is seL4 from being a general purpose OS, on general purpose hardware? seL4 more appropriate for use in constraint environments. It seems that in these environments you should have more control on what programs are doing. I'm thinking if the environment can help you.
A: It's a good suggestion to think about it.

--------------------------------------------------------------------------------------
Jens Kehne - BeeHive: A distributed operating system for cloud platforms
- Heavily virtualized, coarse-grain allocation, duplicated state, difficult to split/merge VMs and distributed applications. All these waste resources
- Processes as basic unit of management instead of VMs
- Assign resource on a much finer grain, faster migration, less state duplication
- Beehive Model: microkernel on multiple machines, cross-node communication, transparent process migration, continous reassignment of resources
- Challenges: How to make IPC fast enough? How to get the IPC and migration to be transparent? How to make good decisions about migration? How to make all this compatible with existing applications?

Q1(Dushynath): Processes were around before VM. There must be some reasons why people choose to use VMs instead of processes?
A: One large problem used to be Kernel state. It's difficult to extract state of a process and move it to a different machine.

Q2(Allen Clement): VM introduced to get easy isolation between processes on different VMs. What are giving back if we drop VMs?
A: Using a microkernel you can still provide fairly strong isolation between processes. Somehow weaker than VMs, but still good enough.

Q3(Steve Hand): What's your killer experiment to show that the system is good?
A: I want to make reconfiguration as fast as possible (i.e. move one process to a different machine when the ex-location gets busy).
Steve/Dushynath: You can do that today very fast with VMs.

--------------------------------------------------------------------------------------
Simon Gerber - Memory management for heterogeneous multicores
- Memory management on single core is well-understood, but we don't know what to do on heterogenous systems.
- Want to find a way of building a fast MM for homogenous systems, extend it to simple heterogenous systems and finally complex heterogenous systems (e.g. x86 and ARM).

Q(Nickolai Zeldovich): What is memory management?
A: The most basic way of isolating processes. => (Zeldovich) It basically is virtual memory.

Q(Steve Hand): What is your target hardware?
A: Probably an x86 that talks to an ARM core.

--------------------------------------------------------------------------------------
James Snee - Cross-layer instrumentation for deep layered software stacks
- Energy consumption for phones.
- Looking at outliers of measurements. Are our measurements ok?
- Must identify the source of the outliers.
- How do we trace application behaviour through the stack and detect the source/reason of the outlier.
- Detect anomalies with runtime tracking.
- Call graph built from a sliding window of event characteristics. With this graph we can say where the divergence happens.
- Working on how to fine grained to go with the trace. Iterative process, start coarse and in each iteration go more fine grained.
- Similar to ftrace but for Android. ftrace can be turned on at runtime.

Q: So you haven't figured what the outliers are?
A: No, not yet. People usually say it's JVM GC, but something else is going on.

Q(Steve Hand): The screen backlight and the right primary sources of energy drains. Are you looking at that?
A: Sure, but we can also look at other concepts (e.g. if you 3G an aggressive battery user).
--------------------------------------------------------------------------------------
Stanko Novakovic - Scale-Out NUMA Systems
- In large-scale graphs work done per node is small and there's a lot of communication.
- Scale-up: single machine. Low-latency, but does not scale.
- Scale-out: multiple machines. Scalability, but high remote access latencies.
- We want Scale-out that can provide Scale-up performance.
- Scale-out system based on NUMA, where nodes can directly r/w access each other's local memory.
- Working on a prototype based on Xen and ccNUMA.

Q(Steve Hand): Why has nobody done this yet? Why doesn't RackScale do this?
A: There's something similar, microserver cluster. However, they're different because they rely on special connection and TCP/IP.
--------------------------------------------------------------------------------------
Stefan Kaestle - Distributed Computing on Hybrid Multicore Machines
- Multicores will have more and more hybrid interconnects.
- To overcome problems with NUMA awareness and architecture we need to program as a distributed system.
- Must think how distributed algorithms change on multicore.
- Ideas: model multicore machines, find relevant characteristics of an algorithm and map them on multicore architectures.

Q(Allen Clement): Which of the classical problem statements matter? Which of the traditional problems still makes sense in this multicore system matters?
A: It depends on how much you want to scale. If you only have one machine then it doesn't matter (assume no failures). However, we have the same problems if we scale on multi-machines.
--------------------------------------------------------------------------------------
Tomasz Kuchta - Document Recovery
- Imagine the case when a file gets corrupted.
- Why is the application crashing? It may be that the program is using an unsual execution path.
- Symbolic execution to obtain constraints on the input then create a new document that will not crash the application.
- How to obtain? Modify and solve the constraints and choose the document which is closest to the initial file.
- Challenges: What heuristics should be used? (i.e. which constraints to we want to negate/remove, focus on the bytes and constraints that trigger the bug)
Q(Allen Clement): What's the difference between a buggy file and corrupted text?
A: My focus is on the files that cause crashes/abnormal program termination.
Allen Clement: Do you see any possibilities to tackle with corrupted data, when you don't realize that something is wrong and just accept that file?
A: Difficult because we base our work on constraints.
--------------------------------------------------------------------------------------
Valter Balegas - Improving Data Consistency in Cloud Infrastructure
- Problem is that the service has to provide latency and serve many requests at the same time.
- Many consistency levels have been proposed. Mostly force execution respecting a total order.
- Proposes: Add a reservation manager. Nothing can happen unless you request a reservation from the manager.
- A consistency model based on data semantics.
- Causal+ consistency with invariants.
--------------------------------------------------------------------------------------
Valentin Dalibard - Optimising Graph Computation: Trading Computation for Communication
- Assume you have a graph, if small enough then you just put it in the memory of a machine and run the computation.
- If graph too big then you probably use distributed BSP.
- The problem is that in BSP there are the synchronization steps which take a lot of time.
- PowerGraph spends about 90% on the synchronization.
- Lesson: CPU time is very cheap, but communication time is very expensive.
- New approach: Don't need to go through BSP, just want to converge as fast as possible using all the resources you have.
- Have constant computation and constant communication.
- Changes in the computation model of BSP: - allow stale inputs on vertex functions => iterate more often.
- assign different priorities to messages => send data from more important vertices.
- allow to iterate over vertices at different frequencies.
- Challenges: Doesn't work for all computations. Check which fixed-point computations fit this model.

Q: Is that going to be more difficult for the person writing these jobs?
A: Yes, it's going to be more difficult. It's part of the problem of finding the computations that work.
Q(Dushynath): Are you talking about bandwidth and latency? It's a good idea to run a back of the envelope numbers.