syslog
30Mar/110

4th Winter School in Hot Topics in Distributed Computing

Last week I attended the 4th Winter School in Hot Topic in Distributed Computing organised by INRIA. The winter school was located in La Plagne (a classic ski resort in the French Alps) so good food and astonishing landscapes were guaranteed.

The winter school program was structured in 2 sessions per day. This year, the invited researchers were Shafi Goldwasser from MIT, Timothy Roscoe from ETH Zurich, Nir Shavit from Oracle and Tel Aviv University, Luis Rodrigues from INESC-ID, André Seznec from INRIA, Pablo Rodríguez from Telefónica Research and Serge Abitebou from INRIA. The talks covered different topics of distributed computing and the slides might be available soon in the winter school website. Nevertheless, I strongly recommend checking the webpage of each speaker for further details about their work. In addition to these sessions, two doctoral sessions allowed  PhD students to show their work and to get some valuable feedback.


The first session was by Shafi Goldwasser about "Secrets and Proofs", or in other words, how cryptography and interactive proofs can be used to make more secure distributed computing systems. She explained the concepts behind secure multi-party computation and some considerations that must be taken into account for resource-limited devices such as sensors.

Timothy Roscoe described the design considerations behind an OS suitable for multi-core machines that led his group to design and implement Barrelfish in collaboration with Microsoft Research. Barrelfish is a micro-kernel OS that uses message passing techniques instead of shared memory. Multi-core machines tend to look more like a network (see Intel's Single-chip cloud computer -SCC , a microcosmos of cloud datacenters). However, as Roscoe said, despite adding more cores brings more cycles, memory and bus are the main bottlenecks in multi-core computers. More chips do not necessarily imply more off-chip bandwidth, more cache or more RAM capacity and in this kind of scenario, a monolithic OS presents scalability problems. Barrelfish solves this issue by using a distributed memory architecture (replicating the state at each core increases performance) and message passing. This technique reduces the latency compared to Shared Memory and it makes possible scaling to hundreds or thousands of cores.

On Wednesday morning, Nir Shavit showed how Flat Combining can be used to optimise data structures for parallelism. One of the main problems are locks: they make execution sequential. Moreover, being sophisticated in synchronisation can be counter-productive and it can cause a considerable memory overhead. Flat-combining uses a single thread to combine all the requests to the data structure from several threads thus improves the cache behaviour by reducing the number of cache misses. Moreover, Flat-combining requires little synchronisation and scales much better than systems based on locks. He described how this technique improves the performance of LIFO stacks, priority queues and hash tables (Nir introduced Hopscotch HasTables). In his opinion, in the future we will see data structures more randomised (hash tables, skiplists, randomised collections), relaxed (unordered pools instead of stacks and queues) and flat.

In the afternoon, Luis Rodrigues highlighted the benefits of DSTM. He is currently involved on a european project called Transactional Memory in the Cloud. In his talk, Rodrigues reviewed some case studies and he compared their performance. He show how a technique does not outperform the others because of its dependence on network latency. The use-cases he covered were:

  • STM for clusters (STM-Cluster)
  • Partitioned/Replicated (Static transactions, Sinfonia)
  • Single/Non-replicated (Distributed Transactions Locking II (D-TL2))
  • Asynchronous Lease Certification (ALC).
  • Active replication via speculation

Andre Seznez gave a talk entitled "Real men don't eat quiche" (If you wonder why his talk was entitled in this way, check this term in wikipedia :-)). He described the hardware architecture of multi-core machines and their bottlenecks in a lecture-fashion. At the end of his talk he mentioned some of the architecture trends that we will have in the years to come. He highlighted that we need energy-aware architectures and techniques to predict performance problems. In fact temperature in cores is a major problem. He proposes migrate tasks to cores depending on their temperature to reduce its negative effects. Nevertheless, he thinks that parallel applications are still expensive and a niche market (e.g. graphics). In fact, most legacy code is sequential and we still have a lack of expert programmers, debug and maintenance tools for parallel programming in the years to come. He thinks that we need to understand better what sequential programs can do efficiently.

Pablo Rodriguez focused his talk on the lessons we can get to make distributed and cloud services more scalable. He explained the case of Facebook and how it relies on Content Distribution Networks. This led his talk to their SPAR Algorithm that leverage social networks to find optimal ways of providing data locality. In their system, they distribute the data to different servers by identifying clusters of users and replicating bridges to others at each one of them. SPAR minimises the number of replicas at each machine rather than minimising the number of edges in the cluster. The second part of Pablo's talk was about "Bulk Data Transfer" or how to deal with the time windows at which data centers located worldwide are at low utilisation to move vast amounts of data between them. In his opinion, network economics and data placement will be issues that will drive system design in the future.

The closing session was by Serge Abiteboul . He talked about web search, web data management and distribution. He showed some of the issues that must be taken into account for web crawling and indexing such as duplications, ethics and language diversity. After that, he described link analysis algorithms such as PageRank, Hits (by Kleinberg) and his work on distributed PageRank. In his opinion, the web is becoming more social and recommendation is getting more importance in the web so he thinks there is a lot of ground to work on corroboration algorithms using simple methods such as voting.

During the doctoral sessions, the presentations were mainly given by students from INRIA and EPFL but there were also some people from Yale University, TU-Berlin, Poznan University, University do Minho, Technion Israel, University of Basque Country and University of Cambridge. Some of the topics are quite related with some ongoing activities we are doing in netos group. I won't describe all of them because it might extend considerably this post (drop me an email if you want more details). The most popular topics were transactional memory, concurrency and solutions for multi-core machines. Nevertheless, there were also some presentations about privacy, decentralized youtube system, P2P Storage, decentralized news recommend engines using gossiping algorithms, recommendation engines, cryptography-based distributed polling in social networks, declarative languages for cloud computing, consensus protocols (UbiTrust project) and also mobile computing.

Mount Blanc from Bellecote Glacier

Mount Blanc from Bellecote Glacier