Mobicom. Day 1
Mobicom'11 is being held in the (always interesting) city of Las Vegas. In this first day, the talks were mainly about wireless technologies and different techniques to avoid congestions were proposed.
Plenary Session
Keynote Speaker: Rajit Gadh (Henry Samueli School of Engineering and Applied Science at UCLA)
Prof. Gadh talked about UCLA project “SmartGridâ€, a topic which is gaining momentum in California.  This project is motivated by the fact that electricity comes from a grid that spread across a whole country and we are still using technology that has been deployed 100 years ago. The grid is rigid, fixed and large. In fact, Rajit Gadh thinks that there is a clear parallelism between data networks and power networks. Based on that observation, they aim to create a Smart Grid infrastructure with the following characteristics: self healing, active participation of consumers, capabilities to accommodate all the energy sources and storage options, eco-friendly, etc.. More information can be found in the project website.
SESSION 1. Enterprise Wireless
FLUID: Improving Throughputs in Entreprise Wireless LANs through Flexible Channelization, Shravan Rayanchu (University of Wisconsin-Madison, USA); Vivek Shrivastava (Nokia Research Center, Palo Alto); Suman Banerjee (University of Wisconsin-Madison, USA); and Ranveer Chandra (Microsoft, USA)
One of the problems in current 802.11 technologies is that channels width is fixed. However, many advantages arise by replacing fixed witch channels with flexible width ones. The goal of this paper is to build a model that can capture flexible channel conflicts, and then use this model to improve the overall throughput in a WLAN.
One of the problems in wireless channels is that depending on the interference, there are different approaches to avoid conflicts. Â Nevertheless, the interference depends on the configuration of the channel. As an example, narrowing the width helps to reduce interference but they also tried to better understand the impact of the power levels.
They showed that given a SNR, it is possible that nodes can predict the delivery ratio for an specific channel width. As a result, the receiver can compute the SNR and predict the Delivery ratio as a function of the SNR autonomously. Given that, the problem of channel assignment and scheduling becomes into a flexible channel assignment and scheduling problem.
SmartVNC: An Effective Remote Computing Solution for Smartphones, Cheng-Lin Tsao, Sandeep Kakumanu, and Raghupathy Sivakumar (Georgia Tech University, USA)
In our opinion, this paper was a great example of how to improve the user experience with certain applications. In this case, they are trying to improve the UX of mobile VNC. This kind of service was designed for desktops and laptops so they do not take into account the nature of smartphones. The goal is allowing users to access a remote PC (in this case Windows) from a smart phone (Android) in a friendly way. They evaluated the UX of 22 users (experienced users, students between 20-30 y.o.) and 9 applications running on VNC. They defined different metrics such as the opinion score (the higher the complexity lesser the mean opinion score) and task effort (number of operations required for a task such as mouse clicks , key storekes etc). Given that, they correlated both metrics for those users running apps in VNC and the results showed that when the task effor is high, the UX is poorer.
They proposed aggregating repetitive sequences of operations in user activity to remove redundancy without being harmless. One of the main problems was that application macros (like in excel) are not completely application agnostic but they are extensible whilst others such as raw macros (e.g. autohotkey) are completely opposite.
They enabled Smart macros. For that, they record events and build macros and they enabled a tailored interface with collapsive overlays on the remote computing client, grouping macros by app, automatic zooming, etc. Â For the applications they tested with those 22 users, they had a task effort reduction from 100 to 3 whilst the time to perform a task is also highly reduced. In the subjective evaluation, all the users showed their satisfaction with the new VNC. The talk was completed with a video recorded demo of the system.
FERMI: A FEmtocell Resource Management System for Interference Mitigation in OFDMA Networks, Mustafa Yasir Arslan (University of California Riverside, USA); Jongwon Yoon (University of Wisconsin-Madison, USA); Karthikeyan Sundaresan (NEC Laboratories America, USA); Srikanth V. Krishnamurthy (University of California Riverside, USA); Suman Banerjee (University of Wisconsin-Madison, USA); Mustafa Arslan
Femtocells: are small cellular base stations that use cable backhaul and they can extend the network coverage. In this scenario, interferences can be a problem but this problem differs from the ones that can be found in the WiFi literature. OFDMA (WiMax, LTE) uses sub-channels at the PHY and multiple users are scheduled in the same frame whilst WiFi uses OFDM (sequential units of symbols transmitted at an specific freq in time). Moreover, OFDMA presents a synchronous MAC (there's no carrier sensing like in WiFi). As a consequence, WiFi solutions cannot be applied to femtocells as interference leads to throughput loss and there are many clients coexisting in the same frame.
As a consequence, the solution must take into account both the time domain and the frequency domain. FERMI gathers load and interference related information. It operates at a coarse granularity (in the order of minutes) but this is not a drawback as interference does not change a lot in this time scale. Moreover, a per-frame solution is not feasible as the interference patterns change on each retransmission but aggregate interference and load change only at coarse time scales.
The system evaluation was done on a WiMax testbed and also on simulations. In both cases, they obtained a 50% throughput gain over pure sub-channel isolation solutions. The core results can be applicable to LTE as well.
SESSION 2. Wireless Access
WiFi-Nano: Reclaiming WiFi Efficiency through 800ns Slots, Eugenio Magistretti (Rice University, USA); Krishna Kant Chintalapudi (Microsoft Research, India);Bozidar Radunovic (Microsoft Research, U.K.); and Ramachandran Ramjee (Microsoft Research, India)
Wifi data rates have increased but throughput performance didn't see similar level of growth. Throughput is much lower than data-rate because of a high frame overhead. There’s a 45% overhead at 54Mbps but this overhead dominates at high bandwidth, around 80% in 300Mbps. This gets worst when multiple-links come at play.
This observation motivated WiFi nano, a technology that allows doubling the throughput of WiFi networks. Slot overhead can be reduced by 10x. Their solution proposes using nano slots to reduce slot duration to 9 microsec (that’s the standard one in 802.11a/n and it’s almost the minimum achievable). In addition, they exploit speculative preambles as preamble detection and transmission occur in parallel. As soon as the back-off expires, a node transmits the preamble but while transmitting preamble, it continues to detect incoming preambles even with self-interference. Their empirical results show that slightly longer preambles improve the throughput up to a 100% and frame aggregation can increase those figures even more. In fact, frame aggregation increases the efficiency as it grows from 17% to more than almost 80%.
XPRESS: A Cross-Layer Backpressure Architecture for Wireless Multi-Hop Networks, Rafael Laufer (University of California at Los Angeles, USA); Theodoros Salonidis; Henrik Lundgren and Pascal Leguyadec (Technicolor, Corporate Research Lab, France)
Multihop networks operate below capacity due to poor coordination across layers, and among transmitting nodes. Â They propose using backpressure scheduling and cross-layering optimisations. At each slot, it selects optimal link set for transmission. Â In their opinion, there are different challenges in multihop networks:
1- Time slots.
2- Link sets (e.g. knowing non-interfereng links)
3- Protocol overhead
4- Computation overhead
5- Link Scheduling.
6- Hardware constraints (e.g. memory limitations in wireless cards)
With XPRESS, all those challenges are addressed. XPRESS has two main components the MC (mesh controller) and the MAP (Mesh access point). MCs receive flow queues, computes schedule and disseminates schedule. On the other hand, MAP executes schedules and processes queues. The key challenge is computing the optimal schedule per slot. but this task takes a lot of time.
The MAP nodes use a x-layer protocol stack to compute the schedules. Apps running on the node go into the kernel who classifies the flows and allocates them on its own queue who is followed by a congestion controller. Then, the pipeline has a flow queue followed by a packet scheduller who puts into the proper link queue each packet. Somehow this reminds me of the work on Active Networks as they are dynamically change the behaviour of the network, in this case on a mesh-scenario. The proposed scheme achieves 63% and 128% gains over 802.11 24 Mbps and auto-rate schemes, respectively. They also performed an scalability evaluation.
CRMA: Collision-Resistant Multiple Access, Tianji Li, Mi Kyung Han, Apurva Bhartia, Lili Qiu, Eric Rozner, and Ying Zhang (University of Texas at Austin, USA); Brad Zarikoff (Hamilton Institute, Ireland)
FDMA, TDMA, FTDMA, CSMA are the traditional MAC protocols to avoid collisions. These techniques incur significant overhead so they move from collision avoidance to collision resistance based on a new encoding/decoding to allow mutliple signal to be transmitted.
In CRMA, every transmitter views the OFDM physical layer as multi orthogonal but sharable channels, and randomly selects a subset of the channels for transmission. When multiple transmissions overlap on a channel, these signals will naturally add up in the wireless medium.
In this system, ACKs are sent as data frames. However there’s a problem with misaligned collisions which are handled with cyclic prefixes (CP) so they force the collided symbols to fall in the same FFT window. On the other hand, overlapping transmissions are limited using exponential back-off.
The evaluation was done on a testbed experiment with CRMA on top of a default OFDM implementation in USRP. They also used Qualnet simulations to evaluate the efficiency of the networks.
The San Diego Trip: An Overview of this year’s SIGKDD Conference
This year's SIGKDD conference returned after 12 years to San Diego, California to host the meeting of Data Mining and Knowledge Discovery experts from around the world. The elite of heavy-weight data scientists was hosted at the largest hotel of the West Coast and together with industry experts and government technologists enumerated more than 1100 attendees, a record number in the conference's history.
The gathering kicked off with tutorials and the parallel of two classics; David Blei's topic models and Jure Leskovec' extensive work on Social Media Analytics. Blei offered a refreshing talk that stretched, from the very basics of text-based learning, to the most up to date extensions of his work with applications in streaming data and the online version of the paradigm that allows one to scale up the model to huge datasets satisfying the requirements of modern data analysis. Leskovec elaborated on a large spectrum of his past work, covering a wide range of topics including the temporal dynamics of news articles, sentiment polarisation analysis in social networks and information diffusion in graphs by modelling the influence of participating nodes. The first day's menu on the social front was completed with Lada Adamic' presentation on the relationship between structure and content in social networks. Her talk at the Mining and Learning with Graphs Workshop provided an empirical analysis on a variety of online domains, that described how the flow of novel content in those systems was evident of variations in the patterns of interaction amongst individuals. The day closed with the conference's plenary open session that featured submission and reviewing highlights and the usual KDD award ceremonies: the latter session honoured the decision trees man, Ross Quilan, who presented a historical overview of his work and a data mining legion of 25 students from NTU that won this year's KDD cup on music recommendations.
After the second night of sleep and repetitive jetlag ignited wake ups, Monday rolled in and the conference opened with sessions on user classification and web user modelling. A follow up in the afternoon with the presentation of the (student) award winning work on the application of topic models for scientific article recommendation attracted the interest of many. The dedicated session of the conference on online social networks also signified the interest of the Data Mining community for the nowadays hot domain. The latter opened with an interesting work on predicting semantic annotations in location-based social networks and in particular the prediction of missing labels in venues that lacked user generated semantic information. While the machine learning part of the work was sound, its applicability as a real problem was doubted, suggesting the need to identify the essential challenges in a relatively new application area. Nonetheless, the keyword of the day was scalability: Â two talks focused on an ever classic machine learning problem, clustering, Â introduced in the context of the trendy Map Reduce model. Aline Ene from University of Illinois introduced the basics, whereas the brazilian Robson Cordeiro offered novel insights with a cutting edge algorithm for clustering huge graphs. The work driven by the guru Christos Faloutsos featured the elegance of simplicity with the virtues of effectiveness, showing that for some size does not matter and petabytes of data can be crunched in minutes. A poster session came to shut the curtains of another day. The crowd was not discouraged by the only-one-free drink offer of the conference organisers and a vibrant set of interactions took place. Some were discussing techniques, some were looking for new datasets, while social cliques were also forming in the corners of the hotel's huge Douglas Pavilion.
Day 3 drove the conference participants to the dark technical depths of the well established topic of matrix factorisation, that was succeeded by the user modelling session.Yahoo!'s Bee-Chung Chen gave an intriguing presentation on a user reputation in a comment rating environment, followed by the lucid talk of Panayiotis Tsaparas on the selection of a useful subset of reviews for amazon products that were plagued by tones of reviews. The Boston-based Greek gang of Microsoft Research, also showed how Mechanical Turk can be used to assess the effectiveness of review selection in such systems.  Poster session number 2 closed the day and the group's work on link-prediction in location-based social networks was up. The three hour exhaustive but fruitful interaction with location-based enthusiasts, agnostics and doubters was a good opportunity to get the vibe of the community in an up and coming hot topic. For application developers and online service providers the work was an excellent example of how location-based data could be used to drive personalised and geo-temporally aware content to users. For data mining geeks it presents an unexplored territory where existing techniques could be tested and novel ones devised. At the end of the poster session many of the participants headed for a taste of San Diego's downtown outing, whereas the relaxing boat trips at the local gulf were also highly preferred.
The final day of the conference was marked by Kaggle's visionary entrepreneur Jeremy Howard and a panel of experts in data mining competitions. The panel aimed to analyse the problems that were risen during previous competitions and the lessons learned for the creation of new successful ones. Howard presented radical views suggesting that the future of data mining and problem solving would be delivered in the form of competitions. Not only competitions could attract an army of approximately 10 million data analysts around the globe, but the design of them could promise a sustainable economic model that would bring money to all participants (even non-winners) and would perhaps put at stake a respectable number of PhD careers. His philosophy was driven by the idea that to solve challenging problems effectively, you need to awaken the diverse pool of minds that is out there and can constitute an infinite source of innovation.
But KDD attracted not only the interest of scientists and corporate experts, but also that of politicians. Ahead of 2012 elections the Obama data mining team is here and hiring! Rayid Ghani chief scientist at Obama for America highlighted the important role of predictive analytics and optimisation problems in the battle for an electorate body that is traditionally positioned to announce winners by only small margins of difference. It is left to see whether science will beat Tea Party style propaganda and will maximise positive votes in a bumpy and complex socio-political landscape. The political world was also also (quietly) represented by government data scientists and secret service analysts who were seeking to catch up with the state of the art in data mining and knowledge discovery, a vital survival requirement in a world overflowed with data and subsequent leaks...
The full proceedings of KDD 2011 can be found here.
Web Science Summer School
till tomorrow, i'm at the web science summer school. i was invited to give a talk on privacy in mobile-social networking applications. my talk was a re-mix of blog posts and papers (including spotme, "what we geeks don’t get about social media privacy", and "location-related privacy in geo-social networks" - pdf ). unfortunately i could not attend the whole summer school, but you can check here the schedule and my notes on a couple of talks are next.
marcel karnstedt gave a great presentation on the effects of user features on churn in social networks. he presented a nice empirical study of the mechanisms by which a web forum maintains a viable user base. he found that different forums show different behavioural patterns and also found few interesting regularities. have a go at his paper (pdf)
bernie hogan wondered what kind of mental models people have of their Facebook personal (ego) networks. to answer this question, he collected mental models that a number of Facebook users have about their personal networks, collected the actual personal networks from Facebook, clustered them using a community detection algorithm, and looked at the extent to which mental maps overlapped with actual networks. he found that people are good at identifying the clusters they are involved in but are not good at identifying which of their social contacts act as `brokers' in the network. this finding has interesting implications - eg, since opportunities/new ideas tend to come from brokers and people find it difficult to identify brokers, then it follows that people do not know where to look for new ideas, right? ;) bernie also said that neurotics tend to have broken networks, while extroverts tend to have clustered networks. check bernie's publications here!
the student projects look very interesting. they include collaborative filtering, sentiment analysis, and community detection.