in between reading SOSP liveblogging notes, I'm still trying to think up how one might implement a "proof of deletion" service for cloud storage - here's the latest
a user stores data in the cloud - the data is encrypted so cloud provder cannot simply read it, but is amenable to privacy preserving queries on some keys.
the user wants to delete a record, contacts a third party (the grim reaper?), and gives then the keys of records. the third party tells the cloud service to delete the data. and then, using an anonymous service (via TOR etc) queries the record - they should get a 404 response.
of course, the cloud provider can squirrel data away but not in any useful way, as the TTP can do the query at any time
why ot just let the user run the query? well they might want to go away, and rely on the TTP who might also be persistent and might have bigger TOR guns....
Chris Smowton, Stephen Smith, Derek Murray, Steve Hand and myself are in the beautiful Cascais near Lisbon to attend the 23rd Symposium on Operating Systems principles. Hopefully, we are going to find some time to write up some of the presentations for syslog over the next couple of days. In the mean time, there is a semi-live blog of semi-structured notes here.
Last week I was in (the other) Cambridge, attending the "Second conference on the Analysis of Mobile Phone Datasets and Networks", or NetMob, held at the MIT Media Lab together with SocialCom 2011. NetMob provides an interesting format: there is only one track of short contributed talks, with the possibility to present recent results or results submitted elsewhere.  Speakers have about 10-12 minutes to present their work and then there is plenty of time to discuss ideas network with other people over 2 days. I gave two talks: one of our research on the effect of geographic distance on online social networks and another on our recent work on universal patterns in urban human mobility.
The unifying theme of the workshop is the analysis of mobile phone datasets: as people user mobile devices more and to do more things, these datasets help us to understand complex processes such as spread of information, human mobility, the usage of urban geography and so on. Indeed, the range of talks presented at the workshop was impressive and fascinating, spanning between two main points: the first day focused more on studying user mobility, while the second day featured works on social behaviour.
Among the most innovative works during the first day there was a talk by people at MIT & Berkeley on using mobile phone CDRs to make sense of urban roads, proposing to use a the Gini coefficient to measure the diversity of individual traffic carried by each street. Individual user mobility was the main theme of several talks: I particularly liked one on the seasonal patterns of user movements, presented by Northeastern University researchers, and one by a large team led by Vincent Blondel on exploring the spatio-temporal properties of human mobility and the regular home-work routine of many users. Laszlo Barabasi gave an invited talk on mobility and predictability, presenting much of his last work and trying to connect the statistical properties of human mobility to the performance limits of many related applications that rely on user regularity. Finally, AT&T Labs presented their results on why it is impossible to anonymize location data.
The second day featured works on the social properties of mobile phone communication between users. Researchers at CMU presented their results on quantifying how social influence might compel users to adopt some products by using randomization techniques. Another interesting talk by a a joint team UC3M and Telefonica presented how time allocation in social networks has strong constraint that are likely to affect and be affect by the social structure itself: well-connected hubs have a lower importance on information transmission than less connected users, with important consequences on many dynamic social processes. Sandy Pentland have another invited talk, offering a wide overview of how mobile devices are changing the technological landscape with their ubiquitous sensing capabilities. Another interesting talk discussed the economic value of mobile location data, presenting scenarios user actions can be monetized and profit shared among different service providers.
Overall NetMob provided an insightful venue for discussions and potential collaborations, always revolving around the idea that as mobile devices become more and more ubiquitous they will offer new fascinating research opportunities.
Many more details about all the talks in the book of abstracts.
here are few papers presented at socialcom (our two papers on personality and language are summarized here)
Funf: Open Android Sensing Framework. One tutorial at socialcom was dedicated to Funf. This is an open source set of functionalities running on phones and servers that enable the collection (sensing), uploading, and configuration of a wide range of data types (location, movement, usage, social proximity). This framework has been built by a professional developer within Sandy Pentland's group (thanks to a Google grant), and has just been made publicly available on the android market (well done!) (download link ). The conference featured a considerable number of papers that made use of the framework. A case in point is [1]. This paper is about predicting "who installs which (mobile) app" based on one's social network (here the term network refers to a composite graph made of different types of phone-sensed networks). It turns out that one has more common apps with familiar strangers than with friends (i'm not 100% sure though, you need to check the paper). A cute bit of the framework is its fun dashboard - this allows researchers to run studies in which personal data is shown to the participants and consequential changes of behaviour can be automatically traced. The ubicomp paper [2] highlights the vision behind the framework.
[1] Composite Social Network for Predicting Mobile Apps Installation
[2] the social fMRI: Measuring, Understanding and Designing Social Mechanism in the Real World. Ubicomp 11.
Another "special" session was dedicated to cyber-bullying - an extremely interesting topic in need of research (pdf overview). Folks at the media lab built an initial model to spot cyber-bullying from conversation in social media. Interestingly, they trained the model using the data from this site. The paper will soon be published and will be titled "Commonsense Reasoning for Detection, Prevention, and Mitigation of Cyberbullying"
Predicting Reciprocity in Social Networks. This paper studied the factors that are associated with the probability that a node w reciprocates and links to a node v in a social network. The most important factor is the difference in status between the two nodes v and w: status(v)/status(w), where status(v)=in_degree(v)/out_degree(v).
The larger that fraction, the more likely w will reciprocates the link. That is because a large denominator and small numerator indicate that v has many in-links and few out-links and that w has many out-links and few in-links. This suggests that v has higher "status" than w will be more likely to reciprocate.
Link Prediction in Social Networks using Computationally Efficient Topological Features. Using katz measure, these researchers effectively predicted social ties in a variety of networks. This isn't a very novel work, yet it's interesting that Katz measure performed best.
The new director of the media lab, Joi Ito, gave a interesting keynote on "Open Standards and Open Networks". He recounted his involvement in a post-disaster radiation monitoring effort in Japan. During his talk, I also learned that the a large number of governments are realising their data (not pictures or videos, but data) under creative common licence.
Fortune Monitor or Fortune Teller: Understanding the Connection between Interaction Patterns and Financial Status. This paper studied the relationship between interactions monitored using mobile phones and financial status. Apparently people with high income don't talk longer but their meeting patterns (mobility) tend to be more diverse than those of people on low income. They also studied people's personality traits and found that people high in
1) Agreeableness tend to have more friends and interact with diverse users (as per face-to-face interactions monitored with bluetooth)
2) Happiness [i hope they measured satisfaction with life] tend to be more diverse contact (it would be cool to double check the measure of diversity used here)
The workshop NetMob was running in parallel and featured a lot of interesting talks that used mobile phone data to answer very interesting societal questions. The full program is in pdf. Salvo fully attended it, so he might be able to tell you more about it ;)
I presented a couple of papers at this year's Socialcom . While I was presenting, the twittersphere was offering encouraging and puzzling feedbacks:
- I love the way @danielequercia introduces a book to read in each of his talk :D
- I really like @danielequercia style in making slides and presenting! minimal, cool and fun :D
The irony is that, during the  coffee break right before my talk, I  received few constructive feedbacks on how to structure my presentations and avoid having, as I often do,  superficial and high-level slides for a *scientific* talk. Well, that's not the first time I get this feedback, and I accept it. However, I feel that many talks at conferences suffer from powerpoint karaoke syndrome - to look "right" (like a proper scientist/professional dude), one needs to recast a paper into slide format. Bad mistake, as The Great Simon L Peyton Jones would tell us. Since I apparently like to suggest books, then let me say that, despite the title, "Presenting to Win" is the best book on how to prepare and deliver presentations (it's for a business audience, but you can easily adapt it to your needs).  Ideally, one should be able to give a talk without any slide - this way, i bet that karaoke presenters will be more likely to reach enlightenment and enter nirvana (provided that they spend 3 days to prepare a 15-minute presentation). If a smooth transition between powerpoint karaoke and nirvana is needed, then  karaoke presenters  might well try the "Takahashi Method"  -  Lawrence Lessig has successfully used it (link to one of his talks) and Steve Jobs was doing something similar for his keynotes.
Anyhooow :) Â this post isn't about presentation styles but about the two papers I presented :) Here is a quick abstract that summarizes them. Enjoy ;)
In the first paper [pdf paper slides], we tested whether Twitter users can be reduced to look-alike nodes (as most of the spreading models would assume) or, instead, whether they show individual differences that impact their popularity and influence. One aspect that may differentiate users is their character and personality. The problem is that personality is difficult to observe and quantify on Twitter. It has been shown, however, that personality is linked to what is unobtrusively observable in tweets: the use of language. We thus carry out a study of tweets and show that popular and influential users linguistically structure their tweets in specific ways. This suggests that the popularity and influence of a Twitter account cannot be simply traced back to the graph properties of the network within which it is embedded, but also depends on the personality and emotions of the human being behind it. Also, in the second paper [pdf paper slides], for a limited number of 335 users, we are able to gather personality data, analyze it, and find that both popular users and influentials are extroverts and emotionally stable (low in the trait of Neuroticism). Interestingly, we also find that popular users are "imaginative" (high in Openness), while influentials tend to be "organised" (high in Conscientiousness). We then show a way of accurately predicting a user's personality simply based on three counts publicly available on profiles: following, followers, and listed counts. Knowing these three quantities about an active user, one can predict the user's five personality traits with a root- mean-squared error below 0.88 on a [1,5] scale. Based on these promising results, we argue that being able to predict user personality goes well beyond our initial goal of informing the design of new personalized applications as it, for example, expands current studies on privacy in social media.