Fifth International AAAI Conference on Weblogs and Social Media
17-21 July 2011, Barcelona, Spain


About us

For questions, please email:

Data Challenge

The ICWSM 2011 Data Challenge introduces a brand-new dataset, the 2011 ICWSM Spinn3r dataset. This dataset includes blogs from Spinn3r over a 33 day period, from January 13th, 2011 through February 14th, 2011. See here for details on how to obtain the collection.

Since the new collection spans some rather extraordinary world events, this year introduces a specific task: to locate significant posts in the collection which are relevant to the revolutions in Tunisia and Egypt. The criterion for "significant relevance" is that the post is worthy of being shared by you, an observer, with a friend. To participate in the task, we will ask that you submit a ranked list of items in the collection, and we will do some form of relevance judgments and scoring in time for the conference.

The data challenge will culminate at ICWSM 2011 with a special workshop. To participate in the workshop, you must submit a 3-page short paper in PDF format and bring a poster to present at the workshop. The short papers will not be reviewed, but the workshop organizers will select a small panel of speakers based on the submissions. The short paper/poster can describe your participation in the shared task, OR ALTERNATIVELY other compelling work you have performed WITH THE 2011 DATASET.

Submissions will be due on May 6th, 2011 (extended from April 22). The website for submissions is For 3-page short papers, submit your paper in the straightforward fashion, in PDF format only. For a task submission, your submission should be a plain ASCII text file where each line has the following format:

Score ID

where 'Score' is a real number and 'ID' is an item identifier corresponding to an item you judge to be of significant relevance. Item identifiers are the filename of the protostream file followed by a colon and the sequence number of the item, for example, 2011/01/13/MAINSTREAM_NEWS/en/0.protostream:1

The 'Score' values should be such that higher scores correspond to a higher degree of likihood or probability of relevance, as in a ranked ordering. If your approach creates an order without scores, use a sequence of decreasing integers in the Score field. If your approach to the task does not generate a score or an order, use the same scores for each document. Finally, in EasyChair, briefly describe the approach taken by your system in the Abstract field of the form.


Google Engineering

Microsoft Research/Bing

Yahoo Research

Church and Duncan Group, Inc.

University of Michigan School of Information

Find more about sponsoring ICWSM-11 over here