The ICWSM-14 Committee is pleased to present the Tutorials Day program for the Eighth International Conference on Weblogs and Social Media (ICWSM-14) in Ann Arbor, MI. The Tutorials Day provides an opportunity for junior and senior researchers to spend a day, freely exploring exciting advances in disciplines outside their normal focus.
9-12PM Sunday, June 1:
T1: Online Experiments for Computational Social Science
T2: Social Media Threats and Countermeasures
1-4PM Sunday, June 1:
T3: Route Planning and Visualization Using Geo-Social Media Data
T4: Large Scale Network Analytics with SNAP
Presenters: Eytan Bakshy and Sean Taylor
Taught by two researchers on the Facebook Data Science team, this
tutorial teaches attendees how to design, plan, implement, and analyze
online experiments. First, we review basic concepts in causal
inference and motivate the need for experiments. Then we will discuss
basic statistical tools to help plan experiments: exploratory
analysis, power calculations, and the use of simulation in R. We
then discuss statistical methods to estimate causal quantities of
interest and construct appropriate confidence intervals. Particular
attention will be given to scalable methods suitable for "big
data", including working with weighted data and clustered
bootstrapping. We then discuss how to design and implement online
experiments using PlanOut, an open-source toolkit for advanced online
experimentation used at Facebook. We will show how basic "A/B
tests", within-subjects designs, as well as more sophisticated
experiments can be implemented. We demonstrate how experimental
designs from social computing literature can be implemented, and also
review in detail two very large field experiments conducted at
Facebook using PlanOut. Finally, we will discuss issues with
logging and common errors in the deployment and analysis of
experiments. Attendees will be given code examples and participate in
the planning, implementation, and analysis of a Web application using
Python, PlanOut, and R.
Eytan Bakshy is a researcher and senior member of the
Facebook Data Science Team. He has been conducting field experiments
at Facebook for over three years, focusing peer effects in networks.
Eytan holds a Ph.D. in information from the University of
Michigan and a B.S. in mathematics from UIUC.
Sean J. Taylor is a research scientist on the Facebook Data Science Team specializing in field experiments on Web and mobile platforms. His research interests include causal inference, social influence, information credibility, and evaluation of predictions. Sean holds a Ph.D. in information systems from NYU's and a B.S. in economics from UPenn.
Presenters: Kyumin Lee, James Caverlee, and Calton Pu
The past few years have seen the rapid rise of many successful
social systems - from Web-based social networks (e.g., Facebook,
LinkedIn) to online social media sites (e.g., Twitter, YouTube) to
large-scale information sharing communities (e.g., reddit, Yahoo!
Answers) to crowd-based funding services (e.g., Kickstarter,
IndieGoGo) to Web-scale crowdsourcing systems (e.g., Amazon MTurk,
However, with this success has come a commensurate wave of new threats, including bot-controlled accounts in social media systems for disseminating malware and commercial spam messages, adversarial propaganda campaigns designed to sway public opinion, collective attention spam targeting popular topics and memes, and propagate manipulated contents.
This tutorial will introduce peer-reviewed research work on social media threats and countermeasures. Specifically, we will address new threats such as social spam, campaigns, misinformation and crowdturfing, and overview countermeasures to mitigate and resolve these threats by revealing and detecting malicious participants (e.g., social spammers, content polluters and crowdturfers) and low quality contents. This tutorial will also overview available tools to detect these participants.
Kyumin Lee is an Assistant Professor, Department of Computer Science, Utah State University, email@example.com. Kyumin Lee's primary research interests are in information quality and data analytics over large-scale networked information systems like the Web, social media systems, and other emerging distributed systems. His current work focuses on both a negative and a positive dimension. On one hand, he focuses on threats to these systems and designs methods to mitigate negative behaviors; on the other, he looks for positive opportunities to mine and analyze these systems for developing next generation algorithms and architectures that can empower decision makers. He received a highly-competitive Google Faculty Research Award in 2013. He has published 30 peer-reviewed research papers in top journals and conferences such as TIST, SIGIR, WWW, CIKM and ICWSM. His work was introduced by the MIT Technology review. Lee received his Ph.D. from Texas A&M in 2013.
James Caverlee is an Associate Professor, Department of Computer Science and Engineering, Texas A&M University, firstname.lastname@example.org. James Caverlee's research focuses on web-scale information management, distributed data-intensive systems, and social computing. Most recently, he's been working on (i) spam and crowdturfing threats to social media and web systems; and (ii) geo-social systems that leverage large-scale spatio-temporal footprints in social media. Caverlee is a recipient of the 2010 Defense Advanced Research Projects Agency (DARPA) Young Faculty Award, the 2012 Air Force Office of Scientific Research (AFOSR) Young Investigator Award, a 2012 NSF CAREER Award, and has been named a Texas A&M Center for Teaching Excellence Montague-CTE Scholar for 2011-2012. Caverlee received his Ph.D. from Georgia Tech in 2007.
Calton Pu is a Professor, College of Computing, Georgia Institute of Technology, email@example.com. Calton Pu's research interests are in the areas of distributed computing, Internet data management, and operating systems. He has published more than 250 papers in journals, book chapters, conference proceedings, and refereed workshops in several system-related areas, including operating systems, transaction processing, systems reliability, security, and Internet data management. He worked on spam and denial of information (with several academic and industry partners), service computing (with IBM Research), and automated system management (with HP Labs). He has served on more than 100 program committees for more than 50 international conferences and workshops, including PC co-chair of SRDS, ICDE, CoopIs, DOA, and general co-chair for CIKM, ICDE, CEAS, and SCC. The sponsors for Calton Pu's research include both government funding agencies such as DARPA, NSF, and companies from industry such as IBM, Intel, and HP. He is an affiliated faculty of Center for Experimental Research in Computer Systems (CERCS), Georgia Tech Information Security Center (GTISC), and Tennenbaum Institute. Pu received his Ph.D. from University of Washington in 1986.
Presenters: Hsun-Ping Hsieh, Thomas Sandholm, and Cheng-Te Li
Geo-social media data, produced by GPS-enabled devices,
location-based services, and digital
cameras, are ubiquitous thanks to the maturity of
mobile and Web technologies. Geographical activities of human
beings are tracked in the form of trajectories.
User-generated geo-social trajectory data enable a novel application,
route planning, which aims to recommend travel routes
satisfying trip requirements. In this tutorial, we aim to
introduce two popular topics related to the analysis of geo-social
media data: route planning and geo-data
visualization. The first part provides a broad review of recent
advances on the route planning problem using GPS
trajectories and uncertain trajectories that come from different
sources and possess diverse properties and problems. Given
geo-social query requirements depicting the desired routes, which are
divided into three categories, i.e., location, context, and
social, we elaborate three mainstream approaches of route planning:
graph search, pattern mining, and inference/learning. The second
part gives a technical introduction and practical advice on how to
visualize geo-social data using various tools, including Google Maps,
D3, Google Fusion Tables, Google Earth, Tableau Public, Open Street
Map, Python Heatmap, Stamen, and Mongolabs. Hands-on examples are
provided to elaborate techniques of cloud data storage, scalable
geo-marker positioning, and interactive maps for visualization.
Hsun-Ping Hsieh is a Ph.D. candidate in National Taiwan
University with research interests on geo-social and urban computing.
He worked as a research intern at Microsoft Research Asia and received
"Excellent Stars of Tomorrow" award in 2013. His representative
recognition includes ACM KDD Cup 2010 First Prize, and Garmin
Thomas Sandholm is a Principal Research Scientist at HP Labs in Palo Alto, CA, USA. He holds a PhD in Computer Science from the Royal Institute of Technology in Sweden, and worked as research staff on distributed systems and geo-social media analysis at Argonne National Labs, Lund University and KAIST.
Cheng-Te Li is now a Postdoctoral Researcher
at Institute of Information Science in Academia Sinica, with
research interests on social networks, big data mining, and geo-social
computing. He hold his Ph.D. in computer science at National
Taiwan University. His representative international
recognition includes Facebook Fellowship Finalist Award 2012, and ACM
KDD Cup 2012 First Prize.
Presenters: Jure Leskovec and Rok Sosic
Techniques for social media modeling, analysis and optimization
are based on studies of large scale networks, where a network can
contain hundreds of millions of nodes and billions of edges. Network
analysis tools must provide not only extensive functionality, but also
high performance in processing these large networks.
The tutorial will present Stanford Network Analysis Platform (SNAP), a general purpose, high performance system for analysis and manipulation of large networks. SNAP is being used widely in studies of web and social media. SNAP consists of open source software, which provides a rich set of functions for performing network analytics, and a popular repository of publicly available real world network datasets. SNAP software APIs are available in Python and C++.
The tutorial will cover all aspects of SNAP, including SNAP APIs and SNAP datasets. The tutorial is targeted toward entry level audience with some programming background, thus the Python API will presented in more detail than the C++ API. The tutorial will include a hands-on component, where the participants will have the opportunity to use SNAP on their computers.
Jure Leskovec is an assistant professor of Computer Science at Stanford University. His research focuses on mining and modeling large social and information networks, their evolution, and diffusion of information and influence over them. Problems he investigates are motivated by large scale data, the Web and on-line media.
Rok Sosic is a Research Associate in the Department of Computer Science at Stanford University.