Tweet |
The ICWSM-14 Committee is pleased to present the Tutorials Day program for the Eighth International Conference on Weblogs and Social Media (ICWSM-14) in Ann Arbor, MI. The Tutorials Day provides an opportunity for junior and senior researchers to spend a day, freely exploring exciting advances in disciplines outside their normal focus.
9-12PM Sunday, June 1:
T1: Online Experiments for Computational Social Science
T2: Social Media Threats and Countermeasures
1-4PM Sunday, June 1:
T3: Route Planning and Visualization Using Geo-Social Media Data
T4: Large Scale Network Analytics with SNAP
Presenters: Eytan Bakshy and Sean Taylor
Taught by two researchers on the Facebook Data Science team, this
tutorial teaches attendees how to design, plan, implement, and analyze
online experiments. First, we review basic concepts in causal
inference and motivate the need for experiments. Then we will discuss
basic statistical tools to help plan experiments: exploratory
analysis, power calculations, and the use of simulation in R. We
then discuss statistical methods to estimate causal quantities of
interest and construct appropriate confidence intervals. Particular
attention will be given to scalable methods suitable for "big
data", including working with weighted data and clustered
bootstrapping. We then discuss how to design and implement online
experiments using PlanOut, an open-source toolkit for advanced online
experimentation used at Facebook. We will show how basic "A/B
tests", within-subjects designs, as well as more sophisticated
experiments can be implemented. We demonstrate how experimental
designs from social computing literature can be implemented, and also
review in detail two very large field experiments conducted at
Facebook using PlanOut. Finally, we will discuss issues with
logging and common errors in the deployment and analysis of
experiments. Attendees will be given code examples and participate in
the planning, implementation, and analysis of a Web application using
Python, PlanOut, and R.
Eytan Bakshy is a researcher and senior member of the
Facebook Data Science Team. He has been conducting field experiments
at Facebook for over three years, focusing peer effects in networks.
Eytan holds a Ph.D. in information from the University of
Michigan and a B.S. in mathematics from UIUC.
Sean J. Taylor is a research scientist on the Facebook
Data Science Team specializing in field experiments on Web and mobile
platforms. His research interests include causal inference, social
influence, information credibility, and evaluation of predictions.
Sean holds a Ph.D. in information systems from NYU's and a B.S. in
economics from UPenn.
Presenters: Kyumin Lee, James Caverlee, and Calton Pu
The past few years have seen the rapid rise of many successful
social systems - from Web-based social networks (e.g., Facebook,
LinkedIn) to online social media sites (e.g., Twitter, YouTube) to
large-scale information sharing communities (e.g., reddit, Yahoo!
Answers) to crowd-based funding services (e.g., Kickstarter,
IndieGoGo) to Web-scale crowdsourcing systems (e.g., Amazon MTurk,
Crowdflower).
However, with this success has come a commensurate wave of new
threats, including bot-controlled accounts in social media systems for
disseminating malware and commercial spam messages, adversarial
propaganda campaigns designed to sway public opinion, collective
attention spam targeting popular topics and memes, and propagate
manipulated contents.
This tutorial will introduce peer-reviewed research work on social
media threats and countermeasures. Specifically, we will address new
threats such as social spam, campaigns, misinformation and
crowdturfing, and overview countermeasures to mitigate and resolve
these threats by revealing and detecting malicious participants (e.g.,
social spammers, content polluters and crowdturfers) and low quality
contents. This tutorial will also overview available tools to detect
these participants.
Kyumin Lee is an Assistant Professor, Department of
Computer Science, Utah State University, kyumin.lee@usu.edu. Kyumin Lee's primary
research interests are in information quality and data analytics over
large-scale networked information systems like the Web, social media
systems, and other emerging distributed systems. His current work
focuses on both a negative and a positive dimension. On one hand, he
focuses on threats to these systems and designs methods to mitigate
negative behaviors; on the other, he looks for positive opportunities
to mine and analyze these systems for developing next generation
algorithms and architectures that can empower decision makers. He
received a highly-competitive Google Faculty Research Award in 2013.
He has published 30 peer-reviewed research papers in top journals and
conferences such as TIST, SIGIR, WWW, CIKM and ICWSM. His work was
introduced by the MIT Technology review. Lee received his Ph.D. from
Texas A&M in 2013.
James Caverlee is an Associate Professor, Department of
Computer Science and Engineering, Texas A&M University, caverlee@cse.tamu.edu. James
Caverlee's research focuses on web-scale information management,
distributed data-intensive systems, and social computing. Most
recently, he's been working on (i) spam and crowdturfing threats to
social media and web systems; and (ii) geo-social systems that
leverage large-scale spatio-temporal footprints in social media.
Caverlee is a recipient of the 2010 Defense Advanced Research Projects
Agency (DARPA) Young Faculty Award, the 2012 Air Force Office of
Scientific Research (AFOSR) Young Investigator Award, a 2012 NSF
CAREER Award, and has been named a Texas A&M Center for Teaching
Excellence Montague-CTE Scholar for 2011-2012. Caverlee received his
Ph.D. from Georgia Tech in 2007.
Calton Pu is a Professor, College of Computing, Georgia
Institute of Technology, calton.pu@cc.gatech.edu.
Calton Pu's research interests are in the areas of distributed
computing, Internet data management, and operating systems. He has
published more than 250 papers in journals, book chapters, conference
proceedings, and refereed workshops in several system-related areas,
including operating systems, transaction processing, systems
reliability, security, and Internet data management. He worked
on spam and denial of information (with several academic and industry
partners), service computing (with IBM Research), and automated system
management (with HP Labs). He has served on more than 100
program committees for more than 50 international conferences and
workshops, including PC co-chair of SRDS, ICDE, CoopIs, DOA, and
general co-chair for CIKM, ICDE, CEAS, and SCC. The sponsors for
Calton Pu's research include both government funding agencies such as
DARPA, NSF, and companies from industry such as IBM, Intel, and HP.
He is an affiliated faculty of Center for Experimental Research in
Computer Systems (CERCS), Georgia Tech Information Security Center
(GTISC), and Tennenbaum Institute. Pu received his Ph.D. from
University of Washington in 1986.
Presenters: Hsun-Ping Hsieh, Thomas Sandholm, and Cheng-Te Li
Geo-social media data, produced by GPS-enabled devices,
location-based services, and digital
cameras, are ubiquitous thanks to the maturity of
mobile and Web technologies. Geographical activities of human
beings are tracked in the form of trajectories.
User-generated geo-social trajectory data enable a novel application,
route planning, which aims to recommend travel routes
satisfying trip requirements. In this tutorial, we aim to
introduce two popular topics related to the analysis of geo-social
media data: route planning and geo-data
visualization. The first part provides a broad review of recent
advances on the route planning problem using GPS
trajectories and uncertain trajectories that come from different
sources and possess diverse properties and problems. Given
geo-social query requirements depicting the desired routes, which are
divided into three categories, i.e., location, context, and
social, we elaborate three mainstream approaches of route planning:
graph search, pattern mining, and inference/learning. The second
part gives a technical introduction and practical advice on how to
visualize geo-social data using various tools, including Google Maps,
D3, Google Fusion Tables, Google Earth, Tableau Public, Open Street
Map, Python Heatmap, Stamen, and Mongolabs. Hands-on examples are
provided to elaborate techniques of cloud data storage, scalable
geo-marker positioning, and interactive maps for visualization.
Hsun-Ping Hsieh is a Ph.D. candidate in National Taiwan
University with research interests on geo-social and urban computing.
He worked as a research intern at Microsoft Research Asia and received
"Excellent Stars of Tomorrow" award in 2013. His representative
recognition includes ACM KDD Cup 2010 First Prize, and Garmin
Fellowship 2014.
Thomas Sandholm is a Principal Research Scientist at HP
Labs in Palo Alto, CA, USA. He holds a PhD in Computer Science from
the Royal Institute of Technology in Sweden, and worked as research
staff on distributed systems and geo-social media analysis at Argonne
National Labs, Lund University and KAIST.
Cheng-Te Li is now a Postdoctoral Researcher
at Institute of Information Science in Academia Sinica, with
research interests on social networks, big data mining, and geo-social
computing. He hold his Ph.D. in computer science at National
Taiwan University. His representative international
recognition includes Facebook Fellowship Finalist Award 2012, and ACM
KDD Cup 2012 First Prize.
Presenters: Jure Leskovec and Rok Sosic
Techniques for social media modeling, analysis and optimization
are based on studies of large scale networks, where a network can
contain hundreds of millions of nodes and billions of edges. Network
analysis tools must provide not only extensive functionality, but also
high performance in processing these large networks.
The tutorial will present Stanford Network Analysis Platform (SNAP), a
general purpose, high performance system for analysis and manipulation
of large networks. SNAP is being used widely in studies of web and
social media. SNAP consists of open source software, which provides a
rich set of functions for performing network analytics, and a popular
repository of publicly available real world network datasets. SNAP
software APIs are available in Python and C++.
The tutorial will cover all aspects of SNAP, including SNAP APIs and
SNAP datasets. The tutorial is targeted toward entry level audience
with some programming background, thus the Python API will presented
in more detail than the C++ API. The tutorial will include a hands-on
component, where the participants will have the opportunity to use
SNAP on their computers.
Jure Leskovec is an assistant professor of Computer
Science at Stanford University. His research focuses on mining and
modeling large social and information networks, their evolution, and
diffusion of information and influence over them. Problems he
investigates are motivated by large scale data, the Web and on-line
media.
Rok Sosic is a Research Associate in the Department of
Computer Science at Stanford University.