The 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2023) is right around the corner, bringing in experts in data science and machine learning to share their latest research. At KDD 2023, a premier data mining and machine learning conference, Snap Research is contributing across various workshops, tutorials, accepted papers and invited talks, furthering knowledge and advancements in graph mining, causal inference, and related fields. Moreover, this year, Snap Research supported KDD through sponsorship and Organizing Committee service. See below for the highlights.
Workshops
Mining and Learning with Graphs Workshop
Neil Shah, Shobeir Fakhraei, Da Zheng, Bahare Fatemi, Leman Akoglu
This workshop provides a platform for researchers and practitioners to discuss the latest
developments in graph-based machine learning via keynotes, invited talks and poster
presentations.
Causal Inference and Machine Learning in Practice
Chu Wang, Yingfei Wang, Xinwei Ma, Zeyu Zheng, Jing Pan, Yifeng Wu, Huigang Chen, Totte
Harinen, Paul Lo, Jeong-Yoon Lee, Zhenyu Zhao, Fabio Vera, Eleanor Dillon, Keith Battocchi
This workshop brings together researchers and practitioners to share experiences and insights
from applying causal inference and machine learning techniques to real-world problems in areas
of product, brand, policy and beyond.
Tutorials
Large-Scale Graph Neural Networks: The Past and New Frontiers
Rui Xue, Haoyu Han, Tong Zhao, Neil Shah, Jiliang Tang, Xiaorui Liu
This tutorial overviews a body of work in training and inference of graph neural networks at scale,
including lazy propagation, piecewise training, condensation, distillation, pre-training and model
pruning.
Accepted Papers
CARL-G: Clustering-Accelerated Representation Learning on Graphs
William Shiao, Uday Saini, Yozen Liu, Tong Zhao, Neil Shah, Evangelos Papalexakis
We propose a new framework for graph self-supervised learning by adapting clustering validation
indices as loss functions, with over 79x training speedup and no performance degradation. [blog]
Semi-supervised Graph Imbalanced Regression
Gang Liu, Tong Zhao, Eric Inae, Tengfei Luo, Meng Jiang
We propose a semi-supervised framework for graph regression tasks, which uses pseudo-
labeling and latent space augmentation to achieve better data balance and reduce model bias,
with promising results in 7 benchmarks.
Sketch-based Anomaly Detection in Streaming Graphs
Siddharth Bhatia, Mohit Wadhwa, Kenji Kawaguchi, Neil Shah, Philip Yu, Bryan Hooi
We propose a first-of-its-kind constant-time and constant-space approach for detecting graph
anomalies in the streaming setting using higher-order sketching.
Balancing Approach for Causal Inference at Scale
Sicheng Lin, Meng Xu, Xi Zhang, Shih-Kang Chao, Ying-Kai Huang, Xiaolin Shi
We present two scalable algorithms for balancing approaches to solve causal inference problems
at scale of 10 million units, which are deployed in an end-to-end system at Snap and significantly
reduce both bias and variance in causal effect estimation.
Organization
Hands-On Tutorials
Neil Shah, Lei Li, Huan Sun
Hands-on Tutorials feature in-depth use of cutting-edge systems and relevant tools to the data
mining and machine learning community.
Invited Talks
Challenges of Online Measurement for Mobile Apps: A Causal Inference Perspective
Xiaolin Shi
We present an overview of causal inference as it applies to measurements in a mobile app
setting, and discuss how we can handle several challenges of going beyond randomization when
A/B tests are not feasible, observing heterogeneous treatment effects when averages are
insufficient, and treating app performance metrics as a focal point instead of only a guardrail. This
is a KDD Applied Data Science Invited Talk.
Graph Learning Benchmarks Panel
Neil Shah
We will discuss challenges and pain points in benchmarking algorithms and models in graph
machine learning community, and touch on avenues to improve benchmarking going forward.
This is a panel discussion at the KDD Workshop on Graph Learning Benchmarks.
The Importance of "In the Moment" Knowledge (KnowledgeNLP Workshop)
Francesco Barbieri
We will discuss why "in the moment" knowledge, like time, location, and weather data, is crucial
for improving NLP models. Given that posts on social platforms often contain short and
challenging-to-understand text, integrating contextual information helps us better interpret
these posts and represent the user's current state.
Sponsorship
KDD Undergraduate Consortium
Snap Research will sponsor the KDD Undergraduate Consortium, which supports the
participation of undergraduate students, providing them with an opportunity to engage with
experts, attend workshops, and expand their knowledge in the field of data science.
Many folks from Snap will be at KDD this year -- please don’t be shy and come check out our work!