Snap Research's Contributions at KDD 2023

The 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2023) is right around the corner, bringing in experts in data science and machine learning to share their latest research. At KDD 2023, a premier data mining and machine learning conference, Snap Research is contributing across various workshops, tutorials, accepted papers and invited talks, furthering knowledge and advancements in graph mining, causal inference, and related fields. Moreover, this year, Snap Research supported KDD through sponsorship and Organizing Committee service. See below for the highlights.

Workshops

Mining and Learning with Graphs Workshop
Neil Shah, Shobeir Fakhraei, Da Zheng, Bahare Fatemi, Leman Akoglu
This workshop provides a platform for researchers and practitioners to discuss the latest developments in graph-based machine learning via keynotes, invited talks and poster presentations.

Causal Inference and Machine Learning in Practice
Chu Wang, Yingfei Wang, Xinwei Ma, Zeyu Zheng, Jing Pan, Yifeng Wu, Huigang Chen, Totte
Harinen, Paul Lo, Jeong-Yoon Lee, Zhenyu Zhao, Fabio Vera, Eleanor Dillon, Keith Battocchi This workshop brings together researchers and practitioners to share experiences and insights from applying causal inference and machine learning techniques to real-world problems in areas of product, brand, policy and beyond.

Tutorials

Large-Scale Graph Neural Networks: The Past and New Frontiers
Rui Xue, Haoyu Han, Tong Zhao, Neil Shah, Jiliang Tang, Xiaorui Liu
This tutorial overviews a body of work in training and inference of graph neural networks at scale, including lazy propagation, piecewise training, condensation, distillation, pre-training and model pruning.

Accepted Papers

CARL-G: Clustering-Accelerated Representation Learning on Graphs
William Shiao, Uday Saini, Yozen Liu, Tong Zhao, Neil Shah, Evangelos Papalexakis
We propose a new framework for graph self-supervised learning by adapting clustering validation indices as loss functions, with over 79x training speedup and no performance degradation. [blog]

Semi-supervised Graph Imbalanced Regression
Gang Liu, Tong Zhao, Eric Inae, Tengfei Luo, Meng Jiang
We propose a semi-supervised framework for graph regression tasks, which uses pseudo- labeling and latent space augmentation to achieve better data balance and reduce model bias, with promising results in 7 benchmarks.

Sketch-based Anomaly Detection in Streaming Graphs
Siddharth Bhatia, Mohit Wadhwa, Kenji Kawaguchi, Neil Shah, Philip Yu, Bryan Hooi
We propose a first-of-its-kind constant-time and constant-space approach for detecting graph anomalies in the streaming setting using higher-order sketching.

Balancing Approach for Causal Inference at Scale
Sicheng Lin, Meng Xu, Xi Zhang, Shih-Kang Chao, Ying-Kai Huang, Xiaolin Shi
We present two scalable algorithms for balancing approaches to solve causal inference problems at scale of 10 million units, which are deployed in an end-to-end system at Snap and significantly reduce both bias and variance in causal effect estimation.

Organization

Hands-On Tutorials
Neil Shah, Lei Li, Huan Sun
Hands-on Tutorials feature in-depth use of cutting-edge systems and relevant tools to the data mining and machine learning community.

Invited Talks

Challenges of Online Measurement for Mobile Apps: A Causal Inference Perspective
Xiaolin Shi
We present an overview of causal inference as it applies to measurements in a mobile app setting, and discuss how we can handle several challenges of going beyond randomization when A/B tests are not feasible, observing heterogeneous treatment effects when averages are insufficient, and treating app performance metrics as a focal point instead of only a guardrail. This is a KDD Applied Data Science Invited Talk.

Graph Learning Benchmarks Panel
Neil Shah
We will discuss challenges and pain points in benchmarking algorithms and models in graph machine learning community, and touch on avenues to improve benchmarking going forward. This is a panel discussion at the KDD Workshop on Graph Learning Benchmarks.

The Importance of "In the Moment" Knowledge (KnowledgeNLP Workshop)
Francesco Barbieri
We will discuss why "in the moment" knowledge, like time, location, and weather data, is crucial for improving NLP models. Given that posts on social platforms often contain short and challenging-to-understand text, integrating contextual information helps us better interpret these posts and represent the user's current state.

Sponsorship

KDD Undergraduate Consortium
Snap Research will sponsor the KDD Undergraduate Consortium, which supports the participation of undergraduate students, providing them with an opportunity to engage with experts, attend workshops, and expand their knowledge in the field of data science.

Many folks from Snap will be at KDD this year -- please don’t be shy and come check out our work!