Theories, Applications, and Cross Modality for
Self-Supervised Learning Models

ICPR 2022 Workshop

Workshop Overview

Self-supervised learning has recently seen remarkable processes across various domains. The goal of SSL method is to learn useful semantic features without any human annotations. In the absence of human-defined labels, the deep network is expected to learn richer feature structures explained by the data itself. There are many works across different modalities showing the empirical success of the SSL learning approaches. However, many questions are still lingering: is it possible to view self-supervised learning across different modalities in a unifying view? Is it possible to find the inherent connection between successful vision architectures and NLP architectures? How to interpret these connections? What mechanism essentially matters during the SSL feature learning procedure when we change data modalities?

There are recently some emerging works starting to pay attention to these issues. However, the complete answer to the above questions still remains elusive and challenging. This workshop aims to approach the mysteries behind SSL from both theoretical and practical perspectives. We expect to invite experts from different communities and share their thoughts on how self-supervised learning approaches across different domains are connected and how they can potentially improve each other.

Topics Covered:

  • Theories of SSL models
  • SSL for computer vision tasks
  • SSL for NLP tasks
  • SSL for time series
  • SSL for graph models
  • Deep architecture design for SSL tasks
  • Cross modality models for SSL approaches
  • Other relevant SSL topics

Call for papers

Important Dates

  • Workshop paper submission deadline: May 15, 2022, 03:00 pm Pacific Time May 31, 2022, 03:00 pm Pacific Time.
  • Notification to authors: May 23, 2022, 03:00 pm Pacific Time June 6, 2022, 03:00 pm Pacific Time.
  • Camera-ready deadline: June 21, 2022, 03:00 pm Pacific Time.
Note: In case of rejection from ICPR main conference, authors can submit their work to our SSL workshop by May 31, 2022. Authors should address all ICPR reviewers' comments in the submitted paper and submit the ICPR reviews as supplementary material.

Submission instructions

We are following the ICPR 2022 paper format:

LaTeX/Word Templates: LaTeX / Word.

All papers will be reviewed by at least two reviewers with single-blind peer review policy. Papers will be selected based on relevance, significance and novelty of results, technical merit, and clarity of presentation. Papers will be published in ICPR proceedings.

Submission website

Please submit your papers through

Schedule (August 21, 2022 Eastern Standard Time Zone)

Initial greetings from the organizing committee 9:00 - 9:05
Invited Talk (Prof. Tengyu Ma, Stanford University)
Title: Toward Understanding Self-Supervised Pre-training
9:05 - 9:40
Invited Talk (Dr. Yue Cao, Beijing Academy of Artificial Intelligence)
Title: Masked Image Modeling as Vision Pretraining: Methodology, Understanding and Data Scaling Capability
9:40 - 10:15
Invited Talk (Prof. Zuxuan Wu, Fudan University)
Title: Self-Supervised Learning for Image and Video Understanding
10:15 - 10:50
Invited Talk (Dr. Antoine Miech, DeepMind)
Title: Flamingo: a Visual Language Model for Few-Shot Learning
10:50 - 11:25
Coffee Break 11:25 - 11:35
Oral presentations (Joint Masked Autoencoding with Global Reconstruction for Point Cloud Learning) 11:35 - 11:50
Oral presentations (Enhancing the Linear Probing Performance of Masked Auto-Encoders) 11:50 - 12:05
Oral presentations (Involving Density Prior for 3D Point Cloud Contrastive Learning) 12:05 - 12:20
Oral presentations (Understanding the properties and limitations of contrastive learning for Out-of-Distribution detection) 12:20 - 12:35
Closing remarks 12:35 -

Invited Speakers

Tengyu Ma is an assistant professor of computer science and statistics at the Stanford University. His research interests broadly include topics in machine learning and algorithms, such as deep learning and its theory, (deep) reinforcement learning and its theory, representation learning, robustness, non-convex optimization, distributed optimization, and high-dimensional statistics. Dr. Ma has been awarded ACM Doctoral Dissertation Award Honorable Mention 2018, COLT Best Paper Award 2018, and NIPS Best Student Paper Award 2016. He received his PhD from Princeton University in 2017.

Yue Cao is currently a researcher in Beijing Academy of Artificial Intelligence (BAAI). Prior to that, he was a senior researcher at Microsoft Research Asia between 2019 and 2022, headed by Baining Guo and closely collaborated with Han Hu, Zheng Zhang and Steve Lin. He received the Bachelor's and Doctor's degrees from the School of Software of Tsinghua University in 2014 and 2019, respectively, supervised by Prof. Jianmin Wang and Prof. Mingsheng Long. He was awarded Microsoft Ph.D. Scholarship in 2017 and Top Scholarship of Tsinghua University in 2018. His work of Swin Transformer won the Best Paper Award (Marr Prize) of ICCV 2021, and four of his research papers are included in the PaperDigest Most Influential Paper List. His papers have received more than 11000 citations in Google Scholar. His recent research interest includes foundation model, self-supervised learning and multimodal learning.

Zuxuan Wu received his Ph.D. in Computer Science from the University of Maryland with Prof. Larry Davis in 2020. He is currently an Associate Professor in the School of Computer Science at Fudan University. His research interests are in computer vision and deep learning. His work has been recognized by an AI 2000 Most Influential Scholars Award in 2022, a Microsoft Research PhD Fellowship (10 people Worldwide) in 2019 and a Snap PhD Fellowship (10 people Worldwide) in 2017.

Antoine Miech is a Research Scientist working in DeepMind's vision group. He completed his computer vision Ph.D. in the WILLOW project-team which is part of Inria and Ecole Normale Superieure, working with Ivan Laptev and Josef Sivic. His main research interests are video understanding and weakly-supervised machine learning. More generally, he is interested in everything related to Computer Vision, Machine Learning and Natural Language Processing. During the 2018 summer, he collaborated with Du Tran, Heng Wang and Lorenzo Torresani at Facebook AI. He was also awarded the Google Ph.D. fellowship in 2018.


Yu Wang
JD AI Research
Yingwei Pan
JD AI Research
Jingjing Zou
University of California San Diego
Angelica I. Aviles-Rivero
University of Cambridge
Carola-Bibiane Schönlieb
University of Cambridge
John Aston
University of Cambridge
Ting Yao
JD AI Research


To contact the organizers please use


Thanks to for the webpage format.