Yimu Wang's Homepage

Biography

I am Yimu Wang, a last-year Ph.D. student at UWaterloo. I obtained my master’s degree under the supervision of Prof. Lijun Zhang in the LAMDA Group led by Prof. Zhihua Zhou at Nanjing University. I was honored to spend a wonderful RA time at Tsinghua University with Prof. Jingjing Liu and Prof. Yang Liu and amazing experiences at Amazon, SONY AI, Borealis AI, Tencent Lightspeed & Quantum Studios, Alibaba, Netease Games, and Megvii.

My major research interests are Multi-modal Learning and 3D understanding.

Yimu Wang
PhD Student
University of Waterloo CS

News

[2025/09] One paper was accepted to JMLR 2025.
[2025/09] One paper was accepted to NeurIPS 2025.
[2025/08] One survey paper was accepted to TMLR 2025.
[2025/08] One paper was accepted to EMNLP 2025.
[2025/06] One paper was accepted to ICCV 2025.
[2025/05] One paper was accepted to ACL 2025.
[2025/01] Two paper was accepted to NAACL 2025.

[2024-10] One paper got accepted by WACV 2025!
[2024-09] One paper got accepted by NeurIPS Workshop 2025!
[2023-12] Two papers were accepted by AAAI 2024!
[2023-10] Three papers got accepted by EMNLP 2023 (one main paper and two findings)!
[2023-09] One paper got accepted by NeurIPS 2023!
[2023-04] I have been awarded by the CVPR's DEI award for traveling to Vancouver!
[2023-02] One paper got accepted by CVPR 2023
[2023-01] One paper got accepted by ICLR 2023

Publications [Google Scholar]

Lexicographic Lipschitz Bandits: New Algorithms and a Lower Boun

Bo Xue, Ji Cheng, Fei Liu, Yimu Wang, Lijun Zhang, and Qingfu Zhang
Journal of Machine Learning Research (JMLR), 2025.

JMLR 2025

Hawaii: Hierarchical Visual Knowledge Transfer for Eﬀicient Vision-Language Models

Yimu Wang, Mozhgan Nasr Azadani, Sean Sedwards, Krzysztof Czarnecki
Annual Conference on Neural Information Processing Systems (NeurIPS), 2025.

NeurIPS 2025 Arxiv

Survey of Video Diffusion Models: Foundations, Implementations, and Applications

Yimu Wang, Xuye Liu, Wei Pang, Li Ma, Shuai Yuan, Paul Debevec, Ning Yu
Transactions on Machine Learning Research (TMLR), 2025.

TMLR 2025 Arxiv Paper

LEO-MINI: An Efficient Multimodal Large Language Model using Conditional Token Reduction and Mixture of Multi-Modal Experts

Yimu Wang, Mozhgan Nasr Azadani, Sean Sedwards, Krzysztof Czarnecki
Empirical Methods in Natural Language Processing (EMNLP), 2025.

EMNLP 2025 Arxiv

OV-SCAN: Semantically Consistent Alignment for Novel Object Discovery in Open-Vocabulary 3D Object Detection

A. Chow, E. Riddell, Yimu Wang, S. Sedwards, K. Czarnecki
International Conference on Computer Vision (ICCV), 2025.

ICCV 2025 Arxiv

NBDESCRIB: A Dataset for Text Description Generation from Tables and Code in Jupyter Notebooks with Guidelines

Xuye Liu, Tengfei Ma, Yimu Wang, Fengjie Wang, Jian Zhao
Annual Meeting of the Association for Computational Linguistics (Findings of ACL), 2025.

Findings of ACL 2025
```
                    bibtex
                
```

ELIOT: Zero-Shot Video-Text Retrieval through Relevance-Boosted Captioning and Structural Information Extractio

Xuye Liu, Yimu Wang, Jian Zhao
NAACL Student Research Workshop (SRW of NAACL), 2025.

SRW of NAACL 2025 Paper

DREAM: Improving Video-Text Retrieval Through Relevance-Based Augmentation Using Large Foundation Models

Yimu Wang, Shuai Yuan, Bo Xue, Xiangru Jian, Wei Pang, Mushi Wang, Ning Yu
Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), 2025.

NAACL 2025 Paper

AIDE: Improving 3D Open-Vocabulary Semantic Segmentation by Aligned Vision-Language Learning

Yimu Wang, Krzysztof Czarneck
IEEE Winter Conference on Applications of Computer Vision (WACV), 2025.

WACV 2025 Paper

Pretext Training Algorithms for Event Sequence Data

Yimu Wang, He Zhao, Ruizhi Deng, Frederick Tung, Greg Mori
Conference on Neural Information Processing Systems Workshop (NeurIPS workshop), 2024.

NeurIPS Workshop 2024 Paper

Lost Domain Generalization Is a Natural Consequence of Lack of Training Domains

Yimu Wang, Yihan Wu, Hongyang Zhang
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2024.

AAAI 2024 Paper

Multiobjective Lipschitz Bandits under Lexicographic Ordering

Bo Xue, Ji Cheng, Fei Liu, Yimu Wang, Qingfu Zhang
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2024.

AAAI 2024 Paper

Eﬀicient Algorithms for Generalized Linear Bandits with Heavy-tailed Rewards

Bo Xue, Yimu Wang, Yuanyu Wan, Jinfeng Yi, and Lijun Zhang
Conference on Neural Information Processing Systems (NeurIPS), 2023.

NeurIPS 2023 Paper

Balance Act: Mitigating Hubness in Cross-Modal Retrieval with Query and Gallery Banks

Yimu Wang, Xiangru Jian, Bo Xue
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP Oral), 2023.

EMNLP (Oral) 2023 Paper Code

Video-Text Retrieval by Supervised Sparse Multi-Grained Learning

Yimu Wang, Peng Shi
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (Findings of EMNLP), 2023.

Findings of EMNLP 2023 Paper Code

InvGC: Robust Cross-Modal Retrieval by Inverse Graph Convolution

Xiangru Jian, Yimu Wang
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (Findings of EMNLP), 2023.

Findings of EMNLP 2023 Paper Code

Cooperation or Competition: Avoiding Player Domination for Multi-target Robustness by Adaptive Budgets

Yimu Wang, Dinghuai Zhang, Yihan Wu, Heng Huang, Hongyang Zhang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.

CVPR 2023 Paper

Multimodal Federated Learning via Contrastive Representation Ensemble

Qiying Yu, Yang Liu, Yimu Wang, Ke Xu, Jingjing Liu
International Conference on Learning Representations (ICLR), 2023.

ICLR 2023 Paper Code

Deep Unified Cross-Modality Hashing by Pairwise Data Alignment

Yimu Wang, Bo Xue, Quan Cheng, Yuhui Chen, and Lijun Zhang
International Joint Conference on Artificial Intelligence (IJCAI), 2021.

IJCAI 2021 Paper

Searching Privately by Imperceptible Lying: A Novel Private Hashing Method with Differential Privacy

Yimu Wang, Shiyin Lu, and Lijun Zhang
ACM International Conference on Multimedia (ACM MM), 2020.

ACM MM 2020 Paper

@inproceedings{10.1145/3394171.3413882,
author = {Wang, Yimu and Lu, Shiyin and Zhang, Lijun},
title = {Searching Privately by Imperceptible Lying: A Novel Private Hashing Method with Differential Privacy},
year = {2020},
isbn = {9781450379885},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3394171.3413882},
doi = {10.1145/3394171.3413882},
abstract = {In the big data era, with the increasing amount of multi-media data, approximate nearest neighbor~(ANN) search has been an important but challenging problem. As a widely applied large-scale ANN search method, hashing has made great progress, and achieved sub-linear search time with low memory space. However, the advances in hashing are based on the availability of large and representative datasets, which often contain sensitive information. Typically, the privacy of this individually sensitive information is compromised. In this paper, we tackle this valuable yet challenging problem and formulate a task termed as private hashing, which takes into account both searching performance and privacy protection. Specifically, we propose a novel noise mechanism, i.e., Random Flipping, and two private hashing algorithms, i.e., PHashing and PITQ, with the refined analysis within the framework of differential privacy, since differential privacy is a well-established technique to measure the privacy leakage of an algorithm. Random Flipping targets binary scenarios and leverages the "Imperceptible Lying" idea to guarantee ε-differential privacy by flipping each datum of the binary matrix (noise addition). To preserve ε-differential privacy, PHashing perturbs and adds noise to the hash codes learned by non-private hashing algorithms using Random Flipping. However, the noise addition for privacy in PHashing will cause severe performance drops. To alleviate this problem, PITQ leverages the power of alternative learning to distribute the noise generated by Random Flipping into each iteration while preserving ε-differential privacy. Furthermore, to empirically evaluate our algorithms, we conduct comprehensive experiments on the image search task and demonstrate that proposed algorithms achieve equal performance compared with non-private hashing methods.},
booktitle = {Proceedings of the 28th ACM International Conference on Multimedia},
pages = {2700–2709},
numpages = {10},
keywords = {large-scale multimedia retrieval, hashing, differential privacy},
location = {Seattle, WA, USA},
series = {MM '20}
}

Nearly Optimal Regret for Stochastic Linear Bandits with Heavy-Tailed Payoffs

Bo Xue, Guanghui Wang,Yimu Wang, Lijun Zhang
International Joint Conference on Artificial Intelligence (IJCAI), 2020.

IJCAI 2020 Paper

An Adversarial Domain Adaptation Network for Cross-Domain Fine-Grained Recognition

Yimu Wang, Ren-Jie Song, Xiu-Shen Wei, and Lijun Zhang
IEEE Winter Conference on Applications of Computer Vision (WACV), 2020.

WACV 2020 Paper

Educations and Research Experience

University of Waterloo, ON, Canada

May. 2022 - Apr. 2026 (Expected)

Ph.D. in Computer Science
Supervisor: Prof. Krzysztof Czarnecki
Tsinghua University, Beijing, China

Sep. 2021 - Mar. 2022

RA
Supervisor: Prof. Jingjing Liu
Nanjing University, Nanjing, China

Sep. 2018 - Jun. 2021

MSc. in Computer Science and Technology
Supervisor: Prof. Lijun Zhang

Working Experiences

Amazon

Applied Scientist Intern

May. 2025 - Aug. 2025

Applied Scientist Intern

Jun. 2024 - Sep. 2024
Alibaba

Research Intern

May. 2021 - Sep. 2021
NetEase

Research Intern

Aug. 2020 - Sep. 2020
Tencent, IEG, Lightspeed & Quantum Studios Group

Research Intern

Jun. 2020 - Jul. 2020

Awards

David R. Cheriton Graduate Scholarship, University of Waterloo

Fall 2024
AAAI 2024 Travel and Diversity & Inclusion Award

Jan. 2024
EMNLP 2023 Diversity and Inclusion Award

Nov. 2023
CVPR 2023 Travel Award and Diversity, Equity & Inclusion Award

Jan. 2023