I am pursuing a Ph.D. degree at Nanjing University under the supervision of Professor Qing Gu (顾庆) and Assistant Professor Zhiwei Jiang (蒋智威). Additionally, I am currently a visiting Ph.D. student at Singapore Management University (SMU), where I am guided by Associate Professor Qianru Sun and Assistant Professor Jiannan Li, with funding from the China Scholarship Council (CSC).

I have a broad interest in computer vision and deep learning, with a current focus on controllable and consistent generation in AIGC, including audio-driven video generation and text-to-image generation. My previous research experience encompasses various areas, such as essay scoring and ordinal classification.

From May 2023 to May 2024, I was a research intern at Tencent AI Lab, where I worked under the mentorship of Kuan Tian (田宽) and Jun Zhang (张军), concentrating on research in AIGC.

Research Experience

  • 2024.10Present, Visiting Ph.D. Student,
    School of Computing and Information Systems, Singapore Management University, Singapore.
  • 2021.09Present, Ph.D. Student,
    Department of Computer Science and Technology, Nanjing University, Nanjing, China.
  • 2023.052024.05, Research Intern,
    Tencent AI Lab, Technology & Engineering Group (TEG), Tencent, Shenzhen, China.

Honors and Awards

  • Outstanding Graduate Student, Nanjing University, 2023.
  • Huawei Scholarship, Nanjing University, 2023.
  • Yingcai Scholarship, Nanjing University, 2021.

Selected Publications

arXiv
sym

V-Express: Conditional Dropout for Progressive Training of Portrait Video Generation;
Cong Wang*, Kuan Tian*, Jun Zhang, Yonghang Guan, Feng Luo, Fei Shen, Zhiwei Jiang, Qing Gu, Xiao Han, Wei Yang;
arXiv:2406.02511.
[code] [project page] [arXiv] [models]

TL;DR: V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.

GitHub forks GitHub forks

arXiv
sym

Ensembling Diffusion Models via Adaptive Feature Aggregation;
Cong Wang*, Kuan Tian*, Yonghang Guan, Jun Zhang, Zhiwei Jiang, Fei Shen, Xiao Han, Qing Gu, Wei Yang;
arXiv:2405.17082.
[code] [arXiv]

TL;DR: We propose Adaptive Feature Aggregation (AFA) to ensemble multiple diffusion models dynamically based on different states like prompts, noises, and spatial locations.

ACL 2023
sym

Aggregating Multiple Heuristic Signals as Supervision for Unsupervised Automated Essay Scoring;
Cong Wang, Zhiwei Jiang, Yafeng Yin, Zifeng Cheng, Shiping Ge, Qing Gu;
Annual Meeting of the Association for Computational Linguistics (ACL), 2023.
[paper] [code] [poster] [slides] [video]

TL;DR: We propose ULRA for unsupervised automated essay scoring, which utilizes multiple heuristic quality signals to train a neural network using Deep Pairwise Rank Aggregation loss.

AAAI 2023
sym

Controlling Class Layout for Deep Ordinal Classification via Constrained Proxies Learning;
Cong Wang, Zhiwei Jiang, Yafeng Yin, Zifeng Cheng, Shiping Ge, Qing Gu;
AAAI Conference on Artificial Intelligence (AAAI), 2023.
[paper] [code] [poster] [slides] [arXiv]

TL;DR: We propose Constrained Proxies Learning for deep ordinal classification, which learns proxies for ordinal classes and adjusts their layout in feature space to capture ordinal relationships.

All Publications

2024

  • AP-Adapter: Improving Generalization of Automatic Prompts on Unseen Text-to-Image Diffusion Models; Y. Fu, Z. Jiang, Y. Liu, C. Wang, Z. Deng, Z. Chen, Q. Gu; Annual Conference on Neural Information Processing Systems (NeurIPS).
  • Advancing Pose-Guided Image Synthesis with Progressive Conditional Diffusion Models; F. Shen*, H. Ye*, J. Zhang, C. Wang, X. Han, W. Yang; International Conference on Learning Representations (ICLR).

2023

2022

Preprints

* denotes equal contribution. denotes the corresponding author.

Academic Services

  • Journal Reviewer: TNNLS, TOMM;
  • Conference Reviewer: ICLR (25), ICIC (24), MM (23), EMNLP (23).