Zipeng Xu

I am a third-year PhD student at the Multimedia and Human Understanding Group, University of Trento (Italy). My advisor is Prof. Nicu Sebe. I am also a visiting PhD student at the Torr Vision Group, University of Oxford.

Previously, I received a bachelor's degree in Communication Engineering and a master's degree in Intelligent Science and Technology from the Beijing University of Posts and Telecommunications (BUPT), where I was advised by Prof. Xiaojie Wang.

Email: zipeng.xu@unitn.it
Personal: [Google Scholar] [Github] [Linkedin]

Publications

StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized Tokenizer of a Large-Scale Generative Model [arXiv] [code] [bilibili]
Zipeng Xu, Enver Sangineto, Nicu Sebe.
In Proceedings of IEEE / CVF International Conference on Computer Vision (ICCV), 2023.

Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model [paper] [arXiv] [code]
Zipeng Xu, Tianwei Lin, Hao Tang, Fu Li, Dongliang He, Nicu Sebe, Radu Timofte, Luc Van Gool, Errui Ding.
In Proceedings of IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2022.

SpectralCLIP: Preventing Artifacts in Text-Guided Style Transfer from a Spectral Perspective [arXiv] [code]
Zipeng Xu*, Songlong Xing*, Enver Sangineto, Nicu Sebe. (* Equal Contribution)
In Proceedings of IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024.

Modeling Explicit Concerning States for Reinforcement Learning in Visual Dialogue [paper] [arXiv] [code]
Zipeng Xu, Fandong Meng, Xiaojie Wang, Duo Zheng, Chenxu Lv and Jie Zhou.
In Proceedings of British Machine Vision Conference (BMVC), 2021.

Enhancing Visual Dialog Questioner with Entity-based Strategy Learning and Augmented Guesser [paper] [arXiv] [code]
Duo Zheng* Zipeng Xu*, Fandong Meng, Xiaojie Wang, Jiaan Wang and Jie Zhou. (* Equal Contribution)
In Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.

Answer-Driven Visual State Estimator for Goal-Oriented Visual Dialogue [paper] [arXiv] [code]
Zipeng Xu, Fangxiang Feng, Xiaojie Wang, Huixing Jiang, Yushu Yang and Zhongyuan Wang.
In Proceedings of ACM Multimedia, 2020.

Internships

NAVER LABS Europe, Grenoble, France. 05/2023-11/2023.
Research Internship.
Mentor: Riccardo Volpi.

VIS, Baidu Inc., Beijing, China. 08/2021-11/2021.
Research Internship, focusing on text-guided image manipulation.
Mentor: Tianwei Lin.

Wechat AI, Tencent Inc., Beijing, China. 04/2020-06/2021.
Research Internship, focusing on visually-grounded natural language generation.
Mentor: Fandong Meng.

Activities

Reviewer: CVPR 2023, ICCV 2023, WACV 2024, BMVC (2022-2023).