Yongjie Zhu

I am now an Algorithm Engineer at Alibaba Group (Beijing). I obtained my Master degree from School of Artificial Intelligence, Beijing University of Posts and Telecommunications (BUPT), co-supervised by Prof. Si Li and Prof. Boxin Shi.

I was a research intern at Microsoft, where I work on computer vision and natural language processing. I also worked as a research intern at Tencent, WXG. At Tencent I've worked on 3D Human Body Estimation and Face Appearance Modeling under the guidance of Dr. Chen Li. Before that, I got my Bachelor degree at BUPT, advised by Prof. Si Li. I've received the China National Scholarship and the JJWorld (Beijing) Network Technology Scholarship (top 3%) at BUPT.

Email  /  CV  /  Bio  /  Google Scholar  /  LinkedIn  /  Github

profile photo
(* indicates equal contribution)

My research interests lie at the intersection of computer vision, computer graphics, and computational photography. Recently, I focus on physics-based computer vision and multimodal in natural language processing, including inferring the physical world (shape, color, light, etc) and linguistic text from images. Representative papers are highlighted.

SPLiT: Single Portrait Lighting Estimation via a Tetrad of Face Intrinsics
Fan Fei*, Yean Cheng*, Yongjie Zhu, Qian Zheng, Si Li, Gang Pan, Boxin Shi
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
[paper]   [project]  

This paper proposes a novel pipeline to estimate a non-parametric environment map with high dynamic range from a single human face image.

Complementary intrinsics from neural radiance fields and CNNs for outdoor scene relighting
Siqi Yang*, Xuanning Cui*, Yongjie Zhu, Jiajun Tang, Si Li, Zhaofei Yu, Boxin Shi
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[paper]   [supp]   [bibtex]  

This paper proposes to complement the intrinsic estimation from volume rendering using NeRF and from inversing the photometric image formation model using convolutional neural networks (CNNs).

Estimating Spatially-Varying Lighting in Urban Scenes with Disentangled Representation
Jiajun Tang, Yongjie Zhu, Haoyu Wang, Jun Hoong Chan, Si Li, Boxin Shi
European Conference on Computer Vision (ECCV), 2022   (Oral Presentation)
[paper]   [bibtex]   [project]

Given a single image and a 2D pixel location, our method can estimate the local lighting that is disentangled into ambient sky light, sun light and lighting-independent local contents.

AdsCVLR: Commercial Visual-Linguistic Representation Modeling in Sponsored Search
Yongjie Zhu, Chunhui Han, Yuefeng Zhan, Bochen Pang, Zhaoju Li, Hao Sun, Si Li, Boxin Shi, Nan Duan, Ruofei Zhang, Liangjie Zhang, Weiwei Deng, Qi Zhang
ACM International Conference on Multimedia (ACM MM), 2022
[paper]   [bibtex]   [data]

We propose a multi-modal relevance modeling approach for sponsored search, and boost the performance via contrastive learning that naturally extends the transformer encoder with the complementary multi-modal inputs.

Hybrid Face Reflectance, Illumination, and Shape from a Single Image
Yongjie Zhu, Chen Li, Si Li, Boxin Shi, Yu-Wing Tai
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021  
[paper]   [bibtex]   [video]  

We proposed a self-supervised deep learning framework that can estimate the hybrid reflection model and detailed normal of the human face. The proposed hybrid reflectance and illumination representation ensures the photo-realistic face reconstruction.

Spatially-Varying Outdoor Lighting Estimation from Intrinsics
Yongjie Zhu, Yinda Zhang, Si Li, Boxin Shi
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021   (Oral Presentation)
[arXiv]   [bibtex]   [video]   [poster]

Collecting high quality paired intrinsic and lighting data in a virtual city lets you train a model that estimates spatially-varying lighting from a single outdoor image.

DeRenderNet: Intrinsic Image Decomposition of Urban Scenes with Shape-(In)dependent Shading Rendering
Yongjie Zhu, Jiajun Tang, Si Li, Boxin Shi
International Conference on Computational Photography (ICCP), 2021
[arXiv]   [bibtex]

Decomposing a single RGB image into its reflectance, shading (caused by direct lighting), and shadow (caused by occlusion) images.

Academic Services
clean-usnob Conference Reviewer, CVPR 2021
clean-usnob Journal Reviewer, IJCV 2020, IJCV 2022
Current and Past Affiliations
Selected Honors
  • [2022] Excellent graduation thesis of BUPT.
  • [2021] China National Scholarship (国家奖学金).
  • [2020] First Prize, Outstanding Student Scholarship of BUPT.
  • [2019] First Prize, Outstanding Student Scholarship of BUPT.
  • [2018] JJWorld (Beijing) Network Technology Scholarship.

Homepage of Yongjie Zhu [朱勇杰]
© Yongjie Zhu. All Rights Reserved.