Xinhua Cheng (程鑫华)

I am a third-year master student of computer applications technology at School of Electron and Computer Engineering, Peking University, advised by Prof. Jian Zhang. Before this, I got a B.E. degree of computer science at College of Computer Science, Sichuan University.

My recent research interests are 3D computer vision, especially text-to-3D creation and editing.

Email  |  Google Scholar  |  DBLP  |  GitHub

Selected Publications (* indicates equal contribution)
Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts
Xinhua Cheng, Tianyu Yang, Jianan Wang, Yu Li, Lei Zhang, Jian Zhang, Li Yuan
Arxiv, 2023
[Paper] [Code] [Page]

We introduce progressive local editing to create precise 3D content consistent with prompts describing multiple interacted objects binding with different attributes.

Null-Space Diffusion Sampling for Zero-Shot Point Cloud Completion
Xinhua Cheng*, Nan Zhang*, Jiwen Yu, Yinhuai Wang, Ge Li, Jian Zhang
International Joint Conference on Artificial Intelligence (IJCAI), 2023

We propose a zero-shot point cloud completion framework by only refining the null-space content during the reverse process of a pre-trained diffusion model.

Panoptic Compositional Feature Field for Editable Scene Rendering with Network-Inferred Labels via Metric Learning
Xinhua Cheng, Yanmin Wu, Mengxi Jia, Qian Wang, Jian Zhang
Conference on Computer Vision and Pattern Recognition (CVPR), 2023

We introduce metric learing for leveraging 2D network-inferred labels to obtain discriminating feature fields, leading to 3D segmentation and editing results.

EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding
Yanmin Wu, Xinhua Cheng, Renrui Zhang, Zesen Cheng, Jian Zhang
Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[Paper] [Code]

we explicitly decouple the textual attributes and conduct dense alignment between such fine-grained language and point cloud objects for 3D visual grounding.

More is better: Multi-source Dynamic Parsing Attention for Occluded Person Re-identification
Xinhua Cheng*, Mengxi Jia*, Qian Wang, Jian Zhang
ACM International Conference on Multimedia (ACM MM), 2022

We introduce the multi-source knowledge ensemble in occluded re-ID to effective leverage external semantic cues learned from different domains.

A Simple Visual-Textual Baseline for Pedestrian Attribute Recognition
Xinhua Cheng*, Mengxi Jia*, Qian Wang, Jian Zhang
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2022
[Paper] [Code]

We model pedestrian attribute recognition as a multimodal problem and build a simple visual-textual baseline to captures the intra- and cross-modal correlations.

Template is adapted from Here
Last updated: Oct 2023