Xinhua Cheng (程鑫华)

I am a third-year master student of computer applications technology at School of Electron and Computer Engineering, Peking University, advised by Prof. Jian Zhang. Before this, I got a B.E. degree of computer science at College of Computer Science, Sichuan University.

My recent research interests are 3D computer vision, especially text-to-3D creation and editing.

Email  |  Google Scholar  |  DBLP  |  GitHub

headshot
Selected Publications (* indicates equal contribution)
Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts
Xinhua Cheng, Tianyu Yang, Jianan Wang, Yu Li, Lei Zhang, Jian Zhang, Li Yuan
Arxiv, 2023
[Paper] [Code] [Page]

We introduce progressive local editing to create precise 3D content consistent with prompts describing multiple interacted objects binding with different attributes.

Null-Space Diffusion Sampling for Zero-Shot Point Cloud Completion
Xinhua Cheng*, Nan Zhang*, Jiwen Yu, Yinhuai Wang, Ge Li, Jian Zhang
International Joint Conference on Artificial Intelligence (IJCAI), 2023
[Paper]

We propose a zero-shot point cloud completion framework by only refining the null-space content during the reverse process of a pre-trained diffusion model.

Panoptic Compositional Feature Field for Editable Scene Rendering with Network-Inferred Labels via Metric Learning
Xinhua Cheng, Yanmin Wu, Mengxi Jia, Qian Wang, Jian Zhang
Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[Paper]

We introduce metric learing for leveraging 2D network-inferred labels to obtain discriminating feature fields, leading to 3D segmentation and editing results.

EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding
Yanmin Wu, Xinhua Cheng, Renrui Zhang, Zesen Cheng, Jian Zhang
Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[Paper] [Code]

we explicitly decouple the textual attributes and conduct dense alignment between such fine-grained language and point cloud objects for 3D visual grounding.

More is better: Multi-source Dynamic Parsing Attention for Occluded Person Re-identification
Xinhua Cheng*, Mengxi Jia*, Qian Wang, Jian Zhang
ACM International Conference on Multimedia (ACM MM), 2022
[Paper]

We introduce the multi-source knowledge ensemble in occluded re-ID to effective leverage external semantic cues learned from different domains.

A Simple Visual-Textual Baseline for Pedestrian Attribute Recognition
Xinhua Cheng*, Mengxi Jia*, Qian Wang, Jian Zhang
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2022
[Paper] [Code]

We model pedestrian attribute recognition as a multimodal problem and build a simple visual-textual baseline to captures the intra- and cross-modal correlations.


Template is adapted from Here
Last updated: Oct 2023