Single-image 3D face reconstruction is a core problem in computer vision, with important clinical applications such as cephalometric landmark analysis in orthodontics. Traditionally, this analysis relies on lateral X-ray imaging; however, frequent X-ray exposure is impractical due to radiation concerns. While recent research has explored detecting landmarks from lateral RGB images as an alternative, existing methods typically rely on 2D features such as the eyes, mouth, ears, and boundary silhouettes, failing to fully exploit the underlying 3D facial geometry spanning the facial profile and jawline, which is essential for accurate diagnosis.
Meanwhile, although 3D face reconstruction from frontal views has seen significant progress, most learning-based 3D morphable model (3DMM) regressors are developed and benchmarked on near-frontal images, where appearance cues are abundant. In extreme profile views (yaw ≈ 90°), much of the face is occluded, and the available signal is dominated by boundary cues, making accurate 3D reconstruction challenging.
In this paper, we bridge this gap with geometry-conditioned synthetic data and a simple profile-specific FLAME regression baseline for single lateral images. We introduce ProfileSynth, a dataset created by sampling FLAME shape and pose parameters in extreme yaw ranges and generating photorealistic profile images using a diffusion model conditioned on depth and normal maps. We further study a profile-specific baseline with visibility-aware jawline regularization. Our framework provides a practical baseline for “profile × 3DMM” reconstruction and a promising foundation for more accurate, non-invasive cephalometric analysis from lateral RGB images.
The project is organized as a profile-focused benchmark and baseline. ProfileSynth supplies paired RGB images and exact FLAME labels for strict lateral views, while Profile3DMM keeps the regressor intentionally simple so that the effects of profile-specific data and evaluation can be studied directly.
ProfileSynth contains 100,000 synthetic profile samples with yaw in the 85-95 degree range. It samples FLAME shape and pose parameters, renders geometry cues including depth, normals, silhouettes, and landmarks, and uses geometry-conditioned image generation to synthesize photorealistic profile RGB images with known 3D labels.
The baseline uses an ImageNet-pretrained ResNet-50 and an MLP head to regress FLAME shape and head pose. It is designed as a compact reference model for the complete lateral setting, rather than a new architecture-heavy reconstruction system.
Representative reconstructions show the behavior of the profile-specific baseline on synthetic profile inputs and a small real-image profile subset.
On the ProfileSynth test split, the baseline is evaluated under shared rigid alignment to focus on shape quality and profile contour fidelity. On NoW, the real-image study is intentionally treated as indicative because the subset is small and the official metric is a global scan-based error rather than a contour-specific benchmark.
Qualitative comparison on ProfileSynth. All predicted meshes are rigidly aligned to the ground truth and rendered from the same canonical profile view.
We also examine real-image transfer on a small NoW profile subset. Quantitative results and qualitative examples are reported in the paper. NoW subject images are not redistributed on this page in accordance with the NoW dataset license.
The authors thank C. Tachiki and Y. Nishii (Department of Orthodontics, Tokyo Dental College) for providing the clinical profile photographs used in the supplementary evaluation.
@article{kanaya2026profile3dmm,
author = {Kanaya, Taiki and Saito, Hideo},
title = {Profile-Specific {3DMM} Regression from a Single Lateral Face Image},
journal = {arXiv preprint arXiv:2605.01746},
year = {2026},
}
This BibTeX entry will be updated once the official CVPR 2026 Workshops proceedings entry is available.