VEGS : View Extrapolation of Urban Scenes in 3D Gaussian Splatting using Learned Priors

ECCV 24

Sungwon Hwang*
KAIST

Min-Jung Kim*
KAIST

Taewoong Kang
KAIST

Jayeon Kang
Ghent University

Jaegul Choo
KAIST

(*: equal contribution)

Paper

arXiv

Code

Data (KITTI-360)

Our method achieves high-fidelity renderings from views distanced from training camera distribution.

Our method jointly reconstructs static scene with dynamic objects such as cars.

Paper Summary at a Glance

Problem Statement

	We tackle the Extrapolated View Synthesis (EVS) problem on views such as looking left, right or downwards from train camera distributions.

Overall Pipeline

	We initialize gaussian means using dense LiDAR map and point cloud from SfM. We leveraged prior scene knowledge, such as surface normal estimation and large-scale diffusion models, to improve rendering quality for EVS.

Covariance Guidance with Surface Normal Prior

	The Lazy Covariance Optimization (LCO) Problem
	LCO problem refers to the case where the covariance is trained to cover the the frustum of a training pixel with a minimal optimization effort. As a result, these covariances are prone to produce unwanted cavity on an underlying scene surface.

	Covariance Guidance Loss
	Our key idea is to guide the orientation and shape of covariances to make them behave like the underlying scene surface. Specifically, we propose \(\mathcal{L}_{cov}\) =\(\mathcal{L}_{axis}\) + \(\mathcal{L}_{scale}\), where \(\mathcal{L}_{axis}\) aligns covariance axes to a surface normal vector and \(\mathcal{L}_{scale}\) minimizes the scale along the covariance axis aligned with surface normal

Score Distillation from Large-scale Diffusion Model

Noise (score) predicted from a diffusion model \( \textbf{s}_{\theta} \) is proportional to the log-gradient of a prior distribution \( p(\textbf{x}) \), or \( \textbf{s}_{\theta}(\textbf{x}_{\tau}, \tau) \approx - \nabla_{\textbf{x}} \text{log} p(\textbf{x}) \). Thus, optimizing \( \textbf{x}_{\tau} \) to yield smaller score pushes \( \textbf{x} \) to our prior distribution \( p(\cdot) \). We model our prior distribution using Stable Diffusion fine-tuned with LoRA.