NeRF--: Neural Radiance Fields Without Known Camera Parameters

Zirui Wang, Shangzhe Wu, Weidi Xie, Min Chen, Victor Adrian Prisacariu
University of Oxford

Arxiv CoLab Notebook LLFF Data BLEFF Data Code

Abstract

Considering the problem of novel view synthesis (NVS) from only a set of 2D images, we simplify the training process of Neural Radiance Field (NeRF) on forward-facing scenes by removing the requirement of known or pre-computed camera parameters, including both intrinsics and 6DoF poses. To this end, we propose NeRF−−, with three contributions: First, we show that the camera parameters can be jointly optimised as learnable parameters with NeRF training, through a photometric reconstruction; Second, to benchmark the camera parameter estimation and the quality of novel view renderings, we introduce a new dataset of path-traced synthetic scenes, termed as Blender Forward-Facing Dataset (BLEFF); Third, we conduct extensive analyses to understand the training behaviours under various camera motions, and show that in most scenarios, the joint optimisation pipeline can recover accurate camera parameters and achieve comparable novel view synthesis quality as those trained with COLMAP pre-computed camera parameters.

Visualisation of Joint Optimisation

We show the visualisation of our joint optimisation below. At the beginning of training, apart from initialising a NeRF model as usual, we initialise all camera poses to be 4x4 identity matrices and a set of focal lengths that are shared by all input images to be the resolution of input images.

Results

We show novel view rendering results on the LLFF-NeRF dataset. Our method offers comparable results to COLMAP-enabled NeRF, while requiring RGB images as the only input. From left to right: COLMAP-enabled NeRF results, our results, and comparisons between our camera pose estimations and COLMAP estimations. The trajectories are aligned using this ATE toolbox.

Left: COLMAP-enabled NeRF. Middle: Ours. Right: Camera pose comparison.

Blender Forward-Facing Dataset (BLEFF)

There are mainly two reasons that motivate us to create BLEFF: 1) We need to evaluate both the camera parameter estimation accuracy and image rendering quality at the same time; and 2) To facilitate the analysis of the robustness of our method, a dataset with progressively increasing pose perturbation levels is required. To that end, we introduce a synthetic dataset BLEFF, containing 14 path-traced scenes, with each rendered in multiple levels of rotation and translation perturbations. Those scenes are modified and rendered with open-source blender files on blendswap and the license info can be found in our supplementary file.

Acknowledgement

Shangzhe Wu is supported by Facebook Research. Weidi Xie is supported by Visual AI (EP/T028572/1). The authors would like to thank Tim Yuqing Tang for insightful discussions and proofreading.

BibTeX

    @article{wang2021nerfmm,
      title={Ne{RF}$--$: Neural Radiance Fields Without Known Camera Parameters},
      author={Zirui Wang and Shangzhe Wu and Weidi Xie and Min Chen and Victor Adrian Prisacariu},
      journal={arXiv preprint arXiv:2102.07064},
      year={2021}
    }