Reconstructing Dynamic Scenes with Topologically-Varying Neural Implicit Fields
Abstract
This paper addresses the challenge of reconstructing dynamic scenes from monocular videos with dynamic neural implicit fields. Previous methods combine a single canonical template with a per-frame deformation field or optimize a hyperspace of templates to cover the variance of the scenes with a large amount of templates. However, single-template based methods cannot recover scenes with topology changes, while hyperspace template approaches may fail to be optimized under large-scale topology changes or motions without any constraints over the high-dimensional space. To address these issues, we propose a topologically-varying neural implicit fields by deforming the sum of multiple canonical templates and recovering large-scale motion and topology changes with a video segmentation strategy. Specifically, multiple templates are defined in canonical space, i.e. a primary template and sparse auxiliary templates. The primary template represents parts that do not change significantly during movement, while the auxiliary templates are combined linearly to represent topological changes. To reconstruct large-scale topologically-varying scenes, a dynamic segmentation strategy is adopted to divide the entire video into multiple segments, and each segment is modeled with its own templates. Compared to existing reconstruction methods for topologically changing objects, our method uses sparse auxiliary templates to form a continuous hyperspace, which is easier to optimize even under large topology changes and motions. Experimental results on various datasets demonstrate that our method outperforms previous works in both geometric reconstruction and novel-view synthesis.