Improving Unsupervised Visual Program Inference with Code Rewriting Families

Brown University

ICCV, 2023 (oral)

Compared to prior state-of-the-art,domain-specific methods, our domain-agnostic approach infers programs that are significantly more concise, interpretable and editable.

Abstract

Programs offer compactness and structure that makes them an attractive representation for visual data. We explore how code rewriting can be used to improve systems for inferring programs from visual data. We first propose Sparse Intermittent Rewrite Injection (SIRI), a framework for unsupervised bootstrapped learning. SIRI sparsely applies code rewrite operations over a dataset of training programs, injecting the improved programs back into the training set. We design a family of rewriters for visual programming domains: Parameter Optimization (PO), Code Pruning (CP), and Code Grafting (CG). For three shape programming languages in 2D and 3D, we show that using SIRI with our family of rewriters improves performance: better reconstructions and faster convergence rates, compared with bootstrapped learning methods that do not use rewriters or use them naively. Finally, we demonstrate that our family of rewriters can be effectively used at test time to improve the output of SIRI predictions. For 2D and 3D CSG, we outperform or match the reconstruction performance of recent domain-specific neural architectures, while producing more parsimonious programs that use significantly fewer primitives.

Inferred Visual Programs

Click on any image to reveal its 3D model.

Unsupervised Visual Programs Inference

Visual data is often highly structured: manufactured shapes are produced by assembling parts; vector graphics images are built from layers of primitives; detailed textures can be created via intricate compositions of noise functions. Visual programs, i.e. programs that produce visual outputs when executed, are a natural approach to capturing this complexity in a structure-aware fashion. Access to well-written visual programs supports downstream applications across visual computing domains, including editing, generative modeling, and structural analysis.

However, crafting these visual programs for a given visual datum can be laborious, even for skilled visual programmers. Therefore, its desirable create systems which perform Visual Program Inference (VPI), i.e. the task of automatically inferring programs that represent the input visual datum. Here, we delve into unsupervised neural VPI, i.e., we are given examples of shapes from a target distribution (such as 3D chairs), without their associated programs and our objective is to train a neural network that aids inferring programs for new shapes from the same target distribution.

In this work, we demonstrate that code rewriting techniques can be effectively integrated into neural VPI technqiues, for training the neural network as well as while performing neurally guided inference. By integrating a family of code rewriters into boostrapped learning methods, we achieve state-of-the-art performance on a variety of visual programming domains while also inferring highly parsimonious programs. We propose: 1) Sparse Intermittent Rewrite Injection (SIRI) a framework for unsupervised visual program inference that leverages a family of code rewriters, and 2) A family of code rewriters applicable to multiple DSLs that benefit VPI learning methods and can be used in a test-time rewriting scheme.

Sparse Intermittent Rewrite Injection (SIRI)

We propose Sparse Intermittent Rewrite Injection (SIRI), a framework for unsupervised bootstrapped learning. SIRI sparsely applies code rewrite operations over a dataset of training programs, and injects the improved programs back into the training set in a controlled manner to ensuring diversity of training programs. Please refer to the paper for more details.

Code Rewriters

We identify three rewrite operators that generalize across multiple shape-program domains, namely Parameter Optimization (PO), Code Pruning (CP), and Code Grafting (CG). Please refer to the paper for more details.

Parameter Optimization (PO)

The PO rewriter updates continuous parameters of the program while keeping its discrete structural parameters fixed. Here, we show an example of using PO on a 3D CSG program. Check out the supplementary for more details.

Code Grafting (CG)

The CG rewriter replaces sub-expressions of a particular program with a better ones from a cache of previously discovered sub-expressions. To accelerate this search we derive the desired execution of sub-expressions w.r.t. the target shape by performing a masked function inversion. Here, we show the CG rewriter applied to a 3D CSG program.

Qualitative Results

Comparison to Domain Specific Approach (CSGStump)

Comparison to Boostrapped Learning Approach (PLAD)

BibTeX

@inproceedings{ganeshan2023coref,
      title={Improving Unsupervised Visual Program Inference with Code Rewriting Families},
      author={Ganeshan, Aditya and Jones, R. Kenny and Ritchie, Daniel},
      booktitle = {Proceedings of the International Conference on Computer Vision ({ICCV})},
      year={2023}
    }

Acknowledgement

We would like to thank the anonymous reviewers for their helpful suggestions. This work was funded in parts by NSF award #1941808 and a Brown University Presidential Fellowship. Daniel Ritchie is an advisor to Geopipe and owns equity in the company. Geopipe is a start-up that is developing 3D technology to build immersive virtual copies of the real world with applications in various fields, including games and architecture.