Many-Worlds Inverse Rendering

ZIYI ZHANG, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland

NICOLAS ROUSSEL, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland

WENZEL JAKOB, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland

Geometry optimization states

Occlusion Shading

Occlusion

(

a) Local perturbation of

(

b) Non-local perturbation of

Light path

Derivative domain

Iter 6002015105

Shading

Fig. 1. Le: (a) Prior dierentiable rendering methods compute how surface deformations impact light transport through occlusion and shading changes.

The resulting gradients drive local geometric adjustments. (b) Our method instead considers adding hypothetical surface patches anywhere in 3D space.

We simultaneously evaluate many such patches as independent, competing explanations of the input data. This approach, termed many-worlds derivatives,

extends gradient computation from surfaces into the surrounding space. This combines the robustness of volumes with the eiciency of surface rendering:

our method does not require an initial mesh and can be started from an empty scene, while avoiding the expense of transmiance and multiple scaering

computations. Right: An example reconstruction using many-worlds derivatives: a triangle mesh embedded in glass and observed through a mirror.

Discontinuous visibility changes remain a major bottleneck when optimiz-

ing surfaces within a physically based inverse renderer. Many previous

works have proposed sophisticated algorithms and data structures to sample

visibility silhouettes more eciently.

Our work presents another solution: instead of evolving a surface locally,

we extend dierentiation to hypothetical surface patches anywhere in 3D

space. We refer to this as a “many-worlds” representation because it models

a superposition of independent surface hypotheses that compete to explain

the reference images. These hypotheses do not interact through shadowing

or scattering, leading to a new transport law that distinguishes our method

from prior work based on exponential random media.

The complete elimination of visibility-related discontinuity handling

bypasses the most complex and costly component of prior inverse rendering

methods, while the extended derivative domain promotes rapid convergence.

Authors’ Contact Information: Ziyi Zhang, École Polytechnique Fédérale de Lausanne

(EPFL), Lausanne, Switzerland, ziyi.zhang@ep.ch; Nicolas Roussel, École Polytech-

nique Fédérale de Lausanne (EPFL), Lausanne, Switzerland, nicolas.roussel@ep.ch;

Wenzel Jakob, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland,

wenzel.jakob@ep.ch.

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for prot or commercial advantage and that copies bear this notice and the full citation

on the rst page. Copyrights for components of this work owned by others than the

author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or

republish, to post on servers or to redistribute to lists, requires prior specic permission

and/or a fee. Request permissions from permissions@acm.org.

ACM 1557-7368/2025/9-ART

https://doi.org/10.1145/nnnnnnn.nnnnnnn

We demonstrate that the resulting Monte Carlo algorithm solves physically

based inverse problems with both reduced per-iteration cost and fewer total

iterations.

CCS Concepts: • Computing methodologies → Rendering.

Additional Key Words and Phrases: dierentiable rendering

ACM Reference Format:

Ziyi Zhang, Nicolas Roussel, and Wenzel Jakob. 2025. Many-Worlds Inverse

Rendering. ACM Trans. Graph. 1, 1 (September 2025), 17 pages. https://doi.

org/10.1145/nnnnnnn.nnnnnnn

1 Introduction

Dramatic progress in the area of inverse rendering has led to meth-

ods that can fully reverse the process of image formation to recon-

struct 3D scenes from 2D images.

This is usually formulated as an optimization problem: given a

loss function

and a rendering

R(𝜋)

of tentative parameters

𝜋

, we

seek to minimize their composition

𝜋

∗

= argmin

𝜋 ∈Π

L(R (𝜋 )). (1)

The details greatly vary depending on the application, but usually

will quantify the dierence between the rendering and one or

more reference images.

Recent work on this problem is largely based on emissive volume

reconstruction [Mildenhall et al

2021; Kerbl et al

2023]. The term

ACM Trans. Graph., Vol. 1, No. 1, Article . Publication date: September 2025.

2 • Zhang et al.

“emissive” highlights that this approach does not rely on simulating

light physics or material reectance; instead, the volume is treated

as if it was a natural emitter of light and stores color values repre-

senting this emitted radiation. These methods are popular because

the direct mapping between stored color and observed appearance

results in well-behaved optimization problems.

Physically based methods instead seek a more complicated expla-

nation: they simulate emission and scattering inside a general scene

that can contain essentially anything: surfaces, volumes, physically

based BRDFs, etc. Interreection adds dense nonlinear parameter

dependencies that make this a signicantly harder optimization

problem. It goes without saying that these two approaches do not

compete: when an emissive volume is an acceptable answer, it should

always be preferred. This is because it encapsulates material, light-

ing, geometry and inter-reection in a unied eld representation

that promotes speedy and stable convergence.

That said, many applications rely on reconstructions that account

for indirect cues—such as shadows and interreections—and aim

to produce physically meaningful results suitable for downstream

tasks like relighting and editing. Unfortunately, algorithms designed

for this harder physical problem are often surprisingly brittle.

A default strategy is to use gradient-based descent to evolve a

scene representation (e.g., a triangle mesh) in a domain

contain-

ing the target

𝜋

∗

. It seems only natural that we try to reach this

goal by evolving compatible models 𝜋

𝑖

∈ Π in this domain, e.g., by

deforming a tentative surface starting from an initial guess.

However, a signicant complication arises when targeting the

physically based avor of this problem: computing

∇

𝜋

L(R (𝜋

𝑖

))

with automatic dierentiation produces incorrect gradients due to

parameter-dependent discontinuities caused by the visibility func-

tion. Incorporating specialized techniques to x this problem simply

leads to the next one: the algorithm now outputs sparse gradients

on object silhouettes that cause convergence to bad local minima.

Preconditioning and regularization can help, but even with all these

xes in place, the optimization often still does not work all that well.

To avoid these problems, we consider optimizations on an ex-

tended parameter space:

𝜋

∗

𝜋

∗

= argmin

𝜋 ∈Π,

𝜋 ∈

R(𝜋,

𝜋)), (2)

where

is furthermore parameterized by

𝜋 ∈

representing fea-

tures that we are unwilling to accept in the nal solution. If the role

of these “perpendicular dimensions” diminishes over time, then we

can simply discard

𝜋

∗

at the end and take

𝜋

∗

to be the solution of

the original optimization problem.

The parameter space extension serves two purposes: rst, it turns

a circuitous trajectory through a non-convex energy landscape into

a more direct route by using the extra degrees of freedom:

Second, we will choose the extension so that visibility changes in the

original problem cease to be discontinuous in the extended space,

which greatly reduces algorithmic complexity.

Seen from a high level, this isn’t a new idea: numerous works in

emissive surface reconstruction [Yariv et al

2021; Wang et al

2021;

Miller et al

2024] have shown that the problem becomes tractable

when starting from an emissive volume like NeRF. However, tak-

ing this idea to the world of physically based rendering (e.g., by

optimizing a scattering microake volume) leads to several serious

problems:

(1)

Speed. Introducing random volumes into a physically based

renderer requires solving the radiative transfer equation (RTE),

which involves free-ight sampling and transmittance estima-

tion along every ray segment. In heterogeneous volumes, both

are computationally expensive iterative processes, making this

approach signicantly slower than surface-based methods.

(2)

Extraction. A separate optimization is required to convert

converged volumes into optically similar surfaces with BRDFs,

and the resulting surface is often only an approximation of the

volume appearance. Prior work related the bulk properties of

microfacets and microakes [Dupuy et al

2016; Miller et al

2024], but this insight cannot be used to convert a general

microake volume into a surface, nor does it generalize to the

richness of reectance models in modern rendering systems.

Instead of introducing an exponential volume that blends all

possible surfaces multiplicatively, we retain a surface (denoted

𝑆

)

and augment it with a spatial representation

𝑆

to model non-local

perturbations of that surface.

Consider placing a new surface patch above the original sur-

face. Standard dierentiation of a path tracer does not account

for this type of change, yet it is a valid way to evolve the scene

representation—one that evolves the surface in a non-local manner.

Importantly, these non-local perturbations can occur simultane-

ously along a light path: the optimization of a potential surface

patch at one location is independent of that at another location.

This is because these hypothetical surfaces aren’t real (or at least,

not yet). They constitute dierent surface possibilities that may or

may not become part of the nal reconstruction. It makes little sense

for them to aect each other.

This leads to a new transport law that we refer to as many-worlds

derivative transport. The term “many-worlds” not only suggests

a distribution of surfaces but also emphasizes the non-interacting

nature of dierent perturbations (“worlds”).

Dierentiation of this model produces dense surface derivatives

in an extended domain. Since we avoid diusing values into an

exponential volume, the optimization remains as ecient as surface

evolution methods. Finally, mesh extraction is a simple projection

that discards

𝑆

, and all of this is easily incorporated into a physically

based path tracer.

Although this paper sometimes refers to the spatial representa-

tion

𝑆

as a “volume”, our method should not be interpreted as volume

reconstruction. Specically, we never optimize this “volume”

𝑆

match input images; instead, we use it to rene the surface

𝑆

. More-

over, our method interacts with BRDFs rather than phase functions,

and has no concept of transmittance in its radiative transfer.

ACM Trans. Graph., Vol. 1, No. 1, Article . Publication date: September 2025.

Many-Worlds Inverse Rendering • 3

Our main contribution is a solution to the classic discontinuity

problem that is algorithmically simpler and computationally more

ecient than prior methods. We present two derivations that alter-

natively formulate the central algorithm as either

•

a non-local perturbation of an evolving surface (Section 3), or

•

an extension of the surface derivative domain that removes the

need for silhouette sampling (Section 4 and Appendix A).

2 Related work and background

This section reviews related prior work on inverse rendering of solid

geometry. The simultaneous disentanglement of geometry, material

and lighting is orthogonal to the topic of this work.

2.1 Physically based inverse rendering

The physically based approach tries to model all available infor-

mation in input images by accounting for lighting, materials, and

interreection. This leads to several challenges.

Discontinuities. Analytic dierentiation of the rendering equation

produces a boundary derivative term that is not sampled by primal

rendering strategies. Consequently, applying automatic dierentia-

tion to a rendering algorithm produces incorrect gradients.

Estimating the missing derivative requires sampling rays along

shape boundaries (i.e., silhouettes). Prior works [Li et al

2018; Zhang

et al

2020; Yan et al

2022; Zhang et al

2023a; Xu et al

2023, 2024]

have extensively studied the theory and practice of estimating this

boundary term. Another approach employs reparameterizations to

convert the boundary contribution to neighboring regions, enabling

simpler area-based formulations. These methods [Loubet et al

2019;

Bangaru et al

2020] often require tracing auxiliary rays to detect the

presence of neighboring silhouettes. For geometry representations

based on signed distance elds (SDFs), specialized algorithms can

alleviate some of this cost [Vicini et al

2022; Bangaru et al

2022;

Wang et al. 2024].

While the theory of the boundary term continues to evolve, prac-

tical eciency and robustness remain challenging. Current methods

still have limitations concerning bias (e.g., specular surfaces) and

variance (e.g., when multiple silhouettes are in close proximity).

These methods are dicult to implement, and the cost of boundary

handling is usually the main bottleneck of the entire optimization.

Sparse gradients. The derivative of the rendering process can be

decomposed into two types: the continuous part and the discontinu-

ous part. We refer readers to Figure 4 in Zhang et al

’s work [2023a]

for an illustration. Since our paper focuses on geometry optimiza-

tion, we refer to the continuous part as the shading derivative and

the discontinuous part as the boundary derivative in the following.

The shading derivative is dened everywhere on the surface

and primarily aects the surface through its normal and heavily

depends on the lighting. Adjoint rendering methods [Nimier-David

et al

2020; Vicini et al

2021b] exploit symmetries to compute the

shading derivative with linear time complexity. Zeltner et al

[2021]

analyze dierent strategies for variance reduction in this process.

The boundary derivative, on the other hand, is only dened on the

visibility silhouette of the surface. These surface derivatives are

sparse in two senses:

(1)

A single view only observes a limited subset of all possible

silhouettes, and a derivative step is likely to introduce a kink

at this subset. Nicolet et al

[2021] reparameterize meshes on a

space that promotes smoothness to mitigate this issue.

(2)

The derivatives are only dened on the surface, not through-

out the entire 3D space. Although light transport is simulated

in this larger space, only the neighborhood of a surface can

use the computed gradients. Regions far from the initial guess,

must wait until the surface extends to them to receive gradi-

ents. Mehta et al

[2023] propose an elegant solution for vector

graphics in 2D; however, they do not extend their approach to

dene surface derivatives in the entire 3D space.

The sparsity has two implications: (1) the optimization is slow

because it needs many iterations to deform the surface to the target

shape, and (2) a good initial guess is needed, especially when the

target shape has a complicated topology or self-occlusions.

Approximations. There is a substantial body of literature on ap-

proximating the PBR process to make optimization easier. These

methods often bypass the discontinuity problem by assuming knowl-

edge of a shape mask (implicitly assuming the shape is directly ob-

servable), approximating boundary derivatives [Laine et al

2020], or

diusing the rendering process into a higher-dimensional space [Fis-

cher and Ritschel 2023]. Additionally, many methods approximate

the light transport for eciency [Zhang et al

2021a; Jin et al

2023;

Zhang et al. 2021b; Boss et al. 2021]. These methods are physically

inspired and can handle complex scenes, while our theory focuses

on dierentiating a fully physically based rendering process.

Nimier-David et al

[2022] demonstrate that a physically based

non-emissive volume can be optimized to t the input images. These

methods are more physically accurate than the emissive volume

methods but do not yield a surface representation in the end.

2.2 Emissive volume inverse rendering

A large body of work based on variants of the NeRF [Mildenhall

et al

2021] technique replaces the expensive rendering process with

an emissive volume, which leads to a fog-like volume that does not

accurately represent a solid surface.

Later work [Yariv et al

2021; Wang et al

2021; Li et al

2023;

Miller et al

2024] added a surface prior to the volume model by

introducing an SDF and deriving the volume density from it. These

methods map SDF values to volume densities using a distribution

function, assuming that a ray intersects a stochastic object following

a Markov process. The variance of the distribution is learnable

during optimization: it begins with a high-variance distribution,

mimicking the NeRF model, and gradually reduces to ensure that a

surface can condently be extracted.

These SDF-parameterized volume reconstructions can be seen as

a blurry version of surface-based methods without indirect eects.

When the variance is zero, these methods simplify to surface-based

methods

. With moderate variance, the eect of a solid object, which

we can interpret as “terminating a ray at a position with probability

1”, is diused to the neighborhood, interpretable as “terminating a

ray in this neighborhood according to a probability density”.

This equivalence is theoretical. In practice, algorithms designed for volume optimiza-

tion cannot handle such a Dirac-delta density distribution due to discontinuities.

ACM Trans. Graph., Vol. 1, No. 1, Article . Publication date: September 2025.

4 • Zhang et al.

2.3 Surfaces versus volumes

Several works studied the relationship of surface and volume ren-

dering: Heitz et al

[2016] and Dupuy et al

[2016] describe micro-

facet surfaces within the framework of random volumes. Vicini

et al

[2021a] incorporated non-exponential transmittance to ac-

count for correlations arising from opaque surfaces. Miller et al

[2024]

formalize microake volumes as the relaxation of a stochastic sur-

face model and solve inverse problems using this representation.

Seyb et al

[2024] recently incorporated a richer set of spatial correla-

tions, producing a forward rendering framework that spans the full

continuum ranging from pure volume to surface-like interactions.

3 Method

3.1 Motivation

Extended parameter space. Our algorithm optimizes a surface

𝑆

in an indirect manner via a distribution

𝑆

of potential surfaces that

denes an associated density eld in 3D space. The surface

𝑆

is a

product of this eld, for example by extracting a level set. By modi-

fying the distribution

𝑆

rather than

𝑆

directly, we enable gradient

propagation throughout the entire space, not just on the surface

itself. The details of how these spaces are dened are orthogonal to

our method, and we describe them in Appendix C.

Figure 1b demonstrates the eect of adding a hypothetical surface

patch (drawn from

𝑆

) into a scene containing the surface

𝑆

. This

patch modies how light propagates in the scene, thereby changing

the surface rendering of

𝑆

. By adjusting the existence and properties

(e.g., normals, BRDFs) of such patches, we iteratively improve

𝑆

better match the target image.

We refer to this a non-local perturbation of

𝑆

: optimizing such hy-

pothetical patches propagates derivatives across the entire domain

of 𝑆, rather than conning updates locally on the surface.

There is a noteworthy connection to prior work: in the limiting

case where the perturbation position coincides with

𝑆

itself, our

derivatives exactly match standard surface derivatives in physical

light simulation [Zhang et al

2020, Zhang et al

2023a]. Appendix A

provides details on this equivalence, showing that our method is

a generalization of local surface evolution, extending its domain

while preserving its geometric meaning.

The optimization converges when no further perturbation im-

proves the match between

𝑆

and the target image. At this point, we

discard 𝑆 and keep

𝑆.

Within each iteration of the optimization,

𝑆

serves as a static

background, providing base colors for perturbations without being

optimized itself. Thus, we also refer to

𝑆

as the background surface.

Conicting possibilities. Consider a simple case of two non-local

perturbations along a ray. We think of them as competing candidates

for improving the agreement between the radiance arriving from

𝑆

and a reference. For example, suppose that

𝜕loss

𝜕𝐿

indicates that the

current pixel’s radiance is too high—in this case, the same informa-

tion should be propagated to both positions without weighting.

Perturbation 1

Perturbation 2

Here, it might seem natural to distribute the target update between

the two positions — say, by scaling it by

. However, this weighting

implicitly assumes the two perturbations compound to rene the

same surface

𝑆

. In reality, they are mutually exclusive possibilities —

once optimization converges, only one will contribute to the nal

radiance, without any kind of blending.

This principle extends to more than two perturbations: all candi-

date positions along the ray should receive the same target update,

as if existing in

many worlds

that do not interact. Ultimately, the ray

will intersect only one of these possibilities, which becomes part of

the nal surface.

The above discussion explains the motivation behind our method

at an intuitive level. Our next goals are therefore to quantify non-

local perturbations and derive the derivative transport law.

3.2 Many-worlds derivative transport

Single perturbation case. Consider a scene containing only the

background surface

𝑆

, where the radiance propagating along ray

(y, x) with direction −𝝎 remains constant:

𝐿

(0) = 𝐿

(𝑠),

where

𝐿

(0)

= 𝐿

(x, 𝝎)

(incident radiance at x) and

𝐿

(𝑠)

= 𝐿

(y, −𝝎)

(outgoing radiance at y) are parameterized by distance.

For a non-local perturbation candidate at distance

𝑡

, we model

its impact on radiative transport as:

𝐿

(0) = 𝛼 (𝑡) 𝐿

(𝑡)

|{z}

perturbed

[

1 − 𝛼 (𝑡)

]

𝐿

(𝑠)

|{z}

original

, (3)

where

𝛼 (𝑡) ∈ [

]

is the probability of a hypothetical surface

patch existing at

𝑡

. The perturbed radiance

𝐿

(𝑡)

computes reected

radiance as if the patch were inserted at

𝑡

, while the rest of the

scene remains

𝑆.

While this formulation is of little use for physically based render-

ing, its derivative (with respect to any parameter

𝜋

) provides the

means to optimize on the extended parameter space:

𝜕

𝜋

𝐿

(0) = 𝜕

𝜋

𝛼 (𝑡) [𝐿

(𝑡) − 𝐿

(𝑠)]

(i) Occlusion

+ 𝛼 (𝑡 ) 𝜕

𝜋

𝐿

(𝑡)

(ii) Shading

. (4)

If 𝜕

𝜋

𝐿

(0) is positive, we can increase the incident radiance by

(1)

Raising the occupancy at

𝛼 (𝑡)

if this perturbation is favorable,

i.e., 𝐿

(𝑡) > 𝐿

(𝑠), or lower it otherwise.

(2)

Increasing the reected radiance

𝐿

(𝑡)

, e.g., by altering the

normal or BRDF of the hypothetical surface.

Both operations locally update the distribution 𝑆 at 𝑡.

Multiple perturbation case. We now extend to the situation where

every point along the ray can be considered a potential perturbation.

As discussed in Section 3.1, we aim to distribute the target update

uniformly across all possibilities along one segment.

Similar to the reverse-mode derivative of an addition

𝑐 = 𝑎 + 𝑏

that simply forwards the derivative towards its operands

𝑎

and

𝑏

without weighting (

𝜕𝑐

𝜕𝑎

1, not

), we sum radiative contributions

from distinct perturbations using a continous integral to achieve an

ACM Trans. Graph., Vol. 1, No. 1, Article . Publication date: September 2025.

Many-Worlds Inverse Rendering • 5

exponential transmiance

many-worlds

linear transmiance

Fig. 2. Many-worlds derivative transport. The propagated derivative

at distance

𝑡

is only weighted by the local radiance dierence and the

local occupancy respectively (Equation 6). In contrast to exponential or

non-exponential (e.g., linear [Vicini et al

2021a]) volumes, the notion of

transmiance disappears as it models nonsensical inter-world shadowing.

Occupancy field Orientation field

empty

filled

surface

Fig. 3. Occupancy and orientation fields. The above images visualize

the contents of an occupancy (

𝛼

) and orientation (

𝜷

) field following an

optimization. The former models the probability of a surface existing at a

position, while the laer assigns normal directions.

analogous propagation behavior:

𝐿

(0) =

∫

𝑠



𝛼 (𝑡) 𝐿

(𝑡)

|{z}

candidate at 𝑡

[

1 − 𝛼 (𝑡)

]

𝐿

(𝑠)

|{z}

background



d𝑡. (5)

Dierentiating it yields the many-worlds derivative transport law:

𝜕

𝜋

𝐿

(0) =

∫

𝑠



𝜕

𝜋

𝛼 (𝑡) [𝐿

(𝑡) − 𝐿

(𝑠)] + 𝛼 (𝑡) 𝜕

𝜋

𝐿

(𝑡)



d𝑡. (6)

The derivative in Equation 6 lacks transmittance terms that would

ordinarily model attenuation along the ray (Figure 2). This stems

from the core principle that distinct worlds must not interact.

The only way in which the background surface

𝑆

manifests in this

equation is to provide a single baseline radiance value

𝐿

(𝑠)

needed

to compute a dierence of radiance values. As a result,

𝑆

does not

directly receive gradients; yet, it still evolves during optimization

as a result of changes in the distribution 𝑆 that generates

𝑆.

Parameterization. Equation 6 requires derivative propagation to-

wards two quantities in the extended parameter space: the prob-

ability of encountering a surface within the distribution, and the

outgoing radiance 𝐿

(x, 𝝎) determined by its properties.

Any

𝑆

with dierentiable realizations of these quantities is in

principle suitable—we use an occupancy eld

𝛼 (x) : R

→ [0, 1]

and an orientation eld 𝜷 (x) : R

→ S

(Figure 3).

The orientation eld

𝜷 (

)

assigns a normal direction to the sur-

face patch at x. The occupancy eld

𝛼 (

)

[Mescheder et al

2019;

Iteration 20

Iteration 200

Ground truth geometry

Without orientationWith orientation

Optimization views

Fig. 4. Importance of the orientation field. We compare reconstructions

done with and without an orientation field

𝜷

. The top row without

𝜷

uses an

isotropic normal distribution. A lack of orientation information dramatically

slows down convergence and produces incorrect meshes.

Niemeyer et al. 2020] models the probability

𝛼 (x)

= Pr{x is inside of 𝑆}, (7)

and is zero when encountering the back side (𝝎 · 𝜷 (x) < 0).

Anisotropy is crucial for physically based inverse rendering. Fig-

ure 4 demonstrates this: assuming a uniform normal distribution for

surface patches produces incorrect results. This happens because

the reference scene is surface-based, where light reection is highly

anisotropic—a property that isotropic distributions fail to capture.

This concludes our derivation via non-local surface perturbations.

To reinforce these results and gain a deeper understanding:

(1)

Appendix A re-derives Equation 6 by analyzing standard sur-

face derivatives and extending them into space.

(2)

Appendix B frames our approach using the mathematical lan-

guage of random volumes.

3.3 Primal rendering

Dierentiable rendering pipelines normally repeatedly render a pri-

mal image, dierentiate a loss, and then backpropagate derivatives.

The previous discussion was only concerned with derivative prop-

agation, and there is thus a question of how to generate a primal

image of our extended parameter space

A natural choice is to only use the background surface

𝑆

for

primal rendering. (Note that primal images serve solely to compute

the adjoint radiance

𝜕loss

𝜕𝐿

—we still employ Equation 6 for derivative

propagation.) Unfortunately, this approach fails catastrophically in

our framework: regions beyond the surface

𝑆

are excluded from

the loss computation yet still receive gradient updates, causing

optimization to become unstable, divergent, and eectively random.

We thus use a primal rendering that incorporates both the sur-

face

𝑆

and the distribution

𝑆

. Recall that the many-worlds principle

In many applications, derivative propagation arises naturally from automatic dieren-

tiation of the primal computation. However, this coupling introduces bias in physically

based rendering [Nimier-David et al

2020, Section 3.2], which necessitates separate,

uncorrelated Monte Carlo simulation of primal rendering and derivative transport.

ACM Trans. Graph., Vol. 1, No. 1, Article . Publication date: September 2025.

6 • Zhang et al.

manifests in Equation 5 as a direct summation of non-local perturba-

tions. This summation is not directly suitable for primal rendering

as it yields unbounded values. For all experiments in this work,

we compute an average over all perturbations (dierent from an

average over all possible surface renderings) by scaling Equation 5

by a factor of 1/𝑠:

𝐿

(0) =

𝑠

∫

𝑠



𝛼 (𝑡) 𝐿

(𝑡) +

[

1 − 𝛼 (𝑡)

]

𝐿

(𝑠)



d𝑡. (8)

The scaling is an empirically motivated choice rather than a theo-

retically unique solution. Our experiments demonstrate that this

normalization is straightforward to implement and produces mean-

ingful adjoint radiances (

𝜕loss

𝜕𝐿

) that enable rapid convergence.

Relation to “radiance eld loss”. A recent work [Zhang et al

2025]

instantiated our many-worlds framework for radiance eld recon-

struction, specializing it to the simplied setting of pure emission

without scattering. This enables several simplications: (1) due to

the lack of BRDF and light integration, noise-free radiance values

can be retrieved from an appropriate representation (typically a

neural network), (2) ray marching replaces Monte Carlo sampling,

and (3) rays are traced from xed pixel centers, so the pixel footprint

integral also disappears.

As a result, there is a 1:1 mapping between surface radiance and

reference pixel values, allowing individual losses to be dened for

each potential surface, which is unattainable in our setting. Since ra-

diance contributions from dierent perturbations are never summed,

their method sidesteps the scaling factor in Equation 8. In contrast,

this work addresses the more dicult nested integral problem in-

herent to physically based rendering, where such simplications do

not apply.

4 Discussion

Pseudocode. We present one possible implementation of our many-

worlds framework in Mitsuba 3 [Jakob et al

2022]. Algorithm 1 uses

a variable

mode

to distinguish between the primal rendering pass

and the derivative propagation pass.

Relation to surface derivatives. Prior work on geometry dierenti-

ation have proposed local derivative formulations [Zhang et al

2020,

Zhang et al

2023a] to quantify how small perturbations of a surface

aect radiative transport across the entire scene in physical light

simulation.

Appendix A extends such formulation to measure how tiny changes

of any hypothetical surface patch within

𝑆

inuence radiative trans-

port. This oers a quantitative way to re-derive the many-worlds

derivative transport law using established theory. We made the

following observations:

(1)

Our method does the right thing near the surface: the opti-

mization behavior matches surface dierentiation algorithms

without requiring explicit silhouette sampling.

(2)

Visibility and shading derivatives are combined into a unied

expression in the extended parameter space. This unication is

impossible on the surface

𝑆

, as the two derivatives are dened

on dierent domains. Unlike prior work—which required sepa-

rate algorithms to compute these two types of derivatives—our

method drastically reduces algorithmic complexity.

Listing 1. Pseudocode of the Many-Worlds primal/backward pass

1 mode = "Primal" # Or "Backward"

3 def Li(x, 𝝎):

4 # Pick a segment to interact with a surface patch

5 𝑘

= ⌊𝑘

max

* rand()⌋

6 return Li_k(x, 𝝎, 𝑘

, 0)

8 def Li_k(x, 𝝎, 𝑘

, 𝑘):

9 if 𝑘 > 𝑘

max

: # Path length exceeds limit

10 return 0

12 # Radiance estimate from background surface

13 s = ray_intersect(

𝑆, x, 𝝎)

14 x

′

= x + s * 𝝎 # Advance to surface

15 𝝎

′

, 𝑤

brdf

= sample_brdf(x

′

, -𝝎)

16 L_bg = Le(x

′

,-𝝎) + Li_k(x

′

, 𝝎

′

, 𝑘

, 𝑘 + 1) * 𝑤

brdf

17 if 𝑘 != 𝑘

: # Segment is before/after the sampled segment

18 return L_bg

20 # Radiance estimate from sampled surface patch

21 t, 𝑤

surf

= sample_surface(rand(), s)

22 if mode == "Primal":

23 weight = 𝑤

surf

/ s # Primal pass: not differentiated

24 else:

25 weight = 𝑤

surf

# Backward pass: derivative propagation

26 x

′

= x + t * 𝝎 # Advance to sampled surface

27 𝝎

′

, 𝑤

brdf

= sample_brdf(x

′

, -𝝎)

28 occupancy = 𝛼(x

′

) # Occupancy at sampled point

29 L_fg = Le(x

′

,-𝝎) + Li_k(x

′

, 𝝎

′

, 𝑘

, 𝑘 + 1) * 𝑤

brdf

30 return lerp(L_bg, L_fg, occupancy) * weight

Relation to volume rendering. At a high level, both inverse vol-

ume rendering [Nimier-David et al

2022] and our method can be

interpreted as optimizing a distribution of surfaces. However, the

two approaches dier fundamentally in how they interact with this

distribution: when multiple potential surfaces exist along a ray, how

do we model their interplay?

•

Exponential volume rendering treats interactions as statisti-

cally independent events, leading to a memoryless Poisson

process where multiple interactions occur, weighted by rela-

tive occlusion probabilities [Bitterli et al. 2018].

•

Our method treats interactions as mutually exclusive events,

ensuring potential surfaces along a ray do not interact via

shadowing or scattering.

Consider a distribution

𝑆

modeling a nearly empty scene con-

taining a single innite plane with an uncertain oset. Within any

realization of this distribution, light traveling towards this plane

will scatter exactly once. In contrast, the volume model diuses the

ray-surface interaction into a band of microakes that cause arbi-

trarily long scattering chains. This not only leads to a signicant

increase in computational cost, but also discombobulates the optical

interpretation of these interactions: the eect of multiple scattering

must later be approximated as a surface BRDF, a process that in

general admits no exact solution and relies on approximation.

Traditional inverse rendering pipeline (which renders the scene to

compute a loss) collapses under the mutually exclusive assumption,

ACM Trans. Graph., Vol. 1, No. 1, Article . Publication date: September 2025.

Many-Worlds Inverse Rendering • 7

Fig. 5. Our method refines a surface

𝑆

by optimizing a distribution

𝑆

possible surfaces. Each hypothetical surface patch drawn from

𝑆

is indepen-

dently adjusted to improve the matching between

𝑆

and the target image.

To define the behavior of possible surfaces in space,

𝑆

is parameterized by

an occupancy field 𝛼 (x) and an orientation field 𝜷 (x).

because it would render each potential surface as a separate scene,

thus unrealistically requiring every potential surface to match the

reference image. Instead, we optimize potential surfaces by rening

the rendering of a background surface

𝑆

, giving the algorithm a clear

convergence target (Figure 5). Such comparative adjustments lead

to the notion of non-local surface perturbations, and simultaneous

optimization of all perturbations leads to our many-worlds derivative

transport.

5 Results

Prior work on physically based inverse rendering often includes

comparisons of forward derivatives to other competing methods or

nite dierence-based reference derivatives. They reveal the change

in rendered pixels when perturbing a single scene parameter. How-

ever, such a comparison is neither applicable nor meaningful in

our method: rst, our method computes the extended derivative

on a higher-dimensional domain, which therefore cannot be com-

pared to existing methods. Second, the need for such comparisons is

motivated by the complexity of formulations that deal with discon-

tinuous visibility, but dierentiation our representation is “trivial”

and therefore not interesting. Because of this, we demonstrate the

correctness and performance of our method solely through end-to-

end optimizations.

All results presented in this paper exclusively use many-worlds

derivatives and explicitly disable surface derivatives on

𝑆

, even for

the albedo optimization demonstrated in Figure 7.

Multi-view reconstructions. Figure 6 presents multi-view object

reconstructions of a known material in various settings using our

method. All experiments use the Adam optimizer [Kingma and

Ba 2014]. We found that disabling momentum in the early itera-

tions helped avoid excessive changes while the occupancy eld

was still very far from convergence. Alternatively, a stochastic gra-

dient descent optimizer produces comparable results. During the

reconstruction, each view is rendered at a 512×512 resolution.

The last four rows of Figure 6 show reconstructions involving

perfectly specular surfaces. No prior PBR method could handle such

scenes: reparameterization-based methods would need to account

for the extra distortion produced by specular interactions, while

silhouette segment sampling methods would need to nd directional

emitters through specular chains. Both are complex additional re-

quirements that would be dicult to solve in practice, while the

problem simply disappears with the many-worlds formulation.

Albedo reconstruction. This paper primarily focuses on geometry,

but our method can also be used to optimize materials or lighting.

Figure 7 presents experiments where we jointly optimize the geom-

etry and albedo texture of an object. We store this spatially varying

albedo on an additional 3D volume that parameterizes the BSDF of

the many-worlds representation.

Benets of assumption-free geometry priors. Figure 8 compares

our method to a technique that evolves an SDF using reparameteri-

zations [Vicini et al

2022]. These experiments demonstrate that our

method requires signicantly fewer iterations because new surfaces

can be materialized from the very rst iteration.

We use 20 random optimization views, with each view rendered

at 256

256 pixels. Both methods utilize a geometry grid resolution

of 128

. The time required to perform one iteration of the opti-

mization for one view is 0

25 seconds for our method, 0

38 seconds

for the large sphere initialization and 0

32 seconds for the small

sphere initialization. Our method benets from the eciency of

ray-triangle intersections, whereas the SDF representation relies on

a more costly iterative sphere tracing algorithm. These measure-

ments also show how sphere tracing slows down with smaller steps

due to complicated geometry near a ray, as evidenced by the slower

performance of the larger sphere initialization.

Optimizing all positions at once. Figure 9 replicates the chair re-

construction experiment of a recent work by Zhang et al

[2023a]

to demonstrate the benets of the extended parameter space.

Interior topological changes. Figure 10 illustrates a limitation of

our method: it does not robustly handle interior topological changes.

This issue arises because many-worlds derivatives extend only to

the exterior of

𝑆

, leaving the interior unsupervised, much like tradi-

tional surface evolution methods. When attempting to create a hole,

we rely on shading derivatives to bend the surface inward, which

sometimes produces a hole as desired (top row). However, optimiza-

tion like this is sensitive to lighting conditions; as shown in the

bottom row, a dierent lighting can cause the

optimization to stall.

To address this limitation, Mehta et al

[2023] proposed explicitly

testing whether creating a cone-shaped hole is benecial, and Zhang

et al

[2025] suggested sampling the background surface stochastic-

ity to occasionally permit visibility through high-occupancy regions.

We leave the exploration of these extensions for future work.

The sphere initialization is used solely to demonstrate this limita-

tion; with an assumption-free initialization, our method converges

correctly and more quickly in both lighting conditions.

Subtractive changes. Figure 11 shows the optimization states for

an initialization with random geometry. We used 16 optimization

views and the same setup as in Figure 6. Since the many-worlds deriv-

ative extends the surface derivative domain without approximations,

our method naturally address scenarios that surface derivatives can

handle—in this case, removing superuous geometry and deforming

the rest to reconstruct the desired object.

ACM Trans. Graph., Vol. 1, No. 1, Article . Publication date: September 2025.

8 • Zhang et al.

P

D D HNF

Optimization views Optimization states (re-lit)

Ground truth

geometry

24 views24 views16 views

16 views

24 views

8 views

10 20 30 50 100 Iter 200

10 15 25 40 100 Iter 200

15 20 30 40 100 Iter 400

10 15 20 25 100 Iter 200

Iter 300

200

100

Fig. 6. Multi-view geometry reconstruction. For the Deer scene, all 8 views are behind the object and we only see the front side in the mirror. For

Polyhedra, Neptune and Fertility, the object is inside a spherical or cubic smooth glass container. The material is known during optimization: the Dragon

has a rough gold material, Polyhedra and Neptune are made of copper oxide, and others are diuse.

Ground truth

geometry and texture

Optimization views Optimization states (re-lit)

Iter 40010010 15 30 300

Iter 4005010 20 30 100

Fig. 7. Material optimization. This experiment demonstrates joint optimization of geometry and an albedo texture. Our method focuses on geometric

optimization, but it is compatible with more general inverse rendering pipelines that furthermore target material and lighting.

ACM Trans. Graph., Vol. 1, No. 1, Article . Publication date: September 2025.

Many-Worlds Inverse Rendering • 9

Ours

SDF reparam

Ground truth geometry Optimization views

Geometry

initialization

Optimization states (re-lit)

1024

768

Ours SDF reparam Ground truth

Fig. 8. Assumption-free optimization. Prior surface optimization methods require an initial guess, which aects the quality of their output. We compare

our method to an unbiased surface derivative method (SDF reparameterization [Vicini et al

2022]) using an initialization with a sphere that is either bigger

(middle) or smaller (boom) than the overall size of the target. The experiment shows that the method struggles to reconstruct the target from the smaller

sphere, while carving it out of the bigger sphere seems more reliable. Our method (top) can optimize without such assumptions, i.e., starting from empty space.

Single optimization view

Optimization states (re-lit)

Geometry

initialization

Ours

Projective sampling

Iter 400

Iter 100

100

160 280

15 20

220

Ground truth geometry

Fig. 9. The benefit of simultaneously optimizing all positions. We replicate an experiment of Zhang et al

[2023a] with our method to reconstruct a chair

from a single reference image. The baseline employs preconditioned gradient descent [Nicolet et al

2021] to locally deform a triangle mesh. In each iteration,

the pixels rendering the legs propagate gradients to the scene, causing their progressive extrusion in a thin region of overlap between tentative object and the

chair in the reference image. Derivatives outside of the region of overlap are discarded. Our method observes and uses all gradients starting from the very first

iteration, enabling faster convergence.

Comparison with volume reconstructions. Figure 12 presents re-

sults and equal-iteration timings comparing our method to volume-

based inversion. For the latter, we set the maximum path depth of

the underlying volumetric path tracer to 3 interactions, as larger

values signicantly degrade performance without improving visual

delity. All experiments use the same number of samples per pixel,

and each optimization iteration uses all 12 optimization views. The

anisotropic volume model uses the SGGX phase function [Heitz et al

2015]. The volume models are initially as fast as ours but slow down

signicantly as the volume thickens, which is a consequence of

iterative steps needed to resolve coupling between dierent parts of

the exponential volume. Besides reducing speed, this coupling leads

to degraded reconstruction quality at equal iteration count. The

many-worlds approach is algorithmically simpler, produces a better

result in less time, and directly outputs a mesh with materials that

are ready to be relit, without the need for additional optimization

to extract a surface BRDF from phase functions.

Timings. Table 1 lists the average computation time per gradient

step and view for several scenes. These timings were measured on an

AMD Ryzen 3970X Linux workstation with an NVIDIA RTX 3090

graphics card, which our implementation uses to accelerate ray

tracing. We limited the maximum path depth for every scene to

a reasonable value. For certain scenes, we also disabled gradient

estimation for some path segments. For example, in the shape re-

constructions inside a glass object, the rst ray segment (from the

camera to the glass interface) and the last ray segment cannot inter-

sect the many-worlds representation.

ACM Trans. Graph., Vol. 1, No. 1, Article . Publication date: September 2025.

10 • Zhang et al.

Optimization states

Geometry

initialization

Ours

Fig. 10. Limitation on handling interior topological changes. Our method extends surface perturbations only to the exterior and does not accommodate

interior topological changes — such as transforming a sphere into a donut. We demonstrate this using a sphere initialization under two lighting conditions.

While shading derivatives can sometimes produce correct holes (top row), it could fail under alternative lighting conditions (boom row).

6 Conclusion and future work

Correct handling of discontinuous visibility during physically based

inversion of geometry has presented a formidable challenge in re-

cent years. Instead of proposing yet another method to solve this

challenging problem, many-worlds inverse rendering shows that

there are completely dierent ways to approach it. Using a notion

of non-local perturbations of a surface, our method successfully

synthesizes complex geometries from an initially empty scene.

Our work lays the theoretical foundation and validates this theory

with a rst implementation. However, the details of this implemen-

tation are still far from optimal and could benet from various

enhancements. For example, we derive the local orientation from a

eld that also governs occupancy, which is straightforward but also

introduces a dicult-to-optimize nonlinear coupling. Extending the

model with a distribution of orientations would further allow it to

consider multiple conicting explanations at every point.

Reconstructing an object involves a balance between exploration

to consider alternative explanations in

𝑆

and exploitation to rene

parameters of the current explanation

𝑆

. Our method aggressively

pursues the exploration phase using a uniform sampling strategy

but lacks a mechanism to eectively exploit its knowledge. We nd

that it often reconstructs a good approximation of a complex shape

in as little as 20 iterations, impossibly fast compared to existing

methods, but then requires 500 iterations for the seemingly trivial

task of smoothing out little kinks.

Our implementation of the many-worlds derivative is based on a

standard physically based path tracer, but previous works on dier-

entiable rendering have shown the benets of moving derivative

computation into a separate phase using a local formulation. This

phase starts the Monte Carlo sampling process where derivatives

locally emerge (e.g., at edges of a triangle mesh in the case of visi-

bility discontinuities). Integrating the local form of our model with

occupancy-based sampling could rene the background surface with

a more targeted optimization of its close neighborhood.

Optimization statesInitialization

Fig. 11. Subtractive changes. Many-worlds derivatives match surface

derivatives when sampled close to the surface. We initialize the scene with

dense geometry to demonstrate the robustness of our method against sub-

tractive changes.

Table 1. We measure the average time needed to optimize one 512

pixel

image for dierent scenes. The reported time covers all overheads including

the primal rendering pass, the uncorrelated derivative propagation pass,

the optimizer step and scene update.

Path

depth

spp grad spp time (s)

Heptoroid 2 2 128 32 0.41

Dragon 2 2 128 32 0.42

Deer 4 2 128 32 0.75

Polyhedra 4 2 64 32 1.10

Neptune 5 3 256 32 2.56

Fertility 5 3 256 32 2.06

Acknowledgments

This project has received funding from the European Research Coun-

cil (ERC) under the European Union’s Horizon 2020 research and

innovation program (grant agreement No 948846).

References

Sai Bangaru, Michael Gharbi, Tzu-Mao Li, Fujun Luan, Kalyan Sunkavalli, Milos Hasan,

Sai Bi, Zexiang Xu, Gilbert Bernstein, and Fredo Durand. 2022. Dierentiable

Rendering of Neural SDFs through Reparameterization. In ACM SIGGRAPH Asia 2022

Conference Proceedings (Daegu, Republic of Korea) (SIGGRAPH Asia ’22). Association

for Computing Machinery, New York, NY, USA, Article 22, 9 pages. doi:10.1145/

3550469.3555397

Sai Praveen Bangaru, Tzu-Mao Li, and Frédo Durand. 2020. Unbiased warped-area

sampling for dierentiable rendering. ACM Transactions on Graphics (TOG) 39, 6

(2020), 1–18.

ACM Trans. Graph., Vol. 1, No. 1, Article . Publication date: September 2025.

Many-Worlds Inverse Rendering • 11

Optimization states

Ground truth

(1/12 views)

Iter 20 Iter 50 Iter 100 Iter 200

Ours

11s 29s 59s 1m59s

Isotropic

22s 1m24s 3m48s 10m35s

Anisotropic

2m10s 6m5s 16m41s32s

Fig. 12. Comparison with volume reconstructions. We compare re-

construction results using exponential volume inversion and the proposed

many-worlds approach. Isotropic volumes (top) cannot replicate the surface

appearance, while anisotropic micro-flakes (middle) perform somewhat

beer. However, simulating for interactions between worlds (e.g. to model

exponential aenuation) introduces substantial computational overheads

in both volume reconstruction methods. Extracting the final BSDF and

surface from the volume poses additional challenges. Our method (boom)

produces a high-fidelity surface reconstruction in a fraction of the time.

Benedikt Bitterli, Srinath Ravichandran, Thomas Müller, Magnus Wrenninge, Jan Novák,

Steve Marschner, and Wojciech Jarosz. 2018. A radiative transfer framework for

non-exponential media. ACM Trans. Graph. 37, 6, Article 225 (Dec. 2018), 17 pages.

doi:10.1145/3272127.3275103

Mark Boss, Raphael Braun, Varun Jampani, Jonathan T Barron, Ce Liu, and Hendrik

Lensch. 2021. Nerd: Neural reectance decomposition from image collections. In

Proceedings of the IEEE/CVF International Conference on Computer Vision. 12684–

12694.

Jonathan Dupuy, Eric Heitz, and Eugene d’Eon. 2016. Additional Progress Towards the

Unication of Microfacet and Microake Theories. In EGSR (EI&I). 55–63.

Michael Fischer and Tobias Ritschel. 2023. Plateau-Reduced Dierentiable Path Tracing.

In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

4285–4294.

Eric Heitz, Jonathan Dupuy, Cyril Crassin, and Carsten Dachsbacher. 2015. The SGGX

microake distribution. ACM Transactions on Graphics (TOG) 34, 4 (2015), 1–11.

Eric Heitz, Johannes Hanika, Eugene d’Eon, and Carsten Dachsbacher. 2016. Multiple-

scattering microfacet BSDFs with the Smith model. ACM Transactions on Graphics

(TOG) 35, 4 (2016), 1–14.

Wenzel Jakob, Adam Arbree, Jonathan T. Moon, Kavita Bala, and Steve Marschner. 2010.

A radiative transfer framework for rendering materials with anisotropic structure.

ACM Trans. Graph. 29, 4, Article 53 (jul 2010), 13 pages. doi:10.1145/1778765.1778790

Wenzel Jakob, Sébastien Speierer, Nicolas Roussel, Merlin Nimier-David, Delio Vicini,

Tizian Zeltner, Baptiste Nicolet, Miguel Crespo, Vincent Leroy, and Ziyi Zhang. 2022.

Mitsuba 3 renderer. https://mitsuba-renderer.org.

Haian Jin, Isabella Liu, Peijia Xu, Xiaoshuai Zhang, Songfang Han, Sai Bi, Xiaowei Zhou,

Zexiang Xu, and Hao Su. 2023. Tensoir: Tensorial inverse rendering. In Proceedings

of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 165–174.

Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, and George Drettakis. 2023.

3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM Transactions

on Graphics 42, 4 (July 2023).

Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization.

CoRR abs/1412.6980 (2014). https://api.semanticscholar.org/CorpusID:6628106

Samuli Laine, Janne Hellsten, Tero Karras, Yeongho Seol, Jaakko Lehtinen, and Timo

Aila. 2020. Modular Primitives for High-Performance Dierentiable Rendering.

ACM Transactions on Graphics 39, 6 (2020).

Tzu-Mao Li, Miika Aittala, Frédo Durand, and Jaakko Lehtinen. 2018. Dierentiable

monte carlo ray tracing through edge sampling. ACM Transactions on Graphics

(TOG) 37, 6 (2018), 1–11.

Zhaoshuo Li, Thomas Müller, Alex Evans, Russell H Taylor, Mathias Unberath, Ming-

Yu Liu, and Chen-Hsuan Lin. 2023. Neuralangelo: High-delity neural surface

reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and

Pattern Recognition. 8456–8465.

William E. Lorensen and Harvey E. Cline. 1987. Marching cubes: A high resolution

3D surface construction algorithm. SIGGRAPH Comput. Graph. 21, 4 (aug 1987),

163–169. doi:10.1145/37402.37422

Guillaume Loubet, Nicolas Holzschuch, and Wenzel Jakob. 2019. Reparameterizing

Discontinuous Integrands for Dierentiable Rendering. Transactions on Graphics

(Proceedings of SIGGRAPH Asia) 38, 6 (Dec. 2019). doi:10.1145/3355089.3356510

Guillaume Loubet and Fabrice Neyret. 2018. A new microake model with microscopic

self-shadowing for accurate volume downsampling. Computer Graphics Forum 37, 2

(2018), 111–121.

Ishit Mehta, Manmohan Chandraker, and Ravi Ramamoorthi. 2023. A Theory of

Topological Derivatives for Inverse Rendering of Geometry. In Proceedings of the

IEEE/CVF International Conference on Computer Vision. 419–429.

Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, and Andreas

Geiger. 2019. Occupancy networks: Learning 3d reconstruction in function space.

In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.

4460–4470.

Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ra-

mamoorthi, and Ren Ng. 2021. NeRF: representing scenes as neural radiance elds

for view synthesis. Commun. ACM 65, 1 (dec 2021). doi:10.1145/3503250

Bailey Miller, Hanyu Chen, Alice Lai, and Ioannis Gkioulekas. 2024. Objects as Volumes:

A Stochastic Geometry View of Opaque Solids. In Proceedings of the IEEE/CVF

Conference on Computer Vision and Pattern Recognition (CVPR). 87–97.

Thomas Müller, Alex Evans, Christoph Schied, and Alexander Keller. 2022. Instant

neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph.

41, 4, Article 102 (July 2022), 15 pages. doi:10.1145/3528223.3530127

Baptiste Nicolet, Alec Jacobson, and Wenzel Jakob. 2021. Large Steps in Inverse Ren-

dering of Geometry. ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia)

40, 6 (Dec. 2021).

Michael Niemeyer, Lars Mescheder, Michael Oechsle, and Andreas Geiger. 2020. Dif-

ferentiable volumetric rendering: Learning implicit 3d representations without 3d

supervision. In Proceedings of the IEEE/CVF conference on computer vision and pattern

recognition. 3504–3515.

Merlin Nimier-David, Thomas Müller, Alexander Keller, and Wenzel Jakob. 2022. Unbi-

ased Inverse Volume Rendering with Dierential Trackers. ACM Trans. Graph. 41,

4, Article 44 (July 2022), 20 pages. doi:10.1145/3528223.3530073

Merlin Nimier-David, Sébastien Speierer, Benoît Ruiz, and Wenzel Jakob. 2020. Radiative

Backpropagation: An Adjoint Method for Lightning-Fast Dierentiable Rendering.

Transactions on Graphics (Proceedings of SIGGRAPH) 39, 4 (July 2020). doi:10.1145/

3386569.3392406

Silvia Sellán and Alec Jacobson. 2022. Stochastic Poisson surface reconstruction. ACM

Transactions on Graphics (TOG) 41, 6 (2022), 1–12.

Silvia Sellán and Alec Jacobson. 2023. Neural Stochastic Screened Poisson Reconstruc-

tion. ACM Transactions on Graphics (Proc. SIGGRAPH Asia) (2023).

Dario Seyb, Eugene d’Eon, Benedikt Bitterli, and Wojciech Jarosz. 2024. From micro-

facets to participating media: A unied theory of light transport with stochastic

geometry. ACM Transactions on Graphics (TOG) 43, 4 (2024), 1–17.

J Kenneth Shultis and RB Myneni. 1988. Radiative transfer in vegetation canopies with

anisotropic scattering. Journal of Quantitative Spectroscopy and Radiative Transfer

39, 2 (1988), 115–129.

Jos Stam and Ryan Schmidt. 2011. On the velocity of an implicit surface. ACM

Transactions on Graphics (TOG) 30, 3 (2011), 1–7.

Eric Veach. 1997. Robust Monte Carlo Methods for Light Transport Simulation. Ph. D.

Dissertation. Stanford University, Stanford, CA.

Delio Vicini, Wenzel Jakob, and Anton Kaplanyan. 2021a. A non-exponential transmit-

tance model for volumetric scene representations. ACM Transactions on Graphics

(TOG) 40, 4 (2021), 1–16.

Delio Vicini, Sébastien Speierer, and Wenzel Jakob. 2021b. Path Replay Backpropagation:

Dierentiating Light Paths using Constant Memory and Linear Time. Transactions

on Graphics (Proceedings of SIGGRAPH) 40, 4 (Aug. 2021), 108:1–108:14. doi:10.1145/

3450626.3459804

Delio Vicini, Sébastien Speierer, and Wenzel Jakob. 2022. Dierentiable signed distance

function rendering. ACM Transactions on Graphics (TOG) 41, 4 (2022), 1–18.

Peng Wang, Lingjie Liu, Yuan Liu, Christian Theobalt, Taku Komura, and Wenping

Wang. 2021. NeuS: Learning Neural Implicit Surfaces by Volume Rendering for

Multi-view Reconstruction. In Advances in Neural Information Processing Systems,

M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan (Eds.),

Vol. 34. Curran Associates, Inc., 27171–27183. https://proceedings.neurips.cc/paper_

les/paper/2021/le/e41e164f7485ec4a28741a2d0ea41c74-Paper.pdf

Zichen Wang, Xi Deng, Ziyi Zhang, Wenzel Jakob, and Steve Marschner. 2024. A Simple

Approach to Dierentiable Rendering of SDFs. In SIGGRAPH Asia 2024 Conference

Papers (SA ’24). Association for Computing Machinery, New York, NY, USA, Article

119, 11 pages. doi:10.1145/3680528.3687573

Oliver Williams and Andrew Fitzgibbon. 2006. Gaussian process implicit surfaces. In

Gaussian Processes in Practice.

Peiyu Xu, Sai Bangaru, Tzu-Mao Li, and Shuang Zhao. 2023. Warped-area reparame-

terization of dierential path integrals. ACM Transactions on Graphics (TOG) 42, 6

ACM Trans. Graph., Vol. 1, No. 1, Article . Publication date: September 2025.

12 • Zhang et al.

(2023), 1–18.

Peiyu Xu, Sai Bangaru, Tzu-Mao Li, and Shuang Zhao. 2024. Markov-Chain Monte

Carlo Sampling of Visibility Boundaries for Dierentiable Rendering. In SIGGRAPH

Asia 2024 Conference Papers (Tokyo, Japan) (SA ’24). Association for Computing

Machinery, New York, NY, USA, Article 118, 11 pages. doi:10.1145/3680528.3687622

Kai Yan, Christoph Lassner, Brian Budge, Zhao Dong, and Shuang Zhao. 2022. Ecient

estimation of boundary integrals for path-space dierentiable rendering. ACM

Transactions on Graphics (TOG) 41, 4 (2022), 1–13.

Lior Yariv, Jiatao Gu, Yoni Kasten, and Yaron Lipman. 2021. Volume rendering of

neural implicit surfaces. In Thirty-Fifth Conference on Neural Information Processing

Systems.

Tizian Zeltner, Sébastien Speierer, Iliyan Georgiev, and Wenzel Jakob. 2021. Monte Carlo

Estimators for Dierential Light Transport. Transactions on Graphics (Proceedings

of SIGGRAPH) 40, 4 (Aug. 2021). doi:10.1145/3450626.3459807

Cheng Zhang, Bailey Miller, Kai Yan, Ioannis Gkioulekas, and Shuang Zhao. 2020. Path-

space dierentiable rendering. ACM Trans. Graph. 39, 4, Article 143 (Aug. 2020),

19 pages. doi:10.1145/3386569.3392383

Kai Zhang, Fujun Luan, Qianqian Wang, Kavita Bala, and Noah Snavely. 2021a. PhySG:

Inverse Rendering with Spherical Gaussians for Physics-based Material Editing and

Relighting. In The IEEE/CVF Conference on Computer Vision and Pattern Recognition

(CVPR).

Xiuming Zhang, Pratul P. Srinivasan, Boyang Deng, Paul Debevec, William T. Free-

man, and Jonathan T. Barron. 2021b. NeRFactor: neural factorization of shape and

reectance under an unknown illumination. ACM Trans. Graph. 40, 6, Article 237

(Dec. 2021), 18 pages. doi:10.1145/3478513.3480496

Youjia Zhang, Teng Xu, Junqing Yu, Yuteng Ye, Junle Wang, Yanqing Jing, Jingyi Yu,

and Wei Yang. 2023b. NeMF: Inverse Volume Rendering with Neural Microake

Field. 2023 IEEE/CVF International Conference on Computer Vision (ICCV) (2023).

Ziyi Zhang, Nicolas Roussel, and Wenzel Jakob. 2023a. Projective Sampling for Dieren-

tiable Rendering of Geometry. Transactions on Graphics (Proce edings of SIGGRAPH

Asia) 42, 6 (Dec. 2023). doi:10.1145/3618385

Ziyi Zhang, Nicolas Roussel, Thomas Müller, Tizian Zeltner, Merlin Nimier-David,

Fabrice Rousselle, and Wenzel Jakob. 2025. Radiance Surfaces: Optimizing Surface

Representations with a 5D Radiance Field Loss. Proceedings of the ACM SIGGRAPH

Conference Papers (SIGGRAPH Conference Papers ’25) (2025), 10 pages. doi:10.1145/

3721238.3730713

A Extended surface derivatives

Our method extends surface optimization by introducing non-local

perturbations to a surface

𝑆

. Along each light path, the method

tests how potential surfaces could exist at sampled positions as

perturbations to

𝑆

. By optimizing these hypothetical surfaces, we

eectively explore an extended parameter space 𝑆 to rene

𝑆.

Previous work [Zhang et al

2020, Zhang et al

2023a] established

the local formulation of surface derivatives, which measures how

an innitesimal surface change aects radiative transport in the

entire scene. We extend their formulation to measure how innites-

imal changes in any hypothetical surface patch within

𝑆

inuence

radiative transport. This provides a quantitative way to rederive

our method. In this section, we show that the resulting derivatives

match the many-worlds derivative transport.

This analysis leads to two critical results:

•

Many-worlds derivatives reduce to conventional surface deriva-

tives (shading and boundary) when evaluated on

𝑆.

•

We unify shading and boundary derivatives into a single term

in the extended domain, enabling an algorithmically simpler

implementation.

A.1 Conventional surface derivatives

In physically based rendering, surface dierentiation involves two

types of derivatives: the shading derivative and the boundary deriv-

ative. The shading derivative applies to the entire visible surface for

a given viewpoint, primarily aecting appearance through the shad-

ing normal and local material properties. In contrast, the boundary

derivative (corresponding to “occlusion” in Figure 1) is dened only

Fig. 13. A light path with

𝑛

segments connecting the camera to the emier.

along the visibility silhouette curve as seen from a viewpoint. It

adjusts the object’s visibility contour and is the main driver of shape

optimization.

Both derivatives can be expressed as integrals over path space,

a high-dimensional domain encompassing all possible light paths.

To simplify our derivation, we write them in the three-point form

([Veach 1997, Section 8.1]) that isolates the derivative contribu-

tion from a single ray segment. While the integral remains high-

dimensional, this segment-specic form encapsulates most of the

dimensions within the incident radiance term

𝐿

𝑖

and the incident

importance term 𝑊

𝑖

, making the formulation more tractable.

A.1.1 Shading derivative. Consider the radiance contribution of a

𝑛

-segment light path as illustrated in Figure 13. The measurement

contribution function is given by:

𝑓 (

x) = 𝑊

𝑒

, x

)𝐺 (x

, x

) (9)



𝑛−1

𝑖=1

𝑓

𝑠

𝑖−1

, x

𝑖

, x

𝑖+1

)𝐺 (x

𝑖

, x

𝑖+1

)



𝐿

𝑒

𝑛

, x

𝑛−1

where

𝑓

𝑠

is the BSDF function,

𝐺

is the standard geometric term

including visibility

, and

𝐿

𝑒

and

𝑊

𝑒

refer to the emitted radiance

and importance. This equation contributes to a measurement—for

instance an image pixel color

𝐼

—following the path space integral

[Veach 1997]:

𝐼 =

∫

𝑓 (

x) d𝜇,

where d𝜇 =

𝑛

𝑖=0

d𝐴(x

𝑖

) is the area product element.

Without loss of generality, we assume that the emitted radiance

𝐿

𝑒

and the camera model do not depend on the scene parameter

𝜃

(i.e., detached). Dierentiation of the contribution function with

respect to 𝜃 yields:

𝜕𝑓 (

𝜕𝜃

𝑛−1



𝑘=1

𝑊

𝑖

k−1

, x

)𝐺 (x

k−1

, x

) (10)

𝜕

𝜃

𝑓

𝑠

k−1

, x

k+1

)𝐺 (x

, x

k+1

)

𝐿

𝑖

𝑘+1

, x

𝑘

where functions

𝐿

𝑖

and

𝑊

𝑖

denote incident radiance and importance:

𝐿

𝑖

𝑘+1

, x

𝑘

) = 𝐿

𝑒

𝑛

, x

𝑛−1

)Π

𝑛−1

𝑖=𝑘+1

[𝐺(x

𝑖

, x

𝑖+1

)𝑓

𝑠

𝑖−1

, x

𝑖

, x

𝑖+1

)],

𝑊

𝑖

𝑘 −1

, x

𝑘

) = 𝑊

𝑒

, x

)Π

𝑘 −1

𝑖=1

[𝐺(x

𝑖−1

, x

𝑖

)𝑓

𝑠

𝑖−1

, x

𝑖

, x

𝑖+1

)].

ACM Trans. Graph., Vol. 1, No. 1, Article . Publication date: September 2025.

Many-Worlds Inverse Rendering • 13

Fig. 14. Surface boundary derivative. The boundary derivative contains

a motion term that tracks how fast the boundary segment

(

)

moves

along the normal direction n

, which is orthogonal to the viewing direction

𝝎 and the silhouee curve direction t

This gives the three-point form shading derivative of 𝐼:



𝜕𝐼

𝜕𝜃



𝑠

∫

𝜕

𝜃

𝑓 (

x) d𝜇

∫

M×M

𝑊

𝑖

𝑘 −1

, x

𝑘

) 𝐺 (x

𝑘 −1

, x

𝑘

)

∫

𝜕

𝜃

𝐺 (x

, x

k+1

)𝑓

𝑠

k−1

, x

k+1

)

𝐿

𝑖

k+1

, x

) d𝐴(x

k+1

)

| {z }

𝜕

𝜃

𝐿

𝑜

, x

k−1

)

d𝐴(x

) d𝐴(x

k−1

). (11)

The term in the curly bracket is abbreviated as

𝜕

𝜃

𝐿

𝑜

(

k−1

)

, which

captures the derivative stemming from only the current interaction

at x

. The derivatives from later interactions along the path are

incorporated in their respective three-point form integrals.

A.1.2 Boundary derivative. We start from Equation (43) in Zhang

et al

’s work [2020] to derive the segment-specic boundary deriv-

ative integral. The derivation is similar to the one shown in the

supplementary material of [Zhang et al

2023a] with a dierent

objective—we aim to express the integral such that it is local to

the boundary segment, rather than to the silhouette point. One can

alternatively derive the same result starting from Equation (4) in

[Zhang et al. 2023a].

The original boundary derivative integral, without reparameteri-

zation, states that (see Figure 14 for illustration):



𝜕𝐼

𝜕𝜃



𝑏

∫

B (x

)

𝐿

𝑑

, x

) 𝐺 (x

, x

) 𝑊

𝑖

, x

) (12)

(𝜕

𝜃

· n

) d𝑙 (x

) d𝐴(x

where

(

)

form a boundary segment that makes contact with

the boundary point x

, and the domain

)

is the boundary curve

on the x

side surface as seen from x

. The function

𝐿

𝑑

denes

radiance dierence between the foreground and the background as

𝐿

𝑑

(

) = 𝐿

𝑜

(

, 𝝎) − 𝐿

𝑖

(

, −𝝎)

. The inner product measures

the motion of x

along its in-surface normal direction n

This integral is already specic to a ray segment, but the motion

term

𝜕

𝜃

is a derived quantity from

𝜕

𝜃

. They are related

as follows [Zhang et al. 2023a, Supplementary Eq. (9)]:

𝜕

𝜃

· n

𝜕

𝜃

· n

𝑙

∥n

× N

∥

Fig. 15. Shading derivatives in many-worlds transport. Along a ray

segment

(

)

, we consider potential interactions with a hypothetical

surface at x

as a perturbation to the background surface. The shading

derivatives from all such interactions along the ray segment are accumu-

lated.

We also change the length measure d

𝑙 (

)

to d

𝑙 (

)

[Zhang et al

2023a, Supplementary Eq. (3)]:

d𝑙 (x

)

d𝑙 (x

)

𝑙

∥𝝎 × t

∥

𝑙

∥𝝎 × t

∥

Futhermore, we use the following identity ([Zhang et al

2023a,

Supplementary Eq. (10)]):

∥𝝎 × t

∥ ∥n

× N

∥ = |𝝎 · N

Combining these equations, we obtain:



𝜕𝐼

𝜕𝜃



𝑏

∫

C (x

)

𝐿

𝑑

, x

)

|𝝎 · N

| ∥𝝎 × t

∥

𝑙

𝑊

𝑖

, x

) (13)

(𝜕

𝜃

· n

) d𝑙 (x

) d𝐴(x

where the boundary domain

)

is the visibility silhouette curve

on the occluder as seen from x

. Rewriting this result by converting

the length measure from scene space to hemisphere space, we get:



𝜕𝐼

𝜕𝜃



𝑏

∫

C (x

)

𝐿

𝑑

, −𝝎) 𝑊

𝑖

, 𝝎) |𝝎 · N

| (14)

(𝜕

𝜃

𝝎 · n

) d𝑙 (𝝎) d𝐴(x

Intuitively, the motion term

𝜕

𝜃

𝝎 ·

measures how rapidly the

occluder moves in the direction perpendicular to the viewing ray.

This motion is weighted by the radiance dierence between the

foreground and background, and the derivative arising from this

motion is transported to the sensor akin to regular radiance [Nimier-

David et al. 2020, Section 3.1].

A.2 Many-worlds derivatives

We now extend surface derivatives into the extended domain to also

quantify how potential surface patches along a ray segment aect

the radiative transport.

A.2.1 Shading derivative. We write the shading derivative as an in-

tegral over potential surface patches along the ray segment

(

)

weighted by the probability of each patch’s existence:

∫

𝑙

𝛼 (𝑡)



𝜕𝐼

𝜕𝜃



𝑠

d𝑡

(11)

∫

𝑙

∫

M×M

𝑡

(15)

𝛼 (𝑡)

𝑊

𝑖

, x

) 𝐺 (x

, x

) 𝜕

𝜃

𝐿

𝑜

, x

)

d𝐴(x

) d𝐴(x

) d𝑡,

where x

is on a hypothetical surface

𝑡

(Figure 15). Note that

the ray direction may not be perpendicular to the surface normal

ACM Trans. Graph., Vol. 1, No. 1, Article . Publication date: September 2025.

14 • Zhang et al.

Fig. 16. Extended form of the surface boundary derivative. For con-

ventional boundary derivatives, the silhouee direction t

is only defined

on the visibility silhouee. In the extended case, we define it for any sur-

face, allowing silhouee occlusion to be interpreted as probabilistic surface

existence.

d𝐴(x

) d𝑡 = d𝑉 (x

)/|𝝎 · N

|, allowing us to write the integral as:

∫

𝛼 (x

) 𝑊

𝑖

, x

) (16)

|𝝎 · N

|V (x

↔ x

)

𝑙

𝜕

𝜃

𝐿

𝑜

, x

) d𝐴(x

) d𝑉 (x

By chaging from an area measure to a solid angle measure, we

can cancel the remaining geometric term and absorb the visibility

function V to obtain:

∫

𝑆

𝛼 (x

) 𝑊

𝑜

, −𝝎) 𝜕

𝜃

𝐿

𝑜

, −𝝎) d𝝎 d𝑉 (x

). (17)

This derivative has a simple form from which we can identify the

following properties:

• It is local to the position x

where the interaction happens.

•

The shading derivative originating from this interaction is

weighted by the probability of the surface’s existence and

is uniformly radiated in all directions to be received by the

sensor.

A.2.2 Boundary derivative. A challenge in analyzing the boundary

derivative for hypothetical surfaces is that existing theory from

prior work only analyzes visibility silhouettes on occluders. This

becomes problematic as a potential surface patch is not always on

such a silhouette: its surface normal N

may not be orthogonal to the

viewing direction

𝝎

. Such a patch still inuences radiative transport

as it occludes radiance traveling from the background. However, this

type of occlusion does not occur in local surface evolution, where

occlusion is restricted to the visibility silhouette.

We therefore introduce an extended form of the boundary deriva-

tive whose value matches the conventional boundary derivative on

a visibility silhouette curve, but is also well-dened for any surface.

Does such an extension make sense? Intuitively, the boundary

derivative examines the color dierence between the foreground

and the background, and the motion term determines if more of

the foreground or the background should be visible. This concept

closely mirrors the derivative of occupancy, where adjusting the

probability of the surface existence decides the visibility balance

between the surface and the background. From this perspective,

the “boundary” derivative logically extends to hypothetical surfaces

with non-tangential ray intersections. Although “boundary” may

be a misnomer in these cases, we retain the term for consistency

with existing literature.

Extended form of surface boundary derivatives. For any surface

located at x

with normal N

, we dene the extended silhouette

direction t

as the in-surface direction along which the dot product

𝝎 ·

remains constant. The conventional silhouette direction is a

special case where 𝝎 · N

= 0.

Let n

be the viewing direction normal, which is orthogonal to

both the viewing direction

𝝎

and the extended silhouette direction

(Figure 16). In the extended case, it is important to distinguish

between the surface normal N

and the viewing direction normal

. This subtle dierence is not present in the conventional case, as

= n

on the visibility silhouette.

We denote the local motion at x

𝜃

) := 𝜕

𝜃

· N

(18)

since it quanties the rate at which occlusion changes at the inter-

action point.

Let m

be a unit vector in the surface such that

{

}

forms an orthonormal basis. the motion relates to ray direction

𝝎 as [Zhang et al. 2023a, Supplementary Eq. (9)]:

𝜕

𝜃

𝝎 · n

∥n

× m

∥

𝑙

𝜕

𝜃

· N

. (19)

The length measure d

𝑙 (𝝎)

is in the solid angle domain, and it

relates to the length measure d𝑙 (x

) on the surface as

d𝑙 (𝝎) =

∥𝝎 × t

∥

𝑙

d𝑙 (x

). (20)

Using these equations, the extended boundary derivative from

Equation 14 can be rewritten as:



𝜕𝐼

𝜕𝜃



𝑏

(14)

∫

C (x

)

𝐿

𝑑

, −𝝎) 𝑊

𝑖

, 𝝎) |𝝎 · N

(𝜕

𝜃

𝝎 · n

) d𝑙 (𝝎) d𝐴(x

)

(18,19)

∫

C (x

)

𝐿

𝑑

, −𝝎) 𝑊

𝑖

, 𝝎) |𝝎 · N

∥n

× m

∥

𝑙

𝜃

) d𝑙 (𝝎) d𝐴(x

)

(20)

∫

C (x

)

𝐿

𝑑

, −𝝎) 𝑊

𝑖

, 𝝎) |𝝎 · N

∥n

× m

∥ ∥𝝎 × t

∥

𝑙

𝜃

) d𝑙 (x

) d𝐴(x

). (21)

Boundary derivative in many-worlds transport. At each interac-

tion position x

, let d

𝑙 (

)

be the length measure along the ex-

tended silhouette direction t

and d

𝑛(

)

be the length measure

along the surface normal N

. Together, they dene an area element

𝐴(

) =

𝑙 (

)

𝑛(

)

which quanties how this patch occludes

the background.

Analogous to shading derivatives, we write the many-worlds

boundary derivative of

𝐼

as an integral over all possible surface

interactions along the ray segment (x

, x

∬



𝜕𝐼

𝜕𝜃



𝑏

d𝑛 d𝑡

(21)

∫

𝑙

∫

M×M

𝑡

𝐿

𝑑

, −𝝎) 𝑊

𝑖

, 𝝎) (22)

|𝝎 · N

∥n

× m

∥ ∥𝝎 × t

∥

𝑙

𝜃

) d𝐴(x

) d𝑡,

ACM Trans. Graph., Vol. 1, No. 1, Article . Publication date: September 2025.

Many-Worlds Inverse Rendering • 15

The basis {t

, N

, 𝝎} leads to a volume element

d𝑉 (x

) = |𝝎 · m

| d𝑙(x

) d𝑛(x

) d𝑡. (23)

Note that m

, n

and N

are co-planar since they are all orthogonal

to t

. We therefore obtain an identity [Zhang et al

2023a, Supple-

mentary Eq. (10)]:

∥𝝎 × t

∥ ∥n

× m

∥ = |𝝎 · m

|. (24)

Using these equations, we simplify the boundary derivative as:

(22)

(23)

∫

𝑊

𝑖

, 𝝎) 𝐿

𝑑

, −𝝎)

|𝝎 · N

𝑙

∥𝝎 × t

∥ ∥n

× m

∥

|𝝎 · m

𝜃

) d𝑉 (x

) d𝐴(x

)

(24)

∫

𝑊

𝑜

, −𝝎) 𝐿

𝑑

, −𝝎) v

𝜃

) d𝝎 d𝑉 (x

). (25)

We dene the local motion v

𝜃

) within its canonical space:

𝜃

) = 𝜕

𝜃

𝛼 (x

). (26)

This denition contrasts with prior work that computes the surface

boundary derivative using an implicit eld representation. In those

works, the motion is dened in the scene space, resulting in the

following Jacobian determinant [Stam and Schmidt 2011]:

𝜕

𝜃

p · N =

𝜕

𝜃

𝛼 (p)

∥∇𝛼(p) ∥

, (27)

that relates the normal velocity of a surface point p to the deriva-

tive of its implicit representation. This Jacobian determinant is not

required since our method optimize the surface patch directly by

modifying its occupancy value, rather than locally deforming it. The

scaling factor in Equation 27 can bias the many-worlds derivative

by erroneously considering neighboring occupancy values. This is

more apparent when considering the fact that the gradient norm

∥∇𝛼∥

could be innitely large near the mean surface or the oppo-

site, it could drop to zero in the case of an initialization where all

positions share the same occupancy value.

This yields the nal form of the boundary derivative in many-

worlds transport:

(25)

(26)

∫

𝑊

𝑜

, −𝝎) 𝐿

𝑑

, −𝝎) 𝜕

𝜃

𝛼 (x

) d𝝎 d𝑉 (x

(28)

A.2.3 Many-worlds derivative. Combining Equation 17 and Equa-

tion 28, we derive a derivative integral that captures the overall

impact of non-local surface perturbations:

𝜕𝐼

𝜕𝜃

∫

𝑊

𝑜

, −𝝎)

𝛼 (x

) 𝜕

𝜃

𝐿

𝑜

, −𝝎) +

𝜕

𝜃

𝛼 (x

) 𝐿

𝑑

, −𝝎)

d𝝎 d𝑉 (x

)

∫

𝑊

𝑜

, 𝝎) 𝜕

𝜃

𝐿

𝑑

, 𝝎) 𝛼 (x

)

d𝝎 d𝑉 (x

), (29)

where the background surface in the radiance dierence function

𝐿

𝑑

is detached. Equation 29 represents the nal form of the many-

worlds derivative, expressed in an alternative domain but equivalent

to Equation 6.

This derivation suggests that our approach is an extension of stan-

dard surface evolution methods. Conventional techniques evolve

surfaces by locally modifying shading properties or adjusting ge-

ometry to change occlusion relationships, with such perturbations

restricted to the surface itself. Instead, we generalize this concept

by analyzing hypothetical surface patches in space, leading to the

same well-dened shading and occlusion-based perturbations in an

extended domain.

A key challenge in this extension is that any position along a ray

can now contribute to surface evolution, rather than being restricted

to a single intersection point. To address this, we adopt the “many-

worlds” perspective: treating all possible interactions as mutually

exclusive events. This ensures that a desired change is uniformly

applied across all potential interactions.

B A random volume perspective

B.1 Transport in an exponential random volume

We begin by reviewing light transport in an exponential random

volume as described by the radiative transfer equation. This equation

gives the incident radiance

𝐿

along a ray

(

, 𝝎)

, accounting for

radiative gains (in-scattering, emission) and losses (out-scattering,

absorption) along the segment reaching up to the nearest ray-surface

intersection at 𝑠 = inf {𝑠

′

| x + 𝑠

′

𝝎 ∈ M} or 𝑠 = ∞ if none exists:

𝐿

(0) =

∫

𝑠

𝑇 (𝑡)

[

𝜇

(𝑡) 𝐿

(𝑡) + 𝜇

(𝑡) 𝐿

(𝑡)

]

d𝑡 + 𝑇 (𝑠) 𝐿

(𝑠). (30)

The rst term models contributions of the volume, while the second

accounts for a potential surface at

𝑡 = 𝑠

, whose outgoing radiance

𝐿

(𝑠) is attenuated by the medium.

The functions

𝜇

(𝑡)

and

𝜇

(𝑡)

specify the volume’s absorption and

scattering coecient, whose sum gives the extinction

𝜇

= 𝜇

+ 𝜇

The in-scattered radiance

𝐿

(𝑡) =

∫

𝐿

𝑖

(𝑡, 𝝎

′

) 𝑓

𝑝

(𝑡, 𝝎, 𝝎

′

) d𝝎

′

(31)

is the product integral of incident radiance and the

phase function 𝑓

𝑝

Finally, the transmittance

𝑇 (𝑡) = exp



−

∫

𝑡

𝜇

𝑡

(𝑡

′

) d𝑡

′



(32)

ties everything together: it establishes the connection to particle

distributions and ensures energy conservation by accounting for

self-shadowing (or more generally, self-extinction).

B.2 Many-worlds transport

Building on the discussion in Section 3.1, we aim to prevent nonsen-

sical scattering and shadowing between multiple potential surfaces

along a ray. This can be achieved by modeling the interaction with

a tenuous volume, whose density is diluted by a factor of

𝜀

, which

reduces the scattering and extinction coecients

𝜇

and

𝜇

. This

low amount of extinction ensures at most one interaction with the

volume along a ray segment, leading to the following transmittance

approximation:

𝑇 (𝑡) ≈ 1 −

∫

𝑡

𝜇

𝑡

(𝑡

′

) d𝑡

′

, (33)

which follows from

𝑒

−𝑥

≈

− 𝑥 + 𝑂 (𝑥

)

. Plugging this into the

RTE

(30)

and discarding a second-order term (

𝜇

∼ 𝜀

) yields the

ACM Trans. Graph., Vol. 1, No. 1, Article . Publication date: September 2025.

16 • Zhang et al.

approximation 𝐿

(0) ≈ 𝐿

𝜀

(0), where the latter is dened as

𝐿

𝜀

(0)

∫

𝑠

[

𝜇

(𝑡) 𝐿

(𝑡) + 𝜇

(𝑡) 𝐿

(𝑡) − 𝜇

(𝑡) 𝐿

(𝑠)

]

d𝑡 + 𝐿

𝑜

(𝑠), (34)

In other words, a tenuous volume is governed by a linearized RTE,

where higher-order terms vanish. The superscript

𝜀

refers to quan-

tities with a dilution factor 𝜀.

To complete this model, we must instill meaning into the terms

𝜇

, and

𝑓

𝑝

. Prior work often did so using volumetric analogs of

microfacet surface models known as microakes [Jakob et al

2010].

Applications include optimization, volumetric level of detail, energy-

conserving random walks, and neural elds [Heitz et al

2015, 2016;

Vicini et al. 2021a; Loubet and Neyret 2018; Zhang et al. 2023b].

It is worth noting that the theory of microakes actually builds

on a more general notion of anisotropic radiative transport [Shultis

and Myneni 1988; Jakob et al

2010], which adopts directionally

varying extinction and scattering coecients arising from a random

distribution of oriented particles:

𝜇

(x, 𝝎) = 𝜌(x)

∫

𝑆

𝜎 (𝝎, 𝝎

′

) 𝐷 (x, 𝝎

′

) d𝝎

′

(35)

𝜇

(x, 𝝎) = 𝜌(x)

∫

𝑆

𝑎(x, 𝝎, 𝝎

′

) 𝜎 (𝝎, 𝝎

′

) 𝐷 (x, 𝝎

′

) d𝝎

′

, (36)

where 𝜌 (x) is the particles’ number density at x, 𝐷 (𝝎) models the

density of their directional orientations,

𝜎 (𝝎, 𝝎

′

)

gives the cross-

sectional area of a single particle with orientation

𝝎

′

observed from

direction

𝝎

, and

𝑎(

, 𝝎, 𝝎

′

)

models the particles’ scattering albedo

(

∈ [

]

). Microake theory can then be derived from these expres-

sions by setting

𝜎 (𝝎, 𝝎

′

) = 𝜎 |𝝎 · 𝝎

′

to model the projected area

of a facet with surface area

𝜎

, and by constructing a phase function

𝑓

𝑝

based on the principle of specular reection from such a ake.

However, we do not model a directional distribution

𝐷 (𝝎)

in this

work. Instead, we adopt a far simpler particle model that associates

a single particle orientation 𝜷 (x) with every point x (Figure 3).

Since there is only one orientation at x, the distribution

𝐷

col-

lapses to a Dirac delta function:

𝐷 (

, 𝝎) = 𝛿 (𝝎 − 𝜷 (

))

, which in

turn reduces the extinction to a simple product of number density

and cross-sectional area

𝜇

(x, 𝝎) = 𝜌(x) 𝜎 (𝝎, 𝜷 (x)). (37)

Here, we do not specically model the split into a separate number

density and cross-section. Instead, we dene an extinction function

that directly evaluates their product:

𝜇

(x, 𝝎) =

(

𝜀 𝑜 (x), if 𝝎 · 𝜷 (x) > 0,

0, otherwise,

(38)

where

𝑜 (

)

is the occupancy, i.e., the point-wise discrete probability

that x is inside

𝑆

(Equation 7). The branch condition encodes an

important nonreciprocal

behavior: light interacting with a back-

facing surface presents a nonsensical case, and this term masks

those parts of the volume.

The extinction

𝜇

models the stopping power of the volume. For

example, the akes of a microake volume obstruct light to a lesser

degree when it propagates nearly parallel to them, and this manifests

This behavior is non-reciprocal because when the medium interacts along

𝝎

(i.e., if

𝜇

𝑡

(x, 𝝎 ) > 0), then the opposite direction is non-interacting (i.e., 𝜇

𝑡

(x, −𝝎) = 0).

via a foreshortening term

|𝝎 · 𝝎

′

in the denition of

𝜇

. This is

important to obtain a sensible (energy-conserving, reciprocal) model

because these particles mutually interact. On the other hand, when

focusing on a single world in a many-worlds representation, light

stops with probability 1 when it encounters a surface regardless of

how it is oriented (except for mentioned back-facing case). This is

why our model does not include a similar foreshortening term. The

term

𝛼 (

)

models the discrete probability of the world within the

ensemble.

We dene the phase function 𝑓

𝑝

of the volume as

𝑓

𝑝

(x, 𝝎, 𝝎

′

) =

𝑓

𝑠

(x, 𝝎, 𝝎

′

) |𝜷 (x) · 𝝎

′

𝑎(x, 𝝎, 𝜷 (x))

which wraps

𝑓

𝑠

, the bidirectional scattering distribution function

(BSDF) of the many-worlds surface at the point x. This denition

follows the standard convention that the phase function integrates

to 1, with absorption handled by other terms. The albedo

𝑎

provides

the necessary normalization constant:

𝑎(x, 𝝎, 𝜷 (x)) =

∫

𝑆

𝑓

𝑠

(x, 𝝎, 𝝎

′

) |𝜷 (x) · 𝝎

′

| d𝝎

′

. (39)

With these denitions,

𝜇

reduces to the

product of extinction and 𝑎:

𝜇

(x, 𝝎) = 𝑎(x, 𝝎, 𝜷 (x)) 𝜇

(x, 𝝎). (40)

The product of this scattering coecient and in-scattered radiance

𝐿

(Equation 31) can be seen to compute an extinction-weighted

integral of the many-worlds BSDF over projected solid angles:

𝜇

(x, 𝝎)𝐿

(x, 𝝎) = 𝜇

(x, 𝝎)

∫

𝑆

𝐿

(x, 𝝎

′

) 𝑓

𝑠

(x, 𝝎, 𝝎

′

) |𝛽 (x) · 𝝎

′

| d𝝎

′

We further use the sum of the above expression with the absorption-

weighted emission to dene an extended outgoing radiance function

𝐿

for points x that lie within the many-worlds representation:

𝜇

(x, 𝝎)𝐿

(x, 𝝎) + 𝜇

(x, 𝝎)𝐿

(x, 𝝎) =

𝜇

(x, 𝝎)𝐿

(x, 𝝎). (41)

Substituting all of these expressions into Equation 34, we obtain

𝐿

𝜀

(0) = 𝜀

∫

𝑠

𝛼 (𝑡)

[

𝐿

(𝑡) − 𝐿

(𝑠)

]

d𝑡 + 𝐿

𝑜

(𝑠), (42)

where

𝛼

gives the occupancy at

𝑡

and is zero in the back-facing

(𝜷 (𝑡 ) · 𝝎 < 0) case.

This equation quanties how the many-worlds representation

modies light transport along individual path segments. For multi-

segment paths, we do not model multiple interactions, which would

introduce second-order derivative terms beyond our scope.

While Equation 42 captures innitesimal volumetric eects, our

many-worlds framework treats each perturbation as a standalone

alternative world (Section 3.1). Removing the

𝜀

scaling reveals the

unit perturbation eect:

𝐿

(0) =

∫

𝑠

𝛼 (𝑡)

[

𝐿

(𝑡) − 𝐿

(𝑠)

]

d𝑡 + 𝐿

𝑜

(𝑠). (43)

Under dierentiation this gives the many-worlds derivative trans-

port in Equation 6.

ACM Trans. Graph., Vol. 1, No. 1, Article . Publication date: September 2025.

Many-Worlds Inverse Rendering • 17

C Technical Details

C.1 Stochastic surface model

Our method evolves a surface

𝑆

by propagating gradients to a sto-

chastic surface model

𝑆

. We outline one possible parameterization of

a stochastic surface model here [Williams and Fitzgibbon 2006; Sel-

lán and Jacobson 2022; Sellán and Jacobson 2023; Miller et al

2024;

Seyb et al

2024], noting that this is not our contribution and other

formulations can be used. Our work focuses on the theoretical foun-

dation for optimizing a distribution of surfaces without relying on

exponential volumes (i.e., ray-surface interactions are mutually ex-

clusive events, rather than statistically independent events), which

is largely independent of the specic model used.

An implicit representation

Φ(

)

of a shape determines whether

a position x is inside (

Φ(

) <

)

, outside (

Φ(

) >

)

, or on the

boundary (

Φ(

) =

)

of a solid. This geometric classication is

independent of optical properties—for example, points x within a

refractive material also count as inside (Φ(x) < 0).

In our case, the scene models a distribution of surfaces, which

turns

Φ(

)

into a random variable. The occupancy

𝛼 (

)

then gives

the probability of x being on or inside an object:

𝛼 (x) = Pr{Φ(x) ≤ 0}. (44)

We model

Φ(𝑥)

as a Gaussian process (GP). Pointwise evaluations

of the implicit function are normally distributed:

Φ(x) ∼ N (𝜇(x), 𝜎

(x)), (45)

which yields an explicit form of the occupancy:

𝛼 (x) =

1 − erf

𝜇(x)



2𝜎

(x)

. (46)

We assume a constant variance 𝜎

(x) = 𝜎

for all x in this work.

We also use this representation to assign an orientation to every

point based on the expected gradient of the implicit function, i.e.,

𝜷 (x) =

𝐸

[

∇Φ(x)

]

∥

𝐸

[

∇Φ(x)

]∥

∇𝜇 (x)

∥

∇𝜇 (x)

∥

Modeling orientation has a crucial impact on the robustness and

speed of optimizations (Figure 4).

A GP also has the property that evaluations

Φ(

), Φ(

), . . .

fol-

low a joint multivariate normal distribution. Their auto-correlation

is often described using a kernel that depends on distance, e.g.:

corr(Φ(x), Φ(y)) = exp



−𝛾 ∥x − y∥



(47)

where

corr()

refers to Pearson’s correlation coecient. This ensures

that nearby points become increasingly correlated, making sudden

jumps in realizations of Φ(x) unlikely.

Autocorrelation would be a crucial property if our method in-

volved steps such as sampling concrete surface realizations from

𝑆

, or if we consider interaction with potential surfaces at multiple

dierent locations, requiring careful modeling of the correlation

between them. However, our method does not depend on such

steps. Its sole interaction with

𝑆

is through pointwise evaluations

of probabilities, and the extraction of the background surface

𝑆:

𝑆 = {x ∈ R

| 𝜇(x) = 0} = {x ∈ R

| 𝛼 (x) =

/2}. (48)

Spatial correlation is therefore not a detail that must be modeled in

our implementation of the algorithm. Other variants of the many-

worlds algorithm, for instance those that involve a more general

directional distribution, require more parameterization of the GP.

C.2 Implementation details

Occupancy and background surface. We query

𝜇(

)

from a grid-

based texture using cubic interpolants in our implementation. Al-

ternatively, a neural representation [Müller et al

2022] could be

employed for greater exibility and resolution.

To simulate interactions with the detached background surface

𝑆

, we use Marching Cubes [Lorensen and Cline 1987] to extract a

triangle mesh from

𝜇(

)

, enabling the use of ecient ray-triangle in-

tersection routines. Any other isosurface extraction algorithm could

in principle be used, since the extraction step and the extracted sur-

face do not need to be dierentiable. Our pipeline does not involve

sphere tracing, delta tracking or other iterative algorithms.

Alternative pamameterization. The derivation in Section 3 dis-

cusses the propagation of derivatives to some parameterization of

the extended parameter space. Our choice of parameterization is

simple but unnecessarily restrictive.

For instance, it should be possible to construct versions of this

framework to model

𝜷

as a normal distribution rather than a single

normal direction. This could accelerate convergence by allowing

the optimizer to simultaneously explore a wider range of surface

orientations. Upon surface extraction, the normal distribution could

be interpreted as micro-scale surface roughness, which we leave for

future work.

Sampling strategy. A detail that must be addressed is how Equa-

tion 5 should be sampled in a Monte Carlo renderer. Since we don’t

know the value of the radiance, a natural choice is to sample other

terms proportional to their known contribution—in this case, draw-

ing positions proportional to

𝛼 (𝑡)

. While this is a sensible choice for

primal rendering, it leads to a chicken-and-egg problem during op-

timization. Regions with

𝛼 (

) ≈

0 are essentially never sampled, but

we must clearly visit them sometimes to even consider the possibility

of placing a surface there.

Recall the many-worlds derivative transport in Equation 6:

𝜕

𝜋

𝐿

(0) =

∫

𝑠



𝜕

𝜋

𝛼 (𝑡) [𝐿

(𝑡) − 𝐿

(𝑠)]

(i)

+ 𝛼 (𝑡 ) 𝜕

𝜋

𝐿

(𝑡)

(ii)



d𝑡.

Term (i) of this expression states that the derivative of occupancy

along the ray is proportional to

𝐿

(𝑡) − 𝐿

(𝑠)

, which is an unknown

quantity that must be estimated. Given that there are no remaining

known factors that could be used to sample

𝑡

, the best strategy for

this derivative is to uniformly pick points along the ray.

This isn’t a new discovery: Nimier-David et al

[2022] discuss the

same issue in inverse volume rendering and address it by introducing

generalizations of volume trackers that sample proportional to pure

transmittance

𝑇 (𝑡)

instead of the default extinction-weighted trans-

mittance

𝜇

(𝑡)𝑇 (𝑡)

. Many-worlds transport has no transmittance,

so the solution ends up being even simpler: a uniform sampling

strategy suces.

ACM Trans. Graph., Vol. 1, No. 1, Article . Publication date: September 2025.