SIMULTANEOUS SIMULATION OF MARKOV CHAINS USING QUASI-MONTE CARLO TECHNIQUES

Title:

SIMULTANEOUS SIMULATION OF MARKOV CHAINS USING QUASI-MONTE CARLO TECHNIQUES

Document Type and Number:

WIPO Patent Application WO/2008/022173

Kind Code:

Abstract:

Methods, systems, apparatus and computer software/computer code products operable to enable computer graphics systems to more efficiently simulate Markov chains (and thus trajectories of photons and the like) comprise simulating, and/or means for simulating, Markov chains using a quasi-Monte Carlo methodology, wherein the simulating of Markov chains comprises sorting states, and wherein the sorting comprises proximity sorting.

Inventors:

KELLER ALEXANDER (DE)
WAECHTER CARSTEN (DE)

Application Number:

PCT/US2007/075966

Publication Date:

December 04, 2008

Filing Date:

August 15, 2007

Export Citation:

Click for automatic bibliography generation Help

Assignee:

MENTAL IMAGES INC (US)
KELLER ALEXANDER (DE)
WAECHTER CARSTEN (DE)

International Classes:

G06T15/06; H04N13/02

Foreign References:

US20040051783A1	2004-03-18
US20030052874A1	2003-03-20
US20040125103A1	2004-07-01

Attorney, Agent or Firm:

JACOBS, David (1050 Winter StreetSuite 1000, #108, Waltham MA, US)

Download PDF:

View/Download PDF PDF Help

Claims:

We cSaJiTi:

1. In a computer graphics system for generating a pixel value for a pixel in an image, the pixel value being representative of points in a scene as recorded on an image

5 plane of a simulated cam era, the computer graphics system being configured to generate the pixel value for an image using a selected ray -tracing methodology comprising the simulating of at least one ray shot from the pixel into a scene along a selected direction, the ray -tracing methodology further comprising the calculating of the intersections of rays and surfaces of objects in the scene and the simulating of K) trajectories of rays illuminating objects in the scene, the simulating of trajectories using Markov chains, the improvement comprising: simulating the Markov chains using a quasi-Monte Carlo methodology, wherein the simulating of Markov chains comprises sorting states, and wherein the sorting comprises proximity sorting. 15

2. hi the computer graphics system of ciaim I, the further improvement comprising simulating a plurality of Markov chains simultaneously.

3. In the computer graphics system of claim 2, the further improvement 20 wherein the simulating of a plurality of Markov chains simultaneously comprises utilizing quasi -Monte Carlo points.

4. In the computer graphics system of claim 3, the further improvement wherein the quasi-Monte Carlo points are selected using any of a Halton sequence, a

25 variant of a Halton sequence, a lattice sequence or a (/, .v)-sequence.

5. ϊn the computer graphics system of claim 2, the further improvement wherein the simulating of a plurality of Markov chains simultaneously comprises utilizing any number of chains.

6. In the computer graphics system of claim 2, the further improvement wherein the simulating of a plurality of Markov chains simultaneously further comprises adding additional trajectories on the fly.

- 46 -

7. In the computer graphics system of claim 2, the further improvement wherein the simulating of a plurality of Markov chains simultaneously further comprises trajectory splitting.

5 8. In the computer graphics system of claim 2, the further improvement wherein the simulating of a plurality of Markov chains simultaneously comprises utilizing a technique simulating particle absorption,

9. In the computer graphics system of claim 8, the further improvement K) wherein the simulating of a plurality of Markov chains simultaneously further comprises utilizing a Russian Roulette technique.

10. In the computer graphics system of claim 2, the further improvement wherein the simulating of a plurality of Markov chains simultaneously for an s-

15 dimensional problem comprises:

H := 0 initialize Xo _. i using quasi-Monte Carlo points xi loop: 20 - sort state vector using a suitable order

- n := n + 1

- continue chain by computing X» _M - using subsequent quasi-Monte Carlo points Xi ,

25 1 1. Sn the computer graphics system of claim ill the further improvement comprising; ordering slates by proximity in state space.

12. In the computer graphics system of claim 11 , the further improvement 30 comprising randomization.

i 3. In the computer graphics system of claim 12, the further improvement comprising utilizing any selected set or sequence of points with any selected randomization.

- 47 -

14. In the computer graphics system of claim ! 1, the further improvement comprising randomization, a.nd applying the following technique'

5 n : ^{■■■■■■} 0, select point sequence x* instanti ate randomization initialize Xy _. / using rand(x? ) loop

• sort state vector using some order 10 • n := n + i

• instantiate randomization

• continue chairs by computing X^ using randfa _? )

.15. In the computer graphics system of claim i.1. the further improvement ! 5 comprising; randomizing the qυasi-Monte Carlo points in each time step «, the randomizing comprising: selecting a (t, s)-sequence in base b ~ 2, from which subsequent samples are drawn, and

20 XOR-iπg the samples by an >v-dimensionaS random vector, the random vector being generated after each transition step.

16. In the computer graphics system of claim i 5, the further improvement comprising adding a selected number of additional trajectories on the fly 25

1 7 In the computer graphics system of claim 16, the further improvement comprising trajectory splitting.

18. In the computer graphics system of claim 17, the further improvement 30 comprising utilizing a technique simulating particle absorption.

19. Sn the computer graphics system of claim 18, the further improvement comprising utilizing a Russian Roulette technique.

- 48 -

20 In the computer graphics system of claim 2, the further improvement comprising: sorting by luminosity, or the norm of state; sorting using a spatial hierarchy; and 5 sorting using buckets enumerated by at. least one space filling curve.

21. In the computer graphics system of claim 20, the further improvement wherein: sorting using a spatial hierarchy comprises utilizing a binary hierarchy. K)

22. In the computer graphics system of claim 21, the further improvement wherein: the spatial hierarchy comprises any of a BSP -tree, A'D-tree or other axis-aligned subset of a BSP-tree structure, or bounding volume hierarchies, or regular voxels. 15

23. In the computer graphics system of claim 20, the further improvement wherein; a binary hierarchy is constructed by recursively subdividing using planes selected by a selected heuristic; and

20 then traversing the hierarchy in in-order to enumerate the leaves of the hierarchy in an order of proximity.

24. In the computer graphics system of claim 23, the further improvement comprising left-balancing the tree and storing the tree in an array data structure to

25 enable efficient traversal of the hierarchy.

25. In the computer graphics system of claim 24, the further improvement wherein: the spatial hierarchy comprises leaves; and

30 sorting using a spatial hierarchy further comprises utilizing bucket sorting and a selected .space-ill ling curve.

26. In the computer graphics system of claim 24, the further improvement wherein sorting using a spatial hierarchy further comprises using regular voxels.

„ 49 .

27. In the computer graphics system of claim 25, the further improvement corn prising. bounding an object to be rendered by an axis-aligned bounding box; 5 recursively or iteraiiveiy dividing the bounding box into successive ieft and right branches, each terminating in a leaf, by applying splitting planes passing through selected points of the object, thereby generating a plurality of leaves; ordering the leaves hierarchically according to their respective luminosities; performing a bucket sort on a matrix of selected size, divided into a selected 10 number of buckets, and using a selected space-fill ing curve to proceed through {he matrix, wherein within an individual bucket, a smaller space-filling curve is used to proceed through individual cells in the matrix.

i 5 28. In the computer graphics system of claim 2, the further improvement wherein the simulation is operable for configuration to solve integral equations.

29. In the computer graphics system of claim 28, the further improvement wherein the simulation is operable to enable light transport simulation for synthesizing 20 reali sti c-appearing i m ages .

30 In the computer graphics system of claim 29, the further improvement wherein: an underlying Sight transport simulating integral equation is reformulated as a 25 path integral, such that sampling path space corresponds to simulating Markov chains, in which paths are established by ray tracing and scattering events; wherein an initial distribution is determined by emission characteristics of light sources or sensors, and transition probabilities are determined by bi-directional reflectance distribution functions and bi-directional subsurface scattering distribution 30 functions for a surface in the image.

31. In the computer graphics system of claim 30, the further improvement comprising.

- 50 -

processing transparency effects by utilizing the path integral, initial distribution and transition probabilities .

32. In the computer graphics system of claim 30, the further improvement 5 wherein. the path integral is solved by utilizing either high dimensional qυasi-Monte Carlo points or by padding low dimensional qυasi-Monte Carlo points.

33 In the computer graphics system of claim 32, the further improvement K) wherein.

(he integral equation underlying light transport simulation is considered either as a Fredboim or a Vo! terra imegra! equation.

34. hi a computer graphics system for generating a pixel value for a pixel in an i 5 image, the pixel value being representative of points in a scene as recorded on an image plane of a simulated camera, the computer graphics system being configured to generate the pixel value for an image using a selected ray-tracing methodology comprising the simulating of at least one ray shot from the pixel into a scene along a selected direction, the ray-tracing methodology further comprising the calculating of

20 the intersections of rays and surfaces of objects in the scene and the simulating of trajectories of rays illuminating objects in the scene, the simulating of trajectories using Markov chains, a method comprising: simulating the Markov chains using a quasi-Monte Carlo methodology, wherein the simulating of Markov chains comprises sorting states, and

25 wherein the sorting comprises proximity sorting.

35. In a computer graphics system for generating a pixel value for a pixel in an image, the pixel value being representative of points in a scene as recorded on an image plane of a simulated camera, the computer graphics system being configured to

30 generate, the pixel value for an image using a selected ray-tracing methodology comprising the simulating of at least one ray shot from the pixel into a scene along a selected direction, the ray-tracing methodology further comprising the calculating of the intersections of rays and surfaces of objects in the scene and the simulating of

- 51 -

trajectories of rays illuminating objects in the scene, the simulating of trajectories using Markov chains, a system comprising: means for simulating the Markov chains using a quasi -Monte Carlo methodology, 5 wherein the means for simulating of Markov chains comprises means for sorting states, and wherein the means for sorting comprises means for proximity sorting.

36, In a computer graphics system for generating a pixel value for a pixel in an image, the pixel value being representative of points in a scene as recorded on an image

K) plane of a simulated camera, the computer graphics system being configured to generate the pixel value for an image using a selected ray-tracing methodology comprising the simulating of at least one ray shot from the pixel into a scene along a selected direction, the ray-tracing methodology further comprising the calculating of the intersections of rays and surfaces of objects in the scene and the simulating of

15 trajectories of rays illuminating objects in the scene, the simulating of trajectories using Markov chains, a computer program product comprising computer-executable program code stored on a computer-readable medium, the computer-executable program code comprising: computer code means for simulating the Markov chains using a quasi -Monte

20 Carlo methodology, wherein the computer code means for simulating of Markov chains comprises computer code means for sorting states, and wherein the computer code means for sorting comprises computer code means for proximity sorting

37, In the computer graphics system of claim 19, the further improvement comprising: simulating inhomogeneoυs Markov chains.

- 52 -

Description:

SIMULTANEO[IS SIMULATION OF MARKOV CHAINS USING QUASI-MONTE CARLO TECHNIQUES

CrossiltefCT

This application for patent (Attorney Docket MENT- 101 -B) claims the priority benefit of United States Provisional Patent App. Serial No 60/822,417, filed August 15. 2006. This application is also a Continuation-! n-Part of United Sates Patent App

10 Serial No 1 1/474,517, filed June 23, 2006 (Attorney Docket MENT- IOl-US), which claims the benefit of United States Provisional Patent App. Serial No 60/693,231 filed June 23, 2005 {Attorney Docket MENT-IO 1 -PR), and which is also a Cσntinuatiou-m- Part of United States Patent App. Serial No. 10/299,574, filed November 19, 2002 {Attorney Docket MENT-075).

15 United States Patent App. Serial No. 10/299,574 (Attorney Docket MfϊNT-075) is a Continuation-in-Part of U.S. Serial No. 09/884,861, filed June 19, 2001 (Attorney Docket MENT-061 ), which claims priority benefit from United States Provisional Patent App. Serial No. 60/265,934, filed February 1 , 2001, and 60/212,286 filed June 19, 2000

20 Each of the above-listed patent applications, including, but not limited to,

Attorney Docket Nos. MENT-061, MENT-075, MENT- IOl-PR, and MENT-IOl-US, as well as their provisional counterparts, is incorporated by reference herein in its entirety as if set forth in its entirety herein.

Reference. To Computer .Program . Listing

25 Attached to this application below is a source code listing. The source code listing is referred to herein as the ""Computer Program Listing," and is organized into sections identified by a three-digit reference number in the format '"1.1.1 ."

FJeId _n Of The Jnvenf ion 30 The present invention relates generally to methods and systems for image rendering in and by digital computing systems, such as for motion pictures and other applications, and in particular, relates to methods, systems, devices, and computer software for real-time, precision ray tracing

Background . of the Invention

The term ''ray tracing" describes a technique for synthesizing photorealistic images by identifying all light paths that connect light sources with cameras and 5 summing up these contributions. The simulation traces rays along the line of sight to determine visibility, and traces rays from tine Sight sources in order to determine illumination.

Ray tracing has become mainstream in motion pictures and other applications. However, current ray tracing techniques suffer from a number of known limitations and

K) weaknesses, including numerical problems, limited capabilities to process dynamic scenes, slow setup of acceleration data structures, and large memory footprints. Thus, current ray tracing techniques lack the capability to deal efficiently with fully animated scenes, such as wind blowing through a forest or a person's hair. Overcoming the limitations of current ray tracing systems would also enable the rendering of, for

15 example, higher quality motion blur in movie productions.

Current attempts to improve the performance of ray tracing systems have fallen short for a number of reasons. For example, current real-time ray tracing systems generally use 3D-trees as their acceleration structure, which are based on axis-aligned binary space partitions. Because the main focus of these systems is on rendering static

20 scenes, they typically fail to address the significant amount of setup time required to construct the required data structures in connection with fully animated scenes. Along these lines, one manufacturer has improved real-time ray tracing by building efficient 3D-trees and developing a technique able to shorten the time needed to traverse the tree. However, it can be shown that the expected memory requirement for the system

25 increases quadratically with an increase in the number of objects to be ray-traced.

Another manufacturer has designed a ray tracing integrated circuit that uses bounding volume hierarchies to improve system performance. However, it has been found that the architecture's performance breaks down if too many incoherent secondary rays are traced.

30 In addition, attempts have made to improve system performance by implementing 3D~tree traversal techniques using field-programmable gate arrays (FPGAs). The main increase in processing speed in these systems is obtained by tracing bundles of coherent rays and exploiting the capability of FPGAs to perform rapid hardwired computations.

. 9 .

The construction of acceleration structures has not yet been implemented in hardware. The FPGA implementations typically use floating point techniques at reduced preci si on .

A number of different techniques can he used to simulate photon trajectories, 5 including Markov chains. The use of quasi -Monte Carlo techniques in simulating Markov chains is described, for example, in; Lεcυyer et al., Randomized Quasi- Monte Carlo Simulation of Markov Chains with an Ordered Stale Space (2004} ("L'Ecuyer"), which is incorporated herein by reference as if set forth in its entirety,

L'Ecuyer discusses the use of a randomized quasi-Monte Carlo method for K) estimating the state distribution at each step of a Markov chain with totally ordered {discrete or continuous) state space. The number of steps in the chain can be random and unbounded. The method can be used in particular to get a low-variance unbiased estimator of the expected totai cost up to some random stopping time, when state dependent costs are paid at each step. That paper provides numerical illustrations i 5 where the variance reduction with respect to standard Monte Carlo is substantial,

Lεcυyer notes that a deterministic quasi-Monte Carlo method for estimating transient measures over a fixed number of steps, for discrete-time and discrete-state Markov chains with a totally ordered state space, was proposed and studied in the following papers, which axe also incorporated herein by reference in their entireties: 20 Lecot et a!., "Quasi-Random Walk Methods," Monte Carlo and Quasi-Monte

Carlo Methods 200(1 pp. 63-85, Springer-VerJag (Berlin, 2002).

Lecot et a!., "Quasi-Monte Carlo Methods for Estimating Transient Measures of Discrete Time Markov Chains," Mυnte Carlo and Quasi ^: -Monte Carlo Methods 2002, pp. 329-43, Springer-Veiiag (Berlin, 2004) f Lecot 2004"). 25 L'Ecuyer describes a technique that simulates n ^::: 2 ^A copies of the chain in parallel, for the same number of steps, using a (0, 2}-set]uence in base 2, At step / of the chain, it reorders the n copies according to their states and simulates the transitions (i.e., next, states) for the ti copies by employing the elements nj to nj -f « - 1 of the (0, 2)- sequeπce in place of uniform random .numbers to drive the simulation. It assumes that 30 simulating each transition of the chain requires a single uniform random van ate.

Convergence to the correct value was proved in Lecot 2004 under a condition on the structure of the transition matrix of the Markov chain.

L'Ecuyer attempts to generalize the technique to Markov chains with continuous state space, with a random and unbounded number of steps (and it is claimed that such

- j ⁾ -

a technique permits one to cover regenerative simulation, in particular), and for which the number ii of uniform random variates that are required to generate the next state in one step of the Markov chain can be larger than 1.

However, while various researchers have postulated the use of Markov chains to 5 simulate photon trajectories and the like, previous techniques of using quasi -Monte Carlo methods to simulate Markov chains, e.g., trajectories of photons, provided only moderate improvements over random sampling. However, QMC methods in accordance with the present invention, as described below, can exploit the intrinsic low-dimensional structure of the solution operator, yielding much more efficient K) techniques.

Thus, prior to the present invention, there was still a need for more efficient methods, systems, apparatus, devices and software program/code products for simultaneously simulating Markov chains.

Ii would be desirable, therefore, to provide improved and more efficient i 5 methods, systems, apparatus, devices and software program/code products for simulating Markov chains., using Quasi-Monte Carlo techniques and sorting strategies.

S»iϊimj|Q; _ιιι of _ι Uie _ιι |iiv _ι e _ι » _ι tion The present invention provides methods, systems, apparatus, and computer

20 software/computer code products suitable for use in a computer graphics system for generating a pixel value for a pixel in an image, the pixel value being representative of points in a scene as recorded on an image plane of a simulated camera, the computer graphics system being configured to generate the pixel value for an image using a selected ray-tracing methodology comprising the simulating of at least one ray shot

25 from the pixel into a scene along a selected direction, the ray-tracing methodology further comprising the calculating of the intersections of rays and surfaces of objects in the scene and the simulating of trajectories of rays illuminating objects in the scene, the simulating of trajectories using Markov chains.

In one aspect of the invention, methods, systems, apparatus and computer

30 software/computer code products in accordance with the invention comprise simulating (and/or means for simulating) the Markov chains using a quasi-Monte Carlo methodology, wherein the simulating of Markov chains comprises soiling states, and wherein the sorting comprises proximity sorting.

. 4 -

A further aspect of the invention comprises simulating a plurality of Markos chains simultaneously, and the simulating of a plurality of Markov chains simultaneously comprises utilizing qυasi-Monte Carlo points, which can be selected using any of a Halton sequence., a variant of a Ha! too sequence, a lattice sequence or a 5 (/, .?)- sequence. The simulating of a plurality of Markov chains simultaneously can utilize any number of chains. tn a further aspect of the invention, the simulating of a plurality of Markov chains simultaneously comprises adding additional trajectories on the fly, such as by trajectory splitting The simulating can also comprise utilizing a technique simulating K) particle absorption, and can include using a Russian Roulette technique. lii one aspect of the invention, the simulating of a plurality of Markov chains simultaneously for an s-dimensional problem comprises the following technique

fi -= 0

15 initialize Xo / using qυasi-Monte Carlo points x _t loop.

- sort state vector using a suitable order - n :~ n í ]

-- continue chain by computing X,. _. / using 20 subsequent quasi -Monte Carlo points X ₁ ,

The invention can also comprise ordering states by proximity in state space, and techniques of randomization. Any selected set or sequence of points with any selected randomization can be employed in conjunction with the methods of the present 25 invention.

In another aspect the invention further comprises the following technique:

n .- 0, select point sequence x, instantiate randomization 30 initialize X _t >> using rand(.r> ) loop

• sort state vector using some order

• n . ^:: n * I

• instantiate randomization

35 • continue chain by computing X _{n t} using rand(.v? )

The randomizing technique can comprise randomizing the quasi -Monte Carlo points in each time step w, wherein the randomizing more particularly comprises: selecting a (/, .^-sequence in base b ^:::: 2, from which subsequent samples are drawn, and XOR-ing the samples by an .v-dimensional random vector, the random vector being 5 generated after each transition step.

Another aspect of the invention comprises sorting by luminosity, or the norm of state; sorting using a spatial hierarchy; and sorting using buckets enumerated by at least one space filling curve. The sorting using a spatial hierarchy can comprise utilizing a binary hierarchy, and the spatial hierarchy can comprise any of a BSP-tree, M>-tree or K) other axis-aligned subset of a BSP-tree structure, or bounding volume hierarchies, or regular voxels. The binary hierarchy can be consiaicted by recursively subdividing using planes selected by a selected heuristic; and then traversing the hierarchy in in- order to enumerate the leaves of the hierarchy in an order of proximity. More efficient traversal can be provided by 1 eft-balancing the tree and storing the tree in an array data 15 structure. Sorting using a spatial hierarchy can also comprise utilizing bucket sorting and a selected space-filling curve; and in one aspect of the invention, using regular voxels.

In another aspect of the invention, the simulation and/or sorting processing comprises: 20 bounding an object to be rendered by an axis-aligned bounding box; recursively or iterative!}' dividing the bounding box into successive left and right branches, each terminating in a leaf, by applying splitting planes passing through selected points of the object, thereby generating a plurality of leaves; ordering the leaves hierarchically according to their respective luminosities; 25 performing a bucket sort on a matrix of selected size, divided into a selected number of buckets; and using a selected space-filling curve to proceed through the matrix, wherein within an individual bucket a smaller space-filling curve is used to proceed through individual cells in the matrix. 30 The simulation processes of the present invention can be adapted for configuration to solve integral equations The simulation is operable to enable light, transport simulation for synthesizing realistic-appearing images. In accordance with this aspect of the invention, an underlying light transport simulating integral equation can be reformulated as a path integral, such that sampling path space corresponds to

- 6 -

simulating Markov chains, in which paths are established by ray tracing and scattering events; wherein an initial distribution is determined by emission characteristics of light sources or sensors, and transition probabilities are determined by bi-directional reflectance distribution functions and bi-directional subsurface scattering distribution 5 functions for a surface in. the image. Transparency effects can be processed by utilizing the path integral, initial distribution and transition probabilities. In related aspects of the invention, the path integral can be solved by utilizing either high dimensional quasi- Monte Carlo points or by padding low dimensional quasi -Monte Carlo points, and the integral equation underlying light transport simulation is considered either as a 10 Fredhohn or a VoI terra integral equation.

Additional details, embodiments, practices, and examples are described in the following Detailed Description, taken in conjunction with the attached drawing figures, or will become readily apparent to those skilled in the an from a reading of the Detailed Description in conjunction with the drawings 15 jjftjgf IjgscrijMjjQij M Ijhjg ; Or^Mffgs

FIG. 1 is a schematic diagram of a conventional digital processing system in which the present invention can be deployed.

FiG. 2 is a schematic diagram of a conventional personal computer, or like 20 computing apparatus, in which the present, invention can be deployed.

FIG. 3 is a diagram illustrating an overall method in accordance with a first aspect of the present invention.

FIG. 4 is diagram of a ray tracing procedure, illustrating the problem of self-intersection. 25 FIG. 5 shows a diagram, in elevation view, of a partitioned axis-aligned bounding box that is itsed as an acceleration data structure in accordance with a further aspect of the invention.

FIGS. 6-8 are a series of diagrams, in isometric view, of the axis-aligned bounding box shown in FlG 5, illustrating the partitioning of the bounding box with 30 L- and R-planes

FKϊS. 9 and 10 are flowcharts of ray tracing methods according to further aspects of the invention.

FIG. 1 1 is diagram illustrating the shift φs(n).

_ 7 _

FSG. 12 shows an illustrative Halton sequence along with its stratification properties.

FIG. 13 is a diagram illustrating a sampling transport path space by bidirectional path tracing.

5 FSGS. 14A-B and 1 5A-B are rendered images illustrating differences between simulations using a Monte Carlo technique and simulations using an array-randomized quasi-Monte Carlo technique.

FIGS. 16A-D are a series of diagrams illustrating photon trajectories from a plurality of starting points on a light source.

K) FSGS. ϊ7A-G, 18A-B, and 19A-C are a series of diagrams illustrating the sorting of states into leaves of a spatial hierarchy, thereby defining an order of proximity by traversing the hierarchy in in-order.

FIGS. 20A-D are, respectively, a test scene "Shirley 6," a graph showing average RMS error to a master solution, a graph showing global variance, and a graph 15 showing pixel -based variance.

FIGS. 2 IA-D are, respectively, a test scene " ^" Invisible Date/' a graph showing average RMS error to a master solution, a graph showing global variance, and a graph showing pixel-based variance.

FIGS. 22 A-C are a visual comparison of the test scene "invisible Date" 20 rendered, respectively, using Monte Carlo, randomized quasi-Monte Carlo, and randomized quasi-Monte Carlo with sorting,

FIGS. 23 A-D axe., respectively, a schematic view of a labyrinth test scene, a graph showing the average amount of total radiance received by the camera using each technique, an enlarged portion of the graph, and a table displaying the deviation from a 25 master solution.

FIGS, 24-30 are flowcharts of a number of techniques and sub-techniques according to various described aspects of the invention.

Detailed Description

30 The present invention provides improved techniques for ray tracing, and for the efficient construction of acceleration data structures required for fast ray tracing. The following discussion describes methods, structures and systems in accordance with these techniques.

It vviil be understood b> those skilled in the art that the desαibed methods and systems can be implemented in software, hardware, or a combination of software and hardware, usirm conventional compute? appajatus such as a personal compute! (PO 01 equivalent device operating in accordance with, or emulating, a conventional operating 5 system such as Microsoft Windows, ϊ .inuv or Unix, either in a standalone configuration or across a network The various processing means and computational means described below and ieeited in the claims may therefore be implemented in the software and/or hardware elements of a properly configured digital processing device or network of devices Processing may be performed sequentially or in parallel, and K) may be implemented using, special purpose or reconfigurable hardware

Methods, devices or software- products in accoi dance with the invention can opeiate on any of a wide range of con semi ona! computing devices and systems, such as those depicted by way of example in FlG 1 (c g , network system 100), whether standalone, networked, portable or fixed, including com cnlionaj PCs 102, laptops 104, 15 handheld or mobile computers 10o, or across the Internet or other networks 108, which may in turn include servers 1 IO and storage 1 12

Sn line with conventional computer software and hardware practice, a software application configured in accordance with the invention can operate within, e g . a PC 102 like that shown in HCi 2, in which piogiam instructions can be icad fiom a 20 CD-ROVl 1 1(>, magnetic disk or other storage 120 and loaded into RAM 1 14 for execution by CPU 1 18 Data can be input into the system via any known device or means including a conventional keyboard, scanner, mouse or other elements lCϊ

T he present description of the invention is dhided into tv\o major sections

S Real-time Precision Ray Tracing

25 f i Sorting Strategies for Rays Simulated Using Markov Chains arid

Quasi -\1oute Carlo Techniques

I. Real-time Precision Ray Tracing

MG 3 is a diagram depicting an overall method 200 in accordance with an 30 aspect of the present invention The method is practiced in the context of a computer graphics system, in which a pixel value is generated for each pixel in an image Each generated pixel value is representative of a point in a scene as recorded on an image plane of a simulated camera The computer graphics system is configured to generate the pixel \ alue for an image using a selected ray -tracing methodology The selected

- 9 -

ray-tracing methodology includes the use of a ray tree that includes at least one ray shot from the pixel into a scene along a selected direction, and further includes calculations of the intersections of rays a.nd objects (and/or surfaces of objects) in the scene.

In the FIG. 3 method 200 , bounding volume hierarchies are used to calculate the 5 intersections of rays and surfaces in the scene, In step 201 , a bounding box of a scene is computed, in step 202, it is determined whether a predetermined termination criterion is met. If not, then in step 203 the axis-aligned bounding box is refined. The process continues recursively until the termination criterion is met. According to an aspect of the invention, the termination criterion is defined as a condition at which the

K) bounding box coordinates differ only in one unit of resolution from a floating point representation of the ray/surface intersection point. However, the scope of the present invention extends to other termination criteria.

The use of hounding volume hierarchies as an acceleration structure is advantageous for a number of reasons. The memory requirements for bounding

15 volume hierarchies can be linearly bounded in the number of objects to be ray traced. Also, as described below, bounding volume hierarchies can be constructed much more efficiently than 3D-trees, which makes them very suitable for an amortized analysis, as required for fully animated scenes.

The following discussion describes in greater detail certain issues in ray tracing

20 technology, and particular aspects of the invention that address those issues, FlG. 4 is a diagram illustrating the "seif-intersectionf problem. FIG. 4 shows a ray tracing procedure 300., including an image surface 302. an observation point 304, and a light source 306. in order to synthesize an image of the surface, a series of computations are performed in order to locate rays extending between the observation point 304 and the

25 surface 302. FlG. 4 shows one such ray 308. Ideally, there is then calculated the exact point of intersection 3 10 between the ray 308 and the surface 302.

However, due to floating point arithmetic computations on computers, it is sometimes possible for the calculated ray/surface intersection point 312 to be different from the actual intersection point 310. Further, as illustrated in FIG. 4, it is possible for

30 the calculated point 312 to be located on the "wrong" side of the surface 302. Tn that case, when computations are performed to locate a secondary ray 314 extending from the calculated ray/surface intersection point 3 12 to the light source 306, these computations indicate that the secondary ray 3 14 hits the surface 302 at a second

- 10 -

intersection point 316 rather than extending directly to the light source 306. thus resulting in an imaging error

One known solution to the self-inteisectiυn problem is to start each jay 308 at a safe distance from the surface 302 I his safe distance is typically expressed as a global 5 floating point < However, the determination of the global floating point t depends heavih on the scene, and the particular location within the scene itself, for which an image is being s\ πthesi/ed

An aspect of the invention pro\ ides a more precise alternative After arriving at a calculated myvsuiface intersection point 3 12, the calculated point 312 and the

10 direction of the ray 308 are then used to re-compute an intersection point that is closer to the actual intersection point 310 This re-coniputation of the intersection point is incorporated into the ray Uacing technique as an iteration that increases piecision If the iterativcly computed intersection point turns out to be on the "wrong"' side of the surface 302, it is moved to the "correct" side of the surface 302 The iteratively

15 computed intersection point can be moved along the surface normal, or along the axis determined by the longest component of the norma! Instead of using a global floating point u the point ss moved by an integer < to the last bus of the floating point mantissas

The described procedure avoids computations _* m double precision and has the advantage that it implicitly adapts to the scale of the floating point number, which is 20 determined by its exponent Thus, in this implementation, all secondary rays directly start from these modified points making an t -offset unnecessary During intersection computation, it can lheiefose be assumed that the ray interval of validity to begin at 0 rather than some offset

K'lodif) ing the integer representation, of the mantissa also avoids numerical

25 problems when intersecting a triangle and a plane in order to decide v> hieh points are on what side

Hxploiting the convex hull property of convex combinations, intersections of iays and free-form surfaces can be found by refining an axis-aligned bounding box, which contains the point of intersection nearest to the rav origin This refinement can

30 be continued until the resolution of floating point numbers is reached, i e , until the bounding box coordinates differ only in one unit of resolution from the floating point representation The self-intersection problem then is a\ oided by selecting the bounding box cof nef that is closest to the surface normal in the center of the bounding box This

- 1 1 -

comer point then is used to start the secondary ray. This "ray object intersection test" is very efficient and benefits from the avoidance of the self-intersection problem.

After constructing the acceleration data structure, the triangles are transformed in-piace. The new representation encodes degenerate triangles so that the intersection 5 test can handle them without extra effort. If of course is also possible to just prevent degenerate triangles to enter the graphics pipeline. Sections 2,2.1 and 2.2.2 of the Computer Program listing set. forth listings of source code in accordance with the present aspect of the invention.

The test first determines the intersection of the ray and the plane of the triangle

K) and then excludes intersections outside the valid interval ]0, result. tfarj on the ray This is achieved by only one integer test. Note that the Hj is excluded from the valid interval. This is important if denormalized floating point numbers are not supported. If this first determination is successful, the test proceeds by computing the Bary centric coordinates of the intersection. Note that again only an integer test, i.e., more i 5 specifically only testing two hits, is required to perform the complete inclusion test. Thus the number of branches is minimal, in order to enable this efficient test, the edges and the normal of the triangle are scaled appropriately in the transformation step.

The precision of the test is sufficient to avoid wrong or missed ray intersections. However, during traversal situations may occur in which it is appropriate to extend the

20 triangles for a robust intersection test. This can be done before transforming the triangles. Since the triangles are projected along the axis identified by the longest component of their normal, this projection case has to be stored. This is achieved by counters in the leaf nodes of the acceleration data structure: The triangle references are sorted by the projection case and a leaf contains a byte for the number of triangles in

25 each class.

A further aspect of the present invention provides an improved approach for constructing acceleration data structures for ray tracing. Compared with prior software implementations that follow a number of different optimizations, the approach described herein yields significantly flatter trees with superior ray tracing performarsee

30 Candidates for splitting planes are given by the coordinates of the triangle vertices inside the axis-aligned bounding box to be partitioned. Note that this includes vertices that actually lie outside the bounding box, but have at least one coordinate that lies in one of the three intervals defined by the bounding box. Out of these candidates, there is selected the plane closest to middle of the longest side of the current axis-

- 12 -

aligned bounding box A further optimization selects only cooidinates of triangles whose longest component of the surface normal matches the normal of the potential splitting plane This pioceduie yields much flatte? tteεs, since placing splitting planes through the triangle vertices implicitly reduces the number of triangles split by splitting 5 planes In addition, the surface is approximated tightly and empt> space is maximized if the number of triangles is higher than a specified threshold and there are no more candidates for splitting planes, the box is split in the middle along its longest side This avoids inefficiencies of other approaches, including the use, for example, of long diagonal objects

K) I he recursive procedure of deciding which triangles belong to the left and right child of a node in the hierarchy has typically i cquπcd extcnsh e bookkeeping and memory allocation There is a much simpler approach that onh fails in exceptional cases Only tv\ o arrays of references to the obj ects to be ray traced are allocated The first arra> is initialized with the object ieferences During recursive space partition, a

15 stack of the elements on the left is grown from the beginning of the array, while the elements, which are classified right, are kept on a stack growing from the end of the array towards the middle In order to be able to quickly restore the elements that are intersecting a split plane, i e , are both left and right, the second airay keeps a stack of them fhus backtiacking is efficient and simple

20 Instead of pruning branches of the tree b\ using the suiface area heuristic, tree depth is pruned by approximately left-balancing the binary space partition starting from a fixed depth As observed by exhaustive experimentation, a global fixed depth parameter can be specified across a vast variety of scenes This can be understood by observing that after a certain amount of binarj space partitions usually theie remain

25 connected components that a;e relatively flat in space Section 2 3 1 of the Computer Program Listing sets forth a listing of source code in accordance with this aspect of the invention

Using bounding volume hierarchies, each object to be ray traced is referenced exactly once λs a consequence, and in contrast with 3D~trees, no mailbox if ⁾ mechanisms are required to prevent the multiple intersection of an object with a ray during the U a versa! of the hiεiarchy This is a significant adv antage from the view point of system performance and makes implementations on a shared memory system much simpler A second important consequence is that there cannot be more inner nodes in the tree of a bounding volume hierarchy than the total number of objects to be ray-

- 13 -

traced Thus the memory footprint of the acceleration data structure can be ϊineaiϊy bounded in the number of objects before construction Such an a prion bound is not a\ ailable foj the const! uctitm of a 3P~tτee, wheje the memoiy complexity is expected to increase quadrats caliy with the number of objects to be ray -traced 5 Thus, there is now described a new concept of bounding volume hieraiehies that aie signifieanth faster than current 3D-tree ray tracing techniques, and in which the memory requirements grow Hneari) , rather than expected quadraticall} , with the number of objects to be ray-traced The core concept that allows bounding volume hierarchies to outperform 3D-trecs is to focus on how t>pace can be partitioned, father K) than focusing on the bounding volumes themselves

In a 3D~tree, a bounding box is partitioned by a single plane Accoiding to the present aspect of the invention, two parallel planes are used to define two axis-aligned bounding boxes FIG 5 is a diagram illustrating the principal data structure 400

FiG 5 shows an axis-aligned bounding box 402. in del ation view An L-plane 15 402 and an R-plaiie 404, which are axiis-aligned and parallel with each other, are used to partition bounding box 402 into left and right axis-aligned bounding box The left bounding box extends from die left wall 406 of the original bounding box 402 to the 1. -plane 402 The right bounding box extend;> from the R-pϊanc 404 to the right wall 408 of the original bounding box 402 llius, the left and right bounding boxes may 20 overlap each other The trav ersal of ray 412 is determined by the positions of intersection with the L- and R-planes 402 and 404 relative to the inters al of \alidtty JLY, F\ 412 of the ray 410

In the FIG 5 data structure 400, the L- and R-planes 402 and 404 are positioned w ith respect to each other to partition the set of objects contained within the original 25 bounding box 402, rather than the space contained within the bounding box 402 In contrast with a 30-tree partition, having two planes offers the possibility of maximizing the empty space between the two planes Consequently the boundary of the scene can be approximated much faster

HGS (>-8 are a series of three-dimensional diagrams further illustrating data iθ structure 400 FIG ό shows a diagram of bounding bo\ 402 For purposes of illustration, v irtual objects within bounding bo\ 402 are depicted as abstract circles 41o As shown in FIGS 7 and 8, L-plane 404 and R-plane 40b are then used to partition bounding box 402 into a left bounding box 402a and a right bounding box 402b The L- and R-planes are selected such that the empty space between them is maximized

- 14 -

Each virtual object 416 ends up in either the left bounding box 402a or the right bounding box 402b. As shown at the bottom of FlG. 8, the virtual objects 416 are partitioned into 'left" objects 416a and "right" objects 416b Each of the resulting bounding boxes 402a and 402b are themselves partitioned, and so on, until a 5 termination criterion has been satisfied.

FlG. 9 is a flowchart of the described method 500. In step 501 , a bounding box of a scene is computed. In step 502, parallel L- and R-planes are used to partition the axis-aligned bounding box left and right axis-aligned bounding boxes, which may overlap. In step 503, the left and right bounding boxes are used to partition the set of

K) virtual objects contained with the original axis-aligned bounding box into a set of left objects and a set of right objects. In step 504, the left and right objects are processed recursively until a termination criterion is met.

Instead of one split parameter, used in earlier implementations, two split parameters are stored within a node. Since the number of nodes is linearly bounded by i 5 the number of objects to be ray traced, an array of all nodes can be allocated once. Thus, the costly memory management of 3D-trees during construction becomes unnecessary.

The construction technique is much simpler than the analog for 3D~lree construction and is easily implemented in a recursive way, or by using an iterative

20 version and a stack. Given a list of objects and an axis-aligned bounding box, the L- and R-planes are determined, and the set of objects is determined accordingly. The left and right objects are then processed recursively until some termination criterion is met. Since the number of inner nodes is bounded, it is safe to rely on termination when there is only one object left.

25 It should be noted that the partition only relies on soiling objects along planes that are perpendicular to the x-, y-, and z-axes, which is very efficient and numerically absolutely stable. In contrast with 3D-trees, no exact intersections of objects with splitting planes need to be computed, which is more costly and hard to achieve in a numerically robust way. Numerical problems of 3D-trees, such as missed triangles at

30 vertices and along edges, can be avoided by extending the triangles before the construction of the bounding volume hierarchy. Also, in a 3D-tree, overlapping objects have to be sorted both into the left and right axis-aligned bounding boxes, thereby causing an expected quadratic growth of the tree.

- 15 -

Various techniques may be used to determine the L- and R-planes, and thus the actual tree layout. Returning to FIGS. 6-S, one technique is to determine a plane M 418 using the 3D-t.ree construction technique described above and partition the objects such that the overlap of the resulting L-pSane and R-piane of the new axis-aligned bounding 5 boxes minimally overlaps the suggested splitting plane M 418, The resulting tree is very similar to the corresponding 3D-tree, however, since the object sets are partitioned rather than space, the resulting tree is much flatter. Another approach is to select the R -plane and L-plane in such a way that the overlap of child boxes is minima! and the empty space is maximized if possible. It should be noted that for some objects axis-

10 aligned bounding boxes are inefficient. A.n example of such a situation is a long cylinder with small radius on the diagonal of an axis-aligned bounding box,

FIG. 10 is a flowchart of a method 600 according to this aspect of the invention In step 601, a bounding box of a scene is computed, In step 602, a 3D-tree construction is executed to determine a splitting plane M ϊn step 603, parallel L- and R»planes are

15 used to partition the axis-aligned bounding box into left and right axis-aligned bounding boxes that minimally overlap the splitting plane M, ϊn step 604, the left and right bounding boxes are used to partition the set of virtual objects contained within the original axis-aligned bounding box into a set of left objects and a set αf right objects. In step 605, the left and right objects are processed recursively until a termination

20 criterion is met. It should be noted that the method 600 illustrated in FlG. 10, as well as the method 200 illustrated in FlG. 3, may be combined with other techniques described herein, including techniques relating to 3D-tree construction, real-time processing, bucket sorting, self-intersection, and the like.

Sn the case of the 3D-tree, the spatial subdivision is continued so as to cut off

25 the empty portions of the space around the object, ϊn the case of the described bounding volume hierarchy, partitioning such objects into smaller ones results in a similar behavior, ϊn order to maintain the predictability of the memory requirements, a maximum bounding box size is defined. Ail objects with an extent that exceeds the maximum bounding box size are split into smaller portions to meet the requirement

30 The maximum allowed size can be found by scanning the data set for the minimal extent among all objects.

The data structure described herein allows the transfer of the principles of fast 3D~tree traversal to bounding volume hierarchies. The cases of traversal are similar: ( I ) only the left child; (2) only the right child; (3) the left child and then the right child;

- 16 -

(4) the right child and then the IeIl child, or (5) the ray is between split planes (i e . empty space) Since one node in the described technique is split by two parallel planes, the ojdej of how to u averse the boxes is determined by the iay direction FlG o sets forth a source code listing incorporating the techniques described above 5 Previous bounding \ olume hierarchy techniques could not efficiency determine the order of how to traverse the child nodes or required additional eiYort, such as updating a heap data structure In addition a whole bounding \ olume had to be loaded and tested against the ray, while the present approach only requires the two plane distances Checking the ray against the two planes in software seems to be more

K) expensive however The tra\ ersal is the bottie neck in 30-trees, and doing some mote computation here better hides the latencies of memory access In addition, the bounding volume hierarchy Uees tend to be much smaller than corresponding 31)-trees of same performance

Although there is herein desαϊbed a new bounding volume hierarchy, there Ls a i 5 strong link to traversing 3D-trees Setting L — R , the classical binary space partition is obtained, and the traversal technique collapses to the traversal technique for 3D-trees

The described bounding volume hierarchy also can be applied to efficienth find iay freetbrm surface intersections by subdividing the freefoπn surface Doing so allows the intersection of a fiecfoim surface with a convex hull ρiopeit> and a

20 subdivision technique efficiently to be computed up to floating point precision. depending on the actual floating point arithmetic A subdivision step is performed, for example, for polynomial surfaces, rational surfaces, and approximating subdivision surfaces I- or each axis in space the possibly overlapping bounding boxes are determined as discussed above (n case of a binary subdivision, the intersection of the

25 L-bo\es and the intej section of the R-boxes for new bounding boxes of the new meshes Ncn\ the above-described traversal can be efficiently performed, since the spatial order of the boxes is known Instead of pie-computing the hieiaiehy of bounding volumes, it can be computed on the fly This procedure is efficient for free-form surfaces and allows one to save the memory for the acceleration data structure, which is replaced by if) a small stack of the bounding volumes that have to be traversed by backtracking The subdivision is continued until the ra\ surface intersection lies in a bounding volume that collapsed to a point in floating point precision or an interval of a small size Section 2 1 1 of the Computer Program Listing sets forth a code listing in accordance with this aspect of the im entioπ

- 17 -

Using regular grids as an acceleration data structure in ray tracing is simple, but efficiency suffers from a lack of spatial adaptivity and the subsequent traversal of many empty grid cells. Hierarchical regular grids can improve on the situation, but still are inferior as compared to bounding volume hierarchies and 3D~trees. However, regular 5 grids can be used to improve on the construction speed of acceleration data structures. The technique for constructing the acceleration data structures are similar to quick sorting and are expected to run in CXn log n). An improvement can be obtained by applying bucket sorting, which runs in linear time. Therefore the axis-aligned bounding box of the objects is partitioned into fi _x ^■ « _v ^x fh axis-aligned boxes. Each

1.0 object then is sorted into exactly one of these boxes by one selected point, e.g., the center of gravity or the first vertex of each triangle could be used. Then the actual axis- aligned bounding box of the objects in each grid cell is determined. These axis-aligned bounding boxes are used instead of the objects they contain as long as the box does not intersect one of the division planes, in that case the box is unpacked and instead the

15 objects in the box will be used directly. This procedure saves a lot of comparisons and memory accesses, noticeably improves the constant of the order of the construction techniques, and also can be applied recursively. The above technique is especially appealing to hardware implementations, since it can be realized by processing a stream of objects.

20 The acceleration data structures cars be built on demand, i.e , at the time when a ray is traversing a specific axis-aligned bounding box with its objects. Then on the one hand the acceleration data structure never becomes refined in regions of space, which are invisible to the rays, and caches are not polluted by data that is never touched. On the other hand after refinement the objects possibly intersected by a ray are already in

25 the caches.

From the above discussion, it will be seen that the present invention addresses long known issues in ray tracing and provides techniques for ray tracing having improved precision, overall speed and memory footprint of the acceleration data structures. The improvements in numerical precision transfer to other number systems

30 as well as, for example, to the logarithmic number system used in the hardware of the ART ray tracing chips. It is noted that the specific implementation of the IEEE floating point standard on a processor or a dedicated hardware can severely influence performance. For example, on a Pentium 4 chip deπorraalized numbers can degrade

- 18 -

performance by a facto* of 100 and more As discussed abo\ e. an implementation of the invention avoids these exceptions The view of bounding volume hierarchies described beiεin makes them suited for tealtime iay tracing In an amortized analysis, the described techniques outperform the previous state of the art. thus allowing more 5 precise techniques to be used, for example, for computing motion blur in full) animated scene, as in a production setting or the like It will be apparent from the above discussion that the described bounding volume hierarchies have significant ach antages when compared with 3D-trce& and other techniques, particular!) _' in hardware implementations and fot huge scenes In an amortized analysis, the described bounding K) volume hierarchies outperform cυirεnt 3D-trees fa) at least a factor of two In addition, the merπoη, footpπnt can be determined beforehand and is lincai in the numbct of objects

II. Sorting Strategies for Rays Simulated Using Markov Chains and 15 Quasi-λionie Carlo Techniques

The following described aspect of the invention provides methods, techniques and systems for efficient photorealistic simulation of light transport, suitable for use in computer graphics and other applications in which a computer or other digital processor

20 is used to calculate pixel \alues foi pixels in an image, using iav-uacing techniques to calculate rav-swface intersections

In particular, the present invention provides improvements to conventional ray -tracing techniques, including methods for making ray-surface intersection testing more efficient, by applying sorting strategies that enable rapid processing of objects

25 w ϊthin a scene illuminated by packets or units of light, such as photons, the trajectories of which aie simulated using Markov chains, the Markov chains being simulated using a suitable quasi -Monte Carlo (QMC) methodology The sorting strategies include pjoximitv sorting, and the i m ention advantageous!) _' exploits the ϊov. dimensionality of the QMC methodology using such proximity sorting

30 The described techniques and systems can be used to solve the integral equation in close to real-time, a task that has presented significant difficult) in conventional systems

Previously, using qua si -Monte Carlo methods to simulate Markov chains, e g , trajectories of photons, provided only moderate improvements ovci random sampling

35 However, QMC methods in accordance with the present invention can exploit the

- 19 -

intrinsic low-dimensioua! structure of the solution operator, yielding much more efficient techniques.

The description of this aspect of the invention is organized as follows

5 1. Markov Chains and Quasi-Monte Carlo Techniques

2 , Simultaneous Si mυlati on of Markov Chains

2 I Analysis of a Deterministic Technique hi One Dimension 2.1 1 Simplification of the Technique 2. ! .2 When and Why It Works 10 2.2 Simplified Technique in v Dimensions

2.3 Randomization

2.4 Array -Randomized Quasi-Monte Carlo 3 Application to Light Transport Simulation

3.1 Fredholm and VoI terra Integrals 15 4. Sorting Techniques and Systems

4 1 Norm of State 4 2 Spatial Hierarchy 4 3 Bucket Sorting and Space-Filling Curves S Results

20 6 Conclusion

7. General Techniques

J, Markov Chains and Quasi-Monte Carlo Techniques

In mathematics, a Markov chain is a discrete-time stochastic sequence of system

25 states. A Markov chain is a "memory less" process, in which previous states axe irrelevant for predicting the probability of subsequent, states, given knowledge of the current state. Markov chains can be used to model and analyze a number of different phenomena For example, the English language has been modeled using a Markov chain, based upon the calculation of the relative frequencies of one word following

30 another. Using these transition probabilities, it is possible to generate English sentences. Many other evolution processes can also be described using Markov chains, including, for example: the analysis of queues, Brownian motion, particle transport, and, of particular relevance to the present discussion, light transport simulation.

More particularly, Markov chains can be used to simulate the trajectories of

35 photons illuminating objects in a scene in a simulated image. Light transport simulation is characterized by complicated domains and discontinuities, and reduction to high-dimensional integrals. In addition, Sight transport simulation is characterized by a low-dimensional structure, due to repeated application of identical operators.

Markov chains may themselves be simulated using quasi -Monte Carlo (QMC)

40 techniques. The general idea behind QMC techniques is that, rather than attempting to

- 20 -

condition random! y generated numbers in some way, one starts with a deterministic sequence that is known to provide very uniform coverage of a selected interval. In randomized QMC, the sequence is randomized in such a way that the numbers have a random appearance, but as a group retain the desirable property of covering the selected 5 interval in a regular way.

There are a number of approaches that can be employed for using QMC to simulate Markov chains. One approach is to apply a high-dimensional, low- discrepancy point set. However, this approach provides significant advantages only for moderate dimensions. Another approach is padding of a low-dimensional, randomized

K) point set. This second approach is simpler to implement, and achieves results that are almost as good. A third approach combines sorting states and padding. This third approach results in a performance boost, in which the power of QMC is visible, and may be used in conjunction with both deterministic and randomized techniques. The implementation of this approach is relatively straightforward. i 5 Properties of processes can be estimated by averaging the contributions of multiple Markov chains. Instead of simulating each trajectory independently, it has been found that simultaneously simulating Markov chains using correlated samples and sorting the ensemble of states after each transition step can notably improve convergence.

20 Given an order on the state space, these approaches can be improved by sorting the ensemble of Markov chains. According to aspects of the invention described below, the use of deterministic approaches results in simplifications in the described techniques, and provides useful indicia as to when and why the described sorting results in increased performance. As further described below, some or all of the described

25 sorting strategies may be advantageously used to increase the efficiency of light transport si mul ati on ,

2. Simultaneous Simulation of Markov Chains

In order to increase efficiency, Markov chains may be simulated simultaneously. 30 The idea of improving the simultaneous simulation of Markov chains with quasi -Monte Carlo methods by an intermediate sorting step was originally introduced by Le ^' cot in a series of papers dealing with the Boltzmann equation and, later on, the heat equation. This idea was then used and refined for solving the heat equation on a grid by Morokoff and Cafiisch and recently extended by L'Ecuyer, Tuffin, Demers et a! , to incorporate

- 21 -

randomized versions of the technique and splitting for rare event simulation. Independent research, in a way related to the above approaches, lias been conducted in the fieid of computer graphics.

In the following discussion, there are described techniques embodying improvements 5 over the above schemes. Background for these improved techniques is provided, in part, by techniques described in Lecot et al., "Qυasi-Monte Carlo Methods for Estimating Transient Measures of Discrete Time Markov Chains," in H. Niederreiter, editor, Mom? Carlo and Ouasi-Monfe Carlo Methods in Scientific Computing 2002, pp. 329-44, Springer Verlag (2004), which is incorporated herein by reference as if set K) forth in its entirety.

The following discussion provides a conceptual framework as to when and why the described techniques are superior to prior approaches There are also described advantageous implementations of the described techniques.

} 5 2.1 Analysis of a Deterministic Technique in One Dimension

There is now described a deterministic technique for a one-dimensional problem. The technique presented in Lecot simultaneously simulates Markov chains on a discrete state space E with an initial distribution μ :— [μn^E 20 and a transition matrix

Using a (t, 2)-sequence {.< ^• , }>.?■ _. $.. in base h, λ' ^" ~ b ^m chains are simulated in parallel, where X _f ,j is the state of chain / at time step n and NQ is the set of natural numbers including zero. Further, the techniques requires that for m > t the N- b ^m subsequent components x _h\ form a (0. m, ])-πet in base b. As shown in Lecot, the technique 25 converges if the following criterion i s met: h ~ l > ^{■■ ■••} I

Vfr € E : ■ YlCw.'iJiJ<* 4L~f t - ¹ H- - Pi .

« ~ i. }> ~ i

X,..ι is the state of chain / at time step n. Thus, the simulation may be performed as follows:

n 0 initiate X,- v , usmg x _v » for 0 / >. N loop

For ,Y b ^' " chains, there is selected a {/, 2)~seqυenec x' in bat>e ft, t>ueh that for m > t, the h'" subsequent components r, _\ form a (0, w, 1 )~net

It will be seen that the Soboi' (0 _s 2}-sequence (φj(/). φ _> {/}} in base δ = 2 fulfills 5 the following rcqυitements

For m 3 > / 0, we !ia\ e λ ^r 2^ 8 and

^S . '.(/ ^ n B) }'\ ₍ {0 4 2 C 1 S 3. ?; fot π _* > _t > All pej mutations a?e identical

10 2JJ Simplification of the Technique

It will be seen that, assuming uniform distributions, stabie sorting does not change the state order In fad, all permutations, can be omitted because the convergence condition remains unchanged Thus, the above-described technique can be simplified 15 Consider the SoboP sequence

which is a (0, 2)-sequence in base b ~ 2 and fuϊfdls the assumptions required for the convergence condition to hold The original Sobol ^' sequence is defined in 1 Sobol\ "On the Distribution of Points in a Cube and the Approximate Evaluation of Integrais," 20 Zk I Yt hisL \ UiL i Mat. Fι~ „ 7(4) 784-802 f 1967), which h incorporated herein by reference as if set forth in its entirety A simple code example for

is provided in T KoHig and A Keller, "Efficient Multidimensional Sampling," ( Ompuier (inφhia horum. 21(3) 557 Se>3 (September 2002 >, which is incorporated 25 herein by reference as if set forth in iti, entirety

T he simulation itself starts at time step n ~ 0 initializing state

for 0 < / < λ ^' using .v, ; for the realization ofμ The technique then continues by sorting the states (described in detail in Section 3. below } and continues the chains by 30 computing

X λ

- 23 -

using

for O s / ^ λ' to realize transitions according to P The index

5 for selecting the next state for transition in fact uses the van der Corput sequence <f ^j ? in base 2, which is a (0, I )-sequence and thus a sequence of (0. »κ i)~nets For example, choosing m ~ 3 "- / ~ 0 we have <V ~ 2* ~ S and

for w _* ? γ« _O I fence all indices used during the different time steps « are in fact identical

IO for ail m

Assuming υnifoim probabifitics i ^■ J ^~ IT the convergence theorem still applies, but more important, stable sorting docs not change the state order It thus follows that, in fact, the index permutation can be chosen 15 as identity without touching the convergence conditions The same applies for selecting the initial states Xv , and it results in the following simplified, but equivalent, technique

♦ ii : 'J

« k)U(t

using oniy the second component of the {(). 2)-seqtιence x,- 20

2.1.2 When and Why It W orks

It will be seen that the improved comergence of the scheme, which has been observed in many applications, i^ caused by the structure of the samples

</M '' ^* }> ^■ - '"

25 used to realize the transitions of X,, , according to /' Hits can be understood by decomposing the radical inverse

- 24 -

which reveals an implicit stratification φ^(!) is an offset with spacing _κ depending

on the state number /, while the shift φs(ti) is identical for all the intervals at time step //

FlG i 1 shows a two-dimensional array 700 illustrating the shift <J\-(«)

Here, the low dimensional setting may result in a misleading interpretation of 5 the samples being a shifted lattice or stratified samples, as the entirety of the φ^.{f) for

/ 0, , . 2 ^' - S in fact must be an {0, ni, l)-nct and thus an equidistant set of samples. However, the good performance stems from the property that φ\{J) is a (/, v)-sequence

and thus a sequence of {/, m\ .v}-πets for any m' with / < m' ± m This means that /> ^'" states, that are similar in state space and therefore subsequent by order after sorting, !0 will sample their transition by a (t. m\ sj-net, which guarantees for good discrete density approximation The maximum impros cment would be obtained if all 2 ^m chains were in the same state The more the states of the chains are separated in state space, the smaller the performance improvements "will be

15 2.2 Simplified Technique in * Dimensions

Using a (t, s)-sequence in base h. which is a sequence of (λ m, y)-nets, the scheme also works in 5 dimensions Marko\ chains, whose states are similar after sorting are guaranteed to sample the transition probability by low disciepancy samples The simplified technique in s dimensions now looks like the following

• >; - f i

• mϊ u.thx X ₁ ; S M )", φ uk~-i \ fι -pfι CtU< < jxunts >

• ϊoo| >

:i . » !

( ^■ . ψi miic « hftijt by o> <^spfn inς X, , »^n,^ _" "filx-vqurni ^nii tph^ <-< froi o .>

Some simulations require trajectory splitting in order to capture certain local subtle effects While this alieady has been addressed using a more complicated approaches, it in fact can be achieved in a simpler way by just drawing more samples out of the (t, s)-sequence for states to be split

25 This is a consequence of the fact that it is not e\cπ neces^ai y to simultaneously simulate exactly b' ⁿ chains It is only important to draw subsequent samples from the (/. v)-sequence and to minimize the number b ^n> of points in the subsequent (/, m, .s)-nets

- 25 -

in order to enable the maxima) performance gain Hie choice of the (/, >s)~sequence. however, is restricted by the condition that (0. ^-sequences only exist for b ≥ s and that ni s t

Note that other sadscal inv ersion based points sets like tSie Hal ton sequence ot 5 its scrambled \ ariants fulfill properties similar to (λ "> )-scquences and will result in .similar performance gains FlG 12 shows a giaph 720 illustrating a Ualton sequence, points x", = \}\i{i\ φHO) 2 ^J1 the depicted Halton sequence, the radical imcrsion implies stratification, with subsequent points that fall into the biggest "holes ^" This properly is good for an> starting point. (/. ,s)~scqucnces have the same property

23 Randomisation

While there currently exists no general proof for convergence of the deterministic technique in higher dimensions, the technique becomes unbiased bv freshly randomizing the quasi-Monte Carlo points in each time step // Since this is. in

15 fact, an instance of padded replications sampling, the argument for unbiasedness becomes simpler. Randomization, howe\er, deserves special attention

The most efficient implementation consists of choosing a (λ .v)-sequence in base h 2, from which subsequent samples are drawn, which are XOR~ed by an .y-dimcnsional random v ector This random vector is freshly drawn after each transition

20 step ϊ I ou ever, as random scrambling changes the order in which the points are enumerated, the local propeπies of the sequences of (/, m. 0-nets arc changed, too

This observation can be taken as an explanation for .some of the effects seen using earlier approaches Sobol and Korobov points used in the array -(R)QM C simulation are worse up to an order of magnitude in variance reduction lhan their

25 transformed (Gray-code foi SoboS, Baker transform for Korobov) counteipaits The explanation for this is found in the structure of the points The sequence of (/, m. s}~ nets extracted from the bobol sequence is locally worse than its (irav-code variant 1 he same goes for the Korobov lattice and its tiansformcd variant

30 2,4, Array-Raiieloniiwd Quasi-iYlonte Carlo

The following is a randomized technique for an λ~dimensional problem

- 25 -

n :~ 0. select point sequence x _: instantiate randomization initialize X _$ j using rand(x<) for 0 < i < N loop

_* serf stsfs vector using some order

_* n ; ~ n 4- 1

_* continue chain by computing X,« _, .< using rar>d{jfj) foe 0 < / < N

The technique has a number of interesting properties. First the technique is unbiased for one tabulated set of x> and for a continuous stream of x>. Second, splitting and absorption are intrinsic when using a stream. This is tαte in a Russian roulette 5 scheme, and is also true when splitting by taking more continuous samples from x,<.

3, Light Transport Simulation

For numerical evidence, the above-described techniques are applied to light transport simulation for synthesizing realistic images. The underlying integral equation 10 can be reformulated as a path integral .

FIG. 13 is a diagram illustrating sampling transport path space 740 by bidirectional path tracing. Trajectories from the eye 742 and the light sources 744 to scene objects 746 axe generated by Markov chains and connected to determine the transported amount of light. The sampling path space, shown in FIG. 13, corresponds 15 to simulating Markov chains, where the paths are established by ray tracing and scattering events. The initial distribution is determined by the emission characteristics of the light sources 744 and the transition probabilities are given by bidirectional reflectance distribution functions on the hit surfaces.

To solve the path integral, there are at least two basic strategies, which are 20 either using high dimensional low discrepancy points or padding low dimensional, low discrepancy points. The latter approach fits the above-discussed findings, where high dimensional events are composed as subsequent transitions of a Markov chain, As measured, the difference to using high dimensional sequences for Markov chain simulations is small to none, but using padded sequences is computationally faster and 25 requires less implementation effort. It. is also simpler for practitioners in rendering industry.

In addition, the low dimensional approach allows for much better results, because the stratification properties of (t, s)-sequences or the Hal ton sequence and its scrambled variants are much better for small dimensions, as discussed above.

FSGS 14A-B and 15A-B are rendered images 760. 770, 780 and 790 illustrating differences between simulations using a Monte Carlo technique and simulations using an array-randomized quasi -Monte Carlo (A-RQMC) technique. The simulations using a Monte Carlo technique are shown in images 760 and 780 in FiGS, HA and 15A. The 5 simulations using an array -randomized quasi-Monte Carlo technique are shown in images 770 and 790 in FIGS 14B and 15B, Arrows at the top of each rendered image 762, 772, 782, and 792 illustrate emission location and direction ϊt will be seen from FIGS. ϊ4A-B and !5A-B that, in these examples, the light paths simulated using an array-randomized QMC technique are significantly more uniformly distributed than the K) light paths simulated using a Monte Carlo technique.

3.1 Fredholm and Volterra Integrals

The integral equation underlying light transport can be considered either as a Fredholra or Vo! terra integral equation, which matters for implementation

l . U'. w ) -- / , f1 1, { * »^.\ . .r, ^) L{ ι\ „ ^< /; ^v {, 'n < { \ ^■ .r } ^■ ~ , \ h,li

S~ s ,- i

15 is the radiance reflected off a surface in point x in direction ,.c, where the domain

S ^' i {.v) of the integral operator is the hemisphere aligned to the normal n(x) in .v.

FSGS 16A-D show a series of diagrams 800, 8 KJ, 820, and 830 illustrating 20 photon trajectories 802, 812, 822, and 832 that are started from a plurality of starting points 804, 814. 824, and 834 on a light source. Upon hitting a surface 806, 8 !6, 826, and 836 after tracing a ray, the bidirectional reflectance distribution function is sampled to determine a direction of scattering to continue the path

/, is the bidirectional reflectance distribution function describing the optical 25 surface properties and /. is the incident radiance Using this integral operator results in a VoI terra integral equation of the second kind, as the integration domain depends on x

on the other iiaud results in a Fredhoim integral equation of the second kind, as integration is being performed over all directions of the unit sphere <S ^' " independent of x. 30 Using the latter approach of generating global directions and rejecting directions with negative scalar product with respect to the surface normaJ nix) is computationally

attractive, while the first approach requites the generation of directions in the hemisphere that have to be transformed into the local surface frame, which is more expensive. Mappings from the unit square to S ^* or 6'£ Lϊ } are known in the art.

An even more important argument for generating global directions is related to the presently described approach: By sotting, it cart happen that two close-by surface

5 points with different surface normals, for example, in a corner, use subsequent samples of a (/, λ)-sequence to generate scattering directions. Generating global directions now works fine, whereas generating directions in the two different local frames using subsequent samples will destroy the low discrepancy properties. These discontinuities become clearly visible. Using global directions, however, does not allow for

10 importance sampling according ιof _r or the cosine term, which often is a disadvantage and deserves further investigation,

4. Sorting Techniques and Systems

It is possible to use sorting to increase the efficiency of the above-described ! 5 techniques As discussed herein, {/, x)~sequences % ^r( ϊ sequences of (/, m, *)~nets in base 6, and are similar to Halton sequences arsd their variants. Similar states X _M j use successive points. Therefore, states should be ordered by proximity in state space. The closer the states are in the state space, the more uniformly it will be explored.

In order to have the states as closely together as possible, they have to be 20 enumerated in an order such that the sum of the distances of neighboring states is minimal. This in fact relates to the traveling salesman problem, where for a given set of cities and the costs of traveling from one city to another city, the cheapest round trip is sought that visits each city exactly once and then returns to the starting city. This problem already was investigated by Euler in the early IStli century and unfortunately 25 can be shown to be NP-hard.

Our problem is very similar except for it is not necessary to return from the last state of the route to the first state Some techniques are already available to efficiently calculate approximate, but close to optimal, solutions for the traveling salesman problem. However, running times of these techniques have not been acceptable in

30 simulations, as the calculation of the distance matrix alone exhibits an 0(N ^" )

complexity, while it is desired to keep the technique as close as possible to the O{N) complexity of classic Monte Carlo methods.

- 29 -

In the following there are discussed a number of orders according to aspects of the present invention to achieve fast sorting for high-dimensional state spaces.

4. J Norm of State

The average complexity of quicksort is θ(N\og λ ^f ), but for certain scenarios

even O(N) techniques exist, such as radix sort, which, however, requires additional temporary memory. In order to use these one-dimensionaJ sorting techniques, the multi-dimensional state must be reduced to one dimension. Amongst many choices often some norm.

I X,...< j| is used to define an order on the state space. However; similar norms do not necessarily indicate proximity in state space. A simple example for this is similar energy of particles in a transport simulation that are located far away in space.

4.2 Spatial Hierarchy i 5 A second possibility to enable multidimensional sorting is the usage of a spatial hierarchy to define an order on the states. Efficient data structures for this purpose are the BSP-tree, its specialized axis aligned subset, the £D~tree, or bounding volume hierarchies. The construction of such binary hierarchies is simple. The space is recursively subdivided using planes selected by some heuristic. The construction runs

20 in 0(λ ^r Iog λ) on the average. Traversing the hierarchy in in-orcJer enumerates the leaves in an order of proximity. This traversal becomes trivial, if the tree is left- balanced and in consequence can be stored in an array

If a spatial hierarchy must be used anyway, for example to accelerate ray tracing, there is no additional construction time for the hierarchy The particles then are stored 25 as linked lists attached to the leaves of the hierarchy.

FIGS. ! 7A-G. 18A-B, and 19A-C are a series of diagrams illustrating that sorting the states into the leaves of a spatial hierarchy defines an order of proximity by traversing the hierarchy in in-order.

FlG. 17 A shows an exemplary scene 840, including a light source 850 and an 30 object 860 to he rendered. The object 860 is a five-pointed star. Sn FIG, 17B, the object 860 is bounded by an axis-aligned bounding box 870 In FIGS 17C-G, the bounding box 870 is iteratively subdivided into branches. Each branch terminates in a leaf.

- 30 -

In FiG. 1 7C, the bounding box 870 is divided into a first-generation left branch L and a first-generation right branch R by a splitting plane 87 ! passing through a first vertex 861 of the object 860.

In FIG. I 7D, the left branch L is divided by a second splitting plane 872 passing 5 through, a second vertex 862 of the object 860, resulting In a second -generation left branch LL and a second-generation right branch LR. tii FϊG. 17E, a third splitting plane 87.1 passes through a third vertex S63 of the object 860, splitting branch LR into third-generation left and right branches LRL and LRR

K) In FϊG. 17F, fourth and fifth splitting planes 874 and 875 pass through fourth and fifth vertices 864 and 865 of the object 860, As shown in FlG. 17F. the result is five leafs LRL, LRR, RR, LLL, LLR and RL, one for each of the five points of the star 860.

FiG. 17G illustrates the illumination of individual points within each leaf. In 15 FIG. ϊ 8A, each leaf has been shaded to indicate its luminosity. In FSG, 18B, the leaves are hierarchical ly ordered according to their respective luminosities. From "darkest" to "brightest," the leaves are arranged as follows: LRL, LRR, LLR, RR ₅ RL

Unfortunately the quality of this order is strongly determined by the quality of the spatial hierarchy used for simulation, which is especially problematic if the number 20 of leafs in the hierarchy is much smaller than the number of chains N as this results in several states being mapped to the same leaf

43 Bucket Sorting and Space Filling Curves

In order to guarantee linear time complexity, bucket sorting can be used In the 25 λ-dimensiona! extension of the simple technique sketched In Section 2.1 , multidimensional states were sorted into buckets by the first dimension of the state, then the states of each bucket were sorted into buckets according to the second dimension, and so forth. This procedure works well, but has the problem that states close in state space can be separated by the sorting procedure. In addition, a 30 stratification of each dimension has to be used, which induces the curse of dimension in the number of Markov chains to be simulated simultaneously

The state space is therefore into equal volume cells, or "'voxels," which serve as buckets. The bucket of each slate is found by truncating the state coordinates according to the resolution of the voxel grid. Note that this applies for continuous as well as

- 31 -

discrete state spaces. Enumerating the voxels by proximity yields the desired order on the states and can be done in linear time in the number of voxels.

Orders that enumerate the voxels by proximity are given by space tilling curves, such as the Peano curve, Hubert curve, or H4ndexing. These curves guarantee ever)' 5 voxel to be visited exactly once and an overall path length being relatively short. For problems with large geometry, which is the case in our own simulations, this can be even one of the few possibilities to generate fast and memory' efficient approximate solutions to tlie traveling salesman problem. However, these curves are rather costly to evaluate, need to be tabulated to be efficient or are not available for higher dimensions.

K) Fortunately, the Z-curve, also known as Lefaesgue-curve or Morton order, avoids these problems. Given integer coordinates of a bucket in multidimensional space, its one dimensional Z-curve index is simply calculated by bitwise interleaving the coordinate values, as shown in FIGS. 19A-C.

FiGS. 19A-C show a series of diagrams 890, 892, and 894 illustrates the Z-

15 curve 891, 893, and 895 in two dimensions for 2^2 (FSG, 19A), 4'4 (FIG. 19B), and 16* 16 (FIG. 19C) buckets. With the origin (0, 0) top left the point marked by ^x has the integer coordinates (3, 4), which corresponds to (01 1 , 10Qb i« the binary system. Its binary Z-curve index 10010 ϊ χ is computed by bitwise interleaving the binary coordinates.

20 This is straightforward and fast to compute for any dimension and problem size, and requires no additional memory. It has been found that the results are not as good as, for example., the Hubert curve in a global context. However, the average case partitioning quality and average/worst case logarithmic index ranges are comparably good Problems can arise in highly symmetrical scenes like Shirley's "Scene 6," shown

25 in FIG 2OA, used for numerical experiments ^' States on the walls parallel to the (λγ, > ^! )-ρlane will be sorted very well, but the states located on the other two walls parallel to the (y, ~)-piane will be visited by the curve in an alternating manner, which can lead to correlation artifacts in some scenarios.

30 5, Numerical Evidence

Following the above discussion, the Halton sequence was used, with permutations by Faure, randomized by a Cranley-Patterson rotation in order to have unbiased error estimates. For sorting, the Z-curve order produced the best results for the particular setting and was used for the following experiments. It is further noted

- 32 -

that it has been numerically verified that omitting the randomization has no notable effect on the precision of the results. In numerical experiments, four approaches to simulate Markov chains were compared.

MC; Uniform random numbers generated by the Mersenne Twister were used for classical Monte Carlo sampling.

RQMC: Used the iiigh-dimensionaS Fϊakon sequence with permutations by Faure randomized by a Cranley -Patterson rotation, where pairs of components were used to sample the two-dimensional emission and scatterin Og events. lo-dim Used the two-dimensional Halton sequence randomized by a

RQMCS: Cranley-Pattersαn rotation. The Z -curve was used to enumerate the bucket-sorted states, hi -dim Used the high-dimensional Haiton sequence with permutations

RQMCS: by Faure randomized by a Cranley-Patterson rotation. The Z -curve was used to enumerate the bucket-soiled states.

5 Sn a first experiment, the robust global illumination technique was used to compute the path integrals. FIG. 2OA shows a test scene "Shirley 6," which is a relatively simple light transport setting, and FlG, 2IA shows a test scene "Invisible Date," which is a more complicated light transport setting, FIGS. 2OB and 2J..B are graphs 901 and 91 S showing average RMS error to a master solution; FIGS, 20C and

10 21C are graphs 902 and 912 showing global variance (averaged over the whole image); and FIGS. 20D and 21 D are graphs 903 and 913 showing pixel-based variance. The numbers were obtained by averaging 10 independent runs for a varying number of Markov chains. The measured numbers only convince for the simple test scene, in even the complicated cases, no performance gain over Monte Carlo sampling can be

15 measured, because the number of independent runs is too small , More experiments were not possible due to excessive running times.

However, the visual error tells a dramatically different story as can be seen in FIGS. 22 A-C, where a clear superiority of the new technique in even very difficult settings becomes obvious. FIGS. 22A-C show a. visual comparison for the test scene

20 'invisible Date" using 300 chains for simulation. FlG, 22A shows the test scene rendered using Monte Carlo; FIG. 22B shows the test scene rendered using randomized quasi-Monte Carlo; and FlG. 22C shows the test scene rendered using randomized quasi -Monte Carlo with sorting.

- 33 -

In FiGS. 22A-C, the only light source Is not visible as it is placed on the ceiling of the neighboring room. Due to the better distribution of the photons randomized quasi-Monte Carlo (HQMC) (FIG. 22B) outperforms Monte Carlo (MC) (FIG. 22A) visually, as can be seen by the reduced shadow artifacts, RQMC with sorting (RC)MCS) 5 (FIG . 22C) (using 256 ^λ voxels for the bucket sort) is even superior as more photons made it into the second room and even the back of the door is Sit very well.

This ease is not an exception, but. can be observed for many test cases. It only emphasizes that standard error measures axe not appropriate error measures for visual quality, which is a known but unsolved problem in computer graphics.

K) FiGS. 23A-O shows measurements for a very difficult Sight transport problem, where photons were traced directly from the light sources and the final path segment was connected to the camera (one technique of bidirectional path tracing) FIG. 23 A shows a schematic view of the labyrinth test scene 940, where floor and roof have been removed for illustration. Tile camera is situated in the room at the bottom, while a light i 5 source is located on the other end of the connecting tunnel. The graph 950 shown in FIG. 23B is a "Box-and-Whisker" plot, showing the average amount (marked by - ) of the total radiance received by the camera for 1048576 simulated light paths for 50 independent realizations using each technique. The graph 960 shown in FIG, 23C enlarges the interesting part of the graph shown in FlG. 23 S. The table 970 shown in

20 FIG. 23D finally displays the deviation from a master solution using the Li - and L2~ norm (variance) for the total radiance received by the camera and received by each pixel (256x256 pixels resolution) averaged over the 50 independent realizations. In this difficult setting the new method (RQMCS) is clearly superior.

Opposite to the above measurements only one number of simultaneously

25 simulated Markov chains is considered Now a sufficient amount of experiments was computationally feasible and the superiority of the new technique became clearly visible.

6, Conclusion

30 It will be. seen that the above described systems and techniques have a number of advantages over the prior art. These include the following' simplification of the technique, which is intuitively focused on sequence properties; improved convergence; linear complexity; and application to light transport simulation. The above described

- 34 -

systems a.nd techniques move one step closer to formulating the technique as an integra!, at which point, it will be relatively straightforward to demonstrate consistency.

We successfully simplified the techniques to simultaneously simulate Markov chains and provided intuition when and why sorting the states can improve 5 convergence. In addition the technique no longer is bounded by the curse of dimension and there is no restriction to homogeneous Markov chains, because the simulation just can use transition probabilities P ≡P» that can change over time. Thus, the techniques described herein can be used to simulate inhomogeneous Markov chains, i.e , where P ≡ F ₁ , depends on the time step «.

IO Experiments have indicated that not ali (/, .^-sequences or radical inversion based points sequences are equally good. This deserves further characterization

The techniques described herein would be even simpler if rank- 1 lattice sequences couid be applied. The constructions thus far, however, lack the properties of (/, 5}-sequenccs that arc required for the improved performance. According to a further 15 aspect of the invention, it may be possible to construct suitable rank-1 lattice sequences to achieve the desired results.

As we are using global directions, i.e. integrate over products of spheres, it is also Interesting to establish connections to recent research in that direction.

20 7. General Techniques

FIGS. 24-30 are flowcharts of a number of techniques and sub- techniques according to the above-described aspects of the invention It should be noted that some or all of the following techniques and sub-techniques, including individual boxes and groups of boxes thereof, may be combined as desired to practice the invention, as 25 deseri bed herei n .

FIG. 24 shows a flowchart of a first general technique 1000 according to the present invention. The technique i000 includes the following:

Box 1001 : Simulate Markov chains using Quasi-Monte Carlo methodology .

Box 1002: Sort the states, 30 (May include sorting by state comprising proximity sorting.)

Box 1003: Simulate plurality of Markov chains simultaneously.

(Can utilize quasi-Monte Carlo points; can select quasi-Monie Carlo points using any of a Hal ton sequence, a variant of a Halton sequence, a lattice sequence or a {t,-s)-secjuence; can utilize any number of chains; can comprise adding trajectories on

- 35 -

the fly. as by trajectory splitting; can utilize Russian Roulette technique or other technique simulating particle absorption.)

FIG. 25 shows a flowchart of a further technique 1 ! 00 according to the present invention for simulating a plurality of Markov chains simultaneously for an 5 s-dimensionai problem. The technique 1 100 includes the following

Box I iOl : n : == 0

(Set n equal to 0 )

Box 1 102: Initialize Xm using quasi-Monte Carlo points x\

Box ! 103: Sort state vector using a suitable order K) Box 1 104. n ^::: n 4- 1

(Increment n by 1.)

Box { K)S' Continue chain by computing X _n J using subsequent quasi -Monte Carlo points x*.

(Can also utilize any selected set or sequence of points with any selected 15 randomization, and ordering of states by proximity in state space.)

Boxes 1 103-1 105 are performed in a loop.

FIG. 26 is a flowchart of additional techniques 1200, one or more of which may be performed in conjunction with the technique 1 100 illustrated in FlG 25. These techniques 1200 include the following. 20 Box { 201 . Sort by luminosity, or the norm of state.

Box 1202' Sort using a spatial hierarchy.

(Can include utilizing a binary hierarchy, or any of a BSP-tree, &D-tree, or other axis-aligned subset of a BSP-tree structure, or hounding volume hierarchies, or regular voxels; binary hierarchy can be constructed by recursively subdividing using planes 25 selected by a selected heuristic, and then traversing the hierarchy in in -order to enumerate the leaves of the hierarchy in an order of proximity; can also include left- balancing the tree and storing the tree in an array data structure to enable efficient traversal of the hierarchy )

Box 1203 ' Sort using buckets enumerated by at least one space fil ling curve 30 FIG 27 iβ a flowchart of a further technique 1300 according to the invention

The technique 1300 includes the following:

Box 1301 : n := 0, select point sequence .Y ₁

Box 1302 Instantiate randomization.

Box 1303 ' Initialize X^ using rand(x> ).

- 36 -

Box 1304. Sort state vector using some order.

Box 1305: n := n + 1

Box 1306: Instantiate randomization.

Box 1307: Continue chain by computing X _; ,,,: using rand(.v> ). 5 Boxes 1304-1307 are performed in a loop.

FlG. 28 is a flowchart of an additional technique 1400 that may be performed in conjunction with the FlG. 27 technique 1300, The additional technique 1400 randomizes Quasi-Monte Carlo points in each time step ικ and includes the following:

Box HOl : Select a U, .*)~sequence in base b ^::: 2, from which subsequent 10 samples are drawn.

Box 1402. XOR samples by .v-dimensional random vector, the random vector being generated after each transition step

FIG. 29 is a flowchart of a further technique 1500 according to the invention. The technique 1500 includes the following: i 5 Box i 501 : Bound object to be rendered by axis-aligned bounding box.

Box 1502: Recursively or iteratively divide bounding box into successive left and right branches, each terminating in a leaf, by applying splitting planes passing through selected points of object, thereby generating plurality of leaves.

Box i 503 : Order leaves hierarchically according to their respective 20 luminosities.

Box 1504: Perform bucket sort on matrix of selected size, divided into selected number of buckets.

Box 1505; Use a selected space-filling curve to proceed through matrix, wherein within an individual bucket, smaller space-filling curve used to proceed 25 through individual cells in matrix.

FIG. 30 is a flowchart of a further technique 1600 according to the invention. The technique 1600 configures a simulation to enable light-transport simulation for synthesizing realistic-appearing images, and includes the following:

Box 1601 : Reformulate underlying light transport-simulating integral equation 30 as path integral, such that sampling path space corresponds to simulating Markov- chains, in which paths are established by ray tracing and scattering events.

Box 1602: Determine initial distribution by emission characteristics of light sources or sensors, and determine transition probabilities by bi-directional reflectance

- 37 -

distribution functions and bi-directional subsurface scattering distribution functions for a surface in image.

Box 1603: Process transparency effects by utilizing path integral, initial distribution and transition probabilities.

5 Box 1604. Solve path integral with or without any kind of randomization, by utilizing either high-dimensional Quasi-Monte Carlo points or by padding low- dimensional Quasi-Monte Carlo points.

(Consider integral equation underlying light transport simulation either as a Fredholm or a VoI terra integral equation.)

K) While the foregoing description includes details which will enable those skilled in the art to practice the invention, it. should be recognized that the description is illustrative in nature and that many modifications and variations thereof will be apparent to those skilled in the art having the benefit, of these teachings. Tt is accordingly intended that the invention herein be defined solely by the claims appended

15 hereto and that the claims be interpreted as broadly as permitted by the prior art.

- 38 -

COMPUTER PROGRAM LISTING

CODE LISTING 2.2.1 voxel Triangl e : : T tra ns form f }

Point *ρ = (Point "Jthis; Vector ri3d;

Vector Ei abs ^::: Ei3d ^::: ;'p [ ¹ I J --p [ Q] } i (p[2]-pfOJ •;

10 / /// search largest component tor projection !ϋ~x, 1—y,2= uint€asi (n_abs.dx) &~ 0x7FFFFFFF; uintcast n abs . dys S= 0x7FFFFfFF; ij iπtCa&t 15 abs . dz) 0x7 FFFFFFI

// Degenerated TriaEigXes rciust be handled fset edge-signs)

15 if!!>{n abs.dx+n abs.dy-Hn abs.dz; > DEGEN TRI EPS ILON.) } //■!(...} > EPS; to handle KaN' s

!i cS - p [G] .x; pO.u - -p [O] . y;

20 pθ. v - -p [O] ,z; n.u=n.v = 0,Of; e f G] . u ^::: e f 1 ^' ] , u ^::: OJ .v ^: == ef :i J .v 1.0 !: ; return;

25 U32 axis = 2; i f {n .abs . dx .> n abs . dy ; if! n abs.dx n abs.dz) axis - 0 ;

30 else !:!: (ri ab.=s.dγ > n atas;.dκ;

.sz is ^::: 1;

Point p03d ~ piO] ;

Point ρi3d - PiI] ;

35 Point ρ23d = float. •: iπv - 2. Of/nSd ; axis e[0 .ϋ ~ ;p23d r PlusOneMod ^' ϊ [axis ^■■ pO3d [ Pl ^sOneKocl3 [ ax is ISV! e[0 ,v - (p23d r PlusOneHod3 [axis -1 ] I --pOad : PlυsOneModS [ ax e[I (pi3d f PIusOrieMod ^' i [ &y.ϊs ^■■ p 03 d [ P I vi s On eMocS 3 [ <α >: i s } > ^■

40 e[l . v (pI3diPl-usOneMod3;azis-*-lj } ~ρ03d[PlusOneMod3 [axis+Ij ! ; ⁴ t inv; t: i O.Sf; i: . u ~ n3d [ ^' PIus.0πeMod3 [axis] _m Xi v; n.v = n3d[PlusOneKod3 [axisíl ] ] *t_inv ρθ . u = -ρQ3d [ I-lu3θneMod3 [ axis ] } ;

45 p0. v ^::: -pO3d [ F1 u3θneMod3 [ax.i. d = pO3d[axis] + n.u*p03d[P.: ,.us:OtteMod3 [sy.i s ] j + ■i. v ^í p03d[Plus0neHod3 [axis t i.

- 39 -

CODE LISTING 2.2.2

IJ32 *idx ^::: pointer to face indices; Ej 32 ofs ~ projection case;

5 f:or t ^' U3.?. .Li ^™ nust t:riDat:a; ii; i.i.— , idx-hí) { float t = ^■ itriData (*idxj ,d - ray . from[oi:£ j

-- triDatό; f * ^' idxj , n . u*ray , from F Plυs0aeMod3 [ofs ] ]

-- triDatό; f * ^' idxj , n . v*ray , from F Piυs0aeMod3 [ofs-f1 ] j } SO / Jray.dfofs] + ^■ triData [*idxj . n.u* ray.cS ( PlusOneMod3 [ ofs] }

^■ + ^■ triData [*id.κ] .;i,v ^í >:ay.d [PiusGneMod ^' 3 [of3 KLj ] } ; i f (uii3-;Cast ft) -1 > uii3'.;Cast ( result , t v.sr. ) } //--1 tor t(>.Of

•":, ^r ont:j.i3ue; float, hi ■- t^ray.dJPIusOneModS [ofs] ] + ray . f r am [ Pis.jsOneMod3 [ofs j j 15 + triData [*idx1. pO.u; float h2 = !:-*ray.d[PIusOπeMod3 [ofsíI] ] í ray. f ro∑nf PiusOneMod3 [of s+I ]

^■ ( ^■ triDatό; f * ^' idxj .pO.v; float u ~ hl*triData[*idz] .ef 0] .v - h2 ^triData [*idx] . e [0] ,u; float v = h2*tciData[*idx] .e[l] ,u - hl*triθata [*idxj - e [1} . v; 20 float: uv ^::: u-f-v; if: ( (uiot-Cast !\ι) \ uim:Caat.{v) S υλ ntCas •: iuv) } > 0x^0000000) cont ϊ.ΏOθ; result. tfar - t; result. tri index = ⁺ idx; 25 )

CODE LISTING 2.3.1

30 Point ⁺ ρ = {Point * ■ S V. r. i Data [ t: r. i index]; int boxMinldK, boxMai-vIάx.;

// boxMinldz and bo/:Ma;-:Idx index the smallest and largest vertex of the triangle

.// in. the componerst; d.lr[0.] of the split plane 35 IHp[Oj [dir [OJ ] < ppj [di?: [0] ] } { if (p [2 j [ dir [0 j j <: p [0] [air [ 0] ] }

! S boxMinϊdx -~ 2; 40 boxlfexldx ~ 1;

} else

{ bo:-iMinTdκ ~ 0;

45 bo:iMax!dκ == p [2. } [όϊr. [Oj ] < p [ !i J [ <d i >: [ 0] ] ? 1 : 2; } } else

I )

50 if (p[2: [dir [0] ; < p [ 1.1 [dϊ r [ 0 j ] } { bttκMinI.dκ ~ 2; boH.MaxI.dx ~ 0;

55 else

3 boxMinϊdx - 1; boxM.szIdx === p [Z] [dir [0] ] < p[0Hd:tr[0]] ? 0 : 2; } 60 j

Z* If t:h<:; -irianq.is? is in t:he split: plana oκ cαEt;pJ.«;t:eϊ y on one si.de; o the split plane

-40-

is decided without any numerical errors, i.e. at the precision the triangle is ent«:ϊ:ed to the ienderring system. Using epsiloKS her* i.s wrong and not: necessary, 5 */

IJ: i ;p[boxMirϊXdxj [dir [Oj j ~~ split! &f, Cp [boxMaxIdx] ( di r [ Oj j --■ spli.':)} // in spli ^! J plane ?

Oω _^ split I terns + + ;

SO if (split < middie_split ) // put to smaller volume left naai divl beiπs t + ; el&fc s !

15 U32 r. - itemslist [υnsortecMoorder] ; right num divltems— ; it eras List fright nurα divlterαsl ~ itexαs List [left nurc ¹ , divltenas] ; itsKisList f Ieft _m r3uκi. _m CiivIteκi.sl ~ t;

} 20 } else i £ i p [boxMazIdx j [dir[O] j <~ split! // t.rianqϊ e complet.elv L«?fϊ ?

1 e1:t' _m ω0.x« _m divI t eτ!is++ ; else if (p [bo;-;KinX<:Jx. HdIr[O] ] >= split) // triangle core.ρle-tc ^j ly right ?

{ 25 i j Ώ s o r t -t: d b o r d e r - - ;

U32 t ~ it eras List [υtisorted border] ; r:icjht num di v.Tt&ϊns ; itsHisList [ri.ght_rnλia__divIteras ] ""' iterasLiβt [lef ^' t__r!uiτι_divlt ^" eiicιs] ; i terns Lis t. [ left; nαiti divltems i ~ u ; 30 } else

// and now detailed, decision, triangle raust intersect split plane . , ,

{

/ * .Vn. the sequel x»?e detftrraine whether a triatigie sbo;.ι.ld gc left and/or 35 ?:iqht, v:hej;e ws: already know that it: must intersect: the sp3.it: plane in.

.s i i lie segxcient.

All computations are ordered so that the more precise computations are done first. Scalar products and cross products are evaluated last.

In EOKK? situations it: rαay he πeceasary ;:o expand, the boundinq ho:-* by 40 an epsiion. This, however, will blow up the required merαory by large amounts ,

If such a situation is encountered, it may be better to analyze it πiiraericaily in order not to Else any epsiions .. ,

Arriving here we know that: p [ boxMa:-:ϊdx] [di.κ[O] j < split. < 45 pfboxMaxTdii] [dii: [0] ] and that ρffooκMidldx] [dir [0] ] Mn [p [boxMaxIdx] [dir f 0] } , p [boxMazIdx] {dir (0I ] ] ,

VJe also know, that the triangle has a non-empty intersection with the cur.reii;; 50 voxfc.i.. The t r: ianql* also caππob l.t<s in the sp3.it plarse, and its v e r t i c e s c a n n o t

J. }.<? on oris? side: only.

*/ iϊi'j boxMidldx = 3 - boxM«xϊdz - boxMinldx; // misainq index, found by 55 3 - 0 í l í 2

/* We now determine the vertex that is aioae on one side of the split plane .

D&peπdiEig on whether bhe lors&3. y vertex is on the !l<si:t or. right side, we h.sve to iaS^ei swap i^he dfc<;i.sion, whether the 60 triόiriqlfi should be go i tier to the left Oϊ: right,

*/ int Alone = (split < pffooκMidldx1 [dir (011) ? boxMinldx : boκMaxϊdx;

-41-

in KotAione = - Alone - boxMidldx; lit < p [fooxMiάldκj [di. r. TO] J } ^" i boxMaκldx : bozMinldx; s urn o f i dx ^::: 3 ^::: 0 -f- 1 + 2 f.loat dist split -- p fAIonej [dir [Oj j ; U32 swap.LR uintCast {dist: J »31; /'/ ~~ p [Alotie] [di. r. [ 0] j > split;

Z* Now the line: seqπtents connect ..ϊiq t.he .Lonely vertex with the remaining ^! :wo vertec.es

<sre intersected with the split p aae. ai and a2 are the intersection points ,

The special case "if (p [bαxHidϊdx fdir[O]j == split}" (yields a K / x, which could b& opbirtά £«hd.] does not help aI: a 1 since .it only can happen as; often as the highest: valence of a vertex of the mesh

15 v / float at — dist / (p [boxMidldxj diriϋ] j - p Alone] [dir[0] j ! float at2 - dist / (p[NotAlonej di r [ 01 j ^■■ p Al on e ] \ di r [ 0 ] ] ) float aiz - (» ifooxMidldz] [άir[i ] - p [Al one [di£[ll * at; rloat a Iy piboxKidldz] [dir[2 j - p [Alone [dic[23 * at;

20 float a2x p f Not.α.i.oi38 ] [ d.i. E: [ 1 ] -- pLAloEiej d i. r. [ 1 ] j float: i\2γ ~ (p [Kot.Aiorje] [di r [2 ] ] - p [Aloriej [di. r. [ 2] I ) ^k ' ar.2; // n is a vector, normal to the line or: intersection ala2 ot the triangle

// and the split plane

25 float nx — a2y - «ly; f1oat ny - a2x ^■■ a 1s;

.// The s.i.gns indicate the qαadraπl: of the vector riorasal to the intersection line

U32 nxs ^~ uintCast {nx) >>3i; // ^~~ (nx < 0.01}

30 U32 nys ~ uintCast {r.-y} »31; // ~~ {ny <" 0.0 E:)

/ ^' * Nuraexicai precision: Due to cancel1acion, floats of approximately same exponent- should foe subtracted first, before adding something of a different- order of

35 jTUKjtii t.ude . All k:-?:.:!cket.:s .L:n tfiB sequel are; ESSSHTIAL for πisπierieai preci sion .

Change them and. you will see more errors in the BSP... pMin is t.he lonely point in the coordinate system with the origin at b33ox.bMir.Hax [01

40 pM&x is the lonely point in the coordinate system with the origin at bBoκ.bMinMaκϋl] float pMirsK p [Aloriej [d.i. r. - bEoz.bM.: ..IiMcSX[Oj [dir [.1 j float pM.i.rsy p [Aloriej [d.i. r. - bBoz.bH.: ..IiMcSx[Oj [dir {2\

45 float pM&zα P [Alofie j [d.i. r. -- bBox.bM. ,.ϊiKax[lj [dir [ !i j float p Ma xy p [Alone] (air ] - bBox.bMinMax[l] [dir [2} ] ; // Determine coordinates of the bounding box, however, with respect to p í ai being the origin, float: boxx [2] ;

50 float: boxy [2] ; boxx [Q] ^::: (pMinx + a Ix * nz; b OKy[Oj ~ fpMiny í a. Iy * ny; bo:»::-:.[lj ~ fpMaxx í a. Ix * ox; bcκy[ij = ipMaxy í a Iy

55 f* Test, whether line of intersection of the triangle and the split plane passes by the bounding box. This is done by indexing the coordinates of the bounding box by the quadrant of the vector tiormai to th<s line oϊ: i.:ri:.*srs<h<;kic>Ei. In fact

60 this is the nifty implementation of t.he 3d !:est. iiit: ^oduced by in t:he book vcii:h

Haines t

-42-

" Rea1-Time Renάe ring"

Bv the ind«hx.3.i3<} th<= veπ:.:tc6s ar.e selected, wS'n.oh ar% farthest i: ϊ: OEγE the 1 i.nt. ,

Rote that, the tri angle CANNOT completely pass the current voxel, since 5 .Lt: must: have

A nonempty intersect:..! on wi;:h. it..

U32 results; if (pMinx -: ^■ MAXϊalx,a2x; < O.Ofi /./ Irne segment of intersection ala2 IO left of box r&au.LtS ^::: uiπbCast (pMinx) >>31; el&«: i J: { pMir-y -t- MAX ( alγ, a2γ ■ < 0.0^:} /./ I .in* segment of intfe.t:s;ection ala? below bo>: results ~ uint.Cast: fpM±nyi »3.1 ; 15 else if{ρMazz í MItHaIx, a2x] > O.Ofϊ // line segment of intersection ala2 right of box results ~ (pMaxz > 0. Of; ; else ifi ^' pMaxy + MI ^' H (aiy, s2y) > 0,Of) // line segment of intersection

3.ia2 above box 20 results ^::: [pMaay > ii.ϋi); else i i. i boxx [ I ' ^" ' nxs j > boxy [nys } J

// .1. iϊse passes beyond bbox ? ~> triangle can only b«- on one side results = (aly-a2x > alx*a2y; ;

// sign of cross product al z a2 is checked to determine side 25 c:lse if (foc:-:x[rιX£.3 <' boxy [ I ^λ nys] } results ~ {a:[y*a2.x < aJ.x*a2y) ; else

// Ok, Eow the triangle must be both Ielit SλVA right s 3D βtackList [ c.u rr S t.β- ck.Tt:«?ras+í j ~ i ;:emsL.i.si t: [ 1 eft ϊHJϊϊI dlvlt:«?ras] ; iinsor teά _^ border ; items Lis t [left _^ tiutT^cii vl terns] ~ itemsList \ unsorted _^ border] ; continue;

! 35 IHswapLS !==: /* ^λ */ χesuits;

{ uns orteci__foo rcie r- - ;

U32 t = items Li st [unsorted__border ] ; right: otim d.i.vT;:ems — ;

40 11 ems L i s t [ r i gh cjiusi _^ d i vl t ems j - it e;τis L i s t f.1 e i t _^ nuro _^ cii vl z ems j ; items Li s t [left _^ uutr^clivl terns] ~ t;

1 else

.Left nust άϊ vlt.α-.msíí ;

45 }

-43 -

CODE LISTING 2.4.1

Intfc r:sectior- Boundary : : Int.* csect {Ray SMy! / / r.ay .tf.At: is "hanged J j

5 // Optimised inverse Ca-LCrJIaLiOn (saves 2 of 3 divisions) 'ileal:, ir.v -::mp -~ ( Ka y.d . dx* ray . d . dyϊ *ra_y . ci ,dκ ; if ( iiiintCast (inv _^ trφ) ά0x7FFFFFFF; > uintCsai; ξ DI V _^ EFSILOR) !

3 inv__ttnρ ~ I. Of / inv _^ tπψ; i 0 ray . inv_d . άx - i ray . d . dy* ray. d . άz S ⁴ inv_tmρ; ray..3.13v d.dv ^::: i ray .d . dx*ray .d . ci:.;} *iπv tc«p; ray..3.13v d.dx ^::: {ray .d . d>x*ray .d . dy) *iπv tc«p;

1 else

15 { ray. inv d , d;-i = { (uiπtCast f ray . d . dx } & 0x7FFFFFFF} > uintCast (DXV EPS I LOR) } ?

(X. Of / ray.d.dx) : INVDϊR _m LfJT[υintCast uay.d.dx} » 31] ; ray.inv d.dy - ( ^' fuir.tCast (ray, cS.dy} liOx^FFf FFE-F! > 20 -ri.ntCast (DIV EPSILON) } ?

(l.ϋ£ / ray. d.dy) : INVDIR IATT [ui nt.Cas f: i ray . d .dy) » 3I] ; ray. i.ωv _^ ci.d?: " ( (uiπtCaβt ir-ay -d . cizi SO^VIfFFFFFFf > uintCast (DIV _^ EPSILON) } ?

(I. Of / ray.d.άξ! ; INVDϊR LUT[uintCast (ray.d.dz) >ϊ- 311; 25 )

In t e r s ft c t i o n result; r: e s Ei i t .11: a J: ^::: r a y . ■; fa r ; res u 11. t £ i__i rid e >: ^■•• - 1 ;

// 30 //BBoi-:-Check

// float tnear ~ 0.Of; worldBBox, Clip { ray, tnear) ; if iuintc.ss" (ray.tfar) ^ 0jv?fr7θ43305 //ray , r,far- ^~ 3.3e3θf //M j 5 r: e !: tirn ( re s u 11. } ;

U32 current__bspStack = 1; //wegen empty stack case -- 0

U32 node - 0;

// 40 //BSP-Travϋrsal

// const U32 whatnode ( 3] ~ { {υintCast {ray , ir.v d.dx!>>27) & siseoJ: {B3PϊIODELξAF) ,

(υiπitCaKt (ray. inv d.dy;»27) S siseof (BSPMODELFAF! ., 45 [UiKtC-ASt ir.ay.±r-v ^" ά.dz)»?/?) s s.i.jceof (BSPKOOELEαF) } ;

U32 bspStackKodε [ϊ28] ; float bspStackFar [ 126 } ; float bspstackNεar f 128 ] ; bspStack.N&ar[01 == ^: --3.4838f; // sentinel

50 do ^' {

//ϊst. Node ein L«af (t.yρ«;<0) od;E: r our ne \/c-?r zwe.l.qimq (t.yp«>~0) while ■ { (ESPNODξLEAFSihspNodea [node] ; -type >-■ 0) i 55 / /Split-Dimerision { x j y I z »

U32 proj - ( (BSPNOOSLEAFi tbspKodes [node] ; . type & 3; float distl - ( { (B3PϊJ0DELEAF& bspϊiodes [node] } . split 3. r. [ whatrsode I pro ^; s ] >>-1 i

- ray . f ϊOEπ [pro") 5 *ray . i tiv ci [pro"; ] ; 60 float distr ^: == i { (BSPKOωELEAF& h SpF; ode. E [:oode] } .Eplit.lrj _. {vjbaLnode (pro j ] >>4 ) ^λ i]

- ray , f rom(ρroj )*ray,inv d[ρroj1;

-44-

node = i ( {BSPNODELEAFS; bspNodes [node 1 ) , type - proj} \ whatnode (pro j 1 ; //type & OXFFFFFFFO i f (tπeai <=== distl } j

5 J. f i iaγ. t f: ,;s r >= d.Ls-:r;

{ bspSt ackNear [ current; _^ bspS S;ac.k.] -~ MAX i tnear, distr) ; bspStackNode [cur rent _^ bspStack] ~ node ^γλ siseof (BSPNODELEAF) ; bspStackFar [ curreαt _^ bspStackl ~ ray.tfar; SO currerst_bspStack-fí; } ray.tfax == ^: MIH i ray. t ;:a r. , dist.l.) ;

1 else

15 if ( ray . tfar >= dist r » { cosar ~ MAX (tnea >:, distr ■ ; node ^λ ~ si^eof (BSPNODELHAF; ;

20 } else goto s;:βck?op;

)

// 25 / / Fa ces - Inters <sct;

, , . code oral tt fed , ..

//Hit gefunden? 30 /7 do //!! NEEDS bspStackNeariOj - -3,4e38i; !!!!! f stackPop: c\3 r r ent fosp5 tack ;

35 t:i3«?tir ^::: bspSt.ackKear [ _' ::u.r: ϊ:eot:. bspSlϋ-et] ;

] villi 1 & { resul t , t: !:.s r. < tnea r. } ; if (Cvirrεnt__taspStack — = 0» return ( result ) ; node ~ bspStcsckϊJodi:: [current: b,«pS*:.β.ck: ; 40 ray.tfar ~ bspS _^ ackFsr i ^' cu.rrent- _m bspStack] ;

} while (true) ;

-45-

Previous Patent: APPARATUS AND METHOD FOR THE TREATMENT OF TISSUE WITH ULTRASOUND BY DIRECT CONTACT

Next Patent: A BRACKET FOR ENGAGING AN EMERGENCY COT WITHIN AN EMERGENCY VEHICLE