## Abstract

Distance field representation of objects in 3D space has several applications such as shape manipulation, graphics rendering, path planning, etc. Distance transforms (DTs) are discrete representations of distance fields in a regular voxel grid. The two main limitations of using distance transforms are that they are compute-intensive, and there are errors introduced while representing the object using DTs. In this work, we develop a hybrid graphics processing unit (GPU)-accelerated marching wavefront method for computing DTs of models composed of trimmed non-uniform rational B-splines (NURBS) surfaces with theoretical bounds. Our hybrid marching approach eliminates the error due to calculating approximate distances by marching. We also calculate the bounds on the error introduced due to the tessellation of the trimmed NURBS surfaces and calculate the propagation of these bounds in computing the DT. Finally, we present computation times for both 2D and 3D GPU DTs of test objects. We show that our GPU-accelerated approach is significantly faster than existing CPU-based methods.

## 1 Introduction

Distance fields have been used extensively for a wide range of applications such as shape manipulation [1], shape and volume representation [2], constructive solid geometry (CSG) Boolean operations [3], graphics rendering [4], path planning [5,6], collision detection [7–9], morphing and sculpting of shapes [10,11], surface offset computations [12,13], etc. Computing and representing the distance field everywhere in 3D space might not be computationally tractable. However, one of the key properties of the distance fields is that it is *C*^{0} continuous. Hence, a discretized distance field or distance transform (DT), where the minimum distance is computed in a regularly spaced grid, is more common in applications using distance fields. The discrete representation of distance fields is analogous to a digital representation of a continuous field or a voxelized representation of a smooth object. Due to the continuous nature of distance fields, the distance field value at any point can be approximated using interpolation from the regular grid of values with bounded error.

**S**(of dimensionality < = 2) and a point

*P*in $R3$, the minimum distance from

*P*to

**S**can be defined as

**S**can consist of any topological entity such as points, edges, or faces that are homologous to $R0$, $R1$, and $R2$, respectively, in 3D Euclidean space. The distance function

**D**can be any distance norm; most standard applications make use of the Euclidean distance norm. Given a closed regular set in $R3$ (an R-set), the boundary of the set can be considered as the set

**S**of objects. The distance field (DF) represents a field of minimum distances to

**S**from all points in $R3$. A similar definition holds true for distance fields in $R2$, where the set

**S**can consist of points or edges that are homologous to $R0$ or $R1$, respectively.

Significant work has been done to compute the DF and DT of polygons, triangle soups, and parametric surfaces by using a multitude of methods such as fast marching methods [14], graphic rasterization methods [15], axis-aligned bounding box tree-based search [16], scanning based methods [17,18], and jump flooding [19]. However, most methods to compute the distance fields are computationally expensive, restricting their efficient use. We discuss several related works for computing distance transforms with and without any acceleration in Sec. 2. However, to the best of our knowledge, these works do not consider bounds on the computed distance values due to the CAD model representation. Specifically, the de facto representation for CAD models, trimmed non-uniform rational B-splines (NURBS), is not directly used for computing the DTs. These are usually tessellated into triangles, which are then used for computing the DT. In this work, we quantify the error in the DT caused by tessellating the original trimmed NURBS representation of a CAD model.

Computing the distance fields of CAD models made of trimmed NURBS surfaces is challenging, and most existing methods do not compute the distance fields directly. A standard approach to deal with such CAD models is to tessellate the trimmed surfaces into triangles and perform the distance computations. The major limitation of computing exact-DFs of trimmed NURBS through tessellation is the presence of cracks [20] between triangles of adjacent trimmed NURBS surfaces. These cracks or gaps in the tessellated model render the model non-manifold or not watertight. Methods based on ray-tracing usually fail for non-watertight models. Several previous works have a focus on creating a watertight tessellation of CAD models made of trimmed NURBS surfaces by tessellating the common edges or healing the cracks between adjacent surfaces using thin triangle strips [21,22]. We deal with gaps in our approach by first voxelizing the model and then computing the DT of the voxelized model. Our initial NURBS tessellation is computed to be smaller than the voxel size, which effectively deals with small gaps. In addition, we compute the bounds based on the resolution of the NURBS surface tessellation and use it to compute the bounds of the DT.

One popular approach to compute the DT is to use a marching algorithm starting from the boundary of the object. However, there are two main disadvantages of the marching approach. First, the marching process is inherently serial, which is slow in the absence of any acceleration approaches (such as bounding volume hierarchy). Second, the marching introduces an error in computing the distances in the absence of any correction (Please see the Appendix for a detailed explanation of the marching error). In this work, we perform the marching operation in parallel on the GPU. We devise a GPU parallel algorithm that accounts for the race condition during marching. In addition, we use a “hybrid” approach to marching, where we store the distance to the closest boundary and the index of the closest boundary voxel while marching. This hybrid approach overcomes the error due to marching. Finally, we also compute the error due to tessellating the NURBS surfaces and use it to calculate the distance bounds. Thus, our approach is both fast and provides theoretical bounds for the distances, making it practical to compute the DT of real-world CAD models represented using trimmed NURBS surfaces.

In this paper, we use graphics processing units (GPUs) to compute DT for CAD models consisting of parametric surfaces (specifically, trimmed NURBS surfaces) as shown in Fig. 1. We first explain the GPU accelerated computation of distance transforms using our hybrid marching wavefront method (see Sec. 3). We then analyze the distance bounds for a trimmed NURBS surface representation of the CAD model. The specific contributions of this paper include the following:

Hybrid GPU-accelerated distance transforms for trimmed NURBS objects using a marching wavefront algorithm that eliminates the marching error. The error in the final DT is only due to tessellation of the surface.

Theoretical bounds for the computation of distances from tessellation of trimmed NURBS surfaces, which can accurately bound the distance transform.

The rest of the paper is arranged as follows. We discuss some research works directly related to our approach in Sec. 2. We then explain our hybrid marching algorithm in Sec. 3. We show the calculation of the bounds on the DT based on the NURBS representation of the object in Sec. 4. Finally, in Sec. 5, we provide some experimental validation of the bounds and computational time for DTs of different objects.

## 2 Related Work on Distance Fields

The work done by Rosenfeld et al. [23] provided the preliminary idea of computing distances or performing distance transforms in a digitized picture. Initially, distance transforms were performed in two-dimensional image space [24–26]. Recently, with the advancement in computing power and the need to represent volume data, researchers are focused on performing distance transforms in a *volumetric* or three-dimensional grid [27]. To accomplish this task, several algorithms have been developed such as chamfer and vector distance methods [28–31], adaptive and complete methods [2,32], level sets and fast marching methods [14]. In the chamfer distance transforms, the minimum distance from the boundary of the object is computed from the center of a voxel to the minimum distance from its neighboring voxels, and then, this value is marched according to a marching scheme (such as wavefront or sweeping schemes). In vector distance transforms, a Euclidean distance from the voxel center to the boundary of the object is computed, and the minimum of those distances is considered for the transform. This results in increased accuracy of the computed distance field. Sethian [14] developed a fast marching method to compute distance transforms that use Eikonal equation solutions at the boundary for advancing fronts. Frisken et al. [2] used the adaptive subdivision of voxels employing an octree structure to develop an adaptive distance field representation. The method developed by Xu et al. [33] uses winding numbers to compute signed distance field for triangular meshes that also includes non-manifold meshes by offsetting the non-manifold surface distance transforms to fill small gaps and holes in the meshes. Further, several other works use tree-based traversal for computing distances quickly and efficiently [16,34]. In this paper, we implement a combination of chamfer and vector distance transforms, with a hybrid wavefront marching scheme that stores the indices of the closest boundary voxel rather than the distance itself to compute the minimum distances.

Additionally, there have been considerable efforts in accelerating the distance field computations using optimized algorithms and graphics hardware (GPUs) [35,36]. Sud et al. [15] presented a fast distance field computation algorithm that uses graphics hardware features such as culling and clamping to effectively compare the distance fields. Cao et al. [37] and Bastos and Celes [38] used GPUs to compute vector distance fields and adaptively sampled distance fields, respectively. However, accelerating chamfer and vector distance transforms is challenging on the GPU, since they involve sorting the voxels according to priority queues, which is difficult to implement in data-parallel GPU architectures. Recently, Zampirolli and Filipe [17], Man et al. [39], and Manduhu and Jones [40] perform exact distance transforms using GPU acceleration for 2D images. However, the algorithms described are limited to application on 2D images and are relatively difficult to implement. In this work, we accelerate the process by computing each march of the voxels from the object boundary, as parallel operations, using the GPU. Our algorithm naturally maps to the GPU architecture, which makes implementing our method relatively straight-forward.

To the best of our knowledge, there is little on computing the bounds for distance fields. Schneider et al. [41], Rong and Tan [19], and Cignoni and Sochor[42] compute approximate distance fields using GPU algorithms. They provide theoretical bounds on the computation of approximate distance transforms as compared to actual distance fields. However, they do not take into consideration the error due to CAD model representation. In this paper, we account for both the errors by using a hybrid approach to correct the error due to marching and quantifying the error due to the CAD model representation. To the best of our knowledge, this is the first work to compute distance bounds and computation of distance fields for trimmed NURBS representation of CAD models.

## 3 Hybrid Marching Algorithm

We compute the distance transforms of manifolds (of dimension < = 2) embedded in 3D space or a curve embedded in 2D space. Without loss of generality, we assume that these are the boundary-representation (B-rep) of a solid model in 3D or the closed boundary of a shape in 2D. The indicator function value (0 or 1) is used to represent if any part of the object is contained in a voxel. Our marching algorithm to compute the distance transforms starts from the boundary voxels. Boundary voxels can formally be defined as follows:

*For given a set of voxels V in a grid with each grid cell cubic in size g, the set of boundary voxels V_{B} is defined as*$VB\u2282V:{vb\u2208VB|v=1\u2203v\u2032\u2208neighbor(vb)|v\u2032=0}$.

The function *neighbor*(*v*) depends on the definition of neighborhood of a voxel in a voxel grid. In 3D, this could be defined as the 6 neighboring voxels that share a face with the current voxel, or the 26 neighboring voxels that share a face, edge, or vertex with the current voxel. We use all the 26 neighboring voxels for calculating the distance transforms in 3D. Similarly in 2D, we use the eight neighboring pixels. We define the distance transform function as follows:

*The distance transform function*$f:{0,1}d\u2192{Rd}\u2200V\u2208{0,1}d$

*, where d refers to total number of voxels in the grid, such that*

In other words, the distance between each boundary voxel and the voxel *v* ∈ **V** is computed, and the minimum of these distances is assigned as the distance transform of the voxel *v*. This is done for each voxel in the grid. A naive implementation of this algorithm has a work complexity of $O(d.|VB|)$, |**V _{B}**| refers to the size (cardinality) of the set of boundary voxels.

We use an hybrid marching wavefront scheme to compute the distance transform. Note that we are only computing the unsigned distance transforms here; hence, we do not use any more information relating to the inside-outside of the B-rep solid model as computed in Ref. [43] (thus giving us the ability to compute the unsigned distance fields for a voxel grid corresponding to non-manifold entities that need not represent a solid model).

### 3.1 Algorithm

#### Distance Transforms (DT)

**Input:**$V$, $nx$, $ny$, $nz$, $g$

**Output:**$D$

1 Initialize: $D0\u2192\u221e$

2 Tessellate NURBS

3 Compute bounds on boundary voxels

/* below for loop is the several marching kernel calls */

4 **for**$m=1:floor(nx2+ny2+nz2)$**do**

/* below for loops are executed in parallel in GPU as a kernel */

5 **for**$i=1:nx$**do**

6 **for**$j=1:ny$**do**

7 **for**$k=1:nz$**do**

8 Boundary Check:

9 **if**$V(i,j,k):$**then**

10 $Dm(i,j,k)=BoundaryDist(i,j,k)$

11 $idx(i,j,k)=(i,j,k)$

12 Marching Wavefront:

13 $dists$ = ComputeDistances(i,j,k)

14 $idx(i,j,k)=argmin(dists)$

/* compute actual euclidean distance from the boundary voxel to (i,j,k) */

15 $Dm(i,j,k)=||(i,j,k)\u2212idx(i,j,k)||\xd7g+$$Dm(idx(i,j,k))$

16 **if**$||Dm\u2212Dm\u22121||<tol$**then**

17 return $Dm$

The algorithm to compute the bounded distance transforms consists of two steps: (i) computing the distance values and the bounds for the boundary voxels and then (ii) propagating that distance value and bounds to other voxels. The minimum distance from the object boundary to the voxel center in case of a boundary voxel is computed based on the boundary representation. For a voxel-based representation, the bounds in distance to object boundary are $(\u2212g/2,g/2)$. The nominal distance value of the boundary voxels is considered zero, and hence, all the boundary voxels are assigned this value. However, we account for the error in computing the distance from the object boundary to the boundary voxel center for different representations. We cover the case of trimmed NURBS representation in Sec. 4. We have provided the bounds for a triangle soup representation in Appendix B.

Once the distance for the boundary voxels is computed, we propagate the distances from the boundary voxels by one neighborhood of voxels (based on the neighborhood definition) in each pass (see Fig. 2). For example, in 2D, the marching can be performed in two directions, orthogonal and diagonal. Examples of cases where both directions of marching is possible are shown in Fig. 2. Although the distance obtained from the orthogonal marches might be temporarily assigned to a voxel, in some cases, the minimum distance might be in the diagonal direction, thereby incurring a temporary error in the distance fields. We correct this error by propagating the closest boundary voxel index instead of the distance value. At each pass, the field value computation marches by one neighborhood ring. The algorithm stops if the total sum of the distance field values do not change for two successive iterations or a maximum number of marches of $floor(2\xd7max(nx,ny,nz))$ is performed.

One of the significant disadvantages of performing marching based approaches is the error accumulated because of performing orthogonal and diagonal marches (see Appendix A for details). To circumvent this issue, instead of storing the minimum distance computed at every step, we store the *argmin* of the distances (i.e., the location of boundary voxels leading to the minimum distance). This approach allows us to correct the error accumulated due to the orthogonal and diagonal marches. In Sec. 4.2, we prove that this approach computes the correct DT.

### 3.2 Graphics Processing Unit Implementation.

The individual computations for each voxel can be computed independently in parallel in our marching algorithm. We map the voxel grid to the 3D block and grid structure in NVIDIA CUDA [44]. The GPU kernel computes the distances for each *i*, *j*, and *k* voxel. The kernel is called once for each pass with multiple passes for the marching. Several neighboring voxels might contribute to the minimum distance of a specific voxel. Still, not all the contributing neighboring voxels have the correct minimum distance value computed yet, which causes a race condition shown in Fig. 3. To overcome this race condition, we make use of a *ping-pong* approach where, we store two copies of distance fields at each pass, the current and previous distance fields (or *DTM-A* and *DTM-B* in Fig. 3). We use *DTM-A* as the reference for performing all the distance comparisons and assign the computed distance values in *DTM-B* (hence, avoiding the race condition). We then synchronize the threads after each marching pass and then swap the previous and current pointers to the distance fields. In each kernel call, the distance values march by one layer of voxels, and we check if a particular neighboring voxel distance value has changed by the march and recompute the minimum distance using the newly computed distances for the neighbors and update the distance value for the voxel in *DTM-B*. After each march, we check for the change in the sum of the field values from the previous march and stop if the change is smaller than a user-defined tolerance. The complete algorithm is shown in Algorithm 1.

Another important aspect of the implementation is the memory complexity of the proposed algorithm. To avoid the race conditions, we store two copies of the distance field matrix. If the voxel grid resolution is (*n*_{x}, *n*_{y}, *n*_{z}), then the total number of values that need to be stored in GPU memory is 2*n*_{x}*n*_{y}*n*_{z}. If the maximum dimension of the voxel grid is *n* = max(*n*_{x}, *n*_{y}, *n*_{z}), the memory complexity of Algorithm 1 is $O(n3)$.

## 4 Bounded Distance Transforms for Trimmed NURBS

### 4.1 Tessellation of Trimmed NURBS Surfaces.

Since a NURBS surface is a mapping from a two-dimensional parametric space to a three-dimensional Euclidean space, any tessellation (triangular, quadrilateral, etc.) applied to the NURBS surface in the parametric space has a direct equivalent tessellated structure in the three-dimensional Euclidean space. Performing the tessellation in the parametric space is an efficient method for spline surfaces. It eliminates the complexity of the three-dimensional triangulation algorithms while ensuring that the shape of the surface is maintained during the tessellation operation [45]. There are several ways to tessellate trimmed surfaces directly in the 3-dimensional Euclidean space [46]; however, this approach would disconnect the NURBS surface from the parametrization, adding another layer of complexity. Although it is also possible to apply quadrilateral tessellation for surfaces, we use triangular tessellation for better compatibility with triangle soup representation. Figure 4 illustrates our tessellation algorithm for trimmed and non-trimmed surfaces.

Before the triangular tessellation, we evaluate the surface using a relatively small evaluation interval delta, *δ* = 0.01 in both parametric dimensions to generate a large number of surface points to triangulate. We use *δ* value to change the number of triangles generated. In addition, we use the vertex spacing, *δ*_{v}, to control the size of the triangles. Vertex spacing can be considered an integer multiplier to the *δ* value, which increases the jump distance between the surface points; therefore, making the edges of the triangles longer. Using two separate variables for triangulation allows us more flexibility and control over the resolution of the final shape. Nonetheless, using a smaller *δ* value with *δ*_{v} = 1 would allow us to generate more triangles. The tessellation algorithm considers four parametrically adjacent points at a time, which forms a quadrilateral structure in a counter-clockwise direction and creates triangles depending on the intersection of the quadrilateral structure with the trim curve (indicated with green dashed line).

When there are no trim curves attached to the surface, the algorithm continuously triangulates the surface using the user-specified *δ* and *δ*_{v} values. If there are any trim curves, the algorithm first determines which points are inside or outside the trim curve with the point-in-polygon algorithm implemented using a winding number approach [47]. The algorithm then uses a ray intersection test to determine which points intersect with the quadrilateral structure and the trim curve. After obtaining the outside and intersection points, a closed polygon is constructed and triangulated. The barycentric coordinate of each new triangle is also checked with the point-in-polygon algorithm to ensure that all triangles are outside the trimming region. This last step is only required if the trim curve is rational, and the weights are different from 1.

Our tessellation algorithm provides a mechanism to control and change the tessellation density and the number of triangles. We directly generate triangles using a user-specified specified *δ* and *δ*_{v} variables in one pass. Our trimming and tessellation algorithms are easily parallelizable since each set of four points can be processed in parallel. Once the tessellation of the trimmed NURBS surface is obtained, computing the distance transforms follows the procedure mentioned in Sec. 3.

### 4.2 Analysis of Hybrid Algorithm for Distance Transforms.

We provide the following three results: (i) the correctness of the marching (Theorem 1), (ii) the guarantee for computing the DT in $\u230a(2max(nx,ny,nz))\u230b$ passes (Lemma 1), and (iii) the bounds for the distance computation from trimmed NURBS representation of the CAD geometry (Theorem 3).

*For computing distance field at a voxel*

*v*from a set of boundary voxels**, the site***V*_{B}*v*_{b}from the set**contributing to minimum Euclidean distance to the voxel v is the same site which contributes to the minimum DT(***V*_{B}**,***V*_{B}*v*) computed from Algorithm 1. More formally, we can say thatLet *v*_{b}_{1} and *v*_{b}_{2} be two boundary voxels in consideration and let us assume that ‖*v*_{b}_{1} − *p*‖ < ‖*v*_{b}_{2} − *p*‖ and let {*d*_{1}, *d*_{2}…*d*_{N}} be the distances for neighborhood voxels where *N* is the number of neighbors for which the distance value is computed. Since ‖*v*_{b}_{1} − *p*‖ < ‖*v*_{b}_{2} − *p*‖, ‖*v*_{b}_{1} − *p*‖ − *sg* < ‖*v*_{b}_{2} − *p*‖ − *sg*, where $s\u2208{1,2}$.

Let *d*_{b}_{1} and *d*_{b}_{2} be the distance field computed for any two neighboring voxels, chosen for the sake of analysis. Then, using triangle inequality, we get *d*_{p} ≤ *d*_{b}_{1} + *sg* and *d*_{p} ≤ *d*_{b}_{2} + *sg*, where $s\u2208{1,2}$. For *d*_{b}_{1}, *d*_{b}_{2} > >*sg*, since ‖*v*_{b}_{1} − *p*‖ − *sg* ≤ ‖*v*_{b}_{2} − *p*‖ − *sg*, we can prove *d*_{b}_{1} ≤ *d*_{b}_{2}. When *d*_{b}_{1}, *d*_{b}_{2} are comparable to *sg* (i.e., in the first few marches), we can prove using the Theorem 4 that *d*_{b}_{1} < *d*_{b}_{2} with the exception case of *d*_{b}_{1} = *d*_{b}_{2}. The equality happens when the angle formed by the line segments connecting *v*_{b} and *d*_{b} and connecting *d*_{b} and *p* is 22.5 deg (see Fig. 11). Therefore, we can generalize that *d*_{b}_{1} ≤ *d*_{b}_{2} for all cases; this inequality is further extended to the neighbors of *d*_{b}_{1} and *d*_{b}_{2}. Since, $DT(VB,p)=min{d1,d2\u2026dN}$ and since *d*_{b}_{1} ≤ *d*_{b}_{2}, $argminvbDT(VB,p)=vb1=argminvb\Vert VB\u2212p\Vert $. Hence, the update by Algorithm 1 does not change the site of the minimum distance, *v*_{b}.

Since the location of the boundary voxel contributing to the actual minimum distance field is the same as the one computed from Algorithm 1, as a corollary, it is easy to prove that the properties of the object such as the voronoi regions and medial axis computed using the distance transform as starting point remain unaffected. In addition, we do not need to store multiple locations of the closest boundary voxels for the voxels on the medial axis since these multiple distances would be equal. Further, the distance transforms computed using our method even satisfies the eiqonal equation. Therefore, this method, although being discrete, is still suitable for the applications of distance fields.

Now, we look at the guarantees on convergence of this marching wavefront method.

*The maximum number of marches required to compute the distance transforms using Algorithm 1 is*$\u230anx2+ny2+nz2\u230b$.

Assuming that the object consists of only one boundary voxel at the corner of the voxel grid *V*, then the Euclidean distance to the most distant corner is $gnx2+ny2+nz2$. With each step of the march, we cover a distance of *g*, $g2$, or $g3$. Thus, considering the smallest marching distance *g*, the maximum number of marches, $m\u2264\u230anx2+ny2+nz2\u230b$.

*(Bounds for voxel representation) The minimum distance from the voxel center to the object boundary in a voxel grid is*$\u03f5v\u2208[\u2212(g3/2),g3/2]$

Let the distance of voxel *v* to the set of boundary voxels **V _{B}** be obtained as

*DT*(

**V**,

_{B}*v*). The distance for an object represented by voxel grid

*V*can be represented as

*DF*(

**V**,

_{B}*v*). Then using the triangle inequality, we can say that

*DF*(

**V**,

_{B}*v*) ≤

*DT*(

**V**,

_{B}*v*) +

*ε*

_{v}, where

*ε*

_{v}is the bound for minimum distance from the object boundary in a voxel grid. Thus, the distance computations is bounded above by

*DT*(

**V**,

_{B}*v*) +

*ε*

_{v}. Similarly, it is bounded from below as well by max (

*DT*(

**V**,

_{B}*v*) −

*ε*

_{v}, 0). Note that, the max function is required because unsigned distances cannot be negative.

From the bounds computed above, we can compute maximum distance field and a minimum distance field, and the actual distance field is guaranteed to be within these bounds.

We extend Theorem 2 to obtain a bound on the distance transforms for trimmed NURBS representation. For this work, we use key results from the theorems in Krishnamurthy et al. [48]. This work theoretically computes the bounds for distance computation between a point and the trimmed NURBS surface tessellation. We could formally state the theorem as follows.

*(Bounds for trimmed NURBS tessellation) The bounds in computing the distances for a trimmed NURBS tessellation is*

*ε*

_{n}∈ [ −

*d*

_{max},

*d*

_{max}],

*where d*

_{max}

*is computed as follows. The trimmed NURBS is evaluated using a*

*n*×

*m*

*grid of uniformly spaced points in the parametric space*.

*Here, d is the nominal distance from a point to the triangle. This distance can be easily computed by taking the perpendicular distance from the voxel center to the triangle as shown in Fig. 5.*

The choice of *δ* and *δ*_{v} is important for getting a proper bound. If *K* is very large, then the bound on the distance transforms is worse than that obtained for the voxel grid.

## 5 Results

In this section, we provide some anecdotal results and comparisons for the proposed algorithms and bounds in the above sections. We implemented the DT algorithm in python*3.6* [49]. The backend for the GPU-accelerated code is written in *c++* using the Pybind11 API [50] and CUDA toolkit [44] for GPU acceleration. We also use *numba* [51] for performing a just-in-time compilation of the CPU version of the same code to perform a fair comparison between the CPU and GPU performance. The Trimesh library^{2} is used for the read/write operation of triangle files and subdivision of triangular meshes for performing the surface voxelization. For NURBS surfaces, we implemented the tessellation algorithm using c++ on the CPU. We make use of the OpenCASCADE library to read NURBS models from STEP files. We used the GPU-accelerated voxelization code developed by Young and Krishnamurthy [52] to compute the initial boundary voxelization and bounds computation.

To the best knowledge of the authors, there are no publicly available source codes with CPU or GPU implementation for distance fields computation of trimmed NURBS surfaces. Therefore, before we demonstrate the performance of our proposed approach on trimmed NURBS surfaces, we first show our results for voxel-based representation (in 2D pixels and 3D voxels) and triangle soup representation and perform comparisons with publicly available source codes. Then, we show the performance of our approach for trimmed NURBS surfaces.

### 5.1 Bounded Distance Transforms for Voxelization.

Examples of the 2D distance transforms are shown in Fig. 6. They also show the maximum and minimum bounds of the distance transforms for the two binary images. The error bound shown in the figure is the difference between the maximum bound DT and the minimum bound DT. It can be seen the maximum values of the bounds in the DTs occur in the neighborhood of the boundary voxels.

### 5.2 Bounded Distance Transforms for Triangle Soups.

The 3D distance transforms (unsigned and signed) for the tessellated model of an Armadillo and Bunny are shown on the left of Figs. 7 and 8, respectively. The unsigned DTs are cross-sectional views, and the signed distance fields are volume renderings. The actual distance transform is the minimum distance to the triangle soup computed at each voxel center using a brute force approach. This is computed by using a KDTree-based representation of the triangles implemented in the *Trimesh*^{3} library to calculate the distances efficiently on the CPU. The distance of each voxel center to the triangles present inside the voxel is computed using the CPU, while the marching to compute the DTs is performed on the GPU. The rightmost images in Fig. 8 shows the minimum and maximum bounds of the DTs. It can be seen that the minimum bound DT has a considerable gap (lower values) at the boundary of the object, while this gap is not present in the maximum bound DT. This is in accordance with the min-max bounds computed.

### 5.3 Bounded Distance Transforms for Trimmed NURBS.

The DTs for trimmed NURBS representations of the Ducky, Hammer, and Scooby are shown in Figs. 9 and 7, respectively. The NURBS models are tessellated using a *δ* of 0.05 and a voxel grid resolution of 128^{3}. The minimum and maximum bounds of the computed distance transforms are shown in Fig. 10 for the ducky and hammer models and in Fig. 7 for the Scooby model.

### 5.4 Timings.

The CPU and GPU timings for the computation of the DTs for voxels, tessellated 3D models, and NURBS representations are shown in Table 1. The timings are an average of three runs for all the models. We also provide a comparison with an implementation of distance transform computation using Scipy [53].^{4} We observe that our GPU implementation performs better than CPU implementation and the Scipy implementation at higher resolutions. At lower resolutions, the speedup is smaller but still substantially faster. The smaller speedup at lower resolutions is due to the overhead involved in transferring the CPU data to GPU and performing the computations on the GPU. The GPU implementation performs well for even very high resolutions. The computational time scales almost linearly with the resolution as the resolution increases. We also report the GPU memory usage in Table 1 ^{5} the GPU memory for even very high-resolution voxel grids is approximately 250 MB, allowing our algorithm to run efficiently on current-generation consumer GPUs.

3D Model | Representation | Resolution | SciPy Time (ms) | CPU Time (ms) | GPU Time (ms) | GPU Memory (MB) | Speedup |
---|---|---|---|---|---|---|---|

Cessna | Voxels | 60 × 15 × 65 | 177.125 | 50.343 | 4.848 | 1.12 | 36x |

121 × 31 × 129 | 3362.717 | 681.029 | 66.570 | 9.23 | 50x | ||

239 × 61 × 257 | 80857.726 | 9433.467 | 811.769 | 72.00 | 100x | ||

Triangles | 121 × 31 × 129 | 3312.983 | 1827.047 | 66.064 | 18.39 | 50x | |

Bunny | Voxels | 65 × 64 × 50 | 7933.000 | 248.810 | 21.246 | 10.24 | 373x |

129 × 128 × 100 | 245197.011 | 3477.55 | 295.751 | 31.49 | 829x | ||

257 × 255 × 200 | 8262842.943 | 58884.615 | 3304.442 | 250.00 | 2500x | ||

Triangles | 129 × 128 × 100 | 236800.517 | 6928.724 | 298.225 | 32.00 | 794x | |

Armadillo | Voxels | 55 × 65 × 39 | 2225.283 | 152.657 | 13.469 | 2.66 | 165x |

109 × 129 × 77 | 59111.559 | 1801.178 | 171.761 | 20.65 | 344x | ||

217 × 257 × 152 | 1823702.939 | 30004.649 | 1923.305 | 161.68 | 948x | ||

Triangles | 109 × 129 × 77 | 58578.174 | 4460.805 | 174.192 | 21.79 | 326x | |

Ducky | Trimmed NURBS | 129 × 90 × 105 | 206156.772 | 5261.200 | 220.166 | 24.39 | 936x |

Hammer | Trimmed NURBS | 51 × 129 × 15 | 991.645 | 82.127 | 10.468 | 3.03 | 94x |

Scooby | Trimmed NURBS | 55 × 129 × 73 | 29871.143 | 638.254 | 78.933 | 11.02 | 378x |

3D Model | Representation | Resolution | SciPy Time (ms) | CPU Time (ms) | GPU Time (ms) | GPU Memory (MB) | Speedup |
---|---|---|---|---|---|---|---|

Cessna | Voxels | 60 × 15 × 65 | 177.125 | 50.343 | 4.848 | 1.12 | 36x |

121 × 31 × 129 | 3362.717 | 681.029 | 66.570 | 9.23 | 50x | ||

239 × 61 × 257 | 80857.726 | 9433.467 | 811.769 | 72.00 | 100x | ||

Triangles | 121 × 31 × 129 | 3312.983 | 1827.047 | 66.064 | 18.39 | 50x | |

Bunny | Voxels | 65 × 64 × 50 | 7933.000 | 248.810 | 21.246 | 10.24 | 373x |

129 × 128 × 100 | 245197.011 | 3477.55 | 295.751 | 31.49 | 829x | ||

257 × 255 × 200 | 8262842.943 | 58884.615 | 3304.442 | 250.00 | 2500x | ||

Triangles | 129 × 128 × 100 | 236800.517 | 6928.724 | 298.225 | 32.00 | 794x | |

Armadillo | Voxels | 55 × 65 × 39 | 2225.283 | 152.657 | 13.469 | 2.66 | 165x |

109 × 129 × 77 | 59111.559 | 1801.178 | 171.761 | 20.65 | 344x | ||

217 × 257 × 152 | 1823702.939 | 30004.649 | 1923.305 | 161.68 | 948x | ||

Triangles | 109 × 129 × 77 | 58578.174 | 4460.805 | 174.192 | 21.79 | 326x | |

Ducky | Trimmed NURBS | 129 × 90 × 105 | 206156.772 | 5261.200 | 220.166 | 24.39 | 936x |

Hammer | Trimmed NURBS | 51 × 129 × 15 | 991.645 | 82.127 | 10.468 | 3.03 | 94x |

Scooby | Trimmed NURBS | 55 × 129 × 73 | 29871.143 | 638.254 | 78.933 | 11.02 | 378x |

Note: Timings are in milliseconds and are average running times of three runs. CPU distance transform time and GPU distance transform time are the time taken for marching using the CPU and GPU implementations. We compare our results with well-optimized implementation of distance transform computation using SciPy [53].

We compare our distance transforms timings with actual computation of the distance fields for triangle soups in Tables 2 and 3. All the computations were performed using an Intel Xeon 2.40 GHz CPU with 384GB RAM and an NVIDIA TITAN RTX GPU with 24GB GPU RAM. We perform actual distance field computations using a CPU implementation provided^{6} in Tables 2 and 3.^{7} For a fair comparison, we only compare the timings of the CPU implementation of DT. While the comparison here is performed only on CPU, we observe a good speed-up for more complex models like Bunny and Armadillo. However, with simpler models such as the Cessna, the time taken by libigl [43] is lower than our DT. Note that for these comparisons, we compute the DT on the CPU. However, our marching algorithm is essentially devised, keeping the GPU acceleration in mind. Whereas the other methods do not use a marching-based algorithm for performing the distance computations.

3D Model | Resolution | DT Time (ms) | SDFGen Time (ms) | Speedup |
---|---|---|---|---|

Cessna | 239 × 61 × 257 | 13591.700 | 27544.800 | 2.027x |

Bunny | 257 × 254 × 199 | 81387.640 | 112601.010 | 1.384x |

Armadillo | 217 × 257 × 151 | 42460.100 | 109251.083 | 2.573x |

3D Model | Resolution | DT Time (ms) | SDFGen Time (ms) | Speedup |
---|---|---|---|---|

Cessna | 239 × 61 × 257 | 13591.700 | 27544.800 | 2.027x |

Bunny | 257 × 254 × 199 | 81387.640 | 112601.010 | 1.384x |

Armadillo | 217 × 257 × 151 | 42460.100 | 109251.083 | 2.573x |

Note: Timings are in milliseconds and are average running times of three runs. We compare our results (faster in bold) with an implementation of distance field computation using the code provided online.

3D Model | Resolution | DT Time (ms) | libigl Time (ms) | Speedup |
---|---|---|---|---|

Cessna | 239 × 61 × 257 | 13591.700 | 9231.010 | 0.679x |

Bunny | 257 × 254 × 199 | 81387.640 | 94812.548 | 1.165x |

Armadillo | 217 × 257 × 151 | 42460.100 | 101584.834 | 2.392x |

3D Model | Resolution | DT Time (ms) | libigl Time (ms) | Speedup |
---|---|---|---|---|

Cessna | 239 × 61 × 257 | 13591.700 | 9231.010 | 0.679x |

Bunny | 257 × 254 × 199 | 81387.640 | 94812.548 | 1.165x |

Armadillo | 217 × 257 × 151 | 42460.100 | 101584.834 | 2.392x |

Note: Timings are in milliseconds and are average running times of three runs. While libigl takes less time (in bold) for simpler CAD models, our method is faster for more voxels.

## 6 Conclusions

In this paper, we have developed an hybrid marching method to compute the bounded distance transform for trimmed NURBS models. Our algorithm eliminates the error due to marching by storing the propagating the closest boundary voxel instead of the distance values. We compute the theoretical bounds for the distance transforms obtained using our algorithm. We empirically show that the computed distance transforms are accurate in replicating the accurate distance fields within our computed bounds. We have a GPU implementation and provide running time comparisons between the CPU and GPU implementations.

Bounded distance transforms provide a better object representation, especially for applications such as object recognition and collision detection. Future work will involve improving our GPU implementation and experimenting with volumetric data structures such as Octrees to reduce memory requirements of storing the distance transforms.

## Footnotes

See Note ^{2}.

See Note ^{4}.

## Acknowledgment

This work was partly supported by the National Science Foundation under Grant Nos. CMMI:1644441 and OAC-1750865.

## Conflict of Interest

There are no conflicts of interest.

## Data Availability Statement

The authors attest that all data for this study are included in the paper. Data provided by a third party listed in Acknowledgment.

### Error Due to Marching

Directly propagating the minimum distance value in a marching algorithm leads to an accumulation of the error.

*t*= 0, $o=2d$. This case corresponds to a 2D marching in a plane. In that case, the maximum error is equal to $(2\u22122+2)/2+2\u22480.08239=8.24%$. A similar solution can be obtained for $t\u22600$ in which case, the value for maximum error is 12.8%. The two cases are shown pictorially in Fig. 11.

### Bounded Distance Transform for Triangle Soups

In this section, we details on the distance computations for objects represented using triangle soups. Since we only compute the unsigned distance fields, there are no specific requirements for the object representation to be two-manifold. If the geometry is manifold, we can compute the signed distance field by postprocessing the distance transforms using the inside-outside information from methods such as Barill et al. [43] and Jacobson et al. [54]. For computing the unsigned DTs, we first need to identify the boundary voxels (i.e., the voxels that intersect with the triangle soup).

We use a subdivision-based method for identifying the boundary voxels that intersect with the triangle soup. We subdivide the triangle soup until the maximum edge length is less than the resolution of the voxel grid. Thus, identifying the boundary voxel grid is simplified to a point membership classification of a point (vertices of the triangle) in an axis-aligned bounding-box (AABB). Then, we compute the distance for all the points in the voxel and check for the maximum and minimum distances. Similar to Sec. 4.2, we compute the maximum and minimum bounds of the distance fields.

The algorithm remains the same as the Sec. 4.2. The only difference is computing the bounds of the boundary voxels using triangle soups. Once the precomputed distance field *D*_{0} is obtained, we perform the marching algorithm, which computes DTs for all the other voxels. Using the unsigned distance field, we compute the signed distance field by assigning the sign to the unsigned distance value based on the inside-outside test on manifold geometries.

#### Bounds on Distance Transforms for Triangle Soups

Now, we compute bounds for DTs using the triangle soup representation. Theorem 2 can be extended to this case with *ε*_{t} = [−*d*_{max}, *d*_{max}] instead of $\u03f5v\u2208[\u2212(g3/2),g3/2]$, where *ε*_{t} represents the bound in computing the minimum distance from the triangle soup and *d*_{max} represents the distance from the voxel center to the object represented using triangle soup. *d*_{max} is maximum $g3/2$, when the nearest vertex of the triangle is at the corner of the voxel. However, in general, the *d*_{max} would be the distance to the closest vertices of the triangle (distance to closest vertices of the triangle ensures that the gaps in closing the object do not affect the minimum distance and hence bounds the distance loosely). With this, the bound in computing the distance field from a triangle to the boundary voxel is *ε*_{t} ∈ [−*d*_{max}, *d*_{max}].

## References

*Nature Methods*,

**17**(3), pp.