## Abstract

In the Monte Carlo ray-trace (MCRT) method, millions of rays are emitted and traced throughout an enclosure following the laws of geometrical optics. Each ray represents the path of a discrete quantum of energy emitted from surface element $i$ and eventually absorbed by surface element $j$. The distribution of rays absorbed by the $n$ surface elements making up the enclosure is interpreted in terms of a radiation distribution factor matrix whose elements represent the probability that energy emitted by element $i$ will be absorbed by element $j$. Once obtained, the distribution factor matrix may be used to compute the net heat flux distribution on the walls of an enclosure corresponding to a specified surface temperature distribution. It is computationally very expensive to obtain high accuracy in the heat transfer calculation when high spatial resolution is required. This is especially true if a manifold of emissivities is to be considered in a parametric study in which each value of surface emissivity requires a new ray-trace to determine the corresponding distribution factor matrix. Artificial neural networks (ANNs) offer an alternative approach whose computational cost is greatly inferior to that of the traditional MCRT method. Significant computational efficiency is realized by eliminating the need to perform a new ray trace for each value of emissivity. The current contribution introduces and demonstrates through case studies estimation of radiation distribution factor matrices using ANNs and their subsequent use in radiation heat transfer calculations.

## 1 Motivation

The Monte Carlo ray-trace (MCRT) method [1–3] has emerged as the dominant tool for formulating high-fidelity models of radiation heat transfer processes. This is because of its universal applicability to problems involving radiant exchange among surfaces and within participating media, the ease with which it conforms to complicated irregular geometries, and its ability to treat directional and wavelength-dependent optical properties. Recently, its value has been significantly enhanced by contributions which establish a statistically meaningful paradigm for estimating the uncertainty, to a stated level of confidence, of predicted heat transfer results [4,5]. A widely lamented disadvantage of the MCRT method is the excessive computational cost associated with achieving high accuracy when fine spatial resolution is required. The fact that rays are mutually independent entities permits massive parallelization, with a proportionate reduction in processor time; however, associated cost, power, volume, and weight penalties exclude massive parallelization in applications where real-time results are required for data interpretation and decision-making on board autonomous space probes [6] and fire-and-forget weapons [7]. The alternative to a slow or computationally ponderous high-fidelity model in such applications would be a reduced-order model that provides comparable accuracy and spatial resolution but in real-time and with significantly reduced hardware requirements [8]. The current contribution describes such an alternative.

## 2 Brief Review of the Monte Carlo Ray-Trace Method

In Eq. (1), $n$ is the number of surface and volume elements making up the enclosure; $K$ is the number of wavelength intervals, or bands; $Qe,ik$ is the power emitted by surface or volume element $i$ in band $k$; and $Qa,ijk$ is the power emitted by surface or volume element $i$ that is absorbed by surface or volume element $j$ in band $k$. Equation (1) is completely general in that it holds whether the radiative interchange is among surface or volume elements or a combination of the two, and for spectral directional radiation as well as for gray diffuse radiation.

It is perhaps worth noting that the distribution factor considered in the current contribution is distinctly different from the *geometrical factor*—also referred to as the *angle factor*, the *view factor*, and the *configuration factor*—which dominated radiation heat transfer pedagogy and practice in the second half of the 20th C [9–17]. Several such factors have been defined and used down through the years to calculate radiant exchange, but the distribution factor defined by Eq. (1) lies at the heart of the MCRT method. The earliest mention of this quantity is attributed to Gebhart, who refers to it as the *absorption factor* in a 1961 article [18]. Gebhart showed that, for the special case of a gray diffuse enclosure, the elements of his absorption factor matrix could be constructed from surface properties and angle factors. In 1968, Howell [19] introduced the term *exchange fraction* for the version of the absorption factor evaluated using the Monte Carlo method. Later, Mahan and Eskin [20,21] refer to this same quantity as the radiation distribution factor because of its essential role in distributing radiation emitted by surface or volume element $i$ to surface or volume element $j$. While this latter term is in common usage, other authorities refer to the distribution factor as the exchange factor [22–24], although the exchange factor used by Yuen [24] is more directly akin to Gerhart's absorption factor since it is evaluated analytically without recourse to ray tracing. Finally, Larsen and Howell [25] attribute the term exchange factor to a family of auxiliary factors that, when used together, describe radiative exchange in the zonal method. It should also be noted that Lin and Sparrow [26] use the term exchange factor to describe radiant interchange among a mixture of diffuse and specular surfaces.

where $Ni$ is the number of equal-strength rays diffusely emitted from surface or volume element $i$, and $Nij$ is the number of those rays absorbed in surface or volume element $j$ [2,3]. Equation (5) may be thought of as an approximation because Eq. (7) produces an estimate of $Dij$ the accuracy of which increases with the number of rays traced for a given number of surface elements $n$. The elements of the radiation distribution factor matrix are created by following the life cycles of a large number of rays whose behavior is governed by application of statistical principles to the laws of geometrical optics. The details of ray tracing are widely available in the literature, including in Refs. [2] and [3].

Concerns expressed elsewhere [23] about the perceived need to “smooth” the exchange or distribution factor matrix are unwarranted when its elements are estimated using the MCRT method. The use of Eq. (7) to compute $Dij$ ensures that both conservation of energy and reciprocity are satisfied to a high level of accuracy. Uncertainties in the values of individual elements resulting from a finite number of rays being traced are equivalent to slight local distortions of the enclosure geometry or minor local variations in the surface properties. The related surface heat flux uncertainties resulting from application of Eq. (5) are quantified in Refs. [4] and [5].

When the number of surface elements $n$ is large and high accuracy is required, an exceedingly large number of rays must be traced, as already remarked in Sec. 1. Furthermore, because $Dij$ depends on the emissivity $\epsilon i$, computational costs can become excessive for optimization processes in which $\epsilon i$ is a parameter. This motivates the search for a computationally less intensive approach.

## 3 Introduction to Artificial Neural Networks

### 3.1 Background.

Artificial neural networks (ANNs) are nonlinear mapping systems with structures based on principles inspired by the human biological nervous system. They provide a fundamentally different approach from other numerical solution methods for forecasting the future. Artificial neural networks can accurately model the inherent relationship between sets of input and output data without reference to the underlying physical system, and yet they are able to consider all the parameters affecting the physical system. Various considerations such as nonlinearity, multiplicity of variables and parameters, and noisy and uncertain input and output values are easily dealt with. Artificial neural networks depend on neither prior knowledge of correlations nor recourse to iterative methods, but rather require only a population of input/output samples. These latter are used to train the neural network which, once trained, is able to produce meaningful outputs in response to the introduction of test inputs not used in training. Artificial neural networks consist of a large number of processing units, which run in parallel to achieve results whose accuracy is comparable to that obtained using computationally more expensive traditional approaches. They are also able to perform dynamic modeling and adaptive control tasks in the presence of abrupt changes in system parameters and imposed control signals. Complexities not easily treated by traditional approaches to thermal system analysis can be accurately modeled with significantly less computing time using an ANN.

Artificial neural networks have been under development for about four decades. They have been widely used in many engineering applications because of their ability to obtain solutions more easily, frequently with an accuracy comparable to that of higher order models [27]. In recent years, ANNs have been used in various thermal applications describing heat transfer in solar energy systems, design of steam generating plants, estimation of heating loads of buildings, waste heat recovery heat exchangers, and related performance prediction and dynamic control applications. Thibault and Grandjean [28] used an ANN for heat transfer data analysis. Parcheco-Vega et al. [29] applied ANNs for modeling the heat transfer phenomena in fin-tube refrigerating heat exchanger systems. An ANN algorithm was used by Bechtler et al. [30] to model the steady-state performance of a vapor–compression liquid heat pump. Lazrak et al. [31] modeled a dynamic absorption chiller using artificial neural networks. An ANN model was developed to predict the convective heat transfer coefficient during condensation of R134 in inclined tubes [32]. Chang et al. [33] predicted heat transfer of supercritical water using ANNs. Ye et al. [34] proposed a novel ANN model for predicting convective heat transfer in $sCO2$. Kaya and Hajimirza designed a two-layer ANN surrogate model to estimate the optical absorptivity of the solar ultrathin organic cells [35,36]. Additional investigations of heat transfer using ANNs have also been reported [37,38].

The cited applications demonstrate that ANNs are often well suited to thermal analysis of engineering systems. This is especially true when performing a parametric study involving repetitive solution of a complex model, in which case it is desirable to accelerate the analysis without comprising the underlying physics.

Although a variety analytical and numerical approaches have been employed in radiation heat transfer analysis, to the best of our knowledge, ANN methods have yet to be applied in this area. This further motivates the present work, which investigates the applicability of ANNs to the radiation heat transfer analysis of diffuse gray enclosures. As an introductory exercise, the computationally intensive MCRT method is used to compute the radiation distribution factors among the surface elements of a two-dimensional diffuse gray enclosure for a range of surface emissivity. Then a back-propagation algorithm is used to train an ANN based on these limited results. The ability of the much faster artificial neural network to accurately predict the distribution factor matrices corresponding to values of emissivity not used in the training cases is then evaluated. Various network configurations are investigated in a search for the optimal network. Once introduced, the method is then extended to increasingly complex problems.

### 3.2 Description of the Artificial Neural Network.

An artificial neural network is an information processing paradigm consisting of a large number of simple processing elements called neurons, or nodes, organized in layers [39]. The node layers are organized into three groups: the input layer, one or more hidden layers, and an output layer. Each layer is occupied by a number of nodes, as illustrated in Fig. 1. All the nodes of each hidden layer are connected to all nodes of the previous and following layers by means of synaptic connectors. Each connector is characterized by a synaptic weight. The input layer is used to designate the parameters for the problem under consideration, while the output layer corresponds to the unknown variables characterizing the performance of the system. The weights of the connectors determine the relative importance of the signals from all the nodes in the previous layer. At each hidden-layer node, the node input consists of a sum of all the outputs of the nodes in the previous layer, each modified by an individual interconnector weight. At each hidden node, the node output is determined by an activation function, which performs nonlinear input–output transformations. The information treated by the connector and node operations is introduced at the input layer, and this propagates forward toward the output layer [40]. Such ANNs are known as feed-forward networks, which is the type used in this study. Figure 1 is a schematic representation of typical feed-forward architecture. The configuration shown has one input layer, two hidden layers, and one output layer.

The error at each output node can be determined by comparing the calculated feed-forward result with the results obtained from conducting the original MCRT-based numerical experiments. Training of the network adjusts its weights to minimize the errors between the ANN result and known output. The training procedure for feed-forward networks is known as the supervised back propagation (BP) learning scheme, where the weights and biases are adjusted layer by layer from the output layer toward the input layer [41]. The mathematical basis, the procedures for training and testing the ANNs, and more descriptions of the BP algorithm can be found elsewhere [42].

An a priori selection of ANN hyperparameters such as network topology, training algorithm, and network size is usually made based on experience. After training, the final sets of weights and biases trained by the network can be used for prediction purposes, and the corresponding ANN becomes a model of the input/output relation of the given problem. Because the ANN is to be trained to interpret the relationship between input and output data, the data used for training must be sufficient to capture the dynamics of the process being modeled. The MCRT method described in Sec. 2 is used to generate the training and test data needed to create and validate the ANN.

## 4 Implementation and Results

As a demonstration of the approach advanced in this contribution, we consider three case studies of increasing complexity. All three cases involve radiant exchange within an enclosure consisting of gray diffuse surfaces in the absence of a participating medium; that is, radiant exchange is governed by Eqs. (5)–(7). However, once the distribution factors have been computed using the MCRT method, the ANN approach advanced here is expected to work equally well in the presence of a participating medium and with directional spectral surface models.

### 4.1 Case Study 1: A Long Box Channel With Uniform Emissivity.

Figure 2 represents a long square-cross-section box channel having uniform wall emissivity and prescribed wall temperatures. The walls have been subdivided into 40 equal-area segments in anticipation of an MCRT analysis. The corresponding ANN will have a single input node representing the emissivity $\epsilon $, and 1600 output nodes representing the 1600 elements of the 40-by-40 radiation distribution factor matrix.

The first author has created a convenient windows application [43] that uses the MCRT method to compute the radiation distribution factors among any number of surface elements making up any two-dimensional diffuse gray enclosure representing a long duct or channel. The MCRT method for two-dimensional geometries is demonstrated in Ref. [44], and the uncertainties associated with the method are thoroughly established in Refs. [4] and [5]. In the current effort, we have used the application described in Ref. [43] to compute the distribution factors for the long square-cross section duct illustrated in Fig. 2. The duct walls are maintained at uniform temperatures of 300 and 500 K as shown in the figure, and the corresponding net heat flux distribution on the walls is sought. The duct has been subdivided into $n=$ 40 longitudinal surface elements, and 100 numerical experiments were carried out covering the emissivity range $0.01\u2264\epsilon \u22641$. For each value of emissivity, two million rays were traced per surface element to obtain estimates of the corresponding distribution factor matrices $\u2009Dij$, where $1\u2264i\u226440$ and $1\u2264j\u226440$. The resulting dataset was then randomly divided into training and test datasets. The training dataset used to regulate the weights on the ANN contained only 10% of the available data. The test dataset, consisting of the remaining 90% of the data, was used to evaluate the predictive ability of the ANN. While we recognize that it is more common to use the majority of the available data for training and a minority for testing, the ratio used here was found to give excellent results across the test data.

An Adam optimization algorithm is used in this study to converge the ANN output with the target data during the training process. This stochastic optimization method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data. Further details about Adam optimization can be found in Ref. [45].

is also evaluated.

Generalization is a term used to describe the ability of an ANN to provide accurate output results when input data that have not been used for training are introduced into the trained network. Generalization is an essential property of any ANN. The network topology and size, as determined by the number of hidden layers and the number of hidden nodes, will affect the predicted performance. The performance of the trained network is evaluated by comparing its predicted results with data set aside for testing. In this study, in order to facilitate the search for a configuration producing relatively good prediction, the ten different ANN configurations listed in Table 1 were considered.

Note that in Table 1, mean $MR$ and mean $MS$ are averaged over all the 10 training datasets and 90 test datasets, with different random weight initialization for each input. Both quantities are important for an assessment of the relative success of the ANN analysis. We can see from inspection of the table that almost any configuration produces adequate results; however, some of them result in poor generalization. For example, the 1-10-1600 configuration produces a mean $MRE$ error of about 15% for the test data despite the low error of 2.5% for the training data. All of the configurations yield the required reciprocity and conservation of energy properties of radiation distribution factors. For the three-layer ANN, when the number of hidden nodes is increased from five to ten, improvements in mean $MRE$ and mean $MFD$ are insignificant, indicating that increasing the number of nodes does not necessarily lead to better performance. For selecting the best configuration, the mean $MFD$ for the test data in conjunction with the mean $MRE$ for the training and test data are both taken into consideration, leading to selection in the current example of the 1-20-20-1600 configuration (shown in bold type in the table).

The predicted radiation distribution factor matrix for a sample of the test data corresponding to $\epsilon =$ 0.75 is represented in Fig. 3. The printed values of $\u2009Dij$ are too small to read in the image, but the color shading, for which bright red indicates the maximum value ($D1,40\u2009$=$\u2009D40,1\u2009$=$\u2009D10,11\u2009$=$\u2009D11,10\u2009$=$\u2009D20,21\u2009$= $\u2009D21,20$ = $\u2009D30,31\u2009$=$\u2009D31,30\u2009\u2248$ 0.2256) and dark green represents the minimum value ($D1,10\u2009$= $\u2009D10,1\u2009$=$D11,20\u2009$=$\u2009D20,11$ = $D21,30\u2009$= $\u2009D30,21$ = $D31,40\u2009$=$\u2009D40,31\u2009\u2248$0.0049), very clearly reveals the expected symmetry in the matrix.

Finally, Fig. 4 compares the MCRT-based and ANN-based net heat flux distributions on the four surfaces of the enclosure depicted in Fig. 2 corresponding to the same ANN test case whose distribution factor matrix is shown in Fig. 3. The expected symmetry in the net heat flux distribution is evident, and excellent agreement is exhibited between the two approaches, with the relative difference between them typically of the order of 0.1%. It is clear that the ANN approach is a potentially powerful alternative to costly ray tracing in radiation heat transfer analysis requiring a parametric study of surface emissivity. For example, once the investment in creating and training the ANN model has been made, the time required to create the data in Fig. 3 is measured in seconds as opposed to hours on a typical desktop computer using the MCRT method.

### 4.2 Case Study 2: A Long Box Channel With Nonuniform Emissivity.

where $\epsilon d$ is the design wall emissivity and rand is a uniformly distributed random number between zero and unity. In this case, four ANN input nodes are used corresponding to the four emissivities $\epsilon i$, while the 1600 output nodes still correspond to the 1600 elements of the 40-by-40 radiation distribution factor matrix. Again, 100 numerical experiments were carried out to produce data. Twenty percent of the data was used to train the neural network. The remaining 80% of the data was used as the test data to validate the predictive power of the network.

Table 2 shows the ANN results for the 4-100-100-1600 configuration selected for the case under consideration. The results are not as satisfactory as in the case with only one emissivity as the input feature. We can see that the mean test error associated with predicting the radiation distribution factors is quite high; however, they still allow accurate prediction of the net heat fluxes. Also, a mean $MR$ of 3.2 × 10^{−4} and a mean $MS$ of 1.0006 show that the model is yielding the required reciprocity and conservation of energy. The relatively high mean $MRE$ error in the test data is due to a small number of high errors in relatively few pixels for a small number of samples.

for relatively “good” and relatively “bad” distribution factor matrix predictions using the ANN. Both results are drawn from the test dataset used in constructing Table 2. The red-tinted cells in the bottom (bad) matrix of Fig. 5, which correspond to errors exceeding 2%, reveal that some elements of the radiation distribution factor matrix are predicted with relatively poor accuracy. However, these large relative errors correspond to small values of $Dij$ (see Fig. 3) so that the relative errors are disproportionately amplified due to division by small numbers. Although this produces a large value of mean MRE for the test dataset error in Table 2, the small values of these distribution factors themselves minimize their effect on the heat flux analysis, thereby yielding a small value of mean MFD.

Figure 6(a) shows good agreement between the net heat fluxes predicted using the ANN-based and MCRT-based distribution factor matrices corresponding to the upper panel of Figs. 5, and 6(b) confirms that the local net heat flux errors are less than 1.5% in this case. Figure 7(a) also reveals good agreement between the ANN-based and MCRT-based net heat flux distributions even though the distribution factor matrix prediction is relatively bad, and the local net heat flux errors shown in Fig. 7(b), though somewhat larger than those in Fig. 6(b), are generally well under 2% in this case. We may conclude that the ANN approach works well for the case of a nonuniform emissivity distribution. Furthermore, while minimizing mean MRE is a valid strategy for defining the ANN hyperparameters, its value should not be interpreted as a measure of the ability of the ANN-produced distribution factor matrix to predict local net heat flux.

### 4.3 Case Study 3: A Long Box Channel With an Interior Obstruction.

In cases 1 and 2, we considered a geometry in which all wall segments have a direct view of all other wall segments. We now consider a more complex geometry involving an interior obstruction, which partially blocks the direct view of some surfaces from other surfaces. In such cases, the MCRT method is the only practical approach for analyzing the radiation heat transfer. Howell was among the first to predict the emerging dominance of the Monte Carlo method for treating radiative heat transfer [1] in such cases. Figure 8 represents a benchmark two-dimensional enclosure that has been used in previous radiation heat transfer studies [48,49]. In this study, we have divided it into 40 equal-area longitudinal surface elements.

The flexibility of the Monte Carlo method to accommodate complex geometries comes at a significant computational cost when the code must be executed many times in the context of a parametric study; e.g., when searching for an optimum value of emissivity for a given application. This cost can be significantly reduced by replacing the high-fidelity MCRT model with a reduced-order ANN model of comparable accuracy in the search algorithm.

Once again assuming that the emissivity is uniform across all the walls of the enclosure, the ANN has only a single input node, corresponding to the emissivity, while 1600 output nodes are required to represent the 1600 elements of the radiation distribution factor matrix. One hundred numerical experiments were carried out to produce training and test datasets and, as before, 10% of the data were used to train the neural network, with the remaining 90% used as the test data to validate the predictive power and generality of the ANN.

Table 3 shows the ANN results for the 1-100-100-1600 configuration selected for this case study. We again see that the MRE error for the test dataset, about 16.6% in this case, is not a measure of the ability of the distribution factors to predict the net heat flux distribution. The ANN model yields the required reciprocity and obeys conservation of energy to a high degree of accuracy.

Figure 9(a) compares the MCRT-based and ANN-based net heat flux distributions on the surfaces of the enclosure depicted in Fig. 8 for a uniform emissivity of 0.75, and Fig. 9(b) shows the relative difference between the values calculated for the net heat fluxes by the two methods. The accuracy—generally better than 1%—is quite acceptable.

## 5 Conclusions and Recommendations

Artificial neural networks are investigated as an alternative to ray tracing in radiation heat transfer applications involving diffuse gray enclosures in the absence of a participating medium. Specifically, they are used to predict the radiation distribution factor matrix and corresponding net heat flux distribution on the walls of long box structures. In each case, a feed-forward back-propagation algorithm is used to train and test the ANN. Net heat flux results obtained using the ANN approach are shown to agree well with those obtained using the standard Monte Carlo ray-trace method for the three cases studied: (1) uniform emissivities on all walls of a square-cross section duct, (2) differing emissivities from wall to wall perturbed about a design value for the same unobstructed duct, and (3) a rectangular duct containing a rectangular obstruction with uniform emissivity on all walls. The authors recommend the approach introduced here when a parametric study is required to determine the optimum value of emissivity for a given application. For example, the results for case study 2, obtained with much less computational effort than would have been required using the MCRT method alone, could be used in a quality-control scheme to determine the variability in the net wall heat flux corresponding to a 5% manufacturing tolerance in wall emissivity. The ANN approach would be the same for the case of a nondiffuse, nongray enclosure filled with a participating medium as for the case of a diffuse gray enclosure in the absence of a participating medium demonstrated in the current effort. This encourages the idea that the approach advanced here would be equally applicable—and even more useful—in these far more complex situations.

## Acknowledgment

The authors gratefully acknowledge NASA's Langley Research Center for its partial financial support for this effort under NASA Contract NNL16AA05C with Science Systems and Applications, Inc., and Subcontract No. 21606-16-036, Task Assignment M.001C (CERES) with Virginia Tech.

## Funding Data

NASA's Langley Research Center (NNL16AA05C; Funder ID: 10.13039/100006199).

Science Systems and Applications, Inc., (21606-16-036, Task Assignment M.001C).

## Nomenclature

- $A$ =
area (m

^{2})- $D$ =
radiation distribution factor

- $e$ =
emissive power (W m

^{−2})- $i$ =
intensity (W m

^{−2}sr^{−1})- $K$ =
number of wavelength intervals

- $MFD$ =
mean relative flux difference

- $MR$ =
mean reciprocity

- $MRE$ =
mean relative error

- $MS$ =
mean summation

- $n$ =
number of surface and volume elements making up the enclosure

- $N$ =
- $q$ =
flux (W m

^{−2})- $Q$ =
power (W)

- $rand$ =
random number

- $RE$ =
relative error

- $T$ =
temperature (K)

- $V$ =
volume (m

^{3})