Least squares spectral element methods formulate the partial differential equation (PDE) as an optimization problem. One of the advantages of this method is that the boundary conditions can be implemented by adding a penalty equation to the cost function and hence be imposed in a weak sense with little effort. In this study a discontinuous methodology is utilized; that is, each element has its own set of degrees-of-freedom. This formulation possesses a greater sparsity pattern in the Jacobian matrix, and has a smaller bandwidth when compare to the continuous counterpart. However, these attributes come at the expense of an increased number of degrees-of-freedom on a given discretization. In the current work, the conventional discontinuous approach is modified to convert the equations to a matrix free system where there is no need for assembling the global system. The continuity in the formulation between two neighboring elements is imposed in a weak sense with a penalty equation added to the original PDE in each element. This penalty term minimizes the integral of the square root of the difference between the unknown state-vectors on each edge for neighboring elements. The conventional discontinuous approach evaluates this integral at the current time iterate. Using the aforementioned approach, assembly of the system is required and is not matrix free. It is shown in this study that by modifying this equation it is possible to obtain a matrix free system. Additionally each element becomes independent from other elements, and the direct solution for each element possible. The system matrix obtained by this least squares method is symmetric positive definite and can be effectively solved by Cholesky decomposition. This solution procedure is well suited for parallelization using Pthreads and CUDA. This is due to the fact that there is now no need for any communication, and each element only reads the data from the neighboring elements, while solving for its own unknowns. Another advantage of the matrix free approach is that adaptation is easily implemented by only introducing the new state-vectors into the data structures and updating the neighbor connectivity. The value of the cost function in the formulation may be used to select the elements to be refined. Each tagged element is then divided by h-refinement. This results in a nonconformal mesh. Utilization of a nonconformal mesh alleviates the need for increasing the resolution in unnecessary locations. To require conformality of the mesh, the extent of refinement and the number of degrees-of-freedom are increased. In the current work, quintic quadrilateral elements are used in the simulations, and a C++ vector class is used for updating mesh refinement data structures.

This content is only available via PDF.