Transfer data from the host to the device.Declare and allocate host and device memory.Given the heterogeneous nature of the CUDA programming model, a typical sequence of operations for a CUDA Fortran code is: These kernels are executed by many GPU threads in parallel. Code running on the host manages the memory on both the host and device, and also launches kernels which are subroutines executed on the device. In CUDA, the host refers to the CPU and its memory, while the device refers to the GPU and its memory. The CUDA programming model is a heterogeneous model in which both the CPU and GPU are used. (Those familiar with CUDA C or another interface to CUDA can jump to the next section). CUDA Programming Model Basicsīefore we jump into CUDA Fortran code, those new to CUDA will benefit from a basic description of the CUDA programming model and some of the terminology used. CUDA Fortran is essentially Fortran with a few extensions that allow one to execute subroutines on the GPU by many threads in parallel. If you are familiar with Fortran but new to CUDA, this series will cover the basic concepts of parallel computing on the CUDA platform. There are a few differences in how CUDA concepts are expressed using Fortran 90 constructs, but the programming model for both CUDA Fortran and CUDA C is the same. If you are familiar with CUDA C, then you are already well on your way to using CUDA Fortran as it is based on the CUDA C runtime API. This post is the first in a series on CUDA Fortran, which is the Fortran interface to the CUDA parallel computing platform. CUDA Fortran for Scientists and Engineers shows how high-performance application developers can leverage the power of GPUs using Fortran.
0 Comments
Leave a Reply. |