Advanced Imaging


Advanced Imaging Magazine

Updated: January 12th, 2011 09:49 AM CDT

Stylin’ and Profilin’ at Very High Speeds

Virtual prototyping of fashion designs cuts product development time from 190 days to as little as 35
Figure 1
Figure 2
Figure 3

By Barry Hochfelder

NVIDIA’s Gupta explains that “CUDA is the massively parallel computing architecture of NVIDIA GPUs and the associated parallel programming model. NVIDIA’s GPUs consist of hundreds of processor cores that operate in tandem to crunch through complex mathematical computations. Software programmers can use C, C++, and other high-level programming languages to map the computationally intensive portions or functions of their software applications on to the GPU. They must express these functions in a parallel way so as to break it down into thousands of smaller computations. In this way, the GPU works in conjunction with the CPU; whereas the CPU runs the sequential portion of the application, the GPU runs the computationally intensive and parallel portions of the application. This enables speedups ranging all the way up to 100 to 250 times over using just a CPU.”

The CUDA Software Development Environment supports two different programming interfaces, a device-level programming interface, in which the application uses DirectX Compute, OpenCL or the CUDA driver API directly to configure the GPU, launch compute kernels, and read back results, and a language integration programming interface, in which an application uses the C runtime for CUDA and developers use a small set of extensions to indicate which compute functions should be performed on the GPU instead of the CPU.

When using the device-level programming interface, developers write compute kernels in separate files using the kernel language supported by their API of choice. DirectX Compute kernels (aka “compute shaders”) are written in HLSL. OpenCL kernels are written in a C-like language called “OpenCL C”. The CUDA Driver API accepts kernels written in C or PTX assembly.

When using the language integration programming interface, developers write compute functions in C and the C Runtime for CUDA automatically handles setting up the GPU and executing the compute functions. This programming interface enables developers to take advantage of native support for high-level languages such as C, C++, Fortran, Java, Python, and more reducing code complexity and development costs through type integration and code integration.

Type integration allows standard types as well as vector types and user-defined types (including structs) to be used seamlessly across functions that are executed on the CPU and functions that are executed on the GPU. Code integration allows the same function to be called from functions that will be executed on the CPU and functions that will be executed on the GPU.

Subscribe to our RSS Feeds