Cuda 5 - Others (Library, OpenACC, Dynamic parallelism)

UDACITY教程 Intro to Parallel Programming

  • Basics on GPU, CUDA, Memory Model
  • Parallel Algorithms(Reduce, Scan, Histogram, Sort)
  • Optimize Parallel GPU Programs
  • Others(Library, OpenACC, Dynamic parallelism)

1. GPU-Accelerated Libraries:

cuBLAS, ArrayFire, OpenCV, etc

Thrust <==> C++ STL

Sort, sort_by_key, scan, reduce, reduce github-wiki

Sort example

2. Other platforms

PyCUDA and MATLAB

3. Cross Platform Solutions

gpu-cross

#pragma acc kernels loop
for( i = 0; i < n; ++i )
	r[i] = a[i]*2.0f;

4. Dynamic parallelism

Nested/Recursive Parallelism

allowed on the compute_35 architecture or above

Quicksort example:

gpu-quicksort

Loading Disqus comments...
Table of Contents