Hands-On GPU:Accelerated Computer Vision with OpenCV and CUDA
上QQ阅读APP看书,第一时间看更新

Transpose 

When the input is in the form of a row-major matrix, and we want the output to be in column-major form, we have to use this transpose communication pattern. It is particularly useful if you have a structure of arrays and you want to convert it in the form of an array of structures. It is also a one-to-one operation. The code for the transpose pattern will look as follows:

out[i+j*128] = in [j +i*128]

In this section, various communication patterns that CUDA programming follows is discussed. It is useful to find a communication pattern related to your application and use the code syntax of that pattern shown as an example.