struct AllocPinnedMemory { static uint8_t* Alloc( int width, int height ) { uint8_t *p = NULL; cudaHostAlloc( (void**)&p, width*height, cudaHostAllocDefault ); return p; } static void Free( uint8_t* p ) { cudaFreeHost( p ); } }; typedef Image < uint8_t , AllocPinnedMemory > PinnedImage; BOOST_AUTO_TEST_CASE( TestProcess2 ) { // ... PinnedImage src( width, height ); PinnedImage dst( width, height ); // ... }This makes the memory operation to be exactly two times faster than previous host memory which is allocated by 'new'. Refer below screenshot of the Visual Profiler. The non pinned memory copy takes 1.24 ms ( HtoD ) and 1.27 ms ( DtoH). The pinned memory takes 628 us ( both HtoD and DtoH ). Of course there is no difference in the kernel time.
essay on programming languages, computer science, information techonlogies and all.
Wednesday, January 30, 2013
CUDA Study - Pinned Memory
As CUDA advocates the pinned memory, only the memory allocation has been modified to see the effect. Refer below code.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment