essay on programming languages, computer science, information techonlogies and all.

Wednesday, June 19, 2013

OpenCL - Connecting pixels

Sometimes, it can happens that number of 1 pixel defects scattered around closely. When the pixels are close enough it is better to consider as a 1 defects rather than separate defects.

To merge number of close pixels, we can use closing - dilation and erosion. This does add pixels between horizontal or vertical distanced pixels. But it doesn't fill the gap between diagonal distance. Refer below two images which shows original image and image after closing.


This is because the dilation is spreading to 8 neighbor pixels like below.

To fill a gap at a diagonal distanced pixels, the dilation should cover more as below.

When this 2 pixel distance dilation is used, the diagonal distance is filled as below.
This 2 pixel dilation code is as below.
__kernel
void Dilation2PixelDistance(
  __global uchar *d_src, 
  __global uchar *d_dst,
  int width, int height,
  int roiLeft, int roiTop, int roiRight, int roiBottom )
{ 
  int x = (roiLeft & (~0x07)) + ( get_global_id(0) << 3 );
  int y = roiTop + get_global_id(1);
  int stride = width >> 3;

  if( x <= roiRight ) 
  {
    int idx = x + (y-2) * width;
    int bidx = idx >> 3;

    uchar dilated = 0, C;

    if( y > roiTop+1 ) 
    {
      C = d_src[bidx];
      dilated |= C;
    }

    bidx += stride;
    if( y > roiTop ) 
    {
      C = d_src[bidx];
      dilated |= C;
      dilated |= (C >> 1) | ( d_src[bidx-1] << 7);
      dilated |= (C << 1) | ( d_src[bidx+1] >> 7);
    }

    bidx += stride;
    if( y <= roiBottom )
    {
      C = d_src[ bidx ];
      dilated |= C;
      dilated |= (C >> 2) | (d_src[ bidx - 1] << 6);
      dilated |= (C >> 1) | (d_src[ bidx - 1] << 7);
      dilated |= (C << 1) | (d_src[ bidx + 1] >> 7);
      dilated |= (C << 2) | (d_src[ bidx + 1] >> 6);
    }

    bidx += stride;
    if( y < roiBottom )
    {
      C = d_src[ bidx ];
      dilated |= d_src[ bidx ];
      dilated |= (C >> 1) | (d_src[ bidx - 1] << 7);
      dilated |= (C << 1) | (d_src[ bidx + 1] >> 7);
    }

    bidx += stride;
    if( y < roiBottom -1 )
    {
      C = d_src[bidx];
      dilated |= C;
    }

    bidx -= (stride<<1);
    if( y <= roiBottom )
    {
      d_dst[ bidx ] = dilated;
    }
  }
}   
The closing operation - 2 pixel dilation & erosion - runs at 1.84 GB/s. Refer below profiling result.
Step 1 : start 0 ns, end 103840 ns, duration 103840 ns, 38243.76 MB/s              : copying source image to device
Step 2 : start 323584 ns, end 837440 ns, duration 513856 ns, 7728.30 MB/s          : 2 pixel dilation
Step 3 : start 1730560 ns, end 2041888 ns, duration 311328 ns, 12755.78 MB/s       : erosion
Step 4 : start 2072288 ns, end 2153088 ns, duration 80800 ns, 49148.92 MB/s        : copying result image from device
Total : duration 2153088 ns, 1844.44 MB/s

Saturday, June 15, 2013

OpenCL - Dilation and Erosion in packed bit

With packed binary pixel - 1 bit per pixel - morphology operation can be done with number of bits together. i.e. 8 pixels - 1 byte - can be read and dilated and eroded in a go. When morphology is done in unit of byte, then the number of thread needs is pixel count divide by 8. Below is a OpenCL kernel code that does dilation in 8 neighbor hood.
__kernel void Dilation( 
  __global uchar *d_src, 
  __global uchar *d_dst,
  int width, int height,
  int roiLeft, int roiTop, int roiRight, int roiBottom )
{ 
  int x = (roiLeft & (~0x07)) + ( get_global_id(0) << 3 );
  int y = roiTop + get_global_id(1);
  int stride = width >> 3;

  if( x <= roiRight ) 
  {
    int idx = x + (y-1) * width;
    int bidx = idx >> 3;

    uchar dilated = 0, C;

    if( y > roiTop ) 
    {
      C = d_src[bidx];
      dilated |= C;                                   // North
      dilated |= (C >> 1) | ( d_src[bidx-1] << 7);    // North West
      dilated |= (C << 1) | ( d_src[bidx+1] >> 7);    // North East
    }

    bidx += stride;
    if( y <= roiBottom )
    {
      C = d_src[ bidx ];
      dilated |= C;                                    // Center
      dilated |= (C >> 1) | (d_src[ bidx - 1] << 7);   // West
      dilated |= (C << 1) | (d_src[ bidx + 1] >> 7);   // East
    }

    bidx += stride;
    if( y < roiBottom )
    {
      C = d_src[ bidx ];
      dilated |= d_src[ bidx ];                        // South
      dilated |= (C >> 1) | (d_src[ bidx - 1] << 7);   // South West
      dilated |= (C << 1) | (d_src[ bidx + 1] >> 7);   // South East
    }

    bidx -= stride;
    if( y <= roiBottom )
    {
      d_dst[ bidx ] = dilated;
    }
  }
}   
Below is a OpenCL kernel code that does erosion in 8 neighbor hood.
__kernel
void Erosion(
  __global uchar *d_src, 
  __global uchar *d_dst,
  int width, int height,
  int roiLeft, int roiTop, int roiRight, int roiBottom )
{ 
  int x = (roiLeft & (~0x07)) + ( get_global_id(0) << 3 );
  int y = roiTop + get_global_id(1);
  int stride = width >> 3;

  if( x <= roiRight ) 
  {
    int idx = x + (y-1) * width;
    int bidx = idx >> 3;

    uchar eroded = 0xFF, C;

    if( y > roiTop ) 
    {
      C = d_src[bidx];
      eroded &= C;
      eroded &= (C >> 1) | ( d_src[bidx-1] << 7);
      eroded &= (C << 1) | ( d_src[bidx+1] >> 7);
    }

    bidx += stride;
    if( y <= roiBottom )
    {
      C = d_src[ bidx ];
      eroded &= C;
      eroded &= (C >> 1) | (d_src[ bidx - 1] << 7);
      eroded &= (C << 1) | (d_src[ bidx + 1] >> 7);
    }

    bidx += stride;
    if( y < roiBottom )
    {
      C = d_src[ bidx ];
      eroded &= d_src[ bidx ];
      eroded &= (C >> 1) | (d_src[ bidx - 1] << 7);
      eroded &= (C << 1) | (d_src[ bidx + 1] >> 7);
    }

    bidx -= stride;
    if( y <= roiBottom )
    {
      d_dst[ bidx ] = eroded;
    }
  }
}