To merge number of close pixels, we can use closing - dilation and erosion. This does add pixels between horizontal or vertical distanced pixels. But it doesn't fill the gap between diagonal distance. Refer below two images which shows original image and image after closing.
This is because the dilation is spreading to 8 neighbor pixels like below.
To fill a gap at a diagonal distanced pixels, the dilation should cover more as below.
This 2 pixel dilation code is as below.
__kernel void Dilation2PixelDistance( __global uchar *d_src, __global uchar *d_dst, int width, int height, int roiLeft, int roiTop, int roiRight, int roiBottom ) { int x = (roiLeft & (~0x07)) + ( get_global_id(0) << 3 ); int y = roiTop + get_global_id(1); int stride = width >> 3; if( x <= roiRight ) { int idx = x + (y-2) * width; int bidx = idx >> 3; uchar dilated = 0, C; if( y > roiTop+1 ) { C = d_src[bidx]; dilated |= C; } bidx += stride; if( y > roiTop ) { C = d_src[bidx]; dilated |= C; dilated |= (C >> 1) | ( d_src[bidx-1] << 7); dilated |= (C << 1) | ( d_src[bidx+1] >> 7); } bidx += stride; if( y <= roiBottom ) { C = d_src[ bidx ]; dilated |= C; dilated |= (C >> 2) | (d_src[ bidx - 1] << 6); dilated |= (C >> 1) | (d_src[ bidx - 1] << 7); dilated |= (C << 1) | (d_src[ bidx + 1] >> 7); dilated |= (C << 2) | (d_src[ bidx + 1] >> 6); } bidx += stride; if( y < roiBottom ) { C = d_src[ bidx ]; dilated |= d_src[ bidx ]; dilated |= (C >> 1) | (d_src[ bidx - 1] << 7); dilated |= (C << 1) | (d_src[ bidx + 1] >> 7); } bidx += stride; if( y < roiBottom -1 ) { C = d_src[bidx]; dilated |= C; } bidx -= (stride<<1); if( y <= roiBottom ) { d_dst[ bidx ] = dilated; } } }The closing operation - 2 pixel dilation & erosion - runs at 1.84 GB/s. Refer below profiling result.
Step 1 : start 0 ns, end 103840 ns, duration 103840 ns, 38243.76 MB/s : copying source image to device Step 2 : start 323584 ns, end 837440 ns, duration 513856 ns, 7728.30 MB/s : 2 pixel dilation Step 3 : start 1730560 ns, end 2041888 ns, duration 311328 ns, 12755.78 MB/s : erosion Step 4 : start 2072288 ns, end 2153088 ns, duration 80800 ns, 49148.92 MB/s : copying result image from device Total : duration 2153088 ns, 1844.44 MB/s