essay on programming languages, computer science, information techonlogies and all.

Sunday, February 17, 2013

Next Move

CUDA
  • Using GT 640. PTX instructions. VLIW aware kernel programming

SSE
  • Any benefit using store without cache ?
  • Calculate throughput.
    • EB = ( Br + Bw ) / T , EB : effective bandwidth, Br ; read bytes, Bw : written bytes, T : time

HW
  • Study hardware - cache, bank and so on

OpenCL
  • Using NVidia OpenCL with GT 640.  Compare result with CUDA. Using buffer. Using Image.
  • Using NVidia OpenCL with GT 640.  vector type 
  • Using Intel OpenCL. Compare result with SSE code
  • Run kernels on GPU GT 640 and CPU G2120 at the same time.
  • Buy 3rd generation processor i3 - 3220 - to access GPU HD2500.

Inspection algorithm
  • Image preprocessing
    • Geometric Distortion corretion - radial in area camera
    • Shading correction - Area, line scan. How to define the compensation ?
  • periodic pattern inspection - 2 points ( hor, ver ), 4 points, 8 points, many points in horizontal and vertical
  • threshold - constant, adaptive threshold, ...
  • binary image handling
  • morphology - open, close, kernel size, iteration count, iteration order
  • segmentation - contour based, region based, bug follower, area, bounding box calculation

Distributed system
  • Group communication to manage multiple inspection processor
  • Join/Leave the group
  • Send packet that goes out to all group member fast and reliable using broadcast or multi-cast. 
  • Designated member can get the burst of REP from all the group member. 
  • A processor can join more than 1 group
  • Streaming images to a member - or designated member
  • REQ, REP, ACK, AYA( Are you alive ? ), IAA ( I am alive ), TA( Try Again ), AU ( Address unknown )
  • Make API that is flexible and easy to use ?  Allow to have different type of REQ possible ?  Refer other API for network ?

No comments: