The algorithm
SIFT is quite an involved algorithm. It has a lot going on and can become confusing, So I’ve split up the entire algorithm into multiple parts. Here’s an outline of what happens in SIFT.- Constructing a scale space
- This is the initial preparation. You create internal representations of the original image to ensure scale invariance. This is done by generating a “scale space”.
- LoG Approximation
- The Laplacian of Gaussian is great for finding interesting points (or key points) in an image. But it’s computationally expensive. So we cheat and approximate it using the representation created earlier.
- Finding keypoints
- With the super fast approximation, we now try to find key points. These are maxima and minima in the Difference of Gaussian image we calculate in step 2
- Get rid of bad key points
- Edges and low contrast regions are bad keypoints. Eliminating these makes the algorithm efficient and robust. A technique similar to the Harris Corner Detector is used here.
- Assigning an orientation to the keypoints
- An orientation is calculated for each key point. Any further calculations are done relative to this orientation. This effectively cancels out the effect of orientation, making it rotation invariant.
- http://www.aishack.in/2010/05/sift-scale-invariant-feature-transform/