Go to Practical Course
PracticalCourse

OpenCV. Automatic cropping and image warping.

This technique is widely used for preprocessing images in photo retouching datasets.

Open in Google Colab

Today we will be using the same idea that we used in lecture "Points matching with SVD in 3D space", but instead SVD, will be using estimation method RANSAC based on points matched with KAZE descriptor(any can be used).

We want to find warping params between the original image and retouched one (you will need this for supervised training

Our task to find key points on Input and Target images and then find transformation matrix T

\begin{align}T\begin{bmatrix} x_{i} \\ y_{i} \\ 1 \end{bmatrix} \sim Source_i\begin{bmatrix} x_{i}^{\prime} \\ y_{i}^{\prime} \\ 1 \end{bmatrix}\end{align}

Solution

To do this we need to find the best keypoints in both images. For these, we can use one of the classic descriptors like SIFT, ORB, KAZE, etc. We will use KAZE because it free and sourced with OpenCV.

Points matching

    import numpy as np
    import cv2
    import math

    ench_image = cv2.imread(ench, 0)
    orig_image = cv2.imread(orig, 0)
    orig_image_rgb = cv2.imread(orig)
    
    try:
        surf = cv2.KAZE_create()
        kp1, des1 = surf.detectAndCompute(ench_image, None)
        kp2, des2 = surf.detectAndCompute(orig_image, None)
    except cv2.error as e:
        raise e

        

After we get keypoints we need to find similar on both images. For this task we will use KNN matcher from OpenCV


    bf = cv2.BFMatcher()
    matches = bf.knnMatch(des1, des2, k=2)
    
    # store all the good matches as per Lowe's ratio test.
    good = []
    for m, n in matches:
        if m.distance < 0.7 * n.distance:
            good.append(m)

    # if less then 10 points matched -> not the same images or higly distorted 
    MIN_MATCH_COUNT = 10
    if len(good) > MIN_MATCH_COUNT:
        src_pts = np.float32([kp1[m.queryIdx].pt for m in good
                                ]).reshape(-1, 1, 2)
        dst_pts = np.float32([kp2[m.trainIdx].pt for m in good
                                ]).reshape(-1, 1, 2)

        kp1_matched=([ kp1[m.queryIdx] for m in good ])
        kp2_matched=([ kp2[m.trainIdx] for m in good ])   

        matches = cv2.drawMatches(ench_image,kp1,orig_image,kp2, good,None, flags=2)
        plt.figure(figsize=(20,10))
        plt.axis('off')
        plt.imshow(matches),plt.show()   
        

On big images you should optimize the process of picking points for matcher, cause too many points will lead to big matching time. So after run we will get something like that:

Finding transformation

So we have input points and target points, we need to estimate transformation between them. The error function for that transformation will be:

\begin{equation} \sum_{\mathrm{i}}\left(\mathrm{x}_{\mathrm{i}}^{\prime}-\frac{\mathrm{t}_{11} \mathrm{x}_{\mathrm{i}}+\mathrm{t}_{12} \mathrm{y}_{\mathrm{i}}+\mathrm{t}_{13}}{\mathrm{t}_{31} \mathrm{x}_{\mathrm{i}}+\mathrm{t}_{32} \mathrm{y}_{\mathrm{i}}+\mathrm{t}_{33}}\right)^{2}+\left(\mathrm{y}_{\mathrm{i}}^{\prime}-\frac{\mathrm{t}_{21} \mathrm{x}_{\mathrm{i}}+\mathrm{t}_{22} \mathrm{y}_{\mathrm{i}}+\mathrm{t}_{23}}{\mathrm{t}_{31} \mathrm{x}_{\mathrm{i}}+\mathrm{t}_{32} \mathrm{y}_{\mathrm{i}}+\mathrm{t}_{33}}\right)^{2} \end{equation}

To do so we will use RANSAC algorithm (cause it really fast). Also we will calculate rotation and scale difference.


    # Finds a perspective transformation between two planes. 
    T, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0)

    ss = T[0, 1]
    sc = T[0, 0]
    scaleRecovered = math.sqrt(ss * ss + sc * sc)
    thetaRecovered = math.atan2(ss, sc) * 180 / math.pi
    print("Calculated scale difference: %.2f\n
            Calculated rotation difference: %.2f" % (scaleRecovered, thetaRecovered))  
        

Now we need only apply our transformation matrix to our input image. This can be done using this formula:

\begin{equation} \operatorname{dst}(x, y)=\operatorname{src}\left(\frac{M_{11} x+M_{12} y+M_{13}}{M_{31} x+M_{32} y+M_{33}}, \frac{M_{21} x+M_{22} y+M_{23}}{M_{31} x+M_{32} y+M_{33}}\right) \end{equation}


    im_out = cv2.warpPerspective(
        orig_image, 
        np.linalg.inv(T), 
        (ench_image.shape[1], ench_image.shape[0])
    )
        
Open in Google Colab

And that's all. Don't forget that you can run this code in Google Colab by clicking the button "Open in colab"

References

  1. Basic concepts of the homography explained with code  [HTML]
  2. Find Image Rotation and Scale Using Automated Feature Matching  [HTML]