File: pyramids.markdown

package info (click to toggle)
opencv 4.6.0%2Bdfsg-12
links: PTS, VCS
area: main
in suites: bookworm
size: 276,172 kB
sloc: cpp: 1,079,020; xml: 682,526; python: 43,885; lisp: 30,943; java: 25,642; ansic: 7,968; javascript: 5,956; objc: 2,039; sh: 1,017; cs: 601; perl: 494; makefile: 179
file content (208 lines) | stat: -rw-r--r-- 7,411 bytes
Image Pyramids {#tutorial_pyramids}
==============

@tableofcontents

@prev_tutorial{tutorial_morph_lines_detection}
@next_tutorial{tutorial_threshold}

|    |    |
| -: | :- |
| Original author | Ana Huamán |
| Compatibility | OpenCV >= 3.0 |

Goal
----

In this tutorial you will learn how to:

-   Use the OpenCV functions **pyrUp()** and **pyrDown()** to downsample or upsample a given
    image.

Theory
------

@note The explanation below belongs to the book **Learning OpenCV** by Bradski and Kaehler.

-   Usually we need to convert an image to a size different than its original. For this, there are
    two possible options:
    -#  *Upsize* the image (zoom in) or
    -#  *Downsize* it (zoom out).
-   Although there is a *geometric transformation* function in OpenCV that -literally- resize an
    image (**resize** , which we will show in a future tutorial), in this section we analyze
    first the use of **Image Pyramids**, which are widely applied in a huge range of vision
    applications.

### Image Pyramid

-   An image pyramid is a collection of images - all arising from a single original image - that are
    successively downsampled until some desired stopping point is reached.
-   There are two common kinds of image pyramids:
    -   **Gaussian pyramid:** Used to downsample images
    -   **Laplacian pyramid:** Used to reconstruct an upsampled image from an image lower in the
        pyramid (with less resolution)
-   In this tutorial we'll use the *Gaussian pyramid*.

#### Gaussian Pyramid

-   Imagine the pyramid as a set of layers in which the higher the layer, the smaller the size.

    ![](images/Pyramids_Tutorial_Pyramid_Theory.png)

-   Every layer is numbered from bottom to top, so layer \f$(i+1)\f$ (denoted as \f$G_{i+1}\f$ is smaller
    than layer \f$i\f$ (\f$G_{i}\f$).
-   To produce layer \f$(i+1)\f$ in the Gaussian pyramid, we do the following:
    -   Convolve \f$G_{i}\f$ with a Gaussian kernel:

        \f[\frac{1}{256} \begin{bmatrix} 1 & 4 & 6 & 4 & 1  \\ 4 & 16 & 24 & 16 & 4  \\ 6 & 24 & 36 & 24 & 6  \\ 4 & 16 & 24 & 16 & 4  \\ 1 & 4 & 6 & 4 & 1 \end{bmatrix}\f]

    -   Remove every even-numbered row and column.

-   You can easily notice that the resulting image will be exactly one-quarter the area of its
    predecessor. Iterating this process on the input image \f$G_{0}\f$ (original image) produces the
    entire pyramid.
-   The procedure above was useful to downsample an image. What if we want to make it bigger?:
    columns filled with zeros (\f$0 \f$)
    -   First, upsize the image to twice the original in each dimension, with the new even rows and
    -   Perform a convolution with the same kernel shown above (multiplied by 4) to approximate the
        values of the "missing pixels"
-   These two procedures (downsampling and upsampling as explained above) are implemented by the
    OpenCV functions **pyrUp()** and **pyrDown()** , as we will see in an example with the
    code below:

@note When we reduce the size of an image, we are actually *losing* information of the image.

Code
----

This tutorial code's is shown lines below.

@add_toggle_cpp
You can also download it from
[here](https://raw.githubusercontent.com/opencv/opencv/4.x/samples/cpp/tutorial_code/ImgProc/Pyramids/Pyramids.cpp)
@include samples/cpp/tutorial_code/ImgProc/Pyramids/Pyramids.cpp
@end_toggle

@add_toggle_java
You can also download it from
[here](https://raw.githubusercontent.com/opencv/opencv/4.x/samples/java/tutorial_code/ImgProc/Pyramids/Pyramids.java)
@include samples/java/tutorial_code/ImgProc/Pyramids/Pyramids.java
@end_toggle

@add_toggle_python
You can also download it from
[here](https://raw.githubusercontent.com/opencv/opencv/4.x/samples/python/tutorial_code/imgProc/Pyramids/pyramids.py)
@include samples/python/tutorial_code/imgProc/Pyramids/pyramids.py
@end_toggle

Explanation
-----------

Let's check the general structure of the program:

#### Load an image

@add_toggle_cpp
@snippet cpp/tutorial_code/ImgProc/Pyramids/Pyramids.cpp load
@end_toggle

@add_toggle_java
@snippet java/tutorial_code/ImgProc/Pyramids/Pyramids.java load
@end_toggle

@add_toggle_python
@snippet python/tutorial_code/imgProc/Pyramids/pyramids.py load
@end_toggle

#### Create window

@add_toggle_cpp
@snippet cpp/tutorial_code/ImgProc/Pyramids/Pyramids.cpp show_image
@end_toggle

@add_toggle_java
@snippet java/tutorial_code/ImgProc/Pyramids/Pyramids.java show_image
@end_toggle

@add_toggle_python
@snippet python/tutorial_code/imgProc/Pyramids/pyramids.py show_image
@end_toggle

#### Loop

@add_toggle_cpp
@snippet cpp/tutorial_code/ImgProc/Pyramids/Pyramids.cpp loop
@end_toggle

@add_toggle_java
@snippet java/tutorial_code/ImgProc/Pyramids/Pyramids.java loop
@end_toggle

@add_toggle_python
@snippet python/tutorial_code/imgProc/Pyramids/pyramids.py loop
@end_toggle

Perform an infinite loop waiting for user input.
Our program exits if the user presses **ESC**. Besides, it has two options:

-   **Perform upsampling - Zoom 'i'n (after pressing 'i')**

    We use the function **pyrUp()** with three arguments:
        -   *src*: The current and destination image (to be shown on screen, supposedly the double of the
            input image)
        -   *Size( tmp.cols*2, tmp.rows\*2 )* : The destination size. Since we are upsampling,
            **pyrUp()** expects a size double than the input image (in this case *src*).

@add_toggle_cpp
@snippet cpp/tutorial_code/ImgProc/Pyramids/Pyramids.cpp pyrup
@end_toggle

@add_toggle_java
@snippet java/tutorial_code/ImgProc/Pyramids/Pyramids.java pyrup
@end_toggle

@add_toggle_python
@snippet python/tutorial_code/imgProc/Pyramids/pyramids.py pyrup
@end_toggle

-   **Perform downsampling - Zoom 'o'ut (after pressing 'o')**

    We use the function **pyrDown()** with three arguments (similarly to **pyrUp()**):
            -   *src*: The current and destination image  (to be shown on screen, supposedly half the input
                image)
            -   *Size( tmp.cols/2, tmp.rows/2 )* : The destination size. Since we are downsampling,
                **pyrDown()** expects half the size the input image (in this case *src*).

@add_toggle_cpp
@snippet cpp/tutorial_code/ImgProc/Pyramids/Pyramids.cpp pyrdown
@end_toggle

@add_toggle_java
@snippet java/tutorial_code/ImgProc/Pyramids/Pyramids.java pyrdown
@end_toggle

@add_toggle_python
@snippet python/tutorial_code/imgProc/Pyramids/pyramids.py pyrdown
@end_toggle

Notice that it is important that the input image can be divided by a factor of two (in both dimensions).
Otherwise, an error will be shown.

Results
-------

-   The program calls by default an image [chicky_512.png](https://raw.githubusercontent.com/opencv/opencv/4.x/samples/data/chicky_512.png)
    that comes in the `samples/data` folder. Notice that this image is \f$512 \times 512\f$,
    hence a downsample won't generate any error (\f$512 = 2^{9}\f$). The original image is shown below:

    ![](images/Pyramids_Tutorial_Original_Image.jpg)

-   First we apply two successive **pyrDown()** operations by pressing 'd'. Our output is:

    ![](images/Pyramids_Tutorial_PyrDown_Result.jpg)

-   Note that we should have lost some resolution due to the fact that we are diminishing the size
    of the image. This is evident after we apply **pyrUp()** twice (by pressing 'u'). Our output
    is now:

    ![](images/Pyramids_Tutorial_PyrUp_Result.jpg)