File: vision.dox

package info (click to toggle)
arrayfire 3.3.2%2Bdfsg1-4
links: PTS, VCS
area: main
in suites: stretch
size: 109,016 kB
sloc: cpp: 127,909; lisp: 6,878; python: 3,923; ansic: 1,051; sh: 347; makefile: 338; xml: 175
file content (203 lines) | stat: -rw-r--r-- 8,679 bytes
/**
\addtogroup arrayfire_func
@{

\defgroup cv_func_fast fast
\ingroup featdetect_mat

\brief FAST feature detector

A circle of radius 3 pixels, translating into a total of 16 pixels, is checked
for sequential segments of pixels much brighter or much darker than the central
one. For a pixel p to be considered a feature, there must exist a sequential
segment of arc_length pixels in the circle around it such that all are greather
than (p + thr) or smaller than (p - thr). After all features in the image are
detected, if nonmax is true, the non-maximal suppression is applied, checking
all detected features and the features detected in its 8-neighborhood and
discard it if its score is non maximal.

=======================================================================

\defgroup cv_func_harris harris
\ingroup featdetect_mat

\brief Harris corner detector

Compute corners using the Harris corner detector approach. For each pixel, a
small window is used to calculate the determinant and trace of such a window,
from which a response is calculated. Pixels are considered corners if they are
local maximas and have a high positive response.

=======================================================================

\defgroup cv_func_susan susan
\ingroup featdetect_mat

\brief SUSAN corner detector

SUSAN is an acronym standing for *Smallest Univalue Segment Assimilating Nucleus*. This method
places a circular disc over the pixel to be tested (a.k.a nucleus) to compute the corner measure
of that corresponding pixel. The region covered by the circular disc is **M**, and a pixel in this
region is represented by \f$\vec{m} \in M\f$ where \f$\vec{m}_0\f$ is the nucleus. Every pixel in the region
is compared to the nucleus using the following comparison function:

\f$ c(\vec{m}) = e^{-{(({I(\vec{m}) - I(\vec{m}_0))} / t})^6}\f$

where *t* is radius of the region, *I* is the brightness of the pixel.

Response of SUSAN operator is given by the following equation:

\f$ R(M) = \begin{cases} g - n(M) \quad \text{if } n(M) < g\\ 0 \quad \text{otherwise},\\ \end{cases}\f$

where \f$ n(M) =  \sum\nolimits_{\vec{m} \in M} c(\vec{m})\f$, g is named the *geometric threshold* and n is the number
of pixels in the mask which are within **t** of the nucleus.

Importance of the parameters, **t** and **g** is explained below:

- *t* determines how similar points have to be to the nucleusbefore they are considered to
  be a part of the univalue segment
- g determines the minimum size of the univalue segment. For a large enough *g*, SUSAN operator becomes
  an edge dectector.

=======================================================================

\defgroup cv_func_orb orb
\ingroup featdescriptor_mat

\brief ORB Feature descriptor

Extract ORB descriptors from FAST features that hold higher Harris responses.
FAST does not compute orientation, thus, orientation of features is calculated
using the intensity centroid. As FAST is also not multi-scale enabled, a
multi-scale pyramid is calculated by downsampling the input image multiple
times followed by FAST feature detection on each scale.

=======================================================================

\defgroup cv_func_sift sift
\ingroup featdescriptor_mat

\brief SIFT feature detector and descriptor extractor

Detects features and extract descriptors using the Scale Invariant Feature
Transform (SIFT), by David Lowe.

Lowe, D. G., "Distinctive Image Features from Scale-Invariant Keypoints",
International Journal of Computer Vision, 60, 2, pp. 91-110, 2004.

WARNING: The SIFT algorithm is patented by the University of British Columbia,
before using it, make sure you have the appropriate permission to do so.

=======================================================================

\defgroup cv_func_gloh gloh
\ingroup featdescriptor_mat

\brief SIFT feature detector and GLOH descriptor extractor

Detects features using the Scale Invariant Feature Transform (SIFT),
by David Lowe. Descriptors are extracted using Gradient Location and
Orientation Histogram (GLOH).

Lowe, D. G., "Distinctive Image Features from Scale-Invariant Keypoints",
International Journal of Computer Vision, 60, 2, pp. 91-110, 2004.

Mikolajczyk, K., and Schmid, C., "A performance evaluation of local
descriptors", IEEE Transactions on Pattern Analysis and Machine Intelligence,
10, 27, pp. 1615-1630, 2005.

WARNING: Although GLOH is free of patents, the SIFT algorithm, used to detect
features that will later be used by GLOH descriptors, is patented by the
University of British Columbia, before using it, make sure you have the
appropriate permission to do so.

=======================================================================

\defgroup cv_func_hamming_matcher hammingMatcher
\ingroup featmatcher_mat

\brief Hamming Matcher

Calculates Hamming distances between two 2-dimensional arrays containing
features, one of the arrays containing the training data and the other the
query data. One of the dimensions of the both arrays must be equal among them,
identifying the length of each feature. The other dimension indicates the
total number of features in each of the training and query arrays. Two
1-dimensional arrays are created as results, one containg the smallest N
distances of the query array and another containing the indices of these
distances in the training array. The resulting 1-dimensional arrays have length
equal to the number of features contained in the query array.

=======================================================================

\defgroup cv_func_nearest_neighbour nearestNeighbour
\ingroup featmatcher_mat

\brief Nearest Neighbour

Calculates nearest distances between two 2-dimensional arrays containing
features based on the type of distance computation chosen. Currently, \ref
AF_SAD (sum of absolute differences), \ref AF_SSD (sum of squared differences)
and \ref AF_SHD (hamming distance) are supported.
One of the arrays containing the training data and the other the
query data. One of the dimensions of the both arrays must be equal among them,
identifying the length of each feature. The other dimension indicates the
total number of features in each of the training and query arrays. Two
1-dimensional arrays are created as results, one containg the smallest N
distances of the query array and another containing the indices of these
distances in the training array. The resulting 1-dimensional arrays have length
equal to the number of features contained in the query array.

=======================================================================

\defgroup cv_func_dog DoG
\ingroup featdetect_mat

\brief Difference of Gaussians

Given an image, this function computes two different versions of smoothed
input image using the difference smoothing parameters and subtracts one
from the other and returns the result.

=======================================================================

\defgroup cv_func_match_template matchTemplate
\ingroup match_mat

\brief Template Matching

Template matching is an image processing technique to find small patches of an image which match a given template image. Currently, this function doesn't support the following three metrics yet.
- \ref AF_NCC
- \ref AF_ZNCC
- \ref AF_SHD

A more in depth discussion about template matching can be found [here](http://en.wikipedia.org/wiki/Template_matching).

=======================================================================

\defgroup cv_func_homography homography
\ingroup homography_mat

\brief Homography Estimation

Homography estimation find a perspective transform between two sets of 2D points.
Currently, two methods are supported for the estimation, RANSAC (RANdom SAmple Consensus)
and LMedS (Least Median of Squares). Both methods work by randomly selecting a subset
of 4 points of the set of source points, computing the eigenvectors of that set and
finding the perspective transform. The process is repeated several times, a maximum of
times given by the value passed to the iterations arguments for RANSAC (for the CPU
backend, usually less than that, depending on the quality of the dataset, but for CUDA
and OpenCL backends the transformation will be computed exactly the amount of times
passed via the iterations parameter), the returned value is the one that matches the
best number of inliers, which are all of the points that fall within a maximum L2
distance from the value passed to the inlier_thr argument. For the LMedS case, the
number of iterations is currently hardcoded to meet the following equation:

\f$ m = \frac{log(1 - P)}{log[1 - {(1 - \epsilon)}^{p}]}\f$,

where \f$ P = 0.99\f$, \f$ \epsilon = 40\%\f$ and \f$ p = 4\f$.



@}
*/