1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112
|
#
# Pairwise Distances Reductions
# =============================
#
# Authors: The scikit-learn developers.
# License: BSD 3 clause
#
# Overview
# --------
#
# This module provides routines to compute pairwise distances between a set
# of row vectors of X and another set of row vectors of Y and apply a
# reduction on top. The canonical example is the brute-force computation
# of the top k nearest neighbors by leveraging the arg-k-min reduction.
#
# The reduction takes a matrix of pairwise distances between rows of X and Y
# as input and outputs an aggregate data-structure for each row of X. The
# aggregate values are typically smaller than the number of rows in Y, hence
# the term reduction.
#
# For computational reasons, the reduction are performed on the fly on chunks
# of rows of X and Y so as to keep intermediate data-structures in CPU cache
# and avoid unnecessary round trips of large distance arrays with the RAM
# that would otherwise severely degrade the speed by making the overall
# processing memory-bound.
#
# Finally, the routines follow a generic parallelization template to process
# chunks of data with OpenMP loops (via Cython prange), either on rows of X
# or rows of Y depending on their respective sizes.
#
#
# Dispatching to specialized implementations
# ------------------------------------------
#
# Dispatchers are meant to be used in the Python code. Under the hood, a
# dispatcher must only define the logic to choose at runtime to the correct
# dtype-specialized :class:`BaseDistancesReductionDispatcher` implementation based
# on the dtype of X and of Y.
#
#
# High-level diagram
# ------------------
#
# Legend:
#
# A ---⊳ B: A inherits from B
# A ---x B: A dispatches to B
#
#
# (base dispatcher)
# BaseDistancesReductionDispatcher
# ∆
# |
# |
# +------------------+---------------+---------------+------------------+
# | | | |
# | (dispatcher) (dispatcher) |
# | ArgKmin RadiusNeighbors |
# | | | |
# | | | |
# | | (float{32,64} implem.) | |
# | | BaseDistancesReduction{32,64} | |
# | | ∆ | |
# (dispatcher) | | | (dispatcher)
# ArgKminClassMode | | | RadiusNeighborsClassMode
# | | +----------+----------+ | |
# | | | | | |
# | | | | | |
# | x | | x |
# | +-------⊳ ArgKmin{32,64} RadiusNeighbors{32,64} ⊲---+ |
# x | | ∆ ∆ | | x
# ArgKminClassMode{32,64} | | | | RadiusNeighborsClassMode{32,64}
# ===================================== Specializations ============================================
# | | | |
# | | | |
# x | | x
# EuclideanArgKmin{32,64} EuclideanRadiusNeighbors{32,64}
#
#
# For instance :class:`ArgKmin` dispatches to:
# - :class:`ArgKmin64` if X and Y are two `float64` array-likes
# - :class:`ArgKmin32` if X and Y are two `float32` array-likes
#
# In addition, if the metric parameter is set to "euclidean" or "sqeuclidean",
# then some direct subclass of `BaseDistancesReduction{32,64}` further dispatches
# to one of their subclass for euclidean-specialized implementation. For instance,
# :class:`ArgKmin64` dispatches to :class:`EuclideanArgKmin64`.
#
# Those Euclidean-specialized implementations relies on optimal implementations of
# a decomposition of the squared euclidean distance matrix into a sum of three terms
# (see :class:`MiddleTermComputer{32,64}`).
#
from ._dispatcher import (
ArgKmin,
ArgKminClassMode,
BaseDistancesReductionDispatcher,
RadiusNeighbors,
RadiusNeighborsClassMode,
sqeuclidean_row_norms,
)
__all__ = [
"BaseDistancesReductionDispatcher",
"ArgKmin",
"RadiusNeighbors",
"ArgKminClassMode",
"RadiusNeighborsClassMode",
"sqeuclidean_row_norms",
]
# ruff: noqa: E501
|