1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
|
Source: libstatistics-topk-perl
Maintainer: Debian Perl Group <pkg-perl-maintainers@lists.alioth.debian.org>
Uploaders: Mason James <mtj@kohaaloha.com>
Section: perl
Testsuite: autopkgtest-pkg-perl
Priority: optional
Build-Depends: debhelper-compat (= 13)
Build-Depends-Indep: libtest-simple-perl <!nocheck>,
perl
Standards-Version: 4.6.2
Vcs-Browser: https://salsa.debian.org/perl-team/modules/packages/libstatistics-topk-perl
Vcs-Git: https://salsa.debian.org/perl-team/modules/packages/libstatistics-topk-perl.git
Homepage: https://metacpan.org/release/Statistics-TopK
Rules-Requires-Root: no
Package: libstatistics-topk-perl
Architecture: all
Depends: ${misc:Depends},
${perl:Depends}
Description: implementation of the top-k streaming algorithm
The Statistics::TopK module implements the top-k streaming algorithm, also
know as the "heavy hitters" algorithm. It is designed to process data streams
and probabilistally calculate the k most frequent items while using limited
memory.
.
A typical example would be to determine the top 10 IP addresses listed in an
access log. A simple solution would be to hash each IP address to a counter
and then sort the resulting hash by the counter size. But the hash could
theoretically require over 4 billion keys.
.
The top-k algorithm only requires storage space proportional to the number of
items of interest. It accomplishes this by sacrificing precision, as it is
only a probabilistic counter.
|