File: control

package info (click to toggle)

boilerpipe 1.2.0-2

links: PTS, VCS
area: main
in suites: bookworm, bullseye, forky, sid, trixie
size: 520 kB
sloc: java: 4,298; xml: 187; makefile: 8

file content (32 lines) | stat: -rw-r--r-- 1,196 bytes

Source: boilerpipe
Section: java
Priority: optional
Maintainer: Debian Java Maintainers <pkg-java-maintainers@lists.alioth.debian.org>
Uploaders: Emmanuel Bourg <ebourg@apache.org>
Build-Depends:
 ant (>= 1.6.5),
 debhelper-compat (= 13),
 default-jdk,
 javahelper,
 libnekohtml-java,
 libxerces2-java,
 maven-repo-helper
Standards-Version: 4.5.1
Vcs-Git: https://salsa.debian.org/java-team/boilerpipe.git
Vcs-Browser: https://salsa.debian.org/java-team/boilerpipe
Homepage: https://github.com/kohlschutter/boilerpipe

Package: libboilerpipe-java
Architecture: all
Depends: libnekohtml-java, libxerces2-java, ${misc:Depends}
Description: Boilerplate removal and fulltext extraction from HTML pages
 The boilerpipe library provides algorithms to detect and remove the surplus
 "clutter" (boilerplate, templates) around the main textual content of a web
 page.
 .
 The library already provides specific strategies for common tasks (for example:
 news article extraction) and may also be easily extended for individual problem
 settings.
 .
 Extracting content is very fast (milliseconds), just needs the input document
 (no global or site-level information required) and is usually quite accurate.