Package: pypdf2 | Debian Sources

Package: pypdf2 / 1.26.0-4+deb11u1

Metadata

Package	Version	Patches format
pypdf2	1.26.0-4+deb11u1	3.0 (quilt)

Patch series

view the series file

Patch	File delta	Description
Prevent_infinite_loop_in_readObject.patch \| (download)	PyPDF2/generic.py \| 4 4 + 0 - 0 ! 1 file changed, 4 insertions(+)	[patch] prevent infinite loop in readobject() function. patch by dhudson1. Closes mstamy2/PyPDF2#184
CVE 2022 24859.patch \| (download)	PyPDF2/pdf.py \| 32 22 + 10 - 0 ! 1 file changed, 22 insertions(+), 10 deletions(-)	cve-2022-24859 Bug-Debian: https://bugs.debian.org/1009879
0001 MAINT Quadratic runtime while parsing reduced to lin.patch \| (download)	PyPDF2/pdf.py \| 8 4 + 4 - 0 ! 1 file changed, 4 insertions(+), 4 deletions(-)	maint: quadratic runtime while parsing reduced to linear (#808) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When the PdfFileReader tries to find the xref marker, the readNextEndLine methods builds a so called line by reading byte-for-byte. Every time a new byte is read, it is concatenated with the currently read line. This leads to quadratic runtime O(n) behavior as Python strings (also byte-strings) are immutable and have to be copied where n is the size of the file. For files where the xref marker can not be found at the end this takes a enormous amount of time: * 1mb of zeros at the end: 45.54 seconds * 2mb of zeros at the end: 357.04 seconds (measured on a laptop made in 2015) This pull request changes the relevant section of the code to become linear runtime O(n), leading to a run time of less then a second for both cases mentioned above. Furthermore this PR adds a regression test.

Patch

File delta

Description

Prevent_infinite_loop_in_readObject.patch | (download)

PyPDF2/generic.py | 4 4 + 0 - 0 !
1 file changed, 4 insertions(+)

 [patch] prevent infinite loop in readobject() function. patch by
 dhudson1. Closes mstamy2/PyPDF2#184

CVE 2022 24859.patch | (download)

PyPDF2/pdf.py | 32 22 + 10 - 0 !
1 file changed, 22 insertions(+), 10 deletions(-)

 cve-2022-24859

Bug-Debian: https://bugs.debian.org/1009879

0001 MAINT Quadratic runtime while parsing reduced to lin.patch | (download)

PyPDF2/pdf.py | 8 4 + 4 - 0 !
1 file changed, 4 insertions(+), 4 deletions(-)

 maint: quadratic runtime while parsing reduced to linear  (#808)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

When the PdfFileReader tries to find the xref marker, the readNextEndLine methods builds a so called line by reading byte-for-byte. Every time a new byte is read, it is concatenated with the currently read line. This leads to quadratic runtime O(n) behavior as Python strings (also byte-strings) are immutable and have to be copied where n is the size of the file.
For files where the xref marker can not be found at the end this takes a enormous amount of time:

* 1mb of zeros at the end: 45.54 seconds
* 2mb of zeros at the end: 357.04 seconds
(measured on a laptop made in 2015)

This pull request changes the relevant section of the code to become linear runtime O(n), leading to a run time of less then a second for both cases mentioned above. Furthermore this PR adds a regression test.