From: Ken Sharp <Ken.Sharp@artifex.com>
Date: Mon, 2 Sep 2024 15:14:01 +0100
Subject: PDF interpreter - sanitise W array values in Xref streams
Origin: https://git.ghostscript.com/?p=ghostpdl.git;a=commit;h=1fb76aaddac34530242dfbb9579d9997dae41264
Bug: https://bugs.ghostscript.com/show_bug.cgi?id=708001
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2024-46952

Bug #708001 "Buffer overflow in PDF XRef stream"

See bug report. I've chosen to fix this by checking the values in the
W array; these can (currently at least) only have certain relatively
small values.

As a future proofing fix I've also updated field_size in
pdf_xref_stream_entries() to be a 64-bit integer. This is far bigger
than required, but matches the W array values and so prevents the
mismatch which could lead to a buffer overrun.

CVE-2024-46952
---
 pdf/pdf_xref.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

--- a/pdf/pdf_xref.c
+++ b/pdf/pdf_xref.c
@@ -53,7 +53,7 @@ static int resize_xref(pdf_context *ctx,
 static int read_xref_stream_entries(pdf_context *ctx, pdf_c_stream *s, uint64_t first, uint64_t last, uint64_t *W)
 {
     uint i, j;
-    uint field_width = 0;
+    uint64_t field_width = 0;
     uint32_t type = 0;
     uint64_t objnum = 0, gen = 0;
     byte *Buffer;
@@ -298,6 +298,24 @@ static int pdfi_process_xref_stream(pdf_
     }
     pdfi_countdown(a);
 
+    /* W[0] is either:
+     * 0 (no type field) or a single byte with the type.
+     * W[1] is either:
+     * The object number of the next free object, the byte offset of this object in the file or the object5 number of the object stream where this object is stored.
+     * W[2] is either:
+     * The generation number to use if this object is used again, the generation number of the object or the index of this object within the object stream.
+     *
+     * Object and generation numbers are limited to unsigned 64-bit values, as are bytes offsets in the file, indexes of objects within the stream likewise (actually
+     * most of these are generally 32-bit max). So we can limit the field widths to 8 bytes, enough to hold a 64-bit number.
+     * Even if a later version of the spec makes these larger (which seems unlikely!) we still cna't cope with integers > 64-bits.
+     */
+    if (W[0] > 1 || W[1] > 8 || W[2] > 8) {
+        pdfi_close_file(ctx, XRefStrm);
+        pdfi_countdown(ctx->xref_table);
+        ctx->xref_table = NULL;
+        return code;
+    }
+
     code = pdfi_dict_get_type(ctx, sdict, "Index", PDF_ARRAY, (pdf_obj **)&a);
     if (code == gs_error_undefined) {
         code = read_xref_stream_entries(ctx, XRefStrm, 0, size - 1, (uint64_t *)W);