1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143
|
From: Ryan Grove <ryan@wonko.com>
Date: Thu, 26 Jan 2023 13:42:39 -0800
Subject: Always remove `<noscript>` elements
Origin: https://github.com/rgrove/sanitize/commit/ec14265e530dc3fe31ce2ef773594d3a97778d22
Bug-Debian: https://bugs.debian.org/1030047
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2023-23627
...even if `noscript` is in the allowlist.
A `<noscript>` element's content is parsed differently in browsers
depending on whether or not scripting is enabled. Since Nokogiri doesn't
support scripting, it always parses `<noscript>` elements as if
scripting is disabled. This results in edge cases where it's not
possible to reliably sanitize the contents of a `<noscript>` element
because Nokogiri can't fully replicate the parsing behavior of a
scripting-enabled browser. The safest thing to do is to simply remove
all `<noscript>` elements.
Fixes GHSA-fw3g-2h3j-qmm7
---
HISTORY.md | 27 ++++++++++++++++++++++
README.md | 14 +++++++----
lib/sanitize/transformers/clean_element.rb | 10 ++++++++
test/test_clean_element.rb | 7 ++++++
test/test_malicious_html.rb | 20 +++++++++++++++-
5 files changed, 73 insertions(+), 5 deletions(-)
diff --git a/README.md b/README.md
index 0f6efb56fde0..5f5e73816f68 100644
--- a/README.md
+++ b/README.md
@@ -12,10 +12,10 @@ properties, @ rules, and URL protocols in elements or attributes containing CSS.
Any HTML or CSS that you don't explicitly allow will be removed.
Sanitize is based on the [Nokogumbo HTML5 parser][nokogumbo], which parses HTML
-exactly the same way modern browsers do, and [Crass][crass], which parses CSS
-exactly the same way modern browsers do. As long as your allowlist config only
-allows safe markup and CSS, even the most malformed or malicious input will be
-transformed into safe output.
+the same way modern browsers do, and [Crass][crass], which parses CSS the same
+way modern browsers do. As long as your allowlist config only allows safe markup
+and CSS, even the most malformed or malicious input will be transformed into
+safe output.
[](http://badge.fury.io/rb/sanitize)
[](https://github.com/rgrove/sanitize/actions?query=workflow%3ATests)
@@ -427,6 +427,12 @@ elements not in this array will be removed.
>
> By default, Sanitize will remove all MathML and SVG elements. If you add MathML or SVG elements to a custom element allowlist, you must assume that any content inside them will be allowed, even if that content would otherwise be removed or escaped by Sanitize. This may create a security vulnerability in your application.
+> **Note**
+>
+> Sanitize always removes `<noscript>` elements and their contents, even if `noscript` is in the allowlist.
+>
+> This is because a `<noscript>` element's content is parsed differently in browsers depending on whether or not scripting is enabled. Since Nokogiri doesn't support scripting, it always parses `<noscript>` elements as if scripting is disabled. This results in edge cases where it's not possible to reliably sanitize the contents of a `<noscript>` element because Nokogiri can't fully replicate the parsing behavior of a scripting-enabled browser.
+
#### :parser_options (Hash)
[Parsing options](https://github.com/rubys/nokogumbo/tree/master#parsing-options) to be supplied to `nokogumbo`.
diff --git a/lib/sanitize/transformers/clean_element.rb b/lib/sanitize/transformers/clean_element.rb
index a2bacd8198f7..98208a7c15f3 100644
--- a/lib/sanitize/transformers/clean_element.rb
+++ b/lib/sanitize/transformers/clean_element.rb
@@ -252,6 +252,16 @@ class Sanitize; module Transformers; class CleanElement
node['content'] = node['content'].gsub(/;\s*charset\s*=.+\z/, ';charset=utf-8')
end
+
+ # A `<noscript>` element's content is parsed differently in browsers
+ # depending on whether or not scripting is enabled. Since Nokogiri doesn't
+ # support scripting, it always parses `<noscript>` elements as if scripting
+ # is disabled. This results in edge cases where it's not possible to
+ # reliably sanitize the contents of a `<noscript>` element because Nokogiri
+ # can't fully replicate the parsing behavior of a scripting-enabled browser.
+ # The safest thing to do is to simply remove all `<noscript>` elements.
+ when 'noscript'
+ node.unlink
end
end
diff --git a/test/test_clean_element.rb b/test/test_clean_element.rb
index 1293cea17ba2..916b0f934251 100644
--- a/test/test_clean_element.rb
+++ b/test/test_clean_element.rb
@@ -541,5 +541,12 @@ describe 'Sanitize::Transformers::CleanElement' do
)).must_equal "<html><head><meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"></head><body>Howdy!</body></html>"
end
+ it 'always removes `<noscript>` elements even if `noscript` is in the allowlist' do
+ assert_equal(
+ '',
+ Sanitize.fragment('<noscript>foo</noscript>', elements: ['noscript'])
+ )
+ end
+
end
end
diff --git a/test/test_malicious_html.rb b/test/test_malicious_html.rb
index 377e6deb0fb5..7b33f68880d3 100644
--- a/test/test_malicious_html.rb
+++ b/test/test_malicious_html.rb
@@ -244,7 +244,6 @@ describe 'Malicious HTML' do
unescaped_content_elements = %w[
noembed
noframes
- noscript
plaintext
script
xmp
@@ -255,6 +254,7 @@ describe 'Malicious HTML' do
]
removed_elements = %w[
+ noscript
style
]
@@ -318,4 +318,22 @@ describe 'Malicious HTML' do
end
end
end
+
+ describe 'sanitization bypass by exploiting scripting-disabled <noscript> behavior' do
+ before do
+ @s = Sanitize.new(
+ Sanitize::Config.merge(
+ Sanitize::Config::RELAXED,
+ elements: Sanitize::Config::RELAXED[:elements] + ['noscript']
+ )
+ )
+ end
+
+ it 'is prevented by removing `<noscript>` elements regardless of the allowlist' do
+ assert_equal(
+ '',
+ @s.fragment(%[<noscript><div id='</noscript><img src=x onerror=alert(1)> '>])
+ )
+ end
+ end
end
--
2.39.1
|