File: 2.17.0-interpolation.md

package info (click to toggle)
apache-log4j2 2.17.1-1~deb11u1
links: PTS, VCS
area: main
in suites: bullseye
size: 21,328 kB
sloc: java: 174,836; xml: 26,222; sh: 97; jsp: 21; javascript: 7; makefile: 5
file content (156 lines) | stat: -rw-r--r-- 9,303 bytes
parent folder | download | duplicates (2)
I'd like to go into detail on some of the changes in 2.17.0, why they're so important, and how they relate to both [CVE-2021-45046](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-45046) and [CVE-2021-45105](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-45105).

The substitution of untrusted log data allowed access to code that was never meant to be exposed. Lookups should be triggered only by configuration and the logging framework (including custom layout/appender/etc plugins). Not by user-provided inputs.

## 1. PatternLayout rendered message substitution

Substitution within the contents of a rendered log message provided the largest opportunity for untrusted inputs to be evaluated by the string replacement system. This was [removed (by default) in 2.15.0](https://github.com/apache/logging-log4j2/commit/001aaada7dab82c3c09cde5f8e14245dc9d8b454), and [removed entirely in 2.16.0](https://github.com/apache/logging-log4j2/commit/27972043b76c9645476f561c5adc483dec6d3f5d) after public disclosure.

## 2. Recursive Substitution Within Lookups

Despite message replacements being removed in 2.15.0, other pathways still exist (based on the configuration) which can evaluate untrusted user-provided values. For example, a `PatternLayout` using the pattern `%p %t %c $${ctx:userAgent} %m%n`[1][2] would re-evaluate `config.getStrSubstitutor().replace(event, " ${ctx:userAgent} ")` for each log-event due to the [LiteralPatternConverter](https://github.com/apache/logging-log4j2/blob/cffe58f6a433ea1ab60ceb129d4c9b3377acda1d/log4j-core/src/main/java/org/apache/logging/log4j/core/pattern/LiteralPatternConverter.java#L62-L65). The substitutor would evaluate recursively, and invoke unexpected lookups. At worst triggering something like jndi or an exception preventing logging, but even resulting in logging that doesn't match the literal data input is a very serious bug. For example if `MDC.put("userAgent", "${lower:FOO}")` then `${ctx:userAgent}` would evaluate to literal `foo` instead of literal `${lower:FOO}`

This is not isolated to the PatternLayout, rather anywhere lookups in which the result contained user-provided data. e.g. `${ctx:USER_PROVIDED}`, `${event:Message}`. For example `<Routes pattern="$${ctx:userAgent}">` would be impacted as well.

In 2.17.0 we resolved this class of issues with a simple idea:
Recursive evaluation is allowed while parsing the configuration (no user-input/LogEvent data is present, and configuration breaks are to be avoided) however when log-events themselves are being evaluated we _never_ recursively evaluate substitutions. [That's it](https://github.com/apache/logging-log4j2/commit/806023265f8c905b2dd1d81fd2458f64b2ea0b5e). That means when `${lower:${event:Message}}` is evaluated for message `Hello, World!` the result is `hello, world!`, and when evaluated for `${java:version}` the result is the string literal `${java:version}` which itself is not evaluated.

-----

1. Note that I used `$${ctx:userAgent}` so the value itself is not evaluated, however even `${ctx:userAgent}` may be vulnerable because it's not likely that a thread-context value is set when the configuration itself is evaluated, in which case no replacement occurs and the value is equivalent to the `$${ctx:userAgent}` example.
2. Please use `%X{userAgent}` in the PatternLayout instead of `${ctx:userAgent}`. It's safer, more obvious, and more efficient. LogEventPatternConverters are pluggable, don't rely on stringly typed data, and are created when the pattern is parsed, not re-scanned on a per-LogEvent basis.

## 3. Impact Upon Configuration

There are certainly cases in which this change impacts existing configurations, however I believe that the safety is well worth the trade-off.
For example, [this test configuration](https://github.com/apache/logging-log4j2/commit/806023265f8c905b2dd1d81fd2458f64b2ea0b5e#diff-f13a31d919bf2e7169ca936948aeef1cda6089f295be684d71f2bd5709248475) had to be updated because routes are created based on a log-event, thus we cannot allow recursive evaluation. An argument can be made that the inputs must already be sanitized to avoid path traversal, however the filename isn't the only place these lookups can be evaluated, and the type of sanitization differs dramatically between validating against string replacement data, and validating against unexpected path-based attacks.

In a perfect world I'd love to swap the types used for substitution to avoid stringly matching anything, using the type system to differentiate between trusted lookups and user-input data, however in such a large, widely-used framework, that's not something we can do, certainly not quickly or easily.

### Configuration Impact: Examples

**Complex Lookups :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:**

This example uses complex lookups based on the [ThreadContext](https://logging.apache.org/log4j/2.x/manual/thread-context.html) (aka MDC), however it continues to work as expected **and** prevents user-provided data from `key1` or `key2` from being substituted.

```xml
<--
Note that lookups are all escaped to prevent them from being evaluated
immediately when the Pattern is constructed
-->
<PatternLayout pattern="%d %p %t %c $${lower:$${ctx:key1:-$${ctx:key2}}} %m%n"/>
```


**Pattern Reuse Via Lookups :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:**

This is not impacted either:

```xml
<Properties>
  <Property name="defaultPattern">%d %p %t %c %m%n</Property>
</Properties>
<Appenders>
  <Console name="console">
    <PatternLayout pattern="${defaultPattern}"/>
  </Console>
</Appenders>
```

**Lazy Pattern Lookups Which include Lazy Lookups :x::x::x:**

However, in this case, evaluation depends on a lazily-replaced property which itself is lazily evaluated:

```xml
<Properties>
  <Property name="messageFormat">$${event:Message}</Property>
</Properties>
<Appenders>
  <Console name="console">
    <PatternLayout pattern="%c $${messageFormat}%n"/>
  </Console>
</Appenders>
```

The upgrade to 2.17.0 would change the output for the following log line:

```java
LogManager.getLogger(com.example.Main.class).info("example");
```

```diff
- com.example.Main example
+ com.example.Main ${event:Message}
```

This can be resolved by unescaping the property reference in the pattern, allowing the pattern itself to be fully evaluated when the configuration is loaded, but preventing message contents from being further interpolated due to the runtime-substitution protections.

Note that this example is contrived, and in practice one should use the `%m` pattern here rather than a lookup, however this example applies to a general shape of usage and some extrapolation may be required for your configuration.


```diff
<Properties>
  <Property name="messageFormat">$${event:Message}</Property>
</Properties>
<Appenders>
  <Console name="console">
-    <PatternLayout pattern="%c $${messageFormat}%n"/>
+    <PatternLayout pattern="%c ${messageFormat}%n"/>
  </Console>
</Appenders>
```

**RoutingAppender with lazy property references in routes :x::x::x:**

RoutingAppender routes are lazily evaluated when they're needed, which means a LogEvent (thus potentially user-provided data) is available. Everything that could be represented before can still be done, however it's no longer possible to extract some pieces to shared properties.

Let's take the [example from the test liked above](https://github.com/apache/logging-log4j2/commit/806023265f8c905b2dd1d81fd2458f64b2ea0b5e#diff-f13a31d919bf2e7169ca936948aeef1cda6089f295be684d71f2bd5709248475)

```xml
<Configuration status="WARN" name="RoutingTest">
  <Properties>
    <Property name="filename">target/routing1/routingtest-$${sd:type}.log</Property>
  </Properties>
  <Appenders>
    <Routing name="Routing">
      <Routes>
        <Route>
          <RollingFile name="Routing-${sd:type}" fileName="${filename}"
                       filePattern="target/routing1/test1-${sd:type}.%i.log.gz">
            <PatternLayout>
              <Pattern>%d %p %C{1.} [%t] %m%n</Pattern>
            </PatternLayout>
            <SizeBasedTriggeringPolicy size="500" />
          </RollingFile>
        </Route>
      </Routes>
    </Routing>
  </Appenders>
  <Loggers>
    <Root level="info">
      <AppenderRef ref="Routing"/>
    </Root>
  </Loggers>
</Configuration>
```

The reference to `fileName="${filename}"` is no longer evaluated recursively to `fileName="target/routing1/routingtest-${sd:type}.log"` however the `${sd:type}` is left as a string literal, not expanded as it was before.

The fix in this case is to inline the `fileName` property value into the `RollingFile` `filename` attribute, removing the extra`$` escaping the `${sd:type}` lookup like so:

```diff
    <Routing name="Routing">
      <Routes>
        <Route>
-          <RollingFile name="Routing-${sd:type}" fileName="${filename}"
+          <RollingFile name="Routing-${sd:type}" fileName="target/routing1/routingtest-${sd:type}.log"
                       filePattern="target/routing1/test1-${sd:type}.%i.log.gz">
            <PatternLayout>
              <Pattern>%d %p %C{1.} [%t] %m%n</Pattern>
            </PatternLayout>
            <SizeBasedTriggeringPolicy size="500" />
          </RollingFile>
        </Route>
      </Routes>
    </Routing>
```