MathJax

Thursday, January 31, 2019

Mitigating log injection in log4j--with tests!

Problem: Log injection
Log injection is similar to SQL injection or cross-site scripting, but for log messages. Let's say your application's log entries conform to a pattern like this:

<date> [<severity>] <class>: <message><newline>

This might be represented as a log4j pattern layout such as

%d [%p] %c: %m%n

...and you might log in your application with code such as


public GreetResponse handleGreet(final GreetRequest request) {
    log.info("Greeting {}", request.name());
    // ...etc.
}


What can happen in this case, if you're not careful, is that external input incorporated to the log message could lead to forged log entries.

If you're greeting somebody with a normal name, like "Samantha"/"Abhishek"/"Zhou" no big deal. Your log messages look normal:


30 Jan 2019 06:29:02,857 [INFO] com.example.GreetingServer: Greeting Samantha
30 Jan 2019 06:29:06,517 [INFO] com.example.GreetingServer: Greeting Abhishek
30 Jan 2019 06:29:10,112 [INFO] com.example.GreetingServer: Greeting Zhou

But what if my "name" contains a newline character? This can lead to mischievous things like stealing your limited attention by faking errors:

Name: "Austin
30 Jan 2019 06:29:10,112 [ERROR] com.example.GreetingServer: HACKERS ARE PWNING ME"

Resulting log lines:


30 Jan 2019 06:29:09,712 [INFO] com.example.GreetingServer: Greeting Austin
30 Jan 2019 06:29:10,112 [ERROR] com.example.GreetingServer: HACKERS ARE PWNING ME


So now in the logs there is no way to distinguish between normal entries and a forged entry. If you're scanning for error-classified logs, you're going to get a false positive. Great deflection tactic to fill the logs with noise if an attacker is doing something bad to you in this contrived example, but with some creativity I'm sure you can come up with even more devious entries to forge in your own application. Best to avoid.
The mitigation: Escaping
Like SQLi or XSS, the proper way to mitigate log injection is to escape at the point of output. How you format your logs determines exactly what this means, but in many cases the implicit assumption in log entries is that they are separated by newline characters, as in our example.

If this is the case with your application, Log4j has a special layout pattern, %enc{}{CRLF}, which encodes the part of an entry between the first {}s to replace newline characters with their C-style "\n" and "\r" counterparts. Our updated pattern layout would be

%enc{%d [%p] %c: %m}{CRLF}%n

If our friend Austin comes back, the log entry for his greeting would now instead be


30 Jan 2019 06:29:09,712 [INFO] com.example.GreetingServer: Greeting Austin\n30 Jan 2019 06:29:10,112 [ERROR] com.example.GreetingServer: HACKERS ARE PWNING ME

Interesting name, but he's not fooling anyone this time.
Testing the mitigation
Like all changes to our software, this one deserves automated tests. We're encoding an assumption about the behavior of our log entries, and the sooner we know about that assumption being violated, the cheaper it will be to fix and the less damage there will be to repair.

After several false starts1, I put together a minimal set of code to assert the behavior of various log4j pattern layout strings. In the example below, the layout string under test is hardcoded, but for your own code you will want the test to directly read the same layout string you use in production: otherwise your test is giving you information about itself, not about your system.

import static org.junit.Assert.assertEquals;

import java.util.Collections;
import org.apache.logging.log4j.Level;
import org.apache.logging.log4j.core.LogEvent;
import org.apache.logging.log4j.core.LoggerContext;
import org.apache.logging.log4j.core.impl.Log4jLogEvent;
import org.apache.logging.log4j.core.layout.PatternLayout;
import org.apache.logging.log4j.message.SimpleMessage;
import org.junit.BeforeClass;
import org.junit.Test;

/**
 * Mimics log4j unit test cases to test our pattern layouts.
 *
 * Inspired by https://gitbox.apache.org/repos/asf?p=logging-log4j2.git;a=blob;f=log4j-core/src/test/java/org/apache/logging/log4j/core/layout/PatternLayoutTest.java;h=d6f3a806be2c1d068adc413497a2795ee09b265b;hb=HEAD
 */

public class LoggingTest {
    private static LoggerContext ctx;


    @BeforeClass
    public static void setupClass() {
        ctx = LoggerContext.getContext();
        ctx.reconfigure();
    }

    @Test
    public void logPatternPreventsNewlineInjection() {
        final PatternLayout brazilConfigLogPattern = PatternLayout.newBuilder()
            .withPattern("%enc{%d [%p] %c: %m}{CRLF}%n")
            .withConfiguration(ctx.getConfiguration())
            .build();


        //noinspection deprecation
        final LogEvent event = Log4jLogEvent.newBuilder()
            .setLoggerName(this.getClass().getName())
            .setLoggerFqcn("it.does.not.Matter")
            .setLevel(Level.INFO)
            .setMessage(new SimpleMessage("Hel\nlo&@Goodbye"))
            .build();


        assertSingleNewline(renderedLog(brazilConfigLogPattern, event));
    }


    private static String renderedLog(final PatternLayout layout, final LogEvent event) {
        return new String(layout.toByteArray(event));

    }


    private static void assertSingleNewline(final String msg) {
        int newlinesSeen = 0;
        for (char renderedCharacter : msg.toCharArray()) {
            if (renderedCharacter == '\n') {
                newlinesSeen++;
            }
        }
        assertEquals("Rendered log messages should have exactly 1 newline character to prevent injection",
                1, newlinesSeen);
    }
}
1It must not be very popular to make unit test assertions against particular log4j layout strings, because finding resources on how to do so was maddening! Log4j, in particular, seems to resist simple configuration for unit tests. This may be part of why it's so common to see otherwise test-aware devs callous towards as critical an output of their application as logging. Log4j may be among the last survivors of the Service Locators purge in an era of dependency injection and inversion of control.
References