Audit Log Integrity

An audit log is only as good as the guarantees you can make about it under adversarial review. "We log every query" is a starting point; "we can prove these records were written in order and have not been modified or deleted since" is the standard a compliance auditor or opposing counsel will actually apply. The techniques below — append-only storage, hash-chained entries, independent signing, and external notarization — turn logs from a convenient debug artifact into evidence.



1. What an Audit Log Must Record

For an AI document pipeline, each event should capture:


2. Append-Only Storage


3. Hash-Chained Entries

Each log entry includes the hash of the previous entry, so tampering with any record invalidates every subsequent hash. A verifier can walk the chain and detect any insertion, deletion, or modification.


4. Example: Signed Hash Chain

import json, hashlib, hmac, time
from dataclasses import dataclass, asdict


@dataclass
class AuditEntry:
    seq: int
    ts: float
    actor: str
    action: str
    subject: dict
    prev_hash: str
    mac: str = ""


def _canonical(e: AuditEntry) -> bytes:
    d = asdict(e); d.pop("mac")
    return json.dumps(d, sort_keys=True, separators=(",", ":")).encode()


def append(prev: AuditEntry | None, actor: str, action: str,
           subject: dict, sign_key: bytes) -> AuditEntry:
    entry = AuditEntry(
        seq=(prev.seq + 1) if prev else 1,
        ts=time.time(),
        actor=actor,
        action=action,
        subject=subject,
        prev_hash=hashlib.sha256(_canonical(prev)).hexdigest() if prev else "GENESIS",
    )
    entry.mac = hmac.new(sign_key, _canonical(entry), hashlib.sha256).hexdigest()
    return entry


def verify_chain(entries: list[AuditEntry], sign_key: bytes) -> bool:
    prev = None
    for e in entries:
        expected_prev = (hashlib.sha256(_canonical(prev)).hexdigest()
                         if prev else "GENESIS")
        if e.prev_hash != expected_prev:
            return False
        mac = hmac.new(sign_key, _canonical(e), hashlib.sha256).hexdigest()
        if not hmac.compare_digest(mac, e.mac):
            return False
        prev = e
    return True

For production, replace HMAC with an asymmetric signature (Ed25519, ECDSA P-256) whose private key lives in a KMS HSM. The HMAC version is illustrative; asymmetric signing lets an auditor verify without holding the signing key.


5. External Notarization

To defend against silent truncation, periodically publish the chain's current head hash to an external, append-only medium:

Cadence is a trade-off: anchor every minute and the head can only slide a minute before detection, but cost rises; anchor daily and the detection window is a day.


6. Access Controls & Retention


↑ Back to Top