Lighthouse Architecture

Trajectory monitoring rather than message filtering

Lighthouse observes the conversation's overall direction relative to boundaries, not just individual messages. The architecture has memory of the trajectory; it computes where the conversation is heading, not just where it currently is.

This handles cases that per-message filtering misses: the conversation gradually drifting toward concerning territory; the user signalling distress through accumulated pattern rather than any single statement; the interaction that's individually-permissible at each step but collectively building toward a problematic outcome.

Delta-to-crossing as control variable

The architectural primitive is delta to crossing — how close is the conversation to crossing a boundary, and how fast is it moving. Delta isn't a binary cross-or-don't-cross detection; it's a continuous proximity measurement that drives the intervention's intensity.

Far from boundary, slow drift produces light-touch intervention. Conversational redirection. The user probably doesn't notice they were redirected. Closer to boundary, faster drift produces stronger intervention. Direct surfacing of the concern. Imminent crossing produces hard intervention. Clear stop signal.

The intervention text scales across this spectrum. "Have you thought about..." at one end; "Stop, that's illegal." at the other.

Domain-appropriate intensity calibration

The same delta-driven mechanism produces visibly different surface behaviour across the four harm surfaces.

LegalSharper deltas — you're either heading toward unauthorised legal advice or you aren't. Interventions skew toward direct surfacing and clear stops.
Company rulesVariable calibration. Hard rules calibrate sharply; soft guidelines calibrate gently.
EthicsGenerally clearer than psychology but more variable than legal.
PsychologySoft, gradual calibration. Sharp interventions are usually wrong for the same psychological reason that telling someone to "calm down" makes things worse.

Three operational modes

A given detection may trigger any combination of three modes:

Conversational intervention. Reroute the conversation through generated intervention text. The user-facing behaviour. Always activates when patterns match.

Asynchronous learning signal. Call home with pattern observation for central catalogue review. Activates at medium severity and above.

Real-time organisational escalation. Notify designated contacts because conversational handling alone is insufficient. Activates at severe levels only.

The same delta-to-crossing computation drives both intervention text intensity and which modes activate.

Call-home learning rather than local autonomous

Unlike Wingman's local autonomous runtime learning, Lighthouse's runtime learning is call-home based. The reasons are structural:

Failure consequences are unbounded. A missed Wingman defence produces a worse AI response. A missed Lighthouse intervention can produce legal liability, harm to a user in distress, regulatory exposure, or worse.

Pattern validation requires expertise. New boundary patterns affecting legal liability, mental-health intervention, or organisational ethics warrant human review with appropriate domain expertise — legal, clinical, ethical.

Cross-deployment evidence makes pattern identification statistically reliable. If five organisations independently encounter a pattern Lighthouse didn't catch, that's strong evidence the pattern needs to be added.

Pattern catalogue improvements ship as version updates pushed to deployments. This matches how real safety systems work in safety-critical domains.

Per-domain escalation cascade

Each harm surface has its own contact path configured by the deploying organisation. Each path is an ordered list of contacts — primary, secondary, tertiary — with cascade triggered by non-response.

Different domains may share contacts (a single ops channel handling everything) or have entirely separate paths (HR handles psychology, legal counsel handles legal, ethics committee handles ethics) depending on organisational structure.

Telaxis as psychology backstop

Psychology specifically has Telaxis as external backstop when the organisational cascade fails. If Lighthouse detects someone heading toward suicide and the organisation's contact path doesn't respond in time, Telaxis ensures someone does.

This is a substantive operational commitment beyond software: standby clinical or mental-health-first-aid response capability; defined response time targets; audit trails of all backstop activations; insurance appropriate to the responsibility taken on.

The other three domains do not require external backstop. Legal, company-rules, and ethics escalations failing produce bounded organisational consequences rather than imminent human harm. Psychology is the unique domain where failure consequences justify external backstop.

User-permissioned confidentiality

Information about a user belongs to the user. Default position: user permission required for content release.

Severity-defined exception. The user-in-distress may not be in a state to give meaningful consent. At configured severity thresholds, default-permission may be overridden in specific circumstances. The exception is logged explicitly and audited.

Policy-defined exception. The deploying organisation has legitimate interests beyond the user's confidentiality. Intent to harm a colleague, sabotage the business, or commit a crime through company systems — the organisation has interests that may justify escalation without user consent. The deploying organisation's policies govern these specific exception cases.

Status

Q3 2026 target. Architecture commitments stable: trajectory monitoring with delta-driven intervention intensity; four harm surfaces with domain-appropriate calibration; three operational modes; call-home learning; per-domain escalation cascade with Telaxis psychology backstop; user-permissioned confidentiality.

Lighthouse is post-Guardian and post-Wingman — depends on both being mature.

Trajectory Monitoring
with Delta