@sisu-ai/mw-guardrails
Policy guardrails to short-circuit unsafe input.
Setup
npm i @sisu-ai/mw-guardrails
Exports
withGuardrails(policy: (text: string) => Promise<string | null>)
- Called early in your stack; reads
ctx.input
only. - If violation: pushes
{ role: 'assistant', content: <string> }
and short-circuits.
- Called early in your stack; reads
What It Does
- Runs a fast, user-defined policy over
ctx.input
before calling the model. - If the policy flags a violation, it pushes a friendly assistant message and stops the pipeline.
- Keeps your app responsive and predictable without depending on provider moderation.
How It Works
withGuardrails(policy)
returns middleware that evaluates your policy.
The policy
function should return:
null
when the input is allowed,- a string with the assistant message to send when blocked (e.g., guidance or a refusal).
Usage
import { withGuardrails } from '@sisu-ai/mw-guardrails';
const policy = async (text: string) =>
/password|apikey|access\s*token/i.test(text)
? "I can’t help with that. Please remove secrets from your request."
: null;
const app = new Agent()
.use(withGuardrails(policy)) // place before inputToMessage
// .use(inputToMessage)
// ... other middleware
Placement & Ordering
- Put guardrails before
inputToMessage
(or any message-appenders) so it evaluates the rawctx.input
. - Combine with an error boundary for robustness; guardrails are for policy, not exception handling.
When To Use
- You want deterministic, low-latency checks on user input (e.g., secrets, PII, profanity, prompt injection keywords).
- You need policy to run regardless of provider capabilities or outages.
When Not To Use
- You must scan the entire
ctx.messages
history (this middleware only readsctx.input
). - You need semantic classification or nuanced moderation — use a model-backed moderation step or a specialized middleware.
- Inputs are non-text (images/files) — you’ll need a different policy mechanism.
Notes & Gotchas
- Internationalization: simple regex checks may miss non-English variants; consider locale-aware policies if needed.
- Empty input:
ctx.input ?? ''
is passed; decide whether empty input should be allowed or rejected in your policy. - Streaming: this middleware does not stream; if it blocks, it sets a final assistant message immediately.
- Logging: Be careful not to log sensitive content; consider
createRedactingLogger
from@sisu-ai/core
.
Community & Support
Discover what you can do through examples or documentation. Check it out at https://github.com/finger-gun/sisu. Example projects live under examples/
in the repo.