Reference Middleware¶
Module: piighost.middleware
PIIAnonymizationMiddleware¶
LangChain middleware that transparently anonymizes personal data around the LLM / tools boundary.
Extends AgentMiddleware from LangChain and intercepts the agent loop at 3 points:
| Hook | When | Operation |
|---|---|---|
abefore_model |
Before each LLM call | Anonymizes all messages |
aafter_model |
After each LLM response | Deanonymizes for the user |
awrap_tool_call |
Around each tool | Deanonymizes args, reanonymizes response |
Constructor¶
| Parameter | Type | Description |
|---|---|---|
pipeline |
ThreadAnonymizationPipeline |
Configured conversation pipeline with memory |
Usage¶
from piighost.middleware import PIIAnonymizationMiddleware
from piighost.pipeline import ThreadAnonymizationPipeline
from langchain.agents import create_agent
middleware = PIIAnonymizationMiddleware(pipeline=conv_pipeline)
agent = create_agent(
model="openai:gpt-5.4",
tools=[...],
middleware=[middleware],
)
Hooks in detail¶
abefore_model(state, runtime) -> dict | None (async)¶
Called before each LLM call. Anonymizes all messages in the conversation via pipeline.anonymize().
Behavior by message type:
HumanMessage→ full NER detection and anonymizationAIMessage→ anonymization (re-anonymizes any deanonymized values)ToolMessage→ anonymization (re-anonymizes tool responses)
# Before abefore_model:
# [HumanMessage("Send an email to Patrick in Paris")]
# After abefore_model:
# [HumanMessage("Send an email to <<PERSON:1>> in <<LOCATION:1>>")]
Returns: {"messages": [...]} if modifications were made, None otherwise.
aafter_model(state, runtime) -> dict | None (async)¶
Called after each LLM response. Deanonymizes all messages so the user sees real values. First tries pipeline.deanonymize() (cache-based), falls back to pipeline.deanonymize_with_ent() (entity-based) on CacheMissError.
# Before aafter_model:
# [AIMessage("Done! Email sent to <<PERSON:1>>.")]
# After aafter_model:
# [AIMessage("Done! Email sent to Patrick.")]
Returns: {"messages": [...]} if modifications were made, None otherwise.
awrap_tool_call(request, handler) -> ToolMessage | Command (async)¶
Wraps each tool call in 3 steps:
- Deanonymizes
strarguments → the tool receives real values - Executes the tool via
handler(request) - Reanonymizes the tool response via
pipeline.anonymize()→ the LLM never sees personal data
# LLM calls : send_email(to="<<PERSON:1>>", subject="Hello")
# ↓ deanonymize args
# Tool gets : send_email(to="Patrick", subject="Hello")
# Tool returns: "Email successfully sent to Patrick."
# ↓ reanonymize response
# LLM sees : "Email successfully sent to <<PERSON:1>>."
Only str arguments are deanonymized. Non-string types are passed through unchanged.
Full flow¶
sequenceDiagram
participant U as User
participant M as PIIAnonymizationMiddleware
participant L as LLM
participant T as Tool
U->>M: User message (plain text)
M->>M: abefore_model()<br/>anonymize HumanMessage via NER
M->>L: Anonymized message (placeholders)
L->>M: Tool call with anonymized args
M->>M: awrap_tool_call()<br/>deanonymize args
M->>T: Tool call with real values
T->>M: Tool response (real values)
M->>M: awrap_tool_call()<br/>reanonymize response
M->>L: Anonymized tool response
L->>M: Final response (placeholders)
M->>M: aafter_model()<br/>deanonymize for user
M->>U: Final response (plain text)
LangChain dependency¶
PIIAnonymizationMiddleware requires langchain to be installed. If not, an ImportError is raised on import:
Installation:
Full example¶
import asyncio
from gliner2 import GLiNER2
from langchain.agents import create_agent
from langchain_core.tools import tool
from piighost.anonymizer import Anonymizer
from piighost.detector.gliner2 import Gliner2Detector
from piighost.linker.entity import ExactEntityLinker
from piighost.resolver import MergeEntityConflictResolver, ConfidenceSpanConflictResolver
from piighost.middleware import PIIAnonymizationMiddleware
from piighost.pipeline import ThreadAnonymizationPipeline
from piighost.placeholder import LabelCounterPlaceholderFactory
@tool
def get_info(person: str) -> str:
"""Returns information about a person."""
return f"{person} is a software engineer in Paris."
model = GLiNER2.from_pretrained("fastino/gliner2-multi-v1")
entity_linker = ExactEntityLinker()
entity_resolver = MergeEntityConflictResolver()
span_resolver = ConfidenceSpanConflictResolver()
ph_factory = LabelCounterPlaceholderFactory()
anonymizer = Anonymizer(ph_factory=ph_factory)
detector = Gliner2Detector(
model=model,
threshold=0.5,
labels=["PERSON", "LOCATION"],
)
pipeline = ThreadAnonymizationPipeline(
detector=detector,
span_resolver=span_resolver,
entity_linker=entity_linker,
entity_resolver=entity_resolver,
anonymizer=anonymizer,
)
middleware = PIIAnonymizationMiddleware(pipeline=pipeline)
agent = create_agent(
model="openai:gpt-5.4",
system_prompt="You are a helpful assistant. Treat placeholders as real values.",
tools=[get_info],
middleware=[middleware],
)
async def main():
result = await agent.ainvoke({
"messages": [{"role": "user", "content": "Who is Patrick?"}]
})
print(result["messages"][-1].content)
asyncio.run(main())