FAQ¶
Is it really necessary to anonymize PII before calling an LLM?
Yes, and this holds regardless of piighost. The stakes (exfiltration to providers, legal requisition, training on conversations, GDPR compliance, leaks via RAG and tools) are covered in Why anonymize?. The page is library-agnostic: it explains why the problem exists before justifying a solution like piighost.
Do my placeholders have to look like <<PERSON:1>>?
No. The format is driven by AnyPlaceholderFactory. By default LabelCounterPlaceholderFactory produces <<LABEL:N>>, but LabelHashPlaceholderFactory yields <<LABEL:hash>>, LabelPlaceholderFactory produces <<LABEL>> without a counter, and you can write your own factory. See Placeholder factories.
Does the LLM see raw PII when it calls a tool?
No. The middleware deanonymizes arguments right before the tool executes, then re-anonymizes the tool response before it flows back to the LLM. The tool sees real values; the LLM only sees placeholders. See the sequence diagram in Architecture.
How do I control what a tool sees: placeholder or real value?
The tool_strategy parameter of PIIAnonymizationMiddleware exposes three modes (FULL, INBOUND_ONLY, PASSTHROUGH) via the ToolCallStrategy enum. The right choice depends on whether the tool may emit new PII and how strict the privacy boundary needs to be. See Tool-call strategies for the trade-offs and the decision tree, and Placeholder factories for the factory constraint that forces PreservesIdentity in every mode but PASSTHROUGH.
What happens if the LLM hallucinates a PII that was not in the input?
It is not anonymized by piighost: entity linking works on detections coming from the input, not on invented values. To cover that case, add a post-response detection pass at the application layer. See Limitations.
Is the cache shared across threads or conversations?
No. The aiocache cache is scoped by thread_id. Two parallel conversations never see each other's placeholders, preventing cross-user leaks. The thread_id is extracted automatically from the LangGraph config.
Can I use piighost without LangChain?
Yes. AnonymizationPipeline and ThreadAnonymizationPipeline are usable standalone, without the middleware. See Basic usage.
Does piighost encrypt cached data?
No. The cache stores the placeholder → value mapping in memory (or in the aiocache backend you configured). See Security for the full list of things out of scope.