BestAIFor.com

Mask

Overview / Description

Mask is an open-source AI Data Loss Prevention (DLP) tool that intercepts and encrypts sensitive data flowing through LLM agent pipelines, designed for engineering teams building compliant AI applications. It sits between the LLM context window and tool execution environments, applying Format-Preserving Encryption (FPE) so that PII is replaced with format-identical ciphertext tokens before reaching the model — and silently restored only when an authorized backend function actually needs the real value.

The core mechanism is a three-phase local-first, just-in-time (JIT) workflow: masking replaces detected entities with HMAC-derived tokens; a pre-tool decryption hook unmaskes values before calling downstream tools; and a post-tool re-masking hook catches any new PII returned in tool output before it flows back to the LLM. Because tokenization is HMAC-based and deterministic within a session, the LLM's reasoning context stays coherent without ever seeing raw personal data.

Detection uses a two-tier waterfall: a fast deterministic tier (registry lookups, checksums, context rules) handles high-confidence structured PII such as SSNs, IBANs, credit card numbers, and passport IDs; a slower probabilistic tier using transformer-based Named Entity Recognition (NER) catches fuzzy entities like names, locations, and organizations. The system supports 50+ PII entity types across financial, contact, identity, healthcare, and vehicle categories in English and Spanish.

SDKs are available for Python (with LangChain, LlamaIndex, and Google ADK integrations) and TypeScript (Node.js). Vault state can be synchronized across clusters via Redis, DynamoDB, or Memcached. Structured JSON audit logs are emitted asynchronously for ingestion into SIEM platforms such as Datadog and Splunk. The library is released under the Apache-2.0 license and is intended to help teams meet SOC2, HIPAA, and PCI-DSS obligations.

Used For

Preventing PII leakage in LLM agent pipelines, Achieving SOC2 and HIPAA compliance for AI applications, PCI-DSS compliant credit card handling in AI workflows, Encrypting sensitive data before it enters LLM context windows, Just-in-time decryption for authorized tool calls in agentic systems, Audit logging for AI data flows into Datadog or Splunk, Building GDPR-aware AI agents handling EU personal data, Protecting healthcare identifiers and medical IDs in AI pipelines, Securing financial data (IBANs, SSNs, routing numbers) in LLM applications, Multi-language PII detection (English and Spanish) in AI systems

Pricing

Plan

Free

P

View pricing

Plan

Free

r

View pricing

Plan

Free

i

View pricing

Plan

Free

c

View pricing

Plan

Free

i

View pricing

Plan

Free

n

View pricing

Plan

Free

g

View pricing

Plan

Free

n

View pricing

Plan

Free

o

View pricing

Plan

Free

t

View pricing

Plan

Free

p

View pricing

Plan

Free

u

View pricing

Plan

Free

b

View pricing

Plan

Free

l

View pricing

Plan

Free

i

View pricing

Plan

Free

s

View pricing

Plan

Free

h

View pricing

Plan

Free

e

View pricing

Plan

Free

d

View pricing

Plan

Free

View pricing

Plan

Free

c

View pricing

Plan

Free

o

View pricing

Plan

Free

n

View pricing

Plan

Free

t

View pricing

Plan

Free

a

View pricing

Plan

Free

c

View pricing

Plan

Free

t

View pricing

Plan

Free

s

View pricing

Plan

Free

u

View pricing

Plan

Free

p

View pricing

Plan

Free

p

View pricing

Plan

Free

o

View pricing

Plan

Free

r

View pricing

Plan

Free

t

View pricing

Plan

Free

f

View pricing

Plan

Free

o

View pricing

Plan

Free

r

View pricing

Plan

Free

d

View pricing

Plan

Free

e

View pricing

Plan

Free

t

View pricing

Plan

Free

a

View pricing

Plan

Free

i

View pricing

Plan

Free

l

View pricing

Plan

Free

s

View pricing

Pros & Cons

Pros

  • Format-Preserving Encryption (FPE) with HMAC-based deterministic tokens keeps the same PII mapping consistent within a session, preserving LLM reasoning without exposing raw data
  • Supports 50+ PII entity types including SSNs, IBANs, credit cards (with Luhn check preservation for 6+4 masking), passport IDs, medical IDs, and VINs across financial, identity, healthcare, and contact categories
  • Two-tier detection waterfall combines a fast deterministic registry tier and a slower transformer NER tier, with already-tokenized data skipped by the neural tier to prevent entity collisions
  • Python SDK integrates natively with LangChain, LlamaIndex, and Google ADK; TypeScript SDK targets Node.js; pluggable vault backends include Redis, DynamoDB, and Memcached
  • Asynchronous JSON audit logging designed for SIEM ingestion into Datadog and Splunk, supporting SOC2, HIPAA, and PCI-DSS compliance workflows

Cons

  • Early-stage project with only 1 GitHub star and 20 commits as of mid-2026, meaning production maturity and community support are unproven
  • PII detection limited to English and Spanish; teams processing data in other languages are not currently supported
  • No managed cloud service or hosted option documented — self-hosting and infrastructure setup (vault backends, SIEM integration) are entirely the user's responsibility
  • Pricing not published — contact support for details

Alternatives

AWS Macie, Microsoft Presidio, Private AI, Nightfall AI, Skyflow