Open Source PII Redaction

Your Cloud.
Your Data.
Zero Trust.

Philterd builds open-source tools and consulting expertise for sensitive data redaction — deployed entirely within your infrastructure, with no third-party data exposure.

50+ PII Types Detected
100% Open Source
0x Third-Party Calls
// Redact PII using philter-sdk-java
import ai.philterd.philter.sdk.PhilterClient;
import ai.philterd.philter.sdk.model.FilterResponse;

String text = "His name is John Smith and he lives at 456 Main St.";

PhilterClient client = new PhilterClient.PhilterClientBuilder()
  .withEndpoint("https://127.0.0.1:8080")
  .build();

FilterResponse filterResponse = client.filter(text);
→ Redacted Output
His name is NAME and he lives at ADDRESS.
Designed for
HIPAA Compliance
GDPR / CCPA
Self-Hosted Deployment
REST API + SDKs
On-Prem or Cloud

Meet Philter —
PII redaction built for your cloud.

Philter is the open-source engine at the heart of Philterd. It identifies, classifies, and redacts PII and PHI from unstructured text — entirely within your own infrastructure. No SaaS subscriptions. No vendor lock-in. No sending sensitive data anywhere.

Configure redaction behavior precisely using policy files: choose between replacement, masking, encryption, or anonymization for each entity type.

Entity Detection

Philter detects names, dates, SSNs, phone numbers, emails, addresses, IPs, credit card numbers, NPI numbers, and dozens more — with custom model support for domain-specific data.

PERSON SSN DATE EMAIL PHONE ADDRESS CREDIT_CARD NPI IP_ADDR +40 more

Flexible Redaction Modes

Choose how each entity type is handled — per policy, per document type, or per pipeline stage.

REDACT REPLACE MASK ENCRYPT HASH RANDOM

High-Volume Processing

REST API and Kafka integration for streaming pipelines — handles millions of documents at scale.

From raw text to redacted output
in milliseconds

Philter's pipeline is designed to be transparent, auditable, and fast — processing text in stages with full policy control at each step.

1

Ingest

Send text via REST API, SDK, or stream it through Kafka / Phirestream directly into Philter.

2

Detect

NLP models, regex patterns, and custom filters scan the text and classify every sensitive entity.

Apply Policy

Your policy file defines exactly how each entity type is handled — replace, mask, encrypt, or remove.

Return

Receive clean, redacted text with an optional audit trail — all within your own infrastructure.

Expert guidance from the people who built it.

Work directly with the creators of Philter. We bring deep expertise in NLP, data engineering, and compliance to your most complex data protection challenges.

Unlike vendors who resell third-party tools, we've built every layer of the Philterd stack ourselves. That depth means we can solve problems other consultants can't even diagnose.

Custom NLP Model Training

Train domain-specific models for unique entity types in your data — medical codes, internal IDs, proprietary formats.

Policy Design & Review

Build airtight redaction policies mapped to your specific compliance requirements and data structures.

Pipeline Integration

Embed Philter into your existing data workflows — Spark, Flink, Kafka, or custom ETL pipelines.

Compliance Architecture

Design end-to-end data protection strategies that satisfy auditors and survive security reviews.

Open Philter Source

Privacy tools must be transparent. Full stop.

You shouldn't have to trust a black box with your most sensitive data. That's why we built Philter open source — so anyone can read the code, audit the behavior, and verify what's happening to their data.

We don't just wrap someone else's API and resell it. We wrote the core engine, every NLP component, and every supporting tool in the Philterd ecosystem. We own the full stack — and so can you.

Auditable by design

Every line of code is on GitHub. No hidden telemetry, no proprietary black boxes.

Community-driven improvement

Bugs get fixed faster. Edge cases surface sooner. The whole community benefits.

No vendor lock-in, ever

Deploy today, fork tomorrow. Your data protection strategy belongs to you.

Ready to protect your sensitive data?

Start with the open-source code, or bring us in to accelerate your compliance program. Either way, we're here to help you get it right.

Clone on GitHub