Philterd builds open-source tools and consulting expertise for sensitive data redaction — deployed entirely within your infrastructure, with no third-party data exposure.
Philter is the open-source engine at the heart of Philterd. It identifies, classifies, and redacts PII and PHI from unstructured text — entirely within your own infrastructure. No SaaS subscriptions. No vendor lock-in. No sending sensitive data anywhere.
Configure redaction behavior precisely using policy files: choose between replacement, masking, encryption, or anonymization for each entity type.
Philter detects names, dates, SSNs, phone numbers, emails, addresses, IPs, credit card numbers, NPI numbers, and dozens more — with custom model support for domain-specific data.
Choose how each entity type is handled — per policy, per document type, or per pipeline stage.
REST API and Kafka integration for streaming pipelines — handles millions of documents at scale.
Philter's pipeline is designed to be transparent, auditable, and fast — processing text in stages with full policy control at each step.
Send text via REST API, SDK, or stream it through Kafka / Phirestream directly into Philter.
NLP models, regex patterns, and custom filters scan the text and classify every sensitive entity.
Your policy file defines exactly how each entity type is handled — replace, mask, encrypt, or remove.
Receive clean, redacted text with an optional audit trail — all within your own infrastructure.
Work directly with the creators of Philter. We bring deep expertise in NLP, data engineering, and compliance to your most complex data protection challenges.
Unlike vendors who resell third-party tools, we've built every layer of the Philterd stack ourselves. That depth means we can solve problems other consultants can't even diagnose.
Train domain-specific models for unique entity types in your data — medical codes, internal IDs, proprietary formats.
Build airtight redaction policies mapped to your specific compliance requirements and data structures.
Embed Philter into your existing data workflows — Spark, Flink, Kafka, or custom ETL pipelines.
Design end-to-end data protection strategies that satisfy auditors and survive security reviews.
You shouldn't have to trust a black box with your most sensitive data. That's why we built Philter open source — so anyone can read the code, audit the behavior, and verify what's happening to their data.
We don't just wrap someone else's API and resell it. We wrote the core engine, every NLP component, and every supporting tool in the Philterd ecosystem. We own the full stack — and so can you.
Every line of code is on GitHub. No hidden telemetry, no proprietary black boxes.
Bugs get fixed faster. Edge cases surface sooner. The whole community benefits.
Deploy today, fork tomorrow. Your data protection strategy belongs to you.
Start with the open-source code, or bring us in to accelerate your compliance program. Either way, we're here to help you get it right.