Skip to content

Personally Identifiable Information(PII) Detection

Open In Colab

This detector is designed to catch all PII instances to ensure that your data stays confidential and is not exposed to third-party, i.e. LLMs.

Tip

Check prerequisites before proceeding further.

Policies

Using PII Detector policy you can select different PII types along with actions that should be performed when this PII type is detected.

Actions:

  • Block: completely prevent the prompt with this PII type from being propagated to an LLM.
  • Warn: alert that the prompt contains this PII. The prompt is propagated to the LLM.
  • Redact: alter the PII within the prompt. The redacted propmt is propagated to the LLM.
  • Passthrough: the PII will be allowed to proceed without any restrictions or changes.

Available PII types:

  • Name: The individual's full name or any part that could identify the individual.
  • Email: An individual's email address.
  • Phone: Telephone numbers associated with an individual.
  • Address: Physical address information, including residential, business, or mailing addresses.
  • Credit Card: Credit card numbers that could be used for financial fraud.
  • SSN: Social Security Numbers, a unique number assigned to U.S. citizens for tracking social security benefits and for other identification purposes.
  • Location: Geographical information that can pinpoint the location of an individual.
  • IP Address: A unique address that identifies a device on the Internet or a local network, which can be traced back to an individual.

API

Example:

import os
import requests

endpoint = "https://api.zenguard.ai/v1/detect/pii"

headers = {
    "x-api-key": os.getenv("ZEN_API_KEY"),
    "Content-Type": "application/json",
}

data = {
    "messages": ["My credit card number is 1234-5679-1234-1234 and my name is John Smith."]
}

response = requests.post(endpoint, json=data, headers=headers)
if response.json()["is_detected"]:
    print("PII detected. ZenGuard: 1, big brother: 0.")
else:
    print("No PII detected: your data is safe to feed into any LLM.")

assert response.json()["is_detected"], "Error detecting pii"
curl -X POST https://api.zenguard.ai/v1/detect/prompt_injection \
    -H "x-api-key: $ZEN_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
        "messages": ["My credit card number is 1234-5679-1234-1234 and my name is John Smith."]
    }'

Response Example:

{
    "is_detected": true,
    "block": {
        "credit_card": [
            "1234-5679-1234-1234"
        ]
    },
    "warn": {},
    "redact": {
        "name": [
            "John Smith -> William Gamble"
        ]
    },
    "sanitized_message": "My credit card number is 1234-5679-1234-1234 and my name is William Gamble."
}

Note that the response is dependent on the Policy configuration for the PII detector. For this example, all messages with Credit Card numbers are blocked and all the names will be redacted out.

  • is_detected(boolean): Indicates whether the prompt contains PII. In this example, it is True since the prompt contains credit card information and name.
  • block(dict): The dictionary that contains blocked PII types and lists of corresponding string values. In this example Credit Card PII type was specified as block in the Policy tab.
  • warn(dict): The dictionary contains PII types and lists of corresponding string values that should trigger user warnings. In this example, there are no PII types that fall into the 'warn' category.
  • redact(dict): The dictionary that contains redacted PII types and lists of redacted string values and substitutions. In this example Name PII type was specified as redact in the Policy tab. Hence the corresponding list contains redacted value "John Smith" and its randomly generated substitution "William Gamble".
  • sanitized_message(string or null): Contains the prompt redacted in accordance with the Policy. Credit card numbers are not redacted because the block action requires blocking the entire prompt. However, names are replaced with randomly generated ones when the Name PII type is specified for redaction according to the Policy.

Error Codes:

    - 401 Unauthorized: API key is missing or invalid.
    - 400 Bad Request: Request body is malformed.
    - 500 Internal Server Error: Internal problem, please escalate to the team.

Client

Example:

import os
from zenguard import Credentials, Detector, ZenGuard, ZenGuardConfig

api_key = os.environ.get("ZEN_API_KEY")
config = ZenGuardConfig(credentials=Credentials(api_key=api_key))
zenguard = ZenGuard(config=config)

message="My credit card number is 1234-5679-1234-1234 and my name is John Smith."
response = zenguard.detect(detectors=[Detector.PII], prompt=message)
if response.get("is_detected"):
    print("PII detected. ZenGuard: 1, big brother: 0.")
else:
    print("No PII detected: your data is safe to feed into any LLM.")

assert response.get("is_detected"), "Error detecting pii"