Keywords Detection

This detector is designed to catch all occurences of specific words or phrases.

Tip

Check prerequisites before proceeding further.

Policies

Using Keywords Detector policy you can add any words or phrases along with actions that should be performed whem the keyword is detected.

Actions:

Block: completely prevent the prompt with the keyword from being propagated to an LLM.
Warn: alert that the prompt contains the keyword. The prompt is still going to be propagated to the LLM.
Redact: alter the keyword within the prompt. The redacted propmt is propagated to the LLM.
Passthrough: the keyword will be allowed to proceed without any restrictions or changes.

API

Usage

PythoncURL

import os
import requests

endpoint = "https://api.zenguard.ai/v1/detect/keywords"

headers = {
    "x-api-key": os.getenv("ZEN_API_KEY"),
    "Content-Type": "application/json",
}

data = {
    "messages": ["You know, batman is my favorite hero."]
}

response = requests.post(endpoint, json=data, headers=headers)
if response.json()["is_detected"]:
    print("Keywords detected.")
else:
    print("No keywords detected.")

assert not response.json()["is_detected"], "Error detecting keywords"

curl -X POST https://api.zenguard.ai/v1/detect/prompt_injection \
    -H "x-api-key: $ZEN_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
        "messages": ["You know, batman is my favorite hero."]
    }'

Response Example:

{
    "is_detected": true,
    "block": [],
    "warn": [
        "Batman"
    ],
    "redact": [
        "hero -> mango"
    ],
    "sanitized_message": "You know, Batman is my favorite mango."
}

Note that the response is dependent on the Policy configuration for the Keywords Detector. For this example, warnings are issued for all prompts containing the word 'Batman,' and the word 'hero' is replaced with 'mango'.

is_detected(boolean): Indicates whether the prompt contains Keywords that are specified in the Keywords Detector Policy. In this example, it is True since the prompt contains "Batman" and "hero".
block(list): The list of strings that contains blocked keywords. In this example no blocked keywords were detected.
warn(list): This is a list of strings containing keywords for which a warning should be issued. In this example the word "Batman" falls into the warn category.
redact(list): The list of strings that contains redacted keywords and substitutions. In this example the word "hero" was replaced with the word "mango".
sanitized_message(string or null): Contains the prompt redacted according to the Policy. The word "Batman" triggers a warning but is not replaced, whereas the word "hero" is replaced with "mango."

Error Codes:

- `401 Unauthorized`: API key is missing or invalid.
- `400 Bad Request`: Request body is malformed.
- `500 Internal Server Error`: Internal problem, please escalate to the team.

Client

Detect Keywords:

import os
from zenguard import Credentials, Detector, ZenGuard, ZenGuardConfig

api_key = os.environ.get("ZEN_API_KEY")
config = ZenGuardConfig(credentials=Credentials(api_key=api_key))
zenguard = ZenGuard(config=config)

message="You know, batman is my favorite hero."
response = zenguard.detect(detectors=[Detector.KEYWORDS], prompt=message)
if response.get("is_detected"):
    print("Keywords detected.")
else:
    print("No keywords detected.")

assert not response.get("is_detected"), "Error detecting keywords"