Toxicity Detection

The Toxicity Detector is designed to evaluate the presence of toxic elements in the prompt. Its main goal is to detect and neutralize potentially harmful or offensive material, helping to uphold a safe and positive online environment.

Tip

Check prerequisites before proceeding further.

Policies

There are currently no policies to tweak for the Toxicity Detector. It works automagically.

API

Usage

PythoncURL

import os
import requests

endpoint = "https://api.zenguard.ai/v1/detect/toxicity"

headers = {
    "x-api-key": os.getenv("ZEN_API_KEY"),
    "Content-Type": "application/json",
}

data = {
    "messages": ["I think its crap that the link to roggenbier is to this article. Somebody that knows how to do things should change it."]
}

response = requests.post(endpoint, json=data, headers=headers)
if response.json()["is_detected"]:
    print("Toxicity detected. Lets be civilized.")
else:
    print("No toxicity detected: carry on with the conversation.")

curl -X POST https://api.zenguard.ai/v1/detect/toxicity \
    -H "x-api-key: $ZEN_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
        "messages": ["I think its crap that the link to roggenbier is to this article. Somebody that knows how to do things should change it."]
    }'

Response Example:

{
    "is_detected": true,
    "score": 1.0,
    "sanitized_message": null
}

is_detected(boolean): Indicates whether toxicity was detected; in this example, the answer is True.
score(float: 0.0 - 1.0): A score representing the likelihood of toxicity, and in this example, it is 1.0.
sanitized_message(string or null): For the toxicity detector this field is null.

Error Codes:

- `401 Unauthorized`: API key is missing or invalid.
- `400 Bad Request`: Request body is malformed.
- `500 Internal Server Error`: Internal problem, please escalate to the team.

Client

Detect toxicity:

import os
from zenguard import Credentials, Detector, ZenGuard, ZenGuardConfig

api_key = os.environ.get("ZEN_API_KEY")
config = ZenGuardConfig(credentials=Credentials(api_key=api_key))
zenguard = ZenGuard(config=config)

message="I think its crap that the link to roggenbier is to this article. Somebody that knows how to do things should change it."
response = zenguard.detect(detectors=[Detector.TOXICITY], prompt=message)
if response.get("is_detected"):
    print("Toxicity detected. Lets be civilized.")
else:
    print("No toxicity detected: carry on with the conversation.")