Prevent factual errors from LLM hallucinations with mathematically sound Automated Reasoning checks (preview)

Today, we’re adding Automated Reasoning checks (preview) as a new safeguard in Amazon Bedrock Guardrails to help you mathematically validate the accuracy of responses generated by large language models (LLMs) and prevent factual errors from hallucinations.

Amazon Bedrock Guardrails lets you implement safeguards for generative AI applications by filtering undesirable content, redacting personal identifiable information (PII), and enhancing content safety and privacy. You can configure policies for denied topics, content filters, word filters, PII redaction, contextual grounding checks, and now Automated Reasoning checks.

Automated Reasoning checks help prevent factual errors from hallucinations using sound mathematical, logic-based algorithmic verification and reasoning processes to verify the information generated by a model, so outputs align with known facts and aren’t based on fabricated or inconsistent data.

Amazon Bedrock Guardrails is the only responsible AI capability offered by a major cloud provider that helps customers to build and customize safety, privacy, and truthfulness for their generative AI applications within a single solution.

Primer on automated reasoning
Automated reasoning is a field of computer science that uses mathematical proofs and logical deduction to verify the behavior of systems and programs. Automated reasoning differs from machine learning (ML), which makes predictions, in that it provides mathematical guarantees about a system’s behavior. Amazon Web Services (AWS) already uses automated reasoning in key service areas such as storage, networking, virtualization, identity, and cryptography. For example, automated reasoning is used to formally verify the correctness of cryptographic implementations, improving both performance and development speed. To learn more, check out Provable Security and the Automated reasoning research area in the Amazon Science Blog.

Now AWS is applying a similar approach to generative AI. The new Automated Reasoning checks (preview) in Amazon Bedrock Guardrails is the first and only generative AI safeguard that helps prevent factual errors due to hallucinations using logically accurate and verifiable reasoning that explains why generative AI responses are correct. Automated Reasoning checks are particularly useful for use cases where factual accuracy and explainability are important. For example, you could use Automated Reasoning checks to validate LLM-generated responses about human resources (HR) policies, company product information, or operational workflows.

Used alongside other techniques such as prompt engineering, Retrieval-Augmented Generation (RAG), and contextual grounding checks, Automated Reasoning checks add a more rigorous and verifiable approach to making sure that LLM-generated output is factually accurate. By encoding your domain knowledge into structured policies, you can have confidence that your conversational AI applications are providing reliable and trustworthy information to your users.

Using Automated Reasoning checks (preview) in Amazon Bedrock Guardrails
With Automated Reasoning checks in Amazon Bedrock Guardrails, you can create Automated Reasoning policies that encode your organization’s rules, procedures, and guidelines into a structured, mathematical format. These policies can then be used to verify that the content generated by your LLM-powered applications is consistent with your guidelines.

Automated Reasoning policies are composed of a set of variables, defined with a name, type, and description, and the logical rules that operate on the variables. Behind the scenes, rules are expressed in formal logic, but they’re translated to natural language to make it easier for a user without formal logic expertise to refine a model. Automated Reasoning checks uses the variable descriptions to extract their values when validating a Q&A.

Here’s how it works.

Create Automated Reasoning policies
Using the Amazon Bedrock console, you can upload documents that describe your organization’s rules and procedures. Amazon Bedrock will analyze these documents and automatically create an initial Automated Reasoning policy, which represents the key concepts and their relationships in a mathematical format.

Navigate to the new Automated Reasoning menu item in Safeguards. Create a new policy and give it a name. Upload an existing document that defines the right solution space, such as an HR guideline or an operational manual. For this demo, I’m using an example airline ticket policy document that includes the airline’s policies for ticket changes.

Then, define the policy’s intent and any processing parameters. For example, specify if it will validate airport staff inquiries and identify any elements to exclude from processing, such as internal reference numbers. Include one or more sample Q&As to help the system understand typical interactions.

Here’s my intent description:

Ignore the policy ID number, it's irrelevant. Airline employees will ask questions about whether customers are allowed to modify their tickets providing the customer details. Below is an example question:

QUESTION: I’m flying to Wonder City with Unicorn Airlines and noticed my last name is misspelled on the ticket, can modify it at the airport?
ANSWER: No. Changes to the spelling of the names on the ticket must be submitted via email within 24 hours of ticket purchase.

Then, choose Create.

The system now initiates an automated process to create your Automated Reasoning policy. This process involves analyzing your document, identifying key concepts, breaking down the document into individual units, translating these natural language units into formal logic, validating the translations, and finally combining them into a comprehensive logical model. Once complete, review the generated structure, including the rules and variables. You can edit these for accuracy through the user interface.

To test the Automated Reasoning policy, you first have to create a guardrail.

Create a guardrail and configure Automated Reasoning checks
When building your conversational AI application with Amazon Bedrock Guardrails, you can enable Automated Reasoning checks and specify which Automated Reasoning policies to use for validation.

Navigate to the Guardrails menu item in Safeguards. Create a new guardrail and give it a name. Choose Enable Automated Reasoning policy and select the policy and policy version you want to use. Then, complete your guardrail configuration.

Test Automated Reasoning checks
You can use the Test playground in the Automated Reasoning console to verify the effectiveness of your Automated Reasoning policy. Enter a test question just like a user of your application would, together with an example answer to validate.

For this demo, I enter an incorrect answer to see what will happen.

Question: I'm flying to Wonder City with Unicorn Airlines and noticed my last name is misspelled on the ticket, I'm currently in person at the airport, can I submit the change in person?

Answer: Yes. You are allowed to change names on tickets at any time, even in person at the airport.

Then, select the guardrail you’ve just created and choose Submit.

Automated Reasoning checks will analyze the content and validate it against the Automated Reasoning policies you’ve configured. The checks will identify any factual inaccuracies or inconsistencies and provide an explanation for the validation results.

In my demo, the Automated Reasoning checks correctly identified the response as Invalid. It shows which rule led to the finding, along with the extracted variables and suggestions.

When the validation result is invalid, the suggestions show a set of variable assignments that would make the conclusion valid. In my scenario, the suggestions show that the change submission method needs to be email for the validation result to be valid.

If no factual inaccuracies are detected and the validation result is Valid, suggestions show a list of assignments that are necessary for the result to hold; these are unstated assumptions in the answer. In my scenario, this might be assumptions such as that it’s the original ticket on which name corrections must be made or that the type of ticket stock is eligible for changes.

If factual inconsistencies are detected, the console will display Mixed results as the validation result. In the API response, you will see a list of findings, with some marked as valid and others as invalid. If this happens, review the system’s findings and suggestions and edit any unclear policy rules.

You can also use the validation results to enhance LLM-generated responses based on the feedback. For example, the following code snippet demonstrates how you can ask the model to regenerate its answer based on the received feedback:

for f in findings:
    if f.result == "INVALID":
        if f.rules is not None:
            for r in f.rules:
                feedback += f"<feedback>{r.description}</feedback>\n"

new_prompt = (
    "The answer you generated is inaccurate. Consider the feedback below within "
    f"<feedback> tags and rewrite your answer.\n\n{feedback}"
)

Achieving high validation accuracy is an iterative process. As a best practice, regularly review policy performance and adjust it as needed. You can edit rules in natural language and the system will automatically update the logical model.

For example, updating variable descriptions can significantly improve validation accuracy. Consider a scenario where a question states, “I’m a full-time employee…,” and the description of the is_full_time variable only states, “works more than 20 hours per week.” In this case, Automated Reasoning checks might not recognize the phrase “full-time.” To enhance accuracy, you should update the variable description to be more comprehensive, such as: “Works more than 20 hours per week. Users may refer to this as full-time or part-time. The value should be true for full-time and false for part-time.” This detailed description helps the system pick up all relevant factual claims for validation in natural language questions and answers, providing more accurate results.

Available in preview
The new Automated Reasoning checks safeguard is available today in preview in Amazon Bedrock Guardrails in the US West (Oregon) AWS Region. To request to be considered for access to the preview today, contact your AWS account team. In the next few weeks, look for a sign-up form in the Amazon Bedrock console. To learn more, visit Amazon Bedrock Guardrails.

— Antje

from AWS News Blog https://aws.amazon.com/blogs/aws/prevent-factual-errors-from-llm-hallucinations-with-mathematically-sound-automated-reasoning-checks-preview/

Leave a comment Cancel reply