News

Azure AI Security: Prompt Shields & Content Safety

Source: azure.microsoft.com

Published on June 6, 2025

Azure AI Security: Protecting AI Models with Prompt Shields and Content Safety

The landscape of AI security is evolving rapidly, with prompt injection attacks emerging as a significant threat to generative AI applications. These attacks occur when adversaries manipulate inputs to large language models (LLMs) to alter their behavior or access unauthorized information. According to the Open Worldwide Application Security Project (OWASP), prompt injection is the top threat facing LLMs today. To combat this growing concern, Azure has introduced Azure AI Content Safety, featuring Prompt Shields—a unified API designed to analyze inputs and guard against both direct and indirect threats.

Prompt injection attacks can be categorized into two main types: direct and indirect. In direct attacks, malicious actors input deceptive prompts to provoke unintended or harmful responses from AI models. Indirect attacks, on the other hand, are more subtle and often involve manipulating the context or sequence of inputs to achieve similar outcomes. Prompt Shields, integrated with Azure OpenAI content filters, provides a robust defense against these attacks by leveraging advanced machine learning algorithms and natural language processing.

Prompt Shields offers numerous benefits, including defending against jailbreaks, prompt injections, and document attacks. It ensures that LLMs behave as designed by blocking prompts that attempt to circumvent rules and policies defined by developers. This capability is crucial for maintaining the security and integrity of AI applications, safeguarding them against malicious attempts at manipulation or exploitation.

Prompt Shields Capabilities

Azure AI Content Safety's Prompt Shields is designed to identify and mitigate potential threats in user prompts and third-party data. By seamlessly integrating with Azure OpenAI content filters, it provides a comprehensive solution for defending against various types of prompt injection attacks. The system is regularly updated with new defenses as new attack types are uncovered, ensuring that AI applications remain secure and reliable.

Customer Successes

AXA, a global leader in insurance, uses Azure OpenAI to power its Secure GPT solution. By integrating Azure's content filtering technology and adding its own security layer, AXA prevents prompt injection attacks and ensures the reliability of its AI models. Secure GPT is based on Azure OpenAI in Foundry Models, which have been fine-tuned using human feedback reinforcement learning. Additionally, AXA relies on Azure content filtering technology, to which the company has added its own security layer to prevent any jailbreaking of the model using Prompt Shields, ensuring an optimal level of reliability.

Wrtn Technologies, a leading enterprise in Korea, relies on Azure AI Content Safety to maintain compliance and security across its products. At its core, Wrtn's flagship technology compiles an array of AI use cases and services localized for Korean users, integrating AI into their everyday lives. The platform fuses elements of AI-powered search, chat functionality, and customizable templates, empowering users to interact seamlessly with an "Emotional Companion" AI-infused agent. These AI agents have engaging, lifelike personalities, interacting in conversation with their creators. The vision is a highly interactive personal agent that is unique and specific to each user, their data, and their memories. Because the product is highly customizable, the built-in ability to toggle content filters and Prompt Shields is highly advantageous, allowing Wrtn to efficiently customize its security measures for different end users.

For IT decision-makers looking to enhance the security of their AI deployments, integrating Azure's Prompt Shields is a strategic imperative. Azure's Prompt Shields and built-in AI security features offer an unparalleled level of protection for AI models, helping organizations harness the power of AI without compromising security. Microsoft, a leader in identifying and mitigating prompt injection attacks, uses best practices developed with decades of research, policy, product engineering, and learnings from building AI products at scale, enabling organizations to achieve their AI transformation with confidence.

By integrating these capabilities into your AI strategy, you can help safeguard your systems from prompt injection attacks and maintain the trust and confidence of your users. Organizations across industries are using Azure AI Foundry and Microsoft 365 Copilot capabilities to drive growth, increase productivity, and create value-added experiences. We are committed to helping organizations use and build AI that is trustworthy, meaning it is secure, private, and safe. Trustworthy AI is only possible when combining our commitments, such as our Secure Future Initiative and Responsible AI principles, with our product capabilities to unlock AI transformation with confidence.