TAGS: | |

Cloudflare Announces A Firewall For Malicious AI Prompts

Drew Conry-Murray

AI applications built on Large Language Models (LLMs), such as virtual assistants that can help workers perform business tasks, and public-facing applications such as customer service chatbots, open fresh new territory for exploitation by malicious actors. They also open fresh new territory for security products.

Case in point, Cloudflare has announced a new Web Application Firewall (WAF) to detect malicious user prompts that attempt to exploit LLM-based AI applications. Called “Firewall for AI,” the new service can be deployed for models that run in Cloudflare’s Workers AI platform, or in front of models hosted in third-party clouds. It can also be used alongside Cloudflare’s AI Gateway service.

Firewall for AI currently addresses two risks associated with LLMs:

1. Exposure of sensitive data

The Cloudflare Firewall for AI WAF will scan outbound responses from models for financial information and secrets such as API keys, based on rulesets developed by Cloudflare. Custom fingerprints are on the product roadmap. Cloudflare says customers can review matches, but doesn’t specify whether matches can be blocked outright.

2. Prompt injection

Attackers may submit requests intended to produce inaccurate. offensive, or inappropriate results.

The Cloudflare Firewall for AI WAF will examine inbound prompts to look for potential abuse. The firewall will assign each prompt a score, from 1 to 99, to indicate the likelihood of an injection attack. Customers can use the score to decide how to handle each prompt. Cloudflare says customers will be able to combine this score with other signals provided by other Cloudflare services such as bot scores and attack scores.

At present, the firewall can only alert you to suspicious prompts, not block them. “The first step is to provide visibility – e.g., flagging suspicious requests,” said Daniele Molteni, Group Product Manager at Cloudflare in an email exchange. “The ability to prevent requests reaching the model will be another option introduced down the line.”

Questions About AI Firewall

I emailed Cloudflare some questions about the announcement. Here are two. The responses again come from Daniele Molteni.

Q. How will Cloudflare distinguish potentially exploitative prompts from safe prompts?

A. Firewall for AI will leverage a combination of heuristics and additional AI layers to evaluate prompts and identify abuses and threats.

Q. Will Cloudflare have to train its own AI firewall against every individual customer LLM to be effective at detecting threats?

A. The first release of AI Firewall will not include ad-hoc training, rather, it will identify attacks that are broadly applicable to common LLM models. As the product evolves, we expect to introduce solutions that take into account the specificity of customer models.

Source: Cloudflare

 

Early Days

Traditional firewall rules can be complicated, but at least they operate within defined parameters (ports, protocols, applications, and so on). Human language is an amalgam of dictionary definitions, shaded meanings, idiom, intent, syntax, slang, and other characteristics that allow for almost infinite variation. Trying to write rules to account for that variation seems a sisyphean undertaking.

Given that, I feel like this Firewall for AI service feels experimental. That’s not to fault Cloudflare, or other products that are likely to emerge in this space: right now, everything to do with generative AI feels experimental.

That said, I can see the appeal of such a service to companies that are making LLM-based AI applications available to the public. End users are going to find all kinds of ways to probe AI applications. And they will find exploits whether through malice, mischief, curiosity, or just by accident. Putting some kind of filter in front of your public-facing LLM may address the most egregious exploits. And it provides, if nothing more, some PR cover for executives and lawyers if a prompt creates a problem.

We’ve had many years of development of security products that incorporate AI. Now it seems like a new category is emerging in which security products are being designed to defend AI. It’s a new angle, but the same challenges remain. The successful deployment and operation of these tools still require work on your part: risk analysis, planning, monitoring, training, incident response, and all the other practices necessary to manage risk.

About Drew Conry-Murray: Drew Conry-Murray has been writing about information technology for more than 15 years, with an emphasis on networking, security, and cloud. He's co-host of The Network Break podcast and a Tech Field Day delegate. He loves real tea and virtual donuts, and is delighted that his job lets him talk with so many smart, passionate people. He writes novels in his spare time.