Enterprises are starting to use LLMs, but we’re still in the early days.
We’re all familiar with the basic use cases: search over documents, customer support, and so on. But the harder problems come with regulated enterprises dealing with large amounts of sensitive data — where you have to deal with thorny technical issues like data integration, prompt injections, permissions, and auditability.
This post lays out a case study of how LLMs can be used in an AML (anti money-laundering) context, and some of the gotchas along the way.
This is the first in a series of case studies of how businesses are unlocking these more complex LLM-based workflows with regulated, or sensitive, data. Hopefully it can inspire you too!
Suppose you’re a fintech company.
The part of the problem we’ll focus on for this post is screening: making sure the businesses you serve are not going to be high risk for you. This means screening your customers to make sure they aren’t doing things that would force you to comply with additional regulations. Fintechs that serve industries like gambling, adult businesses, etc carry a lot of additional risk, so many fintechs choose to simply avoid serving such businesses.
How this typically works:
This part of the process can be summarized in the following diagram:
LLMs can:
You’d still need a human in the loop to ensure the decision is sound. But the LLM can do a lot of the manual work for you. To do that, though, the LLM needs access to the data sources you’re querying.
Recall that we’re enriching the customer-provided data with additional contextual data (from data sources like S3, BigQuery, Snowflake, or DataBricks, along with external APIs and possibly even web scraping). Say you’ve connected an LLM to those data sources via an enterprise RAG platform (such as Credal).
Once you’ve done that, you face three big questions:
LLMs can enable more sophisticated queries than were previously accessible: e.g. you can imagine an agent asking the LLM, “what decision did we make for similar businesses to [X]” and the LLM having to run a search to answer this. In order to do this, they require text (or image) inputs, and that presents our first challenge: what text should we be sending the LLM, and how should it be represented in the data?
In this use case, we have structured data in a database (such as Snowflake or Databricks) saying “Business Name'', “Business Address”, “ID”, “Description of Business”, “Website” etc, and then potentially hundreds of rows per day. We need to make sure that data is represented properly: as something that can be used in a prompt, both for this usecase, and for future cases which may need to touch this data.
In order to run that “fuzzy”, or semantic, search, certain key fields need to be embedded. So those fields — such as website description or website content — need to be classified as “Text” data.
Other fields can be classified as “Metadata” and are usually prepended to the prompt in classic RAG style, sometimes paired with the description depending on the query. For example, if you’re just asking the agent to assess a single business, you can just put everything into one prompt.
You can see an example of this workflow below in Credal:
Regulated enterprises very often have to be able to defend decisions to their regulator, so a well governed system, that’s compliant is vital. That means careful audit logging of every request, easy deletion when customers ask for their data to be deleted, and easy observability that’s exportable to companies’ existing systems for storing audit logs in case regulators come asking for details in the future.
Equally important is the need to think through the permissions of who should be able to access the data we’ve connected from internal systems.
Most large enterprises will have a lot of teams which operate on the basis of the “The principle of least privilege (PoLP)”, which is an information security concept which maintains that a user or entity should only have access to the specific data, resources and applications needed to complete a required task. This is not how most startups operate, but typically makes sense in the context of a larger company.
In this case, some of the data is quite sensitive - the photos of individual people’s IDs, so we may need to restrict this to just the analyst dealing with this specific case, and their direct supervisors. In the simplest case, we can do this by making sure that the underlying data source in Snowflake, GCP etc has a field for allowed users, and then we can just point Credal’s allowed users at that field — inheriting role-based access controls.
There are more sophisticated strategies we can use than this (such as saying this object is tied to this record in Salesforce/Zendesk etc, and then inheriting the permissions of that record), but in many cases this will do. It’s worth noting that permissions for embeddings can be complex, and vary with what those embeddings represent; and we’ve written more about other data security risks with LLMs elsewhere.
Prompt injections are one of the biggest security risks with LLMs. (OWASP listed it as #1). In short, it is possible to put malicious content on a website that, when ingested, instructs the LLM to do something the prompter didn’t intend. We need to manage this risk.
For example, here is a prompt we might use:
Now suppose a malicious business embeds the following instruction in its website HTML:
This type of attack is something that regulated companies will need guardrails around; there are many highly sophisticated attackers looking for ways to exploit the KYC processes of financial institutions, and this example was one that took me about 2 minutes to construct.
Credal provides a number of out-of-the-box guardrails to prevent this kind of mistake. For obvious reasons, we’re not going to discuss all of them here, but one simple part of the offering which our regulated customers use is Credal’s acceptable use policy enforcement. This lets a person write a natural language acceptable use policy, and if that policy is triggered, Credal will automatically flag or block the request.
In this case, we can see the exact same query that tripped up ChatGPT getting flagged as potentially suspicious in Credal’s UI:
This is a simple example; in real life we might have multiple overlapping policies, which can be applied to all use cases or just specific ones. Here, if Credal detects something suspicious in the prompt, we’ll automatically flag that in our API response (or in the Credal UI, if the end-user is using Credal’s chat UI), and the customer can surface that in their application. We’ve also written a whole guide to dealing with prompt injections that you can find here.
Enterprises in regulated industries that deploy AI to improve their business metrics and employee productivity will gain an advantage; thus, we will see widespread AI deployment in these areas in the next few years. Businesses will elevate their use of AI from simple Q&A tools to advanced systems that drive decision-making and safeguard operations. If you’re in a regulated industry or want advice on dealing with these issues, or want to arrange a demo of our platform, feel free to contact us and we’d be happy to help: founders@credal.ai
Credal gives you everything you need to supercharge your business using generative AI, securely.