It seems like an obvious move: train your firm's AI system on SEC filings and then let it perform regulatory tasks. But new research from Patronus AI should give you pause.
When even the most-advanced large language model AI system (GPT-4) was put to the test, it answered only 79% of questions about an SEC document correct. In other words: a C+ grade. As CNBC explains, the large language models tended to either refuse a response, or “hallucinate” an answer with incorrect data points.
"That type of performance rate is just absolutely unacceptable," said Anand Kannappan, Patronus’s co-founder, of the findings. "It has to be much much higher for it to really work in an automated and production-ready way."
The findings cast doubt on both how quickly AI systems can be integrated into firms, and what tasks they’ll perform. Still, it’s not slowing Wall Street down.
In April, Bloomberg announced it was working on an internal AI system based on Chat GPT. The company says the model (dubbed BloombergGPT) will help “in improving existing financial NLP (natural language processing) tasks, such as sentiment analysis, named entity recognition, news classification, and question answering, among others.” And fellow financial giant JPMorgan is reportedly working on an internal model too, having filed a patent claim for a product called IndexGPT that may be able to provide investment advice.
But doling out financial advice or judging investor sentiment, while morally dubious, is not the same as performing regulatory and compliance tasks.
"There just is no margin for error that's acceptable, because, especially in regulated industries, even if the model gets the answer wrong 1 out of 20 times, that's still not high enough accuracy," Rebecca Qian, Patronus’s co-founder, said. Kannappan added, “models will continue to get better over time. We're very hopeful that in the long term, a lot of this can be automated. But today, you will definitely need to have at least a human in the loop to help support and guide whatever workflow you have."
A Word of Caution
In a July assessment on implementing AI into regulatory tasks, consulting and accounting firm EY cautioned that "as AI tools start to become viable from a cost and value perspective, compliance professionals should perform top-down and bottom-up assessments of their operating model to proactively identify areas for potential enhancement. Designing a future-state compliance framework with key operational objectives — risk outcomes, cost reduction, process improvement — can help govern the implementation journey."
THE VERDICT:
Silicon Valley is no stranger to the hype cycle. We have been here before with cryptocurrencies, self-driving cars, and even online grocers. AI is clearly a revolutionary technology that will profoundly reshape our world, but for now, it still needs some work. Regulatory compliance departments would be wise to begin thinking of how AI can integrate into their workflows and processes, but they should not put the proverbial cart before the horse for risk of creating violation nightmares.
Be a smarter legal leader
Join 7,000+ subscribers getting the 4-minute monthly newsletter with fresh takes on the legal news and industry trends that matter.