Patronus AI, a groundbreaking startup created by two former AI experts from Meta, is making waves with its innovative approach to evaluating and testing large language models. With a specific focus on regulated industries where errors are not tolerated, Patronus AI has developed a security and analysis framework designed to identify problematic areas, particularly the potential for hallucinations.
The founding duo, CEO Anand Kannappan and CTO Rebecca Qian, bring extensive experience in responsible AI to the table. Kannappan was responsible for overseeing responsible machine learning frameworks at Meta Labs, while Qian spearheaded responsible NLP research at Meta AI. Together, they have developed a managed service that automates and scales the process of model evaluation, alerting users to identified issues.
The evaluation process consists of three key steps. First, Patronus AI’s product enables users to score models in real-world scenarios, assessing criteria such as hallucinations in finance and other applicable fields. Next, the product generates adversarial test suites and stress tests the models against these tests, building test cases automatically. Finally, the models are benchmarked using various criteria to determine the most suitable model for a specific use case.
The focus of Patronus AI is primarily on highly regulated industries, where incorrect answers can have significant consequences. By ensuring the safety and reliability of large language models, the startup helps companies detect instances of business-sensitive information and inappropriate outputs.
Kannappan emphasizes that Patronus AI aims to be a trusted third party in model evaluation. The startup provides an unbiased, independent perspective on the performance of language models, offering credibility and peace of mind to organizations utilizing these models.
In terms of pricing, Patronus AI adopts a usage-based model, tailored to the volume of evaluations and samples required. The company currently employs six full-time professionals, but with the rapid growth of the space, expansion is on the horizon. Qian emphasizes the importance of diversity within the company, pledging to institute programs and initiatives to foster an inclusive workspace as the team grows.
Patronus AI’s $3 million seed funding round, led by Lightspeed Venture Partners with participation from Factorial Capital and other industry angels, solidifies the startup’s position for future success as it continues to revolutionize the evaluation and testing of large language models.
1. What is Patronus AI?
Patronus AI is a startup focused on developing a security and analysis framework to evaluate and test large language models. It aims to ensure the safety and reliability of these models, particularly in regulated industries.
2. What is the purpose of Patronus AI’s evaluation tool?
The evaluation tool automates and scales the process of model evaluation, alerting users to potential issues and identifying the best model for a specific use case.
3. Why is Patronus AI’s focus on regulated industries?
Regulated industries have little tolerance for errors, making it crucial to ensure the accuracy and reliability of large language models used in these contexts.
4. How does Patronus AI provide an unbiased perspective?
Patronus AI aims to be a trusted third party by offering an unbiased, independent evaluation of language models. This checkmark of credibility sets it apart from self-proclaimed claims of model superiority.
5. How does Patronus AI plan to grow?
The company plans to hire additional talent in the coming months to keep up with the fast-paced growth of the industry. Diversity is a key pillar for Patronus AI, and the company intends to foster an inclusive workspace as it expands its team.