Skip to main content

Vontive Sets Mortgage Industry’s First Benchmark for AI Performance on Data Processing Tasks

Company's proprietary dataset becomes industry gold standard for evaluating AI accuracy in mortgage underwriting

Vontive, the technology company standardizing the business-purpose mortgage, announced the release of the mortgage industry’s first LLM benchmark study today. Using the company's expert-annotated dataset covering 23 document types essential to mortgage underwriting, this benchmark sets the industry’s new baseline for measuring AI accuracy and effectiveness in mortgage document analysis. Its release will help other organizations develop greater trust in their AI initiatives and execute more informed decision-making across their businesses.

"AI produces compelling results that often look 'pretty good,' but in mortgage lending, you need to actually be right," said Wolf Rendall, Director of Data Products at Vontive. "Before our benchmark, companies had no reliable way to distinguish between AI that appears to work well and AI that actually performs accurately on mortgage-specific tasks."

AI benchmarking addresses a critical gap in the mortgage industry. Prior to this study, there had been no way to measure whether AI systems could reliably and accurately handle the volumes and complexity of data required for mortgage lending decisions.

"We invested heavily in building out our data pipeline to ingest structured data from very dense, nuanced mortgage-related documents, and it was imperative that we could validate our results, even amid a rapidly evolving AI landscape," added Rendall. "We knew if we had this challenge, it would eventually become a problem across the industry, potentially limiting AI adoption on a greater scale, so we decided to establish the benchmark ourselves."

Study Results: Claude 3.7 Sonnet Best Suited for Vontive’s Complex Document Processing Needs

To create this benchmark, Vontive provided partnering organization, Vals AI, with a dataset that included human annotations for documents such as county tax certificates, insurance binders with complete policy endorsements and coverages, rental leases with detailed terms and dates, comprehensive title reports, LLC formation documents, operating agreements, and regulatory HUD forms required for property transactions. Vontive’s proprietary AI systems were developed to process these documents, demonstrating particular strength in mathematical reasoning, which is crucial for tasks like calculating annualized tax amounts from quarterly or semi-annual county assessments.

Through Vals AI’s rigorous testing, Vontive determined that Anthropic's Claude 3.7 Sonnet provides optimal performance for the company’s document processing functions, balancing accuracy with cost-effectiveness. The company's analysis also revealed that upgrading to more expensive AI models yielded minimal performance gains, while smaller models showed meaningful accuracy decreases.

According to the study and the prompts provided in May, Anthropic’s Claude 3.7 Sonnet (Nonthinking) was the best performing model at 80.6% accuracy, excelling in both semantic and numerical extraction tasks. For comparison, Claude 4.0 (Thinking) achieved only a 62.5% accuracy rate, while Meta’s Llama 3 achieved a 55.3% accuracy rate. These results show that in the AI race, bigger is not always better because the work also depends on the prompts and the specific problem space. With the benchmark results in hand and iterating quickly by re-evaluating results with Vals AI at each step, Vontive was able to improve the prompts for the AI’s extraction tasks and reach 90% accuracy, in line with human data entry.

Establishing Rigor Across the Industry

By making the benchmark dataset available through their partnership with Vals AI, Vontive created a standardized measurement system that other organizations can use to evaluate their own AI implementations. The benchmark functions similarly to established standards in other domains, such as Stanford's question-and-answer dataset for natural language processing, providing the mortgage industry with its first rigorous framework for AI evaluation.

The benchmark addresses broader industry challenges around AI reliability. Unlike creative AI applications where subjective results are acceptable, mortgage document processing requires precise extraction of specific data points with verifiable accuracy. Through their partnership, Vontive and Vals AI can provide the quantitative rigor necessary to build trust in automated underwriting systems.

“We wanted to work with Vontive on this important initiative because of the company’s clear commitment to AI innovation, accuracy, and quality,” said Rayan Krishnan, CEO of Vals AI. “Now others can benefit as we introduce a new standard for rigor in the industry.”

Vontive's internal AI systems currently achieve 95% accuracy across all supported document types, demonstrating the company's leadership in applying artificial intelligence to complex mortgage workflows.

Scaling AI-Driven Mortgage Operations

In addition to its benchmark, Vontive also disclosed that it now supports more than 40 document types in its AI-powered processing pipeline. This expansion represents all of the common document types typically required in mortgage underwriting workflows. As such, the company will be able to build on its existing success identifying and analyzing vital information for more accurate underwriting and pricing while reducing loan processing time exponentially. Vontive also automatically validates data across thousands of criteria, making sure all relevant points agree at all stages of the loan.

Vontive’s AI Underwriter was first introduced into production workflows on July 23, 2024. Since then, it has parsed 35,000 documents to fill out datasets for 1,700 borrowers on 6,845 loans with requested balances totaling $1.713B.

By embracing, refining, and testing the validity of its AI systems, and persistently innovating to incorporate AI into its data products, Vontive underscores the operational scalability that robust AI implementation can provide.

To learn more about the benchmark study, please visit Vals Benchmarks.

About Vontive

Founded by credit industry and technology veterans, Vontive is the leading embedded mortgage platform for investment real estate with best-in-class technology that standardizes private credit mortgages. Vontive enables any bank, credit union, property technology company, or B2C brand serving real estate investors to launch its own investment-mortgage business with ease. Please visit www.vontive.com or follow us at linkedin.com/company/vontive.

AI produces compelling results that often look 'pretty good,' but in mortgage lending, you need to actually be right. — Wolf Rendall, Director of Data Products at Vontive

Contacts

Stock Quote API & Stock News API supplied by www.cloudquote.io
Quotes delayed at least 20 minutes.
By accessing this page, you agree to the following
Privacy Policy and Terms Of Service.