Solutions
Customer Support
Resources
The report is a collaboration between some of the leading minds in this space, particularly the Legaltech Hub, courtesy of Nicola Shaver and Jereon Plink; and project lead Tara L. Waters, who I’m excited to say is joining our next webinar to discuss which AI solutions fit best for which tasks (sign up here - Agents vs Copilots: which works best for in-house legal?)
Let’s dig into the report. It evaluates the performance of four tools - CoCounsel, Vincent AI, Harvey Assistant from Harvey, and Oliver from Vecflow - across seven legal tasks:
AI performance was benchmarked against a ‘Lawyer Baseline’ - a control group assembled by Cognia Law and including Reed Smith LLP, Fisher Phillips, McDermott Will & Emery, and Ogletree Deakins (plus four more anonymous firms).
It’s great to see pioneering firms take the lead here and grasp the nettle of evaluating people vs AI. I believe that the firms and in-house teams who see this wave coming and ride it will thrive compared to those who pretend it won’t affect them.
So what are the toplines? Here are the summarised results:
The first thing to note is that AI is already outperforming the human lawyer baseline in a whole range of cases.
For data extraction, document Q&A, summarization and transcript analysis, you can already buy multiple solutions (including Juro) that outperform a baseline set by some of the world’s best law firms.
This underlines what many (including us) have been saying for some time, since Goldman Sachs first dropped that dramatic report a year or two ago. But to see it in black and white is still quite something.
It’s hardly surprising though. If I think back to my time as an M&A lawyer, sitting in a windowless room with piles of paper documents and a clipboard, performing what we’d now call data extraction - finding relevant dates, values, change of control clauses and so on - I would have been astonished to see what is possible today.
It makes sense that an AI so well-suited to text comprehension should outperform a single tired, inexperienced, junior lawyer like me. Our customers’ adoption of AI Extract validates this too - usage of that feature alone is growing more than 100% every month.
Similarly for document Q&A and summarization, these are exemplary applications of what generative AI can do.
We saw in our webinar last week (The limits of AI: are there any legal tasks AI should never do? - watch on demand) that there are still pockets of healthy skepticism regarding AI in legal.
But if AI is decisively outperforming top law firms, in a matter of moments, for a fraction of the cost (and without value being measured in six-minute increments), I don’t see how that skepticism can hold for much longer.
That said, we are clearly at the bottom of the maturity curve for some applications of AI. For redlining, for those solutions brave enough to throw their hat in the ring (in this study, Harvey and Vincent), performance is some way behind the law firm baseline. And given the potential consequences of bad redlining, this is definitely a challenge for AI adoption.
Why is redlining so hard? The big difference vs text interpretation tasks is really the amount of context required to do that job well. If we compare extraction, if AI is asked to find an effective date on a vendor contract, it doesn’t really need any context other than the document itself.
There are probably some numbers and letters that look like a date, near to the words ‘effective’ and ‘date’, or some simple reasoning that relates back to a date, and AI can figure it out.
But to redline a contract well, there’s so much you need to know:
… and so on. Even nuanced factors like the real or perceived bargaining power of each side can have a material impact on how you mark up that document.
A lawyer who’s navigated not just that document but that professional scenario dozens or hundreds of times still has the edge…
… for now. Redlining is just at a different point on the maturity curve. As models get more powerful and solutions become more integrated, it’s not hard to imagine the lawyer advantage eroding.
With integrations and APIs, it would be possible for AI to understand:
… and so on. Ultimately the context that AI is missing is just data it doesn’t have yet. Given the pace of development, I would bet on it having that data sooner than we think. It’s just a question of time, and then what will the delivery of legal advice look like?
Check out the Vals.ai report in the comments and do share your experiences of tackling these tasks with AI - which do you like, and which aren’t quite there yet?
ICYMI - we're hosting a webinar featuring results from this survey and our own which you can sign up by clicking below.
Richard Mabey is the CEO and co-founder of Juro, the intelligent contract automation platform. Under his leadership, Juro has scaled rapidly, backed by $38 million in venture funding from prominent investors including Eight Roads, USV, Point Nine Capital and Seedcamp, and the founders of companies like Indeed, Gumtree and Wise.
Richard trained and qualified at Freshfields Bruckhaus Deringer, working as an M&A associate in London and New York. He gained an MBA from INSEAD, and then spent time at LegalZoom, learning to build legal tech products.
Frustrated by the manual legal processes that slow down businesses, Richard co-founded Juro in 2016, with a mission to help the world agree contracts faster. Beyond Juro, he hosts the "Brief Encounters" podcast, makes angel investments, and supports other ambitious ventures from the boardroom. Richard is a Fellow of the RSA, an adviser to The Entrepreneurs Network and sits as a Non-executive Director of Bright Blue.