🤖 AI Summary
A benchmarking report (July–Aug 2025) that evaluated 13 AI systems against a human in-house lawyer baseline (450 task outputs, 72 survey responses, 12 interviews) finds that modern AI can match or exceed lawyers at producing reliable first drafts of contracts. Using a three‑dimension framework—Output Reliability (instruction compliance, factual accuracy, legal adequacy; pass/fail), Output Usefulness (clarity, helpfulness, length; 1–3 each, max 9) and Platform Workflow Support (generation + quality assurance features; max 10)—the study shows top models (Gemini 2.5 Pro at 73.3% reliability, plus GPT-5, GC AI, Brackets, August, SimpleDocs) beating the human baseline (56.7% reliability; rising to 61.5% with AI assistance). Notably, legal AI flagged material risks far more often than general-purpose models (83% vs 55%), while human reviewers raised none.
Technically significant findings: general-purpose LLMs slightly edged legal-specific tools on raw output reliability, while legal platforms scored higher on usefulness and, crucially, workflow integration (66.7% of tools integrate into Microsoft Word). Platform Workflow Support—context handling, template/playbook grounding, and verification features—emerged as the main differentiator for adoption, not pure model accuracy. Implications: legal teams should prioritize tools that combine strong reliability with seamless Word integration and QA features; AI can materially reduce drafting time and surface overlooked risks, but outputs still require lawyer oversight and continuous verification as capabilities evolve.
Loading comments...
login to comment
loading comments...
no comments yet