Open Source CXO

Analyzing AI for the Enterprise: Current and Future States — Part 1 — Open Source CXO Ep. 15 | Active Logic

Rob Kehoe CEO, Active Logic

May 1, 2024 Episode 15 42m

With: John Keddy, CEO at Lazarus AI

The gap between AI hype and AI reality is wide, and it’s growing. In this first of two episodes with John Keddy, CEO of Lazarus AI, the conversation focuses on the practical foundations that enterprise organizations need before they can meaningfully adopt AI — and why most companies are skipping steps that will cost them later.

Lazarus AI specializes in document automation for regulated industries including insurance and healthcare, where data accuracy isn’t optional and the consequences of errors are measured in dollars and compliance violations. John’s perspective is informed by building AI systems that have to work reliably at scale, not just demo well in a conference room. That distinction matters more than most enterprise leaders realize.

Key Insight: Understanding the AI Landscape — Large vs. Small Language Models

John starts with a foundational distinction that many business leaders gloss over: the difference between large language models (LLMs) and small language models (SLMs), and why the choice between them has significant implications for cost, accuracy, and deployment strategy.

Large language models like GPT-4 are general-purpose — they know a lot about a lot. That breadth makes them impressive in demos and useful for general tasks, but it also makes them expensive to run and difficult to fine-tune for specific domains. Small language models are trained on narrower datasets and optimized for specific tasks. They’re cheaper, faster, and often more accurate within their domain.

For enterprise applications, the right answer is frequently a small model trained on proprietary data rather than a large model prompted with general instructions. An insurance company processing claims doesn’t need a model that can write poetry — it needs a model that can accurately extract data from insurance forms. John describes how Lazarus AI uses this principle to deliver higher accuracy at lower cost than organizations attempting to force general-purpose models into specialized roles.

Key Insight: AI in Regulated Industries

Insurance, healthcare, and financial services share a common challenge: the data they process is sensitive, the accuracy requirements are high, and the regulatory consequences of errors are severe. These constraints fundamentally change how AI can be deployed.

John describes the reality that many AI vendors don’t advertise: out-of-the-box AI solutions rarely meet the accuracy thresholds required in regulated industries. A 90% accuracy rate sounds impressive until you realize that means one in ten documents is processed incorrectly — an unacceptable error rate when those documents are insurance claims, medical records, or financial statements.

Lazarus AI’s approach involves proprietary OCR (optical character recognition) technology specifically designed for the types of documents their clients process. Generic OCR tools struggle with the formatting inconsistencies, handwritten annotations, and poor scan quality that are common in real-world document workflows. By building domain-specific ingestion pipelines, Lazarus achieves the accuracy rates that regulated industries require.

Key Insight: Human-in-the-Loop Quality Control

The concept of human-in-the-loop (HITL) AI is central to how Lazarus AI operates, and John makes a compelling case for why it should be central to any enterprise AI deployment in high-stakes domains.

The idea is straightforward: AI handles the high-volume, routine processing, but human reviewers verify outputs that fall below a confidence threshold. This creates a system that captures most of the efficiency gains of full automation while maintaining the accuracy standards that pure automation can’t guarantee.

The nuance is in the implementation. How you route items for human review, how you set confidence thresholds, and how you use human corrections to improve the model over time all determine whether HITL is a genuine quality system or just a checkbox. John describes the feedback loops Lazarus has built, where human corrections are systematically fed back into model training, creating continuous improvement that narrows the gap between automated and human accuracy over time. Organizations investing in custom software for document processing should consider this architecture pattern from the outset.

Key Insight: Cost-Effectiveness and Practical AI Adoption for Mid-Size Companies

One of the most valuable parts of the conversation addresses a question that mid-size companies struggle with: how do you capture AI’s benefits without the budget of a Fortune 500?

John’s advice is pragmatic. Start with a specific, well-defined problem — not “implement AI across the organization.” Identify a process that is high-volume, repetitive, and currently expensive (either in labor cost or error cost). Build or buy an AI solution for that specific process. Measure the results. Then expand.

The mistake most mid-size companies make is trying to boil the ocean — launching broad AI initiatives without clear use cases, measurable success criteria, or realistic timelines. The companies that succeed start small, prove value, and scale incrementally. This is the same discipline that makes any software development initiative successful: define the problem, build the solution, measure the outcome.

John also addresses the build-vs-buy question directly. For most mid-size companies, building custom AI models from scratch is neither practical nor necessary. The better approach is leveraging existing platforms and APIs, customized with proprietary data and integrated into existing web applications and workflows.

Key Insight: Risk Management in AI-Driven Operations

Deploying AI in production isn’t just a technology decision — it’s a risk management decision. John walks through the risk categories that enterprise leaders need to evaluate: accuracy risk (the model gets it wrong), availability risk (the model or API goes down), compliance risk (the model processes data in ways that violate regulations), and vendor risk (your AI provider changes pricing, terms, or capabilities).

Each of these risks requires specific mitigation strategies. Accuracy risk demands validation layers and human-in-the-loop review. Availability risk requires fallback processing paths. Compliance risk needs audit trails and data governance. Vendor risk calls for architecture that isn’t locked into a single provider.

The organizations that treat AI deployment as a risk management exercise, in addition to an opportunity, are the ones that build sustainable AI capabilities. Those that focus only on the upside tend to discover the downside at the worst possible time.

Takeaways

Small, domain-specific models often outperform large general-purpose models for enterprise use cases — and cost less to run.
Generic AI tools rarely meet regulated industry accuracy requirements. Domain-specific ingestion and processing pipelines close the gap.
Human-in-the-loop isn’t a compromise — it’s a feature. Well-designed HITL systems capture efficiency gains while maintaining accuracy standards.
Start with one specific, measurable problem. Mid-size companies that try to implement AI broadly waste budget. Those that start narrow build real capabilities.
Treat AI deployment as a risk management exercise. Accuracy, availability, compliance, and vendor risks each need explicit mitigation strategies.

Have a Project in Mind?

Let's talk about what you're building and how we can help.

(913) 544-2302 Contact Us