Building AI Tools With Guardrails — Peter Holcomb Pt. 2

Show Notes

Every founder building AI tools right now is, in some sense, running haphazardly into the night. The opportunity is obvious, the tools are accessible, and the governance frameworks are anything but. Peter Holcomb returns for Part 2 to walk through the practical framework every AI builder needs before they ship - covering pre-purchase governance questions, what it actually costs to train on proprietary data, prompt injection as an attack vector, red team and blue team testing resources, and why security bolted on after the fact is how companies end up with nine-figure fines.

Peter is the founder of Optimal IT, an AI governance and compliance consultancy working with startups and SMBs under 100 people. He also consults for HackerVerse and Tesseon, two companies on the front lines of AI security testing and guardrail implementation. This is the most practically actionable episode in the series for founders building on top of LLM APIs.

Before You Buy or Build: The Governance Questions First

The most common mistake Peter sees is companies procuring AI tools - or building them - before establishing a governance framework. If you can't answer these questions, you're not ready to purchase or ship:

How are employees ensured to use AI securely?
What data will vendors be allowed to train on?
Where will the AI models be hosted?
What is the risk appetite for hallucinations - when the model confidently fabricates outputs?
How do we account for unwanted biases baked into the model?
What AI security mechanisms are in place against prompt injection, jailbreaking, and evasion?
How are copyright infringement risks being managed?
What regulatory and compliance requirements apply - HIPAA and HITRUST for healthcare, SEC rules for financial services?

These aren't abstract enterprise concerns. They apply to a solo founder with $100 in OpenAI credits who wants to sell a tool to other businesses. The questions scale down; the risks don't disappear.

Building on Top of LLM APIs: What to Actually Consider

If you're building a novel tool on top of OpenAI, Gemini, or another provider's API, the first document you need is an acceptable use policy - what the tool is for, what it's not for, and what users are agreeing to when they use it.

If your tool is training on proprietary data, understand the compute cost before you commit. Fine-tuning and RAG implementations on large proprietary datasets can run hundreds of thousands of dollars. Free API credits are a starting point, not a cost model. Know your retrieval strategy, your chunking strategy, and your parsing strategy before you build - they all affect both cost and output quality.

When you expose a homegrown tool to paying customers, the security burden shifts significantly. The tool needs access controls, output filtering, and hardened endpoints. The model provider's terms of service offer some indemnification, but contracts only cover you so far. At the end of the day, if your user's data is mishandled, it's your liability - not OpenAI's.

Peter's framing on the legal vs. technical investment tradeoff: you need both. Spend on legal to establish indemnification and user risk acknowledgment. Spend on technical security to actually protect the system. Neither replaces the other.

Prompt Injection: What It Is and Why It's #1 on the OWASP Top 10

Prompt injection is the LLM equivalent of SQL injection - using crafted inputs to overwrite or circumvent the system instructions that govern how a model behaves. When Ryan tried to get Lovable to add sarcastic insults to his dietary mindfulness app by telling the model to "ignore what you've been told," that's a textbook direct prompt injection attempt.

In SQL injection, a malicious input in a form field executes a database command - drop table, extract data - if the backend doesn't sanitize inputs. In prompt injection, a malicious or cleverly crafted user message manipulates the model's system prompt, causing it to behave in ways the developer didn't intend and the model's provider prohibits.

Prompt injection is the #1 attack vector on the OWASP LLM Top 10 - the open standard maintained by the Open Web Application Security Project that catalogs the most critical vulnerabilities in LLM-based applications. Red teamers and pen testers use these attack vectors systematically to evaluate whether a model's guardrails are hardened or porous.

Testing Your Tool Before Market: Red Teams and Blue Teams

Once you've built a tool, you need to know how it holds up before real users try to break it. Peter consults for two companies that handle this:

HackerVerse (CEO: Mariana Padilla, CTO: Craig) - a platform where software is evaluated by real red teamers conducting live pen tests. They examine the underlying model, the APIs, and the access controls in real time, and surface what's hardened versus what needs work. This is the attack side.

Tesseon - provides both the red team (attack) and blue team (defense) functions. After identifying vulnerabilities, Tesseon can implement the guardrails directly - baking in the system instructions and filtering layers that harden the model against the attack vectors that were found. They operate on-prem and via SaaS, with a browser plugin for individuals in development. Pricing is compute-based, tied to token usage and container resource loads.

OWASP also maintains an intentionally insecure LLM that developers can use to practice all 10 attack vectors in a safe environment - useful for founders who want to understand the attack surface before hiring external testers.

Who Should Own AI Governance Inside a Company

As a company scales and builds its own AI tools, the question of who owns AI governance internally becomes urgent - and the answer is almost never "one team." Peter advocates for a shared responsibility model built around a cross-functional AI committee or subcommittee.

Here's how the stakeholders break down:

Security team - Pros: CIA triad expertise (confidentiality, integrity, availability), existing tooling. Cons: risk aversion can slow AI deployment below the business's acceptable pace.
Legal team - Pros: fluent in regulatory landscape, privacy law, contracts. Cons: less technical, can be even more risk averse than security.
Privacy team - Pros: aligned with GDPR, CCPA, and data subject rights. Cons: design requirements may exceed the business's actual risk appetite and slow product decisions.
Engineering / Data Science / CTO office - Pros: deep technical fluency on AI capabilities and limitations. Cons: may lack security, compliance, and business-outcome orientation.

No single team has the full picture. Siloing AI governance in any one of them introduces the blind spots of that team while losing the strengths of the others. A dedicated cross-functional committee cuts through the bureaucratic biases and aligns the organization's AI decisions with both technical reality and business goals.

The Real Cost of Skipping Governance

Security bolted on after the fact is the pattern Peter sees repeatedly - and it's consistently the most expensive version of AI governance there is. Companies move fast to market, discover a vulnerability after launch, and face fines and breach assessments that dwarf what proactive governance would have cost.

The math Peter offers: pay a meaningful amount upfront for proactive security controls. Or pay 10x that - possibly a million, a hundred million, or more - in post-breach fines, remediation, and reputational damage. The ROI calculation isn't complex. The discipline to do it before shipping is the hard part.

His recommendation before any AI investment: calculate the return on the tool or program first. If the business case is sound, the governance investment becomes the cost of doing it safely rather than an optional add-on to be deprioritized.

About Optimal IT

Peter's company, Optimal IT (opto-it.io), serves startups and SMBs under 100 people that are adopting AI tools, working with proprietary data, and building novel internal or external software. The typical engagement starts with a risk assessment to understand the company's current security posture and AI usage patterns, then moves into policy development, security control implementation, and compliance certification.

Certifications Optimal IT can guide companies through: SOC 2 Type 2, HIPAA, HITRUST, NIST, and ISO. Current clients are concentrated in technology and health tech. The company's value proposition is giving smaller companies the governance infrastructure that enterprise companies have in-house - at a scale and cost appropriate for early-stage businesses.

Tools & Resources

Optimal IT - AI governance, compliance, and security consultancy for startups and SMBs; risk assessments, policy writing, security controls, SOC 2/HIPAA/HITRUST/NIST/ISO certification (opto-it.io)
HackerVerse - Red team pen testing platform; evaluates AI tools and APIs against real security testers in real time to surface vulnerabilities before launch
Tesseon - Combined red team (attack) and blue team (guardrail implementation) platform; on-prem and SaaS; compute-based pricing; browser plugin in development
OWASP LLM Top 10 - Open Web Application Security Project's catalog of the 10 most critical vulnerabilities in LLM-based applications; prompt injection is #1; includes a sandbox insecure LLM for practice testing
SOC 2 Type 2 - The most common security certification for SaaS companies; signals to enterprise customers that security controls are in place and have been independently audited
Lovable - Referenced as Ryan's vibe-coding tool; its built-in guardrails prevented generation of harmful content even when the user attempted creative prompt workarounds - illustrating both why guardrails exist and why they can frustrate legitimate edge cases

Key Frameworks from This Episode

Governance Before Procurement: If you can't answer the eight core governance questions - on data handling, model hosting, hallucination tolerance, bias risk, security, copyright, and regulatory compliance - you are not ready to buy or build an AI tool. The governance framework has to exist before the software decision, not after.
Prompt Injection as SQL Injection Analogue: SQL injection exploits unsanitized database inputs to execute unauthorized commands. Prompt injection exploits the LLM's instruction-following behavior to overwrite system prompts and extract unauthorized outputs. Both are input-manipulation attacks on systems that trust their inputs too much. Both are the top vulnerabilities in their respective OWASP categories.
Red Team Then Blue Team: Red team = attack: pen testers attempt to break your AI tool using known attack vectors (OWASP LLM Top 10) to surface vulnerabilities. Blue team = defense: security engineers implement the guardrails and controls that harden the system against those vectors. Both steps are necessary. Skipping red team means you don't know what you're defending against.
Shared Responsibility for AI Governance: No single internal team should own AI governance. Security has the right mindset but excessive risk aversion. Legal has regulatory expertise but limited technical fluency. Privacy ensures compliance but may over-constrain design. Engineering has technical depth but may underweight compliance. A cross-functional AI committee captures all four perspectives and avoids the blindspots of any single team.
Proactive vs. Reactive Security Economics: Proactive governance has a known, bounded cost. Reactive governance - responding to a breach or regulatory action - costs a multiple of what prevention would have. Peter's observed ratio: roughly 10x. For companies that get it badly wrong, the fine alone can end the business. The ROI calculation for security investment is almost always favorable when modeled honestly.
Indemnification as a Starting Point, Not a Solution: Acceptable use policies, terms of service, and liability limitations reduce exposure but don't eliminate it. If your platform mishandles user data, the contract protects you to a point - but the data is still yours to answer for. Legal documentation is necessary but not sufficient. Technical security controls do the work that legalese cannot.

FAQ

I'm a solo founder vibe-coding with OpenAI credits. Do I really need AI governance?

If you're building for personal use only, governance is minimal. The moment you charge a user, collect their data, or expose a tool to the public, you have liability. The eight governance questions Peter outlines scale down to startups - they just cost less to answer at small scale than at enterprise scale. Answer them early and cheaply, or answer them expensively after something goes wrong.

What is prompt injection and how do I defend against it?

Prompt injection is an attack where a user crafts an input that overwrites or circumvents the system instructions you've given the model. Defense mechanisms include: input sanitization and validation before passing to the model, output filtering on the response, using models with built-in instruction hierarchy (system prompt > user prompt), and red team testing specifically targeting your system prompts. Tesseon and HackerVerse both provide this testing as a service.

What's the difference between red team and blue team in AI security?

Red team = offense. These are the testers who attempt to break your system using known attack vectors - prompt injection, jailbreaking, data extraction, evasion techniques. Blue team = defense. These are the engineers who implement the controls and guardrails that harden the system against red team findings. Good security practice runs red team first to find the gaps, then blue team to close them.

How much does it actually cost to train an AI model on proprietary data?

It depends enormously on the data volume and the model architecture, but Peter's warning is that founders routinely underestimate it. Full fine-tuning on large proprietary datasets can run hundreds of thousands of dollars before you see usable output. RAG (retrieval-augmented generation) approaches are often more cost-effective for proprietary data use cases. Get real compute cost estimates before committing to an architecture.

Which team should own AI governance in my company?

No single team should own it exclusively. Peter's recommendation is a cross-functional AI committee or subcommittee that includes representatives from security, legal, privacy, and engineering. Each team has a critical perspective and a critical blind spot. The committee structure captures all four and distributes accountability rather than concentrating it in one team's biases.

When should a startup engage Optimal IT?

The right moment is before you have a problem - ideally when you're building your first AI tool or adopting your first LLM-based software. Optimal IT works with companies under 100 people to run a risk assessment, establish policies and security controls, and guide them through compliance certifications (SOC 2, HIPAA, HITRUST, NIST, ISO). The earlier the engagement, the less retrofitting is required later.

Building AI Tools with Guardrails (Part 2 w/ Peter Holcomb)