Unknown Behavior

Independent Research Organization

Researching how AI systems fail, adapt, and exceed their intended boundaries.

Unknown Behavior studies AI behavior under uncertainty. Healthcare-focused, with cross-industry runtime control infrastructure for autonomous AI in any high-stakes domain.

Case studies

Operational engagements.

Recent consulting work in healthcare operations and data quality.

Digitizing intake at a primary-care clinic in Lima.

Pulled scheduling and patient-flow data, identified intake as the throughput constraint, redesigned the workflow into a digital pre-arrival system. Wait time down 50 percent, no-show rate down 40 percent, schedule density restored. Patient-ID reconciliation across Excel and the Peruvian double-surname system was the hidden engineering before any throughput analysis ran.

Reconciling 500K+ transactions across a multi-vendor integration.

Diagnosed cross-system breaks with SQL, built per-batch validation checks at every join point, translated findings into a single Tableau view designed for leadership reading. Mismatches down 25 percent, manual reporting effort down 40 percent. Breaks now caught at ingest instead of weeks later.

Projects

Applied research and infrastructure.

Live demos and source for each project. Healthcare AI applications and the control layer beneath them.

Symptom Triage

Pre-visit triage from a sentence and an optional photo. Plain language in, schema-validated structure out: body systems involved, ranked possible causes, red flags, and the questions a clinician is likely to ask. v1 was a LoRA fine-tune of Qwen2.5-1.5B. v2 is multi-modal Claude Sonnet 4.6 vision. Same schema across both, so hallucination containment stays structural rather than dependent on training accuracy.

Care Gap Engine

Population-health outreach prioritization for primary-care teams on HEDIS or value-based-care contracts. A synthetic 1,000-patient panel, USPSTF and HEDIS rule application, and a three-component prioritization scorer (clinical urgency, response likelihood, equity priority). Personalized outreach drafted with Claude using prompt caching, so per-message cost stays roughly flat across hundreds of drafts.

Sentra

A runtime execution control layer for autonomous AI agents. Sentra sits between agent decision-making and tool execution. Every proposed action is evaluated against deterministic policy rules, cumulative risk is tracked across a session, and a three-strike rule shuts down agents that drift before they cause harm. Model-agnostic by design. Cross-industry: claims processing, customer communications, internal tooling, developer agents, clinical decision support.

Cortex

Dual-LLM verification. One model runs as a worker, another stress-tests its output against user-defined rules before acceptance. Failures return as structured feedback so the worker can revise; persistent failures are blocked rather than silently passed downstream. Three-strike shutdown spawns a replacement agent with memory of why the last one failed.

Practice

What I study, and why it matters in clinical settings.

I study how AI and decision systems behave under uncertainty: where automation drifts past its mandate, where incentives distort outcomes, where a model that looked reliable in evaluation produces unsafe actions in production.

My current focus is healthcare AI (clinical triage, population health prioritization, decision support), and the runtime control infrastructure required to deploy autonomous agents in any high-stakes domain.

Founded 2022.

Contact

Get in touch.

For collaborations, research conversations, and product engagements.