Unconfound Labs
In motion

Unconfound Labs

Curiosity · Clarity · Connection

Explore
Unconfound Labs

Rigor for data that's messy, regulated, and high-stakes.

A Baltimore deep-tech R&D lab applying epidemiology and causal-inference rigor to messy, real-world data, and to the systems that teach people to reason through it. We build privacy-first, agentic-AI engines that turn chaotic, high-stakes data into decision-ready intelligence.

What we're building

Proprietary engines for reasoning under data chaos.

Two product engines on the same core, privacy-first agentic AI for messy, real-world data: one in development, one already in pilot.

Untia In development

A semantic routing engine that untangles work spread across too many parallel threads, directing the right context to the right place.

untia.app →

Innopath In pilot

A learning program for reasoning inside messy, realistic data. Live now as a founding pilot, building toward a simulation-based engine.

innopathlearning.com →
The technical core

What every engine is built on.

Data Architecture

Relational schemas, multi-system integration, and pipelines built for regulated data, so analytics and AI sit on a foundation that holds.

Data Linkage & Synthetic Data

Reconstructing fragmented records and generating synthetic datasets to develop and stress-test methods safely, without exposing sensitive data.

Causal Inference & Methodological Rigor

Isolating confounding, controlling bias, and validating model inputs: the difference between a correlation that ships and a result that survives scrutiny.

Privacy-First & Agentic AI

Local LLM orchestration and agentic workflows that run behind internal data firewalls. Advanced automation on open-weights models, without the leakage and compliance risk of open web APIs.

How we work
One method runs through all of it: take a tangled, confounded problem and make it legible, without flattening what makes it real.

That means starting from a world model, not a quick correlation: mapping what drives the data before trusting what it appears to say. It's why the work holds up under audit, regulatory review, and the kind of messy, high-stakes data that breaks generic AI tools.

Notes

Methods, linkage, and the messy data behind them.

Working notes from our own builds, across public-health, pharma, and learning data. The build log is live; the methods write-ups take the time they need.

Public Health · Record Linkage
Probabilistic linkage of related records in incomplete data
Forthcoming
Education · Learning Analytics
Heterogeneous learner profiles in national survey data
Forthcoming
Pharmacoepidemiology · Signal Detection
Separating signal from reporting artifact in FAERS
Forthcoming
Lab Notes · Build log · Live
How we named the submersible
June 10, 2026
Browse all notes
Methodological grounding

Hands-on experience with record linkage and confounding control on messy, real-world data: administrative and adverse-event records in public-health and pharma/biotech settings. Fragmented records, maternal-infant linkages that don't resolve cleanly, systems too outdated for probabilistic matching: working knowledge of where data integration actually breaks, and the exact friction an agentic approach targets.

Grounded in doctoral epidemiology training (Johns Hopkins), paired with a modern data science and architecture stack: Python, SQL, AWS, agentic-AI tooling (Claude Code, MCP), and data visualization (Tableau, Flourish, Streamlit).

Applied to learning systems

The same instinct shapes how the Labs builds learning systems: investigative environments on realistic, messy data, where answers aren't clean and learners reason under uncertainty the way analysts do.

Explore the learning systems work