Unconfound Labs · A Studio in Motion
Summer 2026

Unconfound Labs

Curiosity · Clarity · Connection

Explore
Unconfound Labs

Rigor for data that's messy, regulated, and high-stakes.

A Baltimore studio applying epidemiology and causal-inference rigor to messy, real-world data, and to the systems that teach people how to reason through it. We design data architecture, build data-linkage and synthetic-data pipelines, and engineer privacy-first, agentic-AI workflows that turn chaotic data into authoritative, decision-ready intelligence.

What we do

Data Architecture & Enterprise Modeling

End-to-end blueprints, relational schemas, and multi-system integrations built for regulated environments, so analytics and AI sit on a foundation that holds.

Data Linkage & Synthetic Data

Reconstructing fragmented records and generating synthetic datasets to develop and stress-test methods safely, without exposing sensitive data.

Causal Inference & Methodological Rigor

Isolating confounding, controlling bias, and validating model inputs: the difference between a correlation that ships and a result that survives scrutiny.

Privacy-First & Agentic AI

Local LLM orchestration and agentic workflows that run behind internal data firewalls. Advanced automation on open-weights models, without the leakage and compliance risk of open web APIs.

How we work
One method runs through all of it: take a tangled, confounded problem and make it legible, without flattening what makes it real.

That means starting from a world model, not a quick correlation: mapping what drives the data before trusting what it appears to say. It's why the work holds up under audit, regulatory review, and the kind of messy, high-stakes data that breaks generic AI tools.

Technical Notes

Methods, linkage, and the messy data behind them.

Working notes from the studio's own builds, across public-health, pharma, and learning data. First entries publishing soon.

Public Health · Record Linkage
Probabilistic linkage of related records in incomplete data
Forthcoming
Education · Learning Analytics
Heterogeneous learner profiles in national survey data
Forthcoming
Pharmacoepidemiology · Signal Detection
Separating signal from reporting artifact in FAERS
Forthcoming
Methodological grounding

Hands-on experience with record linkage and confounding control on messy, real-world data: administrative and adverse-event records in public-health and pharma/biotech settings. Fragmented records, maternal–infant linkages that don't resolve cleanly, systems too outdated for probabilistic matching: working knowledge of where data integration actually breaks, and the exact friction an agentic approach targets.

Grounded in doctoral epidemiology training (Johns Hopkins), paired with a modern data science and architecture stack: Python, SQL, AWS, agentic-AI tooling (Claude Code, MCP), and data visualization (Tableau, Flourish, Streamlit).

Applied to learning systems

The same instinct shapes how the Labs builds learning systems: investigative environments on realistic, messy data, where answers aren't clean and learners reason under uncertainty the way analysts do.

Explore the learning systems work