Curiosity · Clarity · Connection
A Baltimore deep-tech R&D lab applying epidemiology and causal-inference rigor to messy, real-world data, and to the systems that teach people to reason through it. We build privacy-first, agentic-AI engines that turn chaotic, high-stakes data into decision-ready intelligence.
Two product engines on the same core, privacy-first agentic AI for messy, real-world data: one in development, one already in pilot.
A semantic routing engine that untangles work spread across too many parallel threads, directing the right context to the right place.
untia.app →A learning program for reasoning inside messy, realistic data. Live now as a founding pilot, building toward a simulation-based engine.
innopathlearning.com →Relational schemas, multi-system integration, and pipelines built for regulated data, so analytics and AI sit on a foundation that holds.
Reconstructing fragmented records and generating synthetic datasets to develop and stress-test methods safely, without exposing sensitive data.
Isolating confounding, controlling bias, and validating model inputs: the difference between a correlation that ships and a result that survives scrutiny.
Local LLM orchestration and agentic workflows that run behind internal data firewalls. Advanced automation on open-weights models, without the leakage and compliance risk of open web APIs.
One method runs through all of it: take a tangled, confounded problem and make it legible, without flattening what makes it real.
That means starting from a world model, not a quick correlation: mapping what drives the data before trusting what it appears to say. It's why the work holds up under audit, regulatory review, and the kind of messy, high-stakes data that breaks generic AI tools.
Working notes from our own builds, across public-health, pharma, and learning data. The build log is live; the methods write-ups take the time they need.
Hands-on experience with record linkage and confounding control on messy, real-world data: administrative and adverse-event records in public-health and pharma/biotech settings. Fragmented records, maternal-infant linkages that don't resolve cleanly, systems too outdated for probabilistic matching: working knowledge of where data integration actually breaks, and the exact friction an agentic approach targets.
Grounded in doctoral epidemiology training (Johns Hopkins), paired with a modern data science and architecture stack: Python, SQL, AWS, agentic-AI tooling (Claude Code, MCP), and data visualization (Tableau, Flourish, Streamlit).
The same instinct shapes how the Labs builds learning systems: investigative environments on realistic, messy data, where answers aren't clean and learners reason under uncertainty the way analysts do.
Explore the learning systems work