Selected work

Eleven years.
Fortune 100 + government.
Production at scale.

Senior data & AI engagements at United Airlines, Cisco Systems, Spotify, UHG Optum, Verizon, Wellstar, and the State of Connecticut. Plus three independent IP builds — DocSensei, ORCA, and Setup Score.

Engagements

Where I have done
the heavy lifting.

2023 — Now

Principal Scientist, Analytics, AI and Data

Lead enterprise AI, data, and analytics initiatives across 100+ state agencies, delivering measurable impact for 3.7 million citizens. Define the data strategy, ML roadmap, and governance posture that informs senior leadership and drives agency-level operating decisions. Standing up an enterprise LLM platform for document triage, policy lookup, and citizen-facing assistance.

State of Connecticut, DAS · Hartford, CT

2015 — 2023

Managing Partner & Principal AI / ML Engineer

Founded and led a specialized AI / ML consulting practice. Embedded senior data scientists and applied ML engineers inside Fortune 100 enterprises to deliver production ML, NLP, anomaly detection, fraud detection, content moderation, recommender systems, and enterprise analytics. Selected client engagements include United Airlines, Cisco Systems, Spotify, UHG Optum, Verizon, and Wellstar Health System.

Solution Cabin LLC · Atlanta, GA

2022 — 2023

Senior Data Scientist (embedded engagement)

Personalization and Marketplace teams: A/B and switchback tests for podcast and audiobook discovery, with power analysis under heavy-tailed engagement distributions. Built feature pipelines in PySpark on Databricks ingesting hundreds of millions of daily events. Prototyped an embedding-based content similarity service for cold-start podcast recommendations on FastAPI + FAISS.

Spotify · New York, NY

2021 — 2022

Senior Data Scientist (embedded engagement)

Network Assurance group. Unsupervised anomaly detection on streaming router telemetry using isolation forests and seasonal-hybrid ESD over Kafka topics. Designed a Bayesian change-point layer that reduced false-positive alert volume by an estimated 40% in pilot. Built a capacity-planning forecasting service combining Prophet and LightGBM, deployed via SageMaker behind an internal gRPC API.

Cisco Systems · San Jose, CA

2018 — 2021

Senior Data Scientist (embedded engagement)

Advanced analytics team supporting fleet operations and MRO. Built predictive maintenance models for line-replaceable units across narrow- and wide-body fleets, informing parts provisioning and turnaround scheduling at major hubs. Designed a digital-twin-style supply chain simulation in SimPy and PySpark, plus a Gurobi integer-programming model for crew and gate optimization under irregular operations.

United Airlines · Chicago, IL

Selected independent builds

Three production systems,
built end-to-end.

DocSensei · 2023→

Document intelligence and RAG platform

End-to-end RAG system for high-volume document workflows in insurance and legal. Hybrid retrieval (BM25 + dense embeddings), cross-encoder re-ranking, page-faithful citations, and a hallucination guardrail scoring each generated span against retrieved evidence. Architecture is directly applicable to commercial real estate loan files: rent rolls, operating statements, leases, intercreditor agreements, and trustee reports. Currently in beta at a U.S. insurance brokerage under a perpetual source-code license.

RAG Hybrid retrieval Real estate Licensed IP

ORCA · 2024

Asset-level risk and liability modeling

Asset-level liability and exposure model for a national operator. Combined satellite-derived signals, regulatory schedules, and operator-submitted data into a single risk index. Quantified expected exposure under alternative regulatory regimes and surfaced under-reporting patterns. Geospatial layer integrates raster signals with operator records using PostGIS and Tableau map layers.

Geospatial PostGIS Risk modeling Regulatory

Setup Score · 2022→

Quantitative equity and options signal research

Multi-factor signal generator (Setup Score v6 and v7) blending momentum, volatility regime, options flow, and earnings drift. Pine Script and Python research stack with live tracking across selected positions. Hands-on financial markets and instrument-level research, informing personal trading capital.

Quant Pine Script Options flow Signal research

Private repositories. Demos available on request.

Capabilities

Stack and discipline.

Modelling

Statistical and ML modelling

Risk and propensity models, time series forecasting, survival analysis, NLP classification, structured prediction, and experimentation. Python first, with R where it earns its place.

scikit-learn XGBoost PyTorch

LLM systems

Applied LLM and agent systems

Production RAG, structured extraction, evaluation harnesses, guardrails, multi-agent orchestration, cost and latency observability. Vendor-agnostic, biased toward what survives.

Anthropic OpenAI Self-hosted

Data platform

Modern data platforms

Ingestion, transformation, semantic layer, governance, and the dashboard surface. Built around dbt, Airflow, Snowflake, BigQuery, Postgres and a clear contract layer between them.

dbt Snowflake Airflow

Engineering

Backend and platform

FastAPI, SQLAlchemy, Celery, Redis, Postgres, Docker. Patterns I have shipped to production at scale, with the operational telemetry that lets a team sleep through the weekend.

FastAPI Docker CI/CD

Healthcare

Clinical and regulated AI

HIPAA controls, BAA scoping, PHI handling, audit logs, and the documentation a compliance officer will actually sign off on. Predictive readmission and care management models.

HIPAA FHIR CMS data

Leadership

Function design and hiring

Hiring rubrics, interview loops, IC ladders, performance frameworks, and the operating cadence that gets a team to predictable shipping inside ninety days.

Hiring Org design Coaching

A working note

I take on a small number of engagements a year. Fit matters more than fee.