Concepts

A/B Experiments

Humane ships a deterministic A/B routing layer that overlays engine config per-variant and produces real statistical output — two-proportion z-test with p-value and a plain-English conclusion. Not decorative.

Create an experiment

bash
curl -X POST http://localhost:8000/api/experiments \
  -H "X-API-Key: hx_..." \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Empathy Lift for Postoperative Cohort",
    "variant_a_name": "Standard (50%)",
    "variant_b_name": "High Empathy (95%)",
    "variant_a_config": { "empathy_weighting": 50 },
    "variant_b_config": { "empathy_weighting": 95 }
  }'

Deterministic routing

Every end_user hashes to variant a orb via SHA-256(experiment_id + end_user_id). This means:

  • The same user always gets the same variant — no sticky-cookie layer needed.
  • 50/50 traffic split is guaranteed without a coordinator.
  • Assignment is recorded in experiment_assignments and surfaced in the SDK response.
json
"experiments": [
  {
    "experiment_id": "ae31...7f",
    "experiment_name": "Empathy Lift for Postoperative Cohort",
    "variant": "b",
    "variant_name": "High Empathy (95%)",
    "config_overlay": { "empathy_weighting": 95 }
  }
]

Config overlay

Variant configs override engine config for that single call — only whitelisted knobs are honored (empathy_weighting,formality_index,output_sharpness,base_delay,max_jitter,response_filter).

Ordering matters
Experiments run afterthe tenant's engine config but before any user-authored policy. So your policies can override an active experiment when needed — e.g. force high empathy regardless of variant when a crisis signal fires.

Significance testing

Ask the platform for cohort analytics on a running experiment:

bash
curl http://localhost:8000/api/cohort/experiments/ae31...7f \
  -H "Authorization: Bearer $TOKEN"
json
{
  "experiment": { "id": "ae31...", "name": "...", "status": "running" },
  "variant_a": {
    "enrolled": 184, "avg_mood": 54.2,
    "retained_14d": 98, "retention_pct": 53.3,
    "avg_interactions": 7.1
  },
  "variant_b": {
    "enrolled": 191, "avg_mood": 61.8,
    "retained_14d": 131, "retention_pct": 68.6,
    "avg_interactions": 9.4
  },
  "lift_points": 15.3,
  "z": 3.11,
  "p_value": 0.0019,
  "conclusion": "Variant B shows a 15.3-point retention difference (p<0.01, strong evidence)."
}

Dashboard visualization

The cohort analysis table is also rendered in the Ambuja reference client at the bottom of the dashboard — monthly admission cohorts with retention curves, color-coded by outcome. Clinicians read it without a statistics degree.