How is this different from NPS or CSAT?
- NPS and CSAT measure what users say. Behavior Score measures what users do. If someone rates your AI 10/10 but never returns, NPS says great and Behavior Score says the relationship is dying. The gap is usually where the truth lives.
Can I A/B test against Behavior Score?
- Yes. Every Humane experiment reports per-variant Behavior Score with statistical significance. Growth and Scale tiers include this in the dashboard; Community users can pull the raw per-variant metrics via the SDK.
How often is it calculated?
- The dashboard score refreshes hourly. The API endpoint computes on demand over a configurable window (default 7 days). Trends are smoothed across rolling windows to avoid noise from individual outliers.
Why weighted sum instead of a neural net?
- Because investors, CTOs, and compliance officers need to audit the number. A weighted sum is transparent; a neural net on five inputs is theatre. The weights come from reviewing which components had the strongest correlation with 30-day retention across the pilot cohort.
What if my app has no proactive messages?
- The proactive_response_rate component contributes zero. Your ceiling drops from 100 to 80. Most Humane customers turn on proactive within the first two weeks once they see how much headroom that 20% contribution represents.
My app is brand new — what do I see?
- The score publishes once you have at least 10 end users, 100 messages, and 14 days of history. Before that, the dashboard shows a progress bar with what's left until the metric becomes trustworthy. We'd rather show honest 'not yet' than flatter with thin data.