Concepts

Relational Memory

Raw LLMs are memoryless. They greet your patient on session 47 the same way they greet a stranger on session 1. Humane stores a semantic, per-end-user memory plusa small set of relational state variables that persist across sessions — so the AI knows who it's talking to and how it stands with them.

MemPalace

MemPalace is the tenant-scoped vector store (Chroma under the hood today; swappable) keyed by (user_id, end_user_id). Every user message + AI response is embedded and stored with metadata: channel, detected emotions, intent, timestamp.

What gets stored

Each interaction writes one memory record:

json
{
  "text": "User: I'm worried about the surgery tomorrow.\nAssistant: Let's talk through what's on your mind...",
  "similarity": 0.0,
  "timestamp": "2026-04-17T14:22:10Z",
  "end_user_id": "patient_42",
  "channel": "ios_app",
  "metadata": {
    "detected_emotions": ["anxiety"],
    "intent": "emotional_support"
  }
}

How retrieval works

On every process()call, the incoming message is embedded and top-K relevant memories (default 3, similarity threshold 0.3) are stitched into the LLM system prompt as context. They're echoed back in the response:

json
"memory": {
  "relevant_memories": [
    { "text": "User said their surgery is tomorrow morning...",
      "similarity": 0.78, "timestamp": "..." },
    { "text": "User mentioned they struggle with medical anxiety...",
      "similarity": 0.61, "timestamp": "..." }
  ],
  "total_stored": 47,
  "interaction_count": 12,
  "known_since": "2025-12-03T09:14:00Z",
  "last_seen": "2026-04-16T18:41:32Z"
}

Relational state (Table 14)

Alongside the vector store, every (user_id, end_user_id)pair carries four scalar variables. They're what make follow-on conversations feel like the agent remembers you, not just your messages.

VariableRangeDynamics
trust0–1Monotonic; +0.01 per interaction up to a 0.95 ceiling. No time decay. Lost on broken-promise events via the policy engine (−0.10 / −0.05 / −0.02 by prior trust tier).
sentiment0–1 (0.5 neutral)EMA via blend(); each interaction nudges by mood_delta × 0.7. Half-life 6h back toward 0.5.
grudge0–1Accumulates on negative signals: +0.05 frustration, +0.10 anger, +0.15 anger + shouting, +0.20 crisis flag. Half-life 12h — grudges persist twice as long as sentiment swings.
familiarity0–1Monotonic; +0.005 per interaction. No decay. Drives the ContextBuilder's relation_bias = trust × familiarity.

Grudge as a tone signal

grudge maps to a tone override the ContextBuilder attaches to the system prompt:

  • grudge ≥ 0.4tone_grudge_modifier = "cautious"
  • grudge ≥ 0.7tone_grudge_modifier = "cautious_defensive"

Both modifiers surface on the /process response undercontext.tone_grudge_modifier so you can render the agent state in your own UI.

Where these live on the response
Every /api/sdk/process response carries the current values under user.trust, user.sentiment,user.grudge, and user.familiarity.

Secondary relations (stub)

A second table tracks per-(end_user, third_party_person)relations — what the end-user thinks of other people they mention (their mother, their boss, a colleague). The schema is live; the entity extraction that populates it is still in progress, so the table currently records only what other parts of the pipeline write explicitly.

Why it matters

Longitudinal memory + relational state is the single biggest differentiatorbetween Humane and raw LLM calls. For chat-once use cases (search, Q&A) none of this matters. For longitudinal use cases (coaching, therapy, elder care, CS) it's the whole product.

Explore it in the UI
Open any patient in the Ambuja reference client. The Patient Journey page has a live memory explorer — type a query, see which stored memories match, with similarity scores and human-readable match strength (strong / moderate / weak). Trust / sentiment / grudge / familiarity render as live gauges.

Retention policy

Memory is capped per plan tier: Community keeps the most recent 500 messages per end-user, Clinic 5,000, Enterprise unlimited. The cap is enforced automatically after every interaction and by a nightly retention sweep. The four scalar variables are never capped — they're tiny and load-bearing.

Privacy & deletion

Right-to-erasure is a single API call:

python
# Python
client.clear_history("patient_42")

# Or via the platform privacy API
# DELETE /api/privacy/erase/{external_id}  →  wipes messages, memories,
#                                              relational state, triggers, assignments

The privacy API is audit-logged — your compliance officer can prove the deletion happened.

Related API

  • GET /api/sdk/memory/{user_id}?query=... — memory explorer (returns hits + match reasons)
  • POST /api/sdk/memory/{user_id} — manual insertion (backfill / ambient context)
  • GET /api/sdk/history/{user_id} — raw conversation history
  • DELETE /api/privacy/erase/{external_id} — full erasure