A contextual digital avatar and dialogue engine to bring humanity to digital humans, at scale.
The challenge
Giving digital avatars truly expressive, context-aware behaviors remains a major bottleneck, since most systems deliver canned responses or flat emotional ranges. Coordinating animation, speech, facial acting, and branching decision logic in real time is typically custom, brittle, and expensive.
I was COO of the company that built DELPHIC, responsible for operational leadership and the production architecture that made the system possible.
The solution
DELPHIC is a full-stack toolkit combining:
- Context engine managing state, histories, user attributes (no heavy ML training/inference).
- Animation orchestration (full body + facial + "punch moment" effects) tied to dialogue.
- Branching conversation system synchronized with expressive animation.
- Seamless support for either hand-crafted or AI-generated text & lines.
The engine ingests environmental cues, user history, emotional tone, and real-time interaction context. Then, it selects the best animation and dialogue pairing dynamically.
Key capabilities
- Expressive avatar orchestration: full face + body + contextual "punch moments."
- Dynamic branching: responses vary by user profile, tone, conversation history.
- Lightweight context engine: no heavy ML stack required; system logic drives choice.
- Modular integration: works with human authored text or LLM-generated lines.
- Synchronized behavior: avatar response and animation are tightly coupled and context aware.
- Massive diversity from minimal data: Hours of unique motion from ~140 s of source animation
Prototype capabilities
- DELPHIC is the second major revision of the HABTIC system, built on prior experience in procedural avatar animation.
- Its branching context logic enables differentiated responses. For example, a first-time user gets a friendly explanation, but a previously skeptical return user receives a more tailored response.
- Delivers highly personalized avatar interaction without needing full AI training per user or scenario.
Outcomes
- Delivered avatar interactions with character-level expressiveness, without requiring per-user AI training.
- Built a system that works above the dialogue layer, compatible with hand-authored text or future LLM integration.
- Proved that contextual, expressive digital humans can operate at scale without cloud streaming or heavy ML infrastructure.