Skip to main content
This is an event broadcasted by Tavus. A replica.started_speaking/stopped_speaking event is broadcasted by Tavus at specific times: conversation.replica.started_speaking means the replica has just started speaking. conversation.replica.stopped_speaking means the replica has just stopped speaking. When the replica.stopped_speaking event is sent, the event’s properties object will include:
  • A duration field indicating how long the replica was speaking for in seconds. This value may also be null.
  • An interrupted field (true/false) indicating whether the replica was interrupted by the user while speaking, or finished speaking naturally.
These events are intended to act as triggers for actions within your application. For instance, you may want to start a video or show a slide at times related to when the replica started or stopped speaking. The inference_id can be used to correlate other events and tie things like conversation.utterance or tool_call together. This event includes a seq field for global ordering and a turn_idx field to identify which conversational turn the speaking state change belongs to. See Event Ordering and Turn Tracking for details.
message_type
string

Message type indicates what product this event will be used for. In this case, the message_type will be conversation

Example:

"conversation"

event_type
enum<string>

This event occurs when the replica either starts actually speaking audio, or stops actually speaking audio.

Available options:
conversation.replica.started_speaking,
conversation.replica.stopped_speaking
Example:

"conversation.replica.started_speaking"

seq
integer

A globally monotonic sequence number assigned to each event. Use this to determine the ordering of events — a higher seq means the event was sent later. This is useful for reconciling events that may arrive out of order.

Example:

42

turn_idx
integer

The conversation turn index. This value increments each time a conversation.respond interaction is received, and groups all events that belong to the same conversational turn. Use this to correlate events (utterances, tool calls, speaking state changes, etc.) that are part of the same turn.

Example:

3

properties
object