User Started/Stopped Speaking Event

{
  "message_type": "conversation",
  "event_type": "conversation.user.started_speaking",
  "seq": 42,
  "inference_id": "83294d9f-8306-491b-a284-791f56c8383f",
  "turn_idx": 3
}

This is an event broadcasted by Tavus. A user.started_speaking/stopped_speaking event is broadcasted by Tavus at specific times: conversation.user.started_speaking means the user has just started speaking. conversation.user.stopped_speaking means the user has just stopped speaking. These events are intended to act as triggers for actions within your application. For instance, you may want to take some user facing action, or backend process at times related to when the user started or stopped speaking. The inference_id can be used to correlate other events and tie things like conversation.utterance or tool_call together. Keep in mind that with speculative_inference, the inference_id will frequently change while the user is speaking so that the user.started_speaking inference_id will not usually match the conversation.utterance inference_id. This event includes a seq field for global ordering and a turn_idx field to identify which conversational turn the speaking state change belongs to. See Event Ordering and Turn Tracking for details.

message_type

string

Message type indicates what product this event will be used for. In this case, the message_type will be conversation

Example:

"conversation"

event_type

enum<string>

This event occurs when the user either starts speaking, or stops speaking.

Available options:

conversation.user.started_speaking,

conversation.user.stopped_speaking

Example:

"conversation.user.started_speaking"

seq

integer

A globally monotonic sequence number assigned to each event. Use this to determine the ordering of events — a higher seq means the event was sent later. This is useful for reconciling events that may arrive out of order.

Example:

42

inference_id

string

This is a unique identifier for a given utterance. In this case, it will be the utterance the user is speaking.

Example:

"83294d9f-8306-491b-a284-791f56c8383f"

turn_idx

integer

The conversation turn index. This value increments each time a conversation.respond interaction is received, and groups all events that belong to the same conversational turn. Use this to correlate events (utterances, tool calls, speaking state changes, etc.) that are part of the same turn.

Example:

3

Replica Started/Stopped Speaking Event Perception Tool Call Event

⌘I

Getting Started

Onboarding Guide

Conversational Video Interface

Replica

Video Generation

Resources

User Started/Stopped Speaking Event