Date: October 16, 2025
Topic: What is Task Oriented Dialogue?
Recall
User interacts with the system for a common goal.
Notes
What is Task-Oriented Dialogue?
Examples:
- Understand user utterances to perform tasks like provide info, manipulate objects, navigation
- Generating response to user requests in a consistent and coherent manner while maintaining dialogue history
- Managing user and system turns effectively and recovering from errors from potential misunderstanding of user utterances
- Engaging user with adaptive and interesting conversations while progressing towards the common goal
The dialogue context is important, and we need to properly capture the back-and-forth to get all necessary info for the common goal.
Why is Task-Oriented Dialogue Challenging?
- Context of dialogue history is paramount to achieve task success
- Ability to capture back-and-forth interaction, estimate belief state of user goals and resolve anaphora during discourse
- Piece together all relevant and necessary info to complete the task
- Domain specific nature of dialogue makes it harder for adaptation and generalization
- API calls to web-services may change over time and require frequent maintenance of dialogue managers
Task dialogue consists of:
- Turn taking
- Speech acts
- Grounding
Task-Oriented Dialogue Characteristics
Turn Taking
- Dialogue is often interleaved between at least 2 speakers (incl. agent)
- Speakers holds the floor for a given time and takes opportunity to speak/inform/request
Speech Acts
- Assertives: Speaker suggests or puts forward info to the agent
- Directives: Speaker asks or requests info from agent
- Commissives: Speaker agrees or opposes agent (yes/no)
- Expressives: Greets the agent (hello/goodbye)
Grounding
- Whether the user and agent mutually believe or agree on set of info exchanged
- Agent or user acknowledge each other’s info
System

Based on a domain, the user might have an intent. To fulfill the intent, the information from the slots are needed.
Terminology

- From Domains we can get Intents
- Slots are spans contained in a given utterance that leads to a certain intent
- E.g., if we want to book a flight, the slots are the necessary info for booking that flight
<aside>
📌 SUMMARY:
Task oriented dialogue involves having a user interact with the system to achieve something.
This is challenging as we interact in natural language, and the system has to correctly identify the context to fulfill the user’s goal.
</aside>
Date: October 16, 2025
Topic: Automatic Speech Recognition
<aside>
📌 SUMMARY: ASR models translate utterances into words that a language model can understand, and is evaluated by Slot Error Rate
</aside>