Mert Inan, Anthony Sicilia, Suvodip Dey, Vardhan Dongre, Tejas Srinivasan, Jesse Thomason, Gökhan Tür, Dilek Hakkani-Tür, Malihe Alikhani (2025). Better Slow than Sorry: Introducing Positive Friction for Reliable Dialogue Systems. arXiv:2501.17348.
While theories of discourse and cognitive science have long recognized the value of unhurried pacing, recent dialogue research tends to minimize friction in conversational systems. Yet, frictionless dialogue risks fostering uncritical reliance on AI outputs, which can obscure implicit assumptions and lead to unintended consequences. To meet this challenge, we propose integrating positive friction into conversational AI, which promotes user reflection on goals, critical thinking on system response, and subsequent re-conditioning of AI systems. We hypothesize systems can improve goal alignment, modeling of user mental states, and task success by deliberately slowing down conversations in strategic moments to ask questions, reveal assumptions, or pause. We present an ontology of positive friction and collect expert human annotations on multi-domain and embodied goal-oriented corpora. Experiments on these corpora, along with simulated interactions using state-of-the-art systems, suggest incorporating friction not only fosters accountable decision-making, but also enhances machine understanding of user beliefs and goals, and increases task success rates.
Vardhan Dongre, Xiaocheng Yang, Emre Can Acikgoz, Suvodip Dey, Gokhan Tur, Dilek Hakkani Tur (2024). ReSpAct: Harmonizing Reasoning, Speaking, and Acting Towards Building Large Language Model-Based Conversational AI Agents. arXiv:2411.00927.
@article{dongre2024respact,
title={ReSpAct: Harmonizing Reasoning, Speaking, and Acting Towards Building Large Language Model-Based Conversational AI Agents},
author={Dongre, Vardhan and Yang, Xiaocheng and Acikgoz, Emre Can and Dey, Suvodip and Tur, Gokhan and Hakkani-T{\"u}r, Dilek},
journal={arXiv preprint arXiv:2411.00927},
year={2024}
}
Large language model (LLM)-based agents are
increasingly employed to interact with external
environments (e.g., games, APIs, world mod-
els) to solve user-provided tasks. However, cur-
rent frameworks often lack the ability to collab-
orate effectively with users in fully conversa-
tional settings. Conversations are essential for
aligning on task details, achieving user-defined
goals, and satisfying preferences. While exist-
ing agents address ambiguity through clarifica-
tion questions (Li et al., 2023; Zhang and Choi,
2023; Chen et al., 2023), they underutilize the
broader potential of a LLMs conversational ca-
pabilities. In this work, we introduce ReSpAct,
an LLM-based agent designed to seamlessly
integrate reasoning, decision-making, and dy-
namic dialogue for task-solving. Expanding
on reasoning-first approaches like ReAct (Yao
et al., 2022b), ReSpAct employs active, free-
flowing dialogues to interpret instructions, clar-
ify goals, provide status updates, resolve sub-
task failures, and refine plans based on user in-
puts without any explicit dialogue schema. By
alternating between task-solving actions and in-
teractive conversations, ReSpAct demonstrates
improved performance across diverse environ-
ments. We evaluate ReSpAct in user-interactive
settings, including task-oriented dialogue sys-
tems (MultiWOZ) and decision-making tasks
(Alfworld, WebShop). ReSpAct outperforms
ReAct with absolute success rate improvements
of 6% and 4% in Alfworld and WebShop, re-
spectively, and achieves a 5.5% gain in Inform
and a 3% gain in Success scores in MultiWOZ.
These results highlight the value of integrat-
ing dynamic user-agent collaboration for more
effective task resolution.
Daniel Philipov, Vardhan Dongre, Gokhan Tur, Dilek Hakkani Tur (2024). Simulating User Agents for Embodied Conversational AI. arXiv:2410.23555. [Accepted Neurips Open World Agent Workshop 2024] [Best Application Award (Michigan AI Symposium 2024)]
@article{philipov2024simulating,
title={Simulating User Agents for Embodied Conversational-AI},
author={Philipov, Daniel and Dongre, Vardhan and Tur, Gokhan and Hakkani-T{\"u}r, Dilek},
journal={arXiv preprint arXiv:2410.23535},
year={2024}
}
Embodied agents designed to assist users with tasks must possess the ability
to engage in natural language interactions, interpret user instructions, execute
actions to complete tasks and communicate effectively to resolve issues. However,
collecting large-scale, diverse datasets of situated human-robot dialogues to train
and evaluate such agents is expensive, labor-intensive, and time-consuming. To
address this challenge, we propose building a large language model (LLM)-based
user agent that can simulate user behavior during interactions with an embodied
agent in a virtual environment. Given a specific user goal (e.g., make breakfast), at
each time step during an interaction with an embodied agent (or a robot), the user
agent may "observe" the robot actions or "speak" to either proactively intervene
with the robot behavior or reactively answer the robots questions. Such a user
agent assists in improving the scalability and efficiency of embodied dialogues
dataset generation and is critical for enhancing and evaluating the robots interaction
and task completion ability, as well as for future research, such as reinforcement
learning using AI feedback. We evaluate our user agents ability to generate humanlike behaviors by comparing its simulated dialogues with the benchmark TEACh
dataset. We perform three experiments: zero-shot prompting to predict the dialogue
act from history, few-shot prompting, and fine-tuning on the TEACh training subset.
Our results demonstrate that the LLM-based user agent can achieve an F-measure
of 42% in mimicking human speaking behavior with simple zero-shot prompting
and 43.4% with few-shot prompting. Through fine-tuning, we achieved similar
success in deciding when to speak but much greater success in deciding what to
say, from 51.1% to 62.5%. These findings showcase the feasibility and promise of
the proposed approach for assessing and enhancing the effectiveness and reliability
of robot task completion through natural language communication.
Nalin Tiwary*, Vardhan Dongre*, Sanil Arun Chawala, Ashwin Lamani, Dilek Hakkani Tur (2024). From Context to Action: Analysis of the Impact of State Representation and Context on the Generalization of Multi-Turn Web Navigation Agents. arXiv:2410.23555.[Accepted Neurips Open World Agent Workshop 2024]
@inproceedings{tiwary2024context,
title={From Context to Action: Analysis of the Impact of State Representation and Context on the Generalization of Multi-Turn Web Navigation Agents},
author={Tiwary, Nalin and Dongre, Vardhan and Chawla, Sanil Arun and Lamani, Ashwin and Tur, Dilek Hakkani},
booktitle={NeurIPS 2024 Workshop on Open-World Agents}
}
Recent advancements in Large Language Model (LLM)-based frameworks have extended their capabilities to complex real-world applications, such as interactive web navigation. These systems, driven by user commands, navigate web browsers to complete tasks through multi-turn dialogues, offering both innovative opportunities and significant challenges. Despite the introduction of benchmarks for conversational web navigation, a detailed understanding of the key contextual components that influence the performance of these agents remains elusive. This study aims to fill this gap by analyzing the various contextual elements crucial to the functioning of web navigation agents. We investigate the optimization of context management, focusing on the influence of interaction history and web page representation. Our work highlights improved agent performance across out-of-distribution scenarios, including unseen websites, categories, and geographic locations through effective context management. These findings provide insights into the design and optimization of LLM-based agents, enabling more accurate and effective web navigation in real-world applications.
Vardhan Dongre, Gurpreet Singh Hora (2023). Evaluating Uncertainty Quantification approaches for Neural PDEs in scientific applications. arXiv:2311.04457.[Accepted Neurips AI4Science, DLDEIII Workshop 2023]
@article{dongre2023evaluating,
title={Evaluating Uncertainty Quantification approaches for Neural PDEs in scientific applications},
author={Dongre, Vardhan and Hora, Gurpreet Singh},
journal={arXiv preprint arXiv:2311.04457},
year={2023}
}
The accessibility of spatially distributed data, enabled by affordable sensors, field, and numerical experiments, has facilitated the development of data-driven solutions for scientific problems, including climate change, weather prediction, and urban planning. Neural Partial Differential Equations (Neural PDEs), which combine deep learning (DL) techniques with domain expertise (e.g., governing equations) for parameterization, have proven to be effective in capturing valuable correlations within spatiotemporal datasets. However, sparse and noisy measurements coupled with modeling approximation introduce aleatoric and epistemic uncertainties. Therefore, quantifying uncertainties propagated from model inputs to outputs remains a challenge and an essential goal for establishing the trustworthiness of Neural PDEs. This work evaluates various Uncertainty Quantification (UQ) approaches for both Forward and Inverse Problems in scientific applications. Specifically, we investigate the effectiveness of Bayesian methods, such as Hamiltonian Monte Carlo (HMC) and Monte-Carlo Dropout (MCD), and a more conventional approach, Deep Ensembles (DE). To illustrate their performance, we take two canonical PDEs: Burger's equation and the Navier-Stokes equation. Our results indicate that Neural PDEs can effectively reconstruct flow systems and predict the associated unknown parameters. However, it is noteworthy that the results derived from Bayesian methods, based on our observations, tend to display a higher degree of certainty in their predictions as compared to those obtained using the DE. This elevated certainty in predictions suggests that Bayesian techniques might underestimate the true underlying uncertainty, thereby appearing more confident in their predictions than the DE approach.
Patents
G Puthumanaillam, V Dongre, and P Renkert. (2023). “Estimation of environmental disturbance for trailering”. Patent Application No. ENT-2023-023. Patent application under review. 2023 with Brunswick Corporation.
This patent focuses on the estimation of environmental disturbances affecting the performance and stability of trailering vehicles. By using advanced sensor data and computational techniques, this system aims to improve safety and efficiency in vehicle navigation under various environmental conditions.