Chat
#| standalone: true
#| components: [viewer]
#| viewerHeight: 500
from shiny.express import ui
# Set some Shiny page options
ui.page_opts(
title="Hello Shiny Chat",
fillable=True,
fillable_mobile=True,
)
# Create a chat instance and display it
chat = ui.Chat(id="chat")
chat.ui()
# Define a callback to run when the user submits a message
@chat.on_user_submit
async def _():
# Append a response to the chat
await chat.append_message(f"You said: {chat.user_input()}")
from shiny.express import ui
ui.page_opts(
title="Hello Shiny Chat",
fillable=True,
fillable_mobile=True,
)
# Create a chat instance and display it
chat = ui.Chat(id="chat")
chat.ui()
# Define a callback to run when the user submits a message
@chat.on_user_submit
async def _():
# Simply echo the user's input back to them
await chat.append_message(f"You said: {chat.user_input()}")
from shiny import App, ui
app_ui = ui.page_fillable(
ui.panel_title("Hello Shiny Chat"),
ui.chat_ui("chat"),
fillable_mobile=True,
)
def server(input):
# Create a chat instance and display it
chat = ui.Chat(id="chat")
# Define a callback to run when the user submits a message
@chat.on_user_submit
async def _():
# Simply echo the user's input back to them
await chat.append_message(f"You said: {chat.user_input()}")
app = App(app_ui, server)
Relevant Functions
-
chat = ui.Chat()
ui.Chat(id, messages=(), on_error="auto", tokenizer=MISSING)
-
chat.ui()
chat.ui(placeholder="Enter a message...", width="min(680px, 100%)", height = "auto", fill = True)
-
@chat.on_user_submit
chat.on_user_submit(fn)
-
chat.messages()
chat.messages(format=MISSING, token_limits=(4096, 1000), transform_user="all", transform_assistant=False)
-
chat.append_message()
chat.append_message(message)
-
chat.append_message_stream()
chat.append_message_stream(message)
Generative AI quick start
Pick from one of the following providers below to get started with generative AI in your Shiny app. Once you’ve choosen a provider, copy/paste the shiny create
terminal command to get the relevant source files on your machine.
Once the app.py
file is on your machine, open it and follow the instructions at the top of the file. These instructions should help with signing up for an account with the relevant provider, obtaining an API key, and finally get that key into your Shiny app.
Note that all these examples roughly follow the same pattern, with the only real difference being the provider-specific code for generating responses. If we were to abstract away the provider-specific code, we’re left with the pattern shown below. Most of the time, providers will offer a stream=True
option for generating responses, which is preferrable for more responsive and scalable chat interfaces. Just make sure to use .append_message_stream()
instead of .append_message()
when using this option.
Appending messages to a chat is always an async operation. This means that you should await
the .append_message()
or .append_message_stream()
method when calling it and also make sure that the callback function is marked as async
.
The templates above are a great starting point for building a chat interface with generative AI. And, out of the box, Chat()
provides some nice things like error handling and code highlighting. However, to richer and bespoke experiences, you’ll want to know more about things like message formats, startup messages, system messages, retrieval-augmented generation (RAG), and more.
Message format
When calling chat.messages()
to retrieve the current messages, you’ll generally get a tuple of dictionaries following the format below. This format also generally works when adding messages to the chat.
Unfortunately, this format is not universal across providers, and so it may not be directly usable as an input to a generative AI model. Fortunately, chat.messages()
has a format
argument to help with this. That is, if you’re using a provider like OpenAI, you can pass format="openai"
to chat.messages()
to get the proper format for generating responses with OpenAI.
Similarly, the return type of generative AI models can also be different. Fortunately, chat.append_message()
and chat.append_message_stream()
“just work” with most providers, but if you’re using a provider that isn’t yet supported, you should be able to reshape the response object into the format above.
Startup messages
To show message(s) when the chat interface is first loaded, you can pass a sequence of messages
to Chat
. Note that, assistant messages are interpreted as markdown by default.1
message = {
"content": "**Hello!** How can I help you today?",
"role": "assistant"
}
chat = ui.Chat(id="chat", messages=[message])
chat.ui()
In addition to providing instructions or a welcome message, you can also use this feature to provide system message(s).
System messages
Different providers have different ways of working with system messages. If you’re using a provider like OpenAI, you can have message(s) with a role
of system
. However, other providers (e.g., Anthropic) may want the system message to be provided in to the .generate_response()
method. To help standardize how system messages interact with Chat
, we recommending to using LangChain’s chat models. This way, you can just pass system message(s) on startup (just like you would with a provider like OpenAI):
system_message = {
"content": "You are a helpful assistant",
"role": "system"
}
chat = ui.Chat(id="chat", messages=[system_message])
Just make sure, when using LangChain, to use format="langchain"
to get the proper format for generating responses with LangChain.
@chat.on_user_submit
async def _():
messages = chat.messages(format="langchain")
response = await my_model.astream(messages)
await chat.append_message_stream(response)
Remember that you can get a full working template in the Generative AI quick start section above. Also, for another more advanced example of dynamic system messages, check out this example:
Message trimming
When the conservation gets becomes excessively long, it’s often desirable to discard “old” messages to prevent errors and/or costly response generation. To help with this, chat.messages()
only keeps the most recent messages that fit within a conservative token_limit
. See the documentation for more information on how to adjust this limit. Note that trimming can be disabled by setting .messages(token_limit=None)
or Chat(tokenizer=None)
.
Error handling
When errors occur in the @on_user_submit
callback, the app displays a dismissible notification about the error. When running locally, the actual error message is shown, but in production, only a generic message is shown (i.e., the error is sanitized since it may contain sensitive information). If you’d prefer to have errors stop the app, that can also be done through the on_error
argument of Chat
(see the documentation for more information).
Code highlighting
When a message response includes code, it’ll be syntax highlighted (via highlight.js) and also include a copy button.
Custom response display
By default, message strings are interpreted as (github-flavored) markdown. To customize how assistant responses are interpreted and displayed, define a @chat.transform_assistant_response
function that returns ui.HTML
. For a basic example, you could use ui.markdown()
to customize the markdown rendering:
chat = ui.Chat(id="chat")
@chat.transform_assistant_response
def _(content: str) -> ui.HTML:
return ui.markdown(content)
When streaming, the transform is called on each iteration of the stream, and gets passed the accumulated content
of the message received thus far. For more complex transformations, you might want access to each chunk and a signal of whether the stream is done. See the the documentation for more information.
chat.messages()
defaults to transform_assistant=False
By default, chat.messages()
doesn’t apply transform_assistant_response
to the messages it returns. This is because the messages are intended to be used as input to the generative AI model, and so should be in a format that the model expects, not in a format that the UI expects. So, although you can do chat.messages(transform_assistant=True)
, what you might actually want to do is “post-process” the response from the model before appending it to the chat.
Transforming user input
Transforming user input before passing it to a generative AI model is a fundamental part of more advanced techniques like retrieval-augmented generation (RAG). An overly basic transform might just prepend a message to the user input before passing it to the model.
chat = ui.Chat(id="chat")
@chat.transform_user_input
def _(input: str) -> str:
return f"Translate this to French: {input}"
A more compelling transform would be to allow the user to enter a URL to a website, and then pass the content of that website to the LLM along with some instructions on how to summarize or extract information from it. For a concrete example, this template allows you to enter a URL to a website that contains a recipe, and then the assistant will extract the ingredients and instructions from that recipe in a structured format:
In addition to providing a helpful startup message, the app above also improves UX by gracefully handling errors that happen in the transform. That is, when an error occurs, it appends a useful message to the chat and returns None
from the transform.
@chat.transform_user_input
async def try_scrape_page(input: str) -> str | None:
try:
return await scrape_page_with_url(input)
except Exception:
await chat.append_message(
"I'm sorry, I couldn't extract content from that URL. Please try again. "
)
return None
The default behavior of chat.messages()
is to apply transform_user_input
to every user message (i.e., it defaults to transform_user="all"
). In some cases, like the recipes app above, the LLM doesn’t need every user message to be transformed, just the last one. In these cases, you can use chat.messages(transform_user="last")
to only apply the transform to the last user message (or simply chat.user_input()
if the model only needs the most recent user message).
Footnotes
The interpretation and display of assistant messages can be customized.↩︎