Connecting an AI Assistant to Discord Without Losing Your Mind
The obvious way to connect an AI assistant to Discord is to wire up a bot, pipe messages to the LLM, and send responses back. That takes about twenty minutes. It also takes about twenty minutes to regret.
The first version of Karl's Discord integration did exactly that. Every message in every channel the bot could see became an LLM prompt. Every response went straight back to the channel. It worked in the sense that messages went in and responses came out. It was unusable in every other sense.
The bot responded to channels it should not have. It replied to messages that were not directed at it. It produced a wall of text in active channels where nobody asked for its opinion. It ignored DMs because those were not in the message handler's scope. And it had no concept of proactive updates — if a long-running task finished, the bot said nothing unless someone asked.
The Three Problems
Every Discord integration problem I hit fell into one of three buckets:
1. When should the bot respond? Mention rules. Channel scope. DM handling. The trigger conditions.
2. What should the bot know? Channel-specific context. Which project lives in which channel. What the conversation is about.
3. When should the bot speak unprompted? Task completion, error states, status updates. The proactive layer.
Get any of these wrong and the integration degrades quickly. Get all three wrong and people mute the bot within a day.
Self-Enforcing Mention Rules
The first instinct for "when should the bot respond" is to put it in the system prompt. "Only respond when mentioned."
That does not work, and the reason is the same as any prompt-based safety rule. The model follows the instruction when it is easy and ignores it when it has something it wants to say.
The fix is to enforce mention rules in code, before the LLM ever sees the message. The message handler checks three conditions:
def should_respond(message, bot_user):
# 1. Direct mention (@Karl)
if bot_user in message.mentions:
return True
# 2. DM channel (always respond)
if isinstance(message.channel, discord.DMChannel):
return True
# 3. Reply to one of the bot's own messages
if message.reference and message.reference.resolved:
if message.reference.resolved.author == bot_user:
return True
return False
If none of these conditions are met, the message never reaches the LLM. The bot does not "decide" to stay quiet. It never sees the message. That means no wasted API calls, no context pollution from irrelevant conversations, and no risk of the bot injecting itself into a discussion where it is not wanted.
A mention rule enforced in code does not degrade on hard tasks. It works the same at 2 AM as it does at noon.
DMs as the Planning Layer
Public channels are for short exchanges. Someone asks a question, Karl answers. Someone requests a task, Karl starts it. The interaction is visible to everyone in the channel, which is the point — it creates shared context.
But complex work needs a planning layer. Multi-turn conversations about approach, trade-offs, requirements. Those conversations are useful, but they clutter public channels fast.
The pattern that emerged: DMs are for planning, channels are for execution.
When a task needs discussion — scope, approach, what "done" means — that conversation happens in DMs with Karl. The back-and-forth stays between the user and the assistant. Once the plan is settled, the task gets queued, and the relevant channel gets a brief status update when it completes.
This separation is not a Discord feature. It is a convention. But it maps naturally to how people already use Discord. DMs feel like a private working session. Channels feel like a team room. The bot behavior follows that existing mental model instead of fighting it.
The implementation detail that makes this work: DM messages get a different system prompt than channel messages. In DMs, Karl has the full project context loaded — task queue, recent work history, relevant code files. In public channels, the system prompt is leaner. It assumes the conversation is quick and scoped to one question.
Channel-Specific Behavior
Different channels serve different purposes. A #tts-pipeline channel is about audiobook generation work. A #blog channel is about article drafts. A general #dev channel is for code work that does not fit elsewhere.
Karl's behavior changes per channel. Not through prompt engineering — through configuration.
Each channel has a small config block:
{
"tts-pipeline": {
"context": "TTS pipeline status, worker health, audio generation tasks",
"relevant_files": ["TASK_QUEUE.json", "tts-pipeline-status.json"],
"proactive_updates": true,
"mention_only": false
},
"blog": {
"context": "Article drafting, blog maintenance, SEO",
"relevant_files": ["article-ideas.json"],
"proactive_updates": true,
"mention_only": true
}
}
When a message arrives in #tts-pipeline, Karl loads the TTS context files and knows about pipeline state. When a message arrives in #blog, it loads article ideas and draft status. The bot does not need every piece of project context in every channel. It loads what is relevant.
The mention_only flag is a per-channel override. In #blog, Karl only responds when mentioned. In #tts-pipeline, where ongoing work discussions happen, Karl can respond to any message — the channel is scoped enough that false triggers are rare.
Channel-specific configuration is more reliable than asking the model to infer context from channel names. Inference works 90% of the time. Configuration works 100% of the time.
Proactive Updates
The last piece is the one that surprised me. The bot was responsive — it answered when spoken to. But it was silent during long-running work. A TTS generation job that takes forty minutes? No updates. A task queue stall detected and recovered? Nobody knew.
People do not stare at a task queue. They want to be told when something finishes or fails.
The solution was a proactive update system. When a task completes, fails, or hits a milestone, Karl posts a brief update to the relevant channel. Not a wall of text. One or two lines.
✅ TTS generation complete for Chapter 14 (3m 42s). 3 chapters remaining in queue.
⚠️ Task
code-review-auth-modulehas been in progress for 4h 12m. Marked stale and returned to queue.
📝 Article drafted: "Connecting an AI Assistant to Discord Without Losing Your Mind" — ready for review in #blog.
These updates are triggered by events in the task system, not by the LLM deciding to announce something. The task processor calls a notification function when a task changes state. The notification function checks which channel cares about this task type and posts the update.
This is what makes the bot feel like a team member rather than a search interface. It speaks when it has something to report. It stays quiet when it does not.
The General Principle
Discord integrations fail when the bot's behavior is governed by prompt instructions instead of code. "Only respond when mentioned" in a system prompt is a suggestion. The same rule in a message handler is a guarantee.
The patterns that work:
- Enforce trigger rules in code. The LLM should never see messages it is not supposed to respond to.
- Match context to channel scope. Loading everything into context for every channel wastes tokens and confuses the model.
- Use DMs for complex planning. Keep public channels clean for status and quick exchanges.
- Send proactive updates for long-running work. Silence during a forty-minute task is not a feature.
An AI assistant on Discord should be like a good colleague in a team chat. It speaks when it has something useful to say. It does not interject in conversations that are not its business. It gives you a heads-up when work is done. And when you need to think through a hard problem together, you take it to a private channel.
None of that requires sophisticated AI. It requires a message handler that enforces rules before the model gets involved, and a notification system that speaks when work completes. The model's job is generating good responses. The infrastructure's job is making sure those responses happen at the right time, in the right place.