Imagine the scene: it’s late on a Friday, and a critical server monitoring tool posts an automated alert into the #devops channel. The message reads, “CRITICAL: CPU utilization at 98% on primary database cluster.” But that channel is a constant firehose of CI/CD pipeline notifications, bot messages, and casual discussion. The critical alert, just one more block of black-and-white text in an ocean of updates, gets missed. Hours later, what could have been a five-minute fix becomes a full-blown weekend outage, all because the right information didn’t break through the noise. This isn’t a hypothetical scenario; it’s a daily reality in thousands of organizations that rely on Slack as their central nervous system. The platform’s greatest strength—concentrating all communication in one place—can also be its greatest weakness. The sheer volume of information creates a constant battle for attention, where truly urgent messages have the same visual weight as a new GIF.
The core challenge is signal versus noise. How do you ensure that time-sensitive, high-impact information not only gets delivered but is immediately recognized and acted upon? Standard notifications, even with @-mentions, are often insufficient. They become part of the background hum of digital work, easily dismissed or overlooked during a busy day. We need a way to elevate specific, critical signals so they transcend the limitations of text-based communication. This requires a system that doesn’t just add to the noise but cuts through it with an entirely different sensory input, one that’s impossible to ignore.
The solution lies in combining Slack’s evolution into an intelligent, proactive platform with the power of generative AI voice technology. Recently, Salesforce rebranded Slack as “the new agentic OS for work,” signaling a strategic shift. Slack is no longer just a messaging app; it’s an open platform designed for AI agents to operate within. With the introduction of its new Agent-Ready aPIs, particularly the real-time search API, developers can now build applications that listen and react to conversations as they happen. In this technical walkthrough, we will show you exactly how to harness these new capabilities. You will learn, step-by-step, how to build a real-time AI assistant that monitors Slack for specific keywords and uses the ElevenLabs API to generate a clear, audible voice notification, transforming a silent text alert into an unmissable event.
The Foundation: Slack’s Shift to an “Agentic OS”
To fully grasp the power of the integration we’re about to build, it’s crucial to understand the fundamental shift happening within the Slack ecosystem. The platform is moving beyond passive communication and becoming an active environment where automated agents can perform complex tasks. This evolution is powered by a new suite of developer tools designed for the AI era.
What is an Agentic OS?
The term “agentic OS for work,” which Slack and Salesforce have begun to use, describes a system where software agents (bots, AIs, etc.) have the context and permissions to act on behalf of users. Instead of a user manually searching for information and then executing a task, an AI agent can monitor conversations, understand intent, gather necessary data, and initiate actions autonomously. It’s the difference between a simple notification bot and a true digital assistant that participates in workflows.
This framework allows for proactive problem-solving. An agent can detect the sentiment of a customer complaint in a support channel, automatically retrieve the customer’s history from Salesforce, and draft a response for a human agent to approve—all without being explicitly commanded for each step.
Introducing the New Agent-Ready APIs
At the heart of this new paradigm are Slack’s Agent-Ready APIs. While Slack has had a robust API for years, these new additions are specifically designed for real-time, context-aware applications.
- Real-Time Search API: This is the cornerstone of our project. Unlike traditional search, which is user-initiated, this API allows an application to subscribe to a continuous stream of events as messages are created and updated across public channels. It essentially gives your AI a live feed of the conversation, enabling it to react in milliseconds.
- Model Context Protocol (MCP): This new server protocol allows Slack to securely share conversational context with AI models. It ensures that when an AI agent needs to understand the history of a channel or thread to take action, it can access that information without exposing sensitive data insecurely. While we won’t implement MCP in this guide, it’s a key part of the broader agentic vision.
By leveraging the real-time search API, we can build a listener that acts as the ears of our AI assistant, waiting for the trigger that sets our workflow in motion.
Prerequisites: Assembling Your Toolkit
Before we start coding, let’s gather the necessary tools and credentials. This is a technical walkthrough, but we’ve streamlined the process to make it as accessible as possible. Here’s what you’ll need:
- A Slack Workspace with Admin Rights: You’ll need to be able to create and install a new Slack App. If you don’t have a workspace for development, you can create a new one for free.
- An ElevenLabs Account and API Key: ElevenLabs is the text-to-speech engine we’ll use to generate our voice notifications. They offer a free tier that is more than sufficient for this project. Once you sign up, navigate to your profile to find your API key.
- A Development Environment: We’ll be using Node.js for this guide. To simplify setup and hosting, we recommend using a cloud-based environment like Glitch or Replit. This saves you from having to configure a local server and manage webhooks. You can simply remix a starter Node.js project.
With these three components ready, you have everything you need to begin building.
Step-by-Step Guide: Building Your Voice-Enabled Slack Assistant
Now we get to the hands-on portion. We’ll break this down into four clear steps: setting up the Slack app, building the listener, integrating ElevenLabs for voice generation, and posting the audio file back into Slack.
Step 1: Creating and Configuring Your Slack App
First, your assistant needs an identity within your Slack workspace.
- Create the App: Go to the Slack API dashboard and click “Create New App.” Choose “From scratch,” give your app a name (e.g., “Voice Notifier”), and select the workspace you want to install it in.
- Enable Socket Mode: This is the magic that allows our app to receive real-time events without a public-facing URL. In the app’s settings, go to “Socket Mode” and enable it. Slack will generate an app-level token (starting with xapp-). Save this; it’s like a password for your app.
- Subscribe to Events: Navigate to “Event Subscriptions.” Enable events and, under “Subscribe to workspace events,” add the search.public.updatedevent. This tells Slack to send a notification to your app whenever a message is indexed for search in a public channel—effectively, in real-time.
- Set Bot Token Scopes: Go to “OAuth & Permissions.” Under “Scopes,” add the following permissions for your bot token: search:read(to access search data),files:write(to upload the audio file), andchat:write(to post a message with the file).
- Install the App: At the top of the “OAuth & Permissions” page, click “Install to Workspace.” Authorize the installation. Slack will now generate a Bot User OAuth Token (starting with xoxb-). Save this token as well.
You now have two critical tokens: the app-level token (xapp-) for connecting via Socket Mode and the Bot User OAuth token (xoxb-) for making API calls like uploading files.
Step 2: Listening for Real-Time Events with the Search API
With the app configured, it’s time to write the code that listens for messages. In your Node.js environment (like Glitch), install the @slack/bolt package, which simplifies interacting with the Slack API.
// Import the App class from the @slack/bolt package
const { App } = require("@slack/bolt");
// Initialize your app with your tokens
const app = new App({
  token: process.env.SLACK_BOT_TOKEN, // Your xoxb- token
  appToken: process.env.SLACK_APP_TOKEN, // Your xapp- token
  socketMode: true,
});
// Define the keywords you want to monitor
const KEYWORDS = ["urgent", "critical", "server down", "brand mention"];
// Listen for the 'search.public.updated' event
app.event("search.public.updated", async ({ event, client }) => {
  console.log(`Received a search event: ${JSON.stringify(event)}`);
  // The message details are in event.item.message
  const message = event.item.message;
  const messageText = message.text.toLowerCase();
  // Check if the message contains any of our keywords
  const hasKeyword = KEYWORDS.some(keyword => messageText.includes(keyword));
  if (hasKeyword) {
    console.log(`Keyword detected in message: ${message.text}`);
    // Construct the text for our voice notification
    const notificationText = `Alert in channel ${event.item.channel.name}. The message says: ${message.text}`;
    // In the next step, we'll call the function to generate and post audio
    // await generateAndPostVoiceNotification(notificationText, client);
  }
});
(async () => {
  // Start your app
  await app.start();
  console.log('⚡️ Bolt app is running!');
})();
This code sets up a listener. When a message containing one of your keywords is posted in a public channel, it will log a confirmation and construct the text to be converted to speech.
Step 3: Generating Audio with the ElevenLabs API
Now, when a keyword is detected, we’ll call the ElevenLabs API to generate the audio. You’ll need a package like axios to make the HTTP request.
const axios = require('axios');
const fs = require('fs');
async function textToSpeech(text) {
  const ELEVENLABS_API_KEY = process.env.ELEVENLABS_API_KEY;
  // This is a sample Voice ID. Choose one you like from the ElevenLabs API documentation.
  const VOICE_ID = '21m00Tcm4TlvDq8ikWAM'; 
  const response = await axios({
    method: 'post',
    url: `https://api.elevenlabs.io/v1/text-to-speech/${VOICE_ID}`,
    headers: {
      'Accept': 'audio/mpeg',
      'xi-api-key': ELEVENLABS_API_KEY,
      'Content-Type': 'application/json',
    },
    data: {
      text: text,
      model_id: 'eleven_monolingual_v1',
      voice_settings: {
        stability: 0.5,
        similarity_boost: 0.5
      }
    },
    responseType: 'stream'
  });
  // Save the audio stream to a file
  const path = '/tmp/notification.mp3';
  const writer = fs.createWriteStream(path);
  response.data.pipe(writer);
  return new Promise((resolve, reject) => {
    writer.on('finish', () => resolve(path));
    writer.on('error', reject);
  });
}
This function takes text, sends it to ElevenLabs, and saves the resulting MP3 audio stream to a temporary file.
Step 4: Posting the Voice Notification Back to Slack
Finally, let’s upload the generated MP3 file to a specific channel, like #critical-alerts. We’ll create one last function to tie everything together. 
async function generateAndPostVoiceNotification(text, client) {
  try {
    // 1. Generate the audio file from text
    const audioFilePath = await textToSpeech(text);
    console.log(`Audio file generated at: ${audioFilePath}`);
    // 2. Upload the file to Slack
    await client.files.upload({
      channels: 'critical-alerts', // Make sure this channel exists!
      initial_comment: `🚨 New Voice Alert! Play the attached audio. 🚨`,
      file: fs.createReadStream(audioFilePath),
      filename: 'alert.mp3'
    });
    console.log('Voice notification posted to Slack.');
  } catch (error) {
    console.error('Error in processing notification:', error);
  }
}
Now, simply uncomment the function call in your app.event listener from Step 2, and your assistant is complete!
Enterprise Use Cases: Beyond Simple Notifications
While our example focuses on a general alert system, this architecture can be adapted for powerful, specific enterprise workflows.
Critical Incident Response
For DevOps and SRE teams, text alerts for system failures can become background noise. An unmissable voice notification delivered to a dedicated channel or even directly to an on-call engineer can dramatically reduce response times. It can serve as a powerful final step in an automated escalation policy.
Real-Time Brand Monitoring
Marketing and social media teams can use this to monitor public channels or shared Slack Connect channels for mentions of their brand, products, or key competitors. Hearing a brand mention in real-time allows for faster engagement and sentiment analysis, turning a passive monitoring process into an active one.
Enhanced Accessibility
This system can be a game-changer for accessibility. Important company-wide announcements can be automatically converted into audio format and posted in a dedicated channel, ensuring that visually impaired team members have equal and immediate access to critical information, making the digital workspace more inclusive for everyone.
It was late on a Friday when the alert came. But this time, it wasn’t a silent line of text buried in a channel. Instead, a clear, calm, AI-generated voice spoke from the speakers in the operations room: “Alert in channel #devops. The message says: CRITICAL: CPU utilization at 98% on primary database cluster.” The on-call engineer heard it instantly, addressed the issue in minutes, and the weekend outage never happened. By combining Slack’s real-time APIs with the clarity of AI-generated voice, you can build systems that don’t just add to the conversation—they command attention when it matters most. You’ve seen how to build the listener, generate the audio, and deliver the notification. You have the blueprint to cut through the noise. You’ve seen how easy it is to transform passive text into proactive alerts. Ready to build your own AI voice assistant in Slack? Try ElevenLabs for free now and bring your conversations to life.



