Do you have a favorite fictional or non-fictional character? Let’s build an AI model that speaks just like your favorite character. Then we’ll deploy that model as a chatbot on Discord, a popular messaging platform.
What’s great about this tutorial is that all the coding and deployment happen in the cloud (for free!). So you don’t even need a local IDE to follow along. You can follow along using the code in my GitHub repository:
You can also check out this tutorial on YouTube:
Outline of this Tutorial
There are four steps that we need to tackle:
How to Prepare the Data
For our chatbot to learn to converse, we need text data in the form of dialogues.This is essentially how our chatbot is going to respond to different exchanges and contexts.
There are a lot of interesting datasets on Kaggle for popular cartoons, TV shows, and other media. For example:
I choose a character named Joshua from my favorite adventure video game, The World Ends With You (TWEWY). The game has a ton of awesomely rich and quirky dialogues so I am confident that my chatbot will develop a unique personality 🙂
With some simple parsing and text cleaning, I turn the game’s transcript into a dataset similar to the other Kaggle datasets.
As a rule of thumb, 1000+ lines of dialogue for your character should be enough. Now let’s proceed to the exciting part – training the model.
How to Train the Model
This step is largely inspired by a Medium post on training a conversational model using transcripts from Rick and Morty. We will use its Google Colab notebook for training as well.
Under the hood, the model is a Generative Pre-trained Transfomer (GPT), the most popular language model these days.
Instead of training from scratch, we will load Microsoft’s pre-trained GPT DialoGPT-small
and fine-tune it using our dataset. For more theory on how GPT works, do refer back to the Medium post.
In the notebook, all we need to change is the dataset we use. To prepare the data for training, the code converts our data into a contexted format shown below.
Running through the entire notebook using Google Colab’s free GPU will create a folder named output-small
that contains the model file in /content/drive/My Drive/Colab Notebooks
. I have about 700 lines and the training takes less than ten minutes.
Feel free to train a larger and smarter model like DialoGPT-medium
or even DialoGPT-large
, as well as increase the number of training epochs by searching for num_train_epochs
in the notebook.
How to Host the Model
We will host the model on Hugging Face, which provides a free API for us to query the model. Note that the free account does have a 30k input characters per month limit.
Sign up for Hugging Face and create a new model repository by clicking on New model. Obtain your API token by going to Edit profile > API Tokens.
Back in Google Colab, in the same notebook, we will push the model and the tokenizer to the Model Hub.
Then you should be able to go to your model page in a web browser and see that the model is pushed.
For our chatbot model to be recognized as a conversational model, we need to add a tag to the Model Card README.md
before any markdown.
You can start chatting with the model in the browser! Try to see if the chatbot has learned the name of your chosen character 🙂
Now let’s move beyond chatting in the browser and deploy the chatbot model as a Discord bot.
How to Build the Chatbot
Go to the Discord Developer’s page, create an application, and add a bot to it. Since our chatbot is only going to respond to user messages, checking General Permissions > View Channels and Text Permissions > Send Messgaes in the Bot Permissions Setting is sufficient. Copy the bot’s API token for later use.
Sign up for Repl.it and create a new Python Repl.
Let’s store our API tokens for Hugging Face and Discord as environment variables, named HUGGINGFACE_TOKEN
and DISCORD_TOKEN
respectively. This helps keep them secret.
In main.py
, we will create our Discord bot. I will break down the code step by step.
Imports
Hugging Face API Endpoint
Discord Bot Client Definition
Main Function
How to Deploy the Bot to Discord
With that, our bot is ready to go! Start the Repl script by hitting Run, add the bot to a server, type something in the channel, and enjoy the bot’s witty response.
Well, as expected from my favorite quirky video game character 😀 See the end of this post for a 15-minute video of a real-time chat among me, my friend, and my bot.
How to Keep the Bot Online
One little problem with our bot is that it halts as soon as we close the Repl.it script. There are two ways around this.
If you have a Repl.it Hacker plan that comes with University email addresses, then you can set the Repl project to Always On.
Otherwise, you can wrap your Python script in a Flask Web App and utilize services like Uptime Robot to ping your web app every five minutes to keep it awake.
You can read the last section in this freeCodeCamp Discord bot tutorial to learn more.
Final Thoughts and Resources
Cheers! You have reached the end of this tutorial. Hope you had fun creating Discord bots.
As a side note, my biggest takeaway from this project was perhaps not the technical part but rather about when to pivot and redefine a project. Although the user-facing component of this project turns out to be a Discord bot, my original vision was to build a React.js front-end for the chatbot.
However, as soon as I saw the slick UI chat on Hugging Face, I realized that’s something I can never dream to build with just a bunch of React.js crash courses.
So I decided to keep my backend but change my front-end/user-facing component to something more unique and interesting – a Discord bot that never fails to entertain me and my friends on our chat server. 🙂