Friday, April 25, 2014

Tutorial - Create a reddit bot with python and Heroku

Heroku is a popular Platform as a Service (PaaS) host for deploying applications in multiple languages. You can run a limited number of small applications for free, Heroku automatically restarts your application if it crashes, and deployment is fairly simple.

Prerequisites

There are a few tools that you'll to have installed need before you start:

  • Heroku toolbelt (Heroku has good installation instructions here. You only need to follow steps 1-3.)
  • git
  • python (any version is fine, I'll use 3.4 in my examples)
  • virtualenv (not exactly necessary, but useful)

Setup instructions

Start by making a project directory and a virtual environment.

mkdir redditbot
cd redditbot
virtualenv --python=python3.4 env
source env/bin/activate

This will create a virtual python environment in a directory named env. If you run python --version you should see the version that you requested and not your system default python.

\

Next, you'll need to install PRAW, which will be used for interacting with reddit.

pip install praw

This will install praw into your virtual environment.

redditbot

I'll give my reddit bot a really original name: redditbot. I'm not going to explain how to use PRAW, but I will give a really simple example bot and point you to the documentation here. This bot is a slightly modified version of the "quick peek" example bot from the PRAW documentation.

import os
import time

import praw

# reddit really wants you to use a unique user agent string.
# see https://github.com/reddit/reddit/wiki/API#rules
r = praw.Reddit(user_agent='redditbot 0.1 by /u/')
# login isn't strictly needed here since we're not
# posting, commenting, etc.
# you'll need to set the REDDIT_USER and REDDIT_PASS
# environment variables before you run this bot
r.login(os.environ['REDDIT_USER'], os.environ['REDDIT_PASS'])

while True:
    for submission in r.get_subreddit('learnpython').get_hot(limit=5):
        print(submission)
    time.sleep(30)

Save this into redditbot.py, and you're ready to deploy the application to Heroku

Deployment

First, you'll need to create a heroku application.

heroku create

This will create a Heroku application. In the output, you should see the application name (I'll use "rapid-brook-5928" as a made-up example) and a git address. After this, you can create a git repository and connect it to your Heroku app (add the Heroku git repo as a remote).

git init
heroku git::remote -a rapid-brook-5928

Now, you'll need to create a few files to tell Heroku (and git) what to do to create and run your app.

requirements.txt

This file can be created for you by pip. You should recreate it any time you install or update any libraries in your virtual environment.

pip freeze > requirements.txt

runtime.txt

This file tells Heroku what version of python (or ruby, or java, etc.) to run your application. You should set it to the version you're using in your virtual environment.

python-3.4.0

Procfile

This file tells Heroku about the processes needed for your app. The free tier only supports one process per application, and we're going to make it a worker.

worker: python redditbot.py

.gitignore

This last one will keep git from storing unneeded files (or sending them to Heroku).

__pycache__/
env/
*.pyc

Remember the REDDIT_USER and REDDIT_PASS environment variables that redditbot is using? You need to set those in your Heroku environment.

heroku config:set REDDIT_USER=AzureDiamond REDDIT_PASS=hunter2

Now that you have all of the files created, it is time to push them to Heroku.

# add all files in current directory to git
git add .
git commit -m "creating redditbot"
# push your changes to Heroku
# this will deploy/redeploy your app on Heroku
# Heroku will install all of the libraries
# in requirements.txt and create a worker
git push heroku master

At this point, your bot is not yet running. You still need to log in to Heroku, go to your application dashboard, and increase the number of dynos allocated to your application to 1 (make sure to click on the "Apply Changes" button. For my example app, I would go to https://dashboard.heroku.com/apps/rapid-brook-5928/resources

Viewing the output

At this point, the bot doesn't do anything besides list the top posts at /r/learnpython. Everything the bot prints (including stacktraces when it crashes) goes to the Heroku log, which you can view with this command:

heroku logs

Where to go from here

Now that you're done creating a basic reddit bot, you'll want to expand it and do more. Explore the PRAW documentation and the reddit API to learn how to create posts or respond to comments. If you do respond to comments, you'll want to remember which comments you've responded to. You can do this with a list you store in memory, or you can use a database. Good luck!

1 comment:

Unknown said...

Hi, git push heroku master is giving me an error:

Counting objects: 8, done.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (8/8), 1.58 KiB | 0 bytes/s, done.
Total 8 (delta 0), reused 0 (delta 0)
remote: Compressing source files... done.
remote: Building source:
remote:
remote: -----> Python app detected
remote: -----> Installing python-2.7.0
remote: $ pip install -r requirements.txt
remote: /app/tmp/buildpacks/python/bin/steps/pip-install: line 8: /app/.heroku/python/bin/pip: No such file or directory
remote:
remote: ! Push rejected, failed to compile Python app
remote:
remote: Verifying deploy...
remote:
remote: ! Push rejected to powerful-ridge-37398.
remote:
To https://git.heroku.com/powerful-ridge-37398.git
! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'https://git.heroku.com/powerful-ridge-37398.git'