MattWeekend

Introduction

What are we doing?

We’re setting up a web server that will send people a GIF of their choosing

Sounds too good to be true

I know right

Is this a tutorial?

Kinda. I’m not going to go through the full installation steps, because that would take forever. The main focus here is to explore the concepts. Here’s the code if you want a line by line breakdown: Github Link

Will you help me?

Of course, of course. I enjoy helping with with this sort of stuff. My email: lamers@outlook.com

What Is it going to look like?

Just send an email to gifBotz@gmail.com, and in about 5 mins you'll receive a GIF of whatever you put in the subject line. Try it!

Reading Emails With Python

The first thing we need to do is set up an email address to receive messages. I’d recommend Gmail, just because their API is easy to work with. Also, make sure to create a new email address instead of using your primary one. We don’t want to accidentally send an email to everyone in your address book :) . Once you create the Gmail address, turn on “less secure apps” otherwise Gmail will block your SMTP requests. Link

For connecting to Gmail, we’re mainly going to use the imap and pyZmail libraries. The below is just configuration jargon that logs us into our gmail account and returns a list of objects that represent unread emails. Make sure to change these values to your own address/password.


#Settings for reading Email
our_gmail_address = config.our_gmail_address
gmail_password = config.gmail_password

imaplib._MAXLINE = 1000000
imap = imapclient.IMAPClient('imap.gmail.com')
imap.login(our_gmail_address, gmail_password )
imap.select_folder('INBOX')
server = smtplib.SMTP_SSL('smtp.gmail.com', 465)
server.ehlo()
server.login(our_gmail_address, gmail_password)
unread_emails = imap.search(['UNSEEN'])

Let’s cycle through the objects and extract details such as the body text and email subject.


incoming_list = []

for single_email in unread_emails:
    raw_mail = imap.fetch([single_email],['BODY[]','FLAGS'])
    messageDict = messageValues(raw_mail, single_email)
    incoming_list.append(messageDict["incoming_email_no_space"])

Now we’re going to create a helper function to create a dictionary of all the message values, so that we can easily access them. Let’s also create some new values with the special characters removed. When people send us an email, they can include any junk they want. These regex functions remove symbols such as '*'', '=', etc. that might mess up our program.


def messageValues(raw_mail, single_email):
    message = pyzmail.PyzMessage.factory(raw_mail[single_email][b'BODY[]'])
    email_subject = message.get_subject()
    email_subject_alphanumeric = re.sub(r"[^\w\s]", '', email_subject)
    email_subject_no_space = re.sub(r"\s+", '+', email_subject_alphanumeric)
    incoming_email = message.get_addresses('from')
    incoming_email = incoming_email[0][1]
    incoming_email_alphanumeric = re.sub(r"[^\w\s]", '', incoming_email)
    incoming_email_no_space = re.sub(r"\s+", '+', incoming_email_alphanumeric)
    reply_substring = "Per Your Request"
    if reply_substring in email_subject_alphanumeric:
        is_reply = "yes"
    else:
        is_reply = "no"

    subjectDict =	{
        "email_subject": email_subject,
        "email_subject_alphanumeric": email_subject_alphanumeric,
        "email_subject_no_space": email_subject_no_space,
        "incoming_email": incoming_email,
        "incoming_email_alphanumeric": incoming_email_alphanumeric,
        "incoming_email_no_space": incoming_email_no_space,
        "reply_substring": reply_substring,
        "is_reply": is_reply
    }
    return subjectDict

Sweet, now we can easily fetch and read all our unread emails with python. Let’s get to the fun part -- GIFs.

Working with the Giphy API

After visiting the Giphy.com website and generating an API key, we can plug our email subject text into the giphy api and have it return a list of GIF URLs related to that text. I’m randomly selecting from the top ten results, just to add some variety.

As a side note, whenever you see a “GIF” on the modern internet, it’s almost always actually a HTML5 video pretending to be a GIF. Developers avoid GIFs, because they’re super slow and inefficient. However, HTML5 videos are only accepted on 10% of email clients, so we have to use actual GIFs for this project.


giphy_json_response = json.loads(urllib.request.urlopen("http://api.giphy.com/v1/gifs/search?q=" + messageDict["email_subject_alphanumeric"] +api_key).read())
json_items_returned = len(giphy_json_response["data"])

if json_items_returned > 1:
  randomSelect = random.randint(0, (json_items_returned - 1))
  giphy_url = giphy_json_response["data"][randomSelect]["images"]["downsized_large"]["url"]

In the Giphy api, I noticed that you could filter by “rating” (G, PG-13, R, etc.). I wondered why a rating system existed for GIFs -- after checking, it seems there are many GIFs of naked people.

Normally I wouldn’t care, but because my brain is drawn to worst case scenarios, I worried about people “spoofing” their email address and using my service to send explicit GIFs to other people. I checked a few terms and found the Giphy rating filter does not work … at all. So, we have to add the filter on our end.

One possible filter is the python library “better_profanity”, which is just a dictionary of bad words. It is fast, but misses any modified words and just isn’t a very sophisticated approach. There is also “profanity_check”, which uses some cool machine learning techniques to identify hateful speech. Unfortunately, it’s bad at catching the explicit words that aren’t necessarily hateful. Since neither were a complete solution, I just use them both and it seems to work decently.


email_text = messageDict["email_subject_alphanumeric"]
test1 = profanity.contains_profanity(email_text)
test2 = predict([email_text])

if test1 == True or test2[0] == 1:
  is_profane = True
else:
  is_profane = False

Sending Email

Now we can actually start formatting our email, using handy-dandy python F-strings. We have a different html email template for 404 errors, profane emails (which returns a Judge Judy GIF), and then a success email which returns the requested GIF.


  message = MIMEMultipart("alternative")
  if is_profane == False:
      message["Subject"] = "Per Your Request, A" + " " + messageDict["email_subject_alphanumeric"] + " GIF"
  else:
      message["Subject"] = "Per Your GIF Request..."

  message["From"] = our_gmail_address
  message["To"] = messageDict["incoming_email"]

  if json_items_returned == 0 :
      body_text = """
      <h1>Sorry, we couldn't find a GIF for your request :( </h1>
      <br>
      <h3> Back to the basement for our programmers... </h3>
      <img src="https://media.giphy.com/media/1F1JGyGZhiSAA8Vuhn/giphy.gif" width="400" height="400" alt="GIF Loadin" border="0">
      """

  elif is_profane == True :
      body_text = """
      <h1> There are other sites for that kind of language </h1>
      <img src="https://media.giphy.com/media/OvrMMjROnZvHi/giphy.gif" width="400" height="400" alt="GIF Loadin" border="0">
      """
  else:
      body_text = f"""\
      <h1>Here's your GIF!</h1>
      <img src="{giphy_url}" width="400" height="400" alt="GIF Loadin" border="0">
      """

  html = f"""\
           <html>
            <body>
                   {body_text}
                   <br>
                   <a href="https://docs.google.com/forms/d/e/1FAIpQLSfO4fis5HXZZef8iGF3SN6oFferB3zc6J3ucV2UB-QBjty6EA/viewform?usp=sf_link">
                   Didn't request this GIF? You can stop these</a>
               </body>
           </html>
           """
  part = MIMEText(html, "html")
  message.attach(part)

  server.sendmail(our_gmail_address, to, message.as_string())
  print ('Email sent!')

And that’s all you technically need to send the emails.

But what if you wanted to keeps some statistics on who is sending us emails? Or track how many emails are failing to go through? Or limit the number of times people can use our service, so our competitors can’t email us 200,000 times and burn through our API tokens? For that my friend, we need a database.

Database Section

Postgresql is a great and free general-purpose sql database, so let’s install that now. There are many tutorials that explain the installation process better than I could, so I won’t cover that here. Make sure you also install psycopg2, which is the python library needed to interact with the database.

The first thing we want to do is input our data into the database. From the main loop, we’ll pass in a tuple containing all of the info that we need to log.

def insert_data(record_list):
    try:
        connection = psycopg2.connect(user = config.server_user,
                                    password = config.server_password,
                                    host = config.server_host,
                                    port = config.server_port,
                                    database = config.server_database)


        cursor = connection.cursor()

        postgres_insert_query = """ INSERT INTO log (to_email, serv_timestamp, subject, profane, incoming_email_no_space) VALUES (%s,%s,%s,%s,%s) """

        for x in record_list:
            dt = datetime.utcnow()
            #record_to_insert = ('fraph@my.com', dt, 'cheesez', True)
            record_to_insert = x
            cursor.execute(postgres_insert_query, record_to_insert)
            connection.commit()



    except (Exception, psycopg2.Error) as error :
        print ("Error while connecting to PostgreSQL", error)
    finally:
        #closing database connection.
            if(connection):
                cursor.close()
                connection.close()
                print("PostgreSQL connection is closed")

You might be thinking, why do I have to pass in a record through the cursor? Can’t I just pass in a variable directly by using f-strings? Well, when working with SQL you have to be careful not to create vulnerabilities that could be used for SQL Injection attacks. Inserting via the cursor lets psycopg2 handle the hard work of sanitizing inputs.

I’ll let the psycopg2 documentation explain it more clearly. They may be a little dramatic, but…

And here's the obligatory XKCD:

Now that we’re recording the emails in the database, we can start writing queries to pull all kinds of interesting information.

Let’s put in a rule that we won’t send an email to any user that has submitted over 10 requests in the past 24 hours.

First, we create a new table called 'log' in the database and create whatever columns are needed. This can be done within the PgAdmin GUI.

Then, we can pretty much just use standard SQL syntax:

s = """Select incoming_email_no_space, count(incoming_email_no_space)
                    FROM "public"."log"
                    where serv_timestamp >= NOW() - interval '24 hour'
                    and incoming_email_no_space in %s
                    group by incoming_email_no_space"""

We just need to convert the 'list' of emails over to a tuple.

incoming_record = (*incoming_list,)
cursor.execute(s, (incoming_record,) )

Perfect

To return the lists of users into a list, we’re going to use the fetchall method.

list_users = cursor.fetchall()
#fix for weird issue where query adding whitespace
new_list_users = []
for x in list_users:
  z = x[0].strip()
  new_list_users.append((z, x[1]))

To get the total count of emails sent to that address, we just add the number of instances in the current batch to the number of emails sent in the past 24 hours (which we found with the SQL query). If it’s greater than the limit we defined, we ‘continue’ to exit out of the loop, which prevents us from sending the email.

old_message_count = list(filter(lambda x:messageDict["incoming_email_no_space"] in x, email_count))
new_message_count = incoming_list.count(messageDict["incoming_email_no_space"])
total_message_count = old_message_count[0][1] + new_message_count

if total_message_count > 10:
  print("Whoah, stop sending emails")
  continue

Sweet, now we're now able to rate-limit our GIF-sending service. I'm sure this will be very important ;-)

Bonus

Wouldn’t it be cool if we could turn our program on/off with an email? On a professional project, this would be a bad thing to implement for security reasons, but we're going to forge ahead here.

We’re first going to define “anubisSwitch” and “hathorSwitch” as our kill/start keywords. We’ll check for those words every time we receive an email.

for single_email in unread_emails:
    raw_mail = imap.fetch([single_email],['BODY[]','FLAGS'])
    messageDict = messageValues(raw_mail, single_email)
    incoming_list.append(messageDict["incoming_email_no_space"])
    if messageDict["email_subject_alphanumeric"] in ("anubisSwitch", "hathorSwitch"):
        toggle_config(messageDict["email_subject_alphanumeric"])

If we see either of those keywords, we’re going to call a function to make the database reflect the new on/off status. If “anubisSwitch” we’re going to change the “running” value to FALSE. If ‘hathorSwitch’ we’re going to change that value to TRUE.

def toggle_config(x):
    try:
        connection = psycopg2.connect(user = config.server_user,
                                    password = config.server_password,
                                    host = config.server_host,
                                    port = config.server_port,
                                    database = config.server_database)
        print("testing")

        cursor = connection.cursor()

        if x == "anubisSwitch":
            postgres_insert_query = """ UPDATE figs SET running = False WHERE id = 1 """
        if x == "hathorSwitch":
            postgres_insert_query = """ UPDATE figs SET running = True WHERE id = 1 """

        cursor.execute(postgres_insert_query)
        connection.commit()

Now, when we run our main get_data query we’ll check whether the ‘running’ value is set to false. If so, we’ll stop the script from running.

t = "Select running FROM public.figs where id = 1"
        run_setting = cursor.execute(t)
        running = cursor.fetchone()
        running = running[0]

if running == False:
            sys.exit("Exit to script")

And that’s it, we shut down the program on our web server with an email. Pretty slick, right?

Deploying

We’re going to go the easy route with deployment, and just run the script every five minutes. We can’t run it much more frequently than that anyways, because Gmail places limits on how often we can ping for new messages.

If you’re on Windows, you can just use ‘task scheduler’ to run 'gif.py' every few mimutes.

If you’re on linux, open up a cron tab and enter the following.

SHELL=/bin/bash
* /5 * * * * bash cronset.sh >> logoutput.txt 2>&1

We have to add the SHELL=/bin/bash part, otherwise we won’t be able to activate the python virtual environment. The default shell does not recognize ‘source’.

The asterisk part is CRON magic, which specifies that the script should run every five minutes. The rest is just enabling logging for this script, so that we can troubleshoot any issues.

Now just create the below shell script and set it as an executable using the ‘chmod + x’ command, and we'll have a program that automatically runs every five minutes.

#!/bin/bash
source gifBot/gifEnvir/bin/activate
/usr/bin/python3 gifBot/gif.py

We’re done!

Phew, well thanks for scrolling all the way to the bottom.

If you’re interested in being notified of new posts, feel free to sign up using the below form.

GIF BOT

Introduction

What are we doing?

Sounds too good to be true

Is this a tutorial?

Will you help me?

What Is it going to look like?

Reading Emails With Python

Working with the Giphy API

Sending Email

Database Section

Bonus

Deploying

Other articles

Home

Winning A Contest With Python

3 Things I Thought Would Be Easier

Create a GIF BOT with python

Recursive Lambdas In Google Sheets

Subsidies and Their Effect On Online Auctions

Summarize articles with Natural Language Processing

How to pull and clean data from the web

Ramblings on Crypto