Introduction

AWS Lambda can be used to process event notifications from Amazon S3. S3 can send an event to a Lambda unction when an object is created or deleted. This event-driven architecture can be used to build a scalable & reliable serverless applications.

In this article, we will look at how to build a simple application to create thumbnails of images stored in S3 using AWS Lambda.

Prerequisites

  • Serverless framework: We will be using the Serverless Framework to create our application.

  • virtualenv: Virtualenv is a tool to create isolated python environments. We will be using it to create an isolated environment for our application. Instructions for installation are available [here][install-virtualenv].

Getting Started

The source code for the project is available at this repository.

Project Setup

Create a new directory for the application

mkdir serverless-image-processing
cd serverless-image-processing

Setup a virtualenv for the application

virtualenv env
source env/bin/activate

Create requirements.txt with all the dependencies

boto3
Flask
Pillow

Install the dependencies

Run the following command in the shell to install all the dependencies.

pip install -r requirements.txt

Server

We will use Flask to create a webserver which will expose a POST endpoint that will handle the image upload

Let’s create a file named app.py in our root folder with the following contents.

from flask import Flask, request
from pathlib import Path
from werkzeug.utils import secure_filename

app = Flask(__name__)

ALLOWED_EXTENSIONS = {"png", "jpg", "jpeg", "gif"}
UPLOAD_FOLDER = Path(__file__).resolve().parent


def allowed_file(filename):
    return "." in filename and \
           filename.rsplit(".", 1)[1].lower() in ALLOWED_EXTENSIONS


@app.route("/upload", methods=["POST"])
def upload_image():
    """Receives a request to upload an image."""
    # Check if file was uploaded
    if "file" not in request.files:
        return "No file uploaded"

    file = request.files["file"]
    # Check if file is allowed
    if not allowed_file(file.filename):
        return "File extension not allowed"

    # save file locally
    filename = secure_filename(file.filename)
    file_path = UPLOAD_FOLDER / filename
    file.save(file_path)

    return "File uploaded successfully!"

We have a function named upload_image which checks if the file is an image and then saves the file locally.

We can test if our file upload works by running the following commands:

In a shell, start the webserver:


export FLASK_APP=app
flask run --reload

Then, in a different shell, we will make a request to the server and upload an image. Let’s say we have an image named hellocat.jpg in my home directory, then we can upload it to our server using the following command:

curl -F "file=@//home/abhishek/hellocat.jpg" localhost:5000/upload

The response we received:

File uploaded successfully!

S3

Now that our server can handle file uploads, we will integrate S3 with our server. Whenever a file is uploaded to the server, we will put that file in S3.

Let’s first create a new S3 bucket to hold all of our images. We can create a bucket named your-bucket-name using the following command:

aws s3 mb s3://your-bucket-name

Once the bucket has been created, we can update the server code to upload any images to S3. The server code to handle S3 uploads would look like this:


import boto3
from pathlib import Path
from flask import Flask, request
from werkzeug.utils import secure_filename

app = Flask(__name__)

S3_BUCKET = "serverless-lambda-tutorial"
S3_UPLOAD_KEY_NAME = "uploads/{filename}"
ALLOWED_EXTENSIONS = {"png", "jpg", "jpeg", "gif"}
UPLOAD_FOLDER = Path(__file__).resolve().parent


def allowed_file(filename):
    return "." in filename and \
           filename.rsplit(".", 1)[1].lower() in ALLOWED_EXTENSIONS


@app.route("/upload", methods=["POST"])
def upload_image():
    """Receives a request to upload an image."""
    # Check if file was uploaded
    if "file" not in request.files:
        return "No file uploaded"

    file = request.files["file"]
    # Check if file is allowed
    if not allowed_file(file.filename):
        return "File extension not allowed"

    # save file locally
    filename = secure_filename(file.filename)
    file_path = UPLOAD_FOLDER / filename
    file.save(file_path)

    file_key = S3_UPLOAD_KEY_NAME.format(filename=filename)
    upload_image_to_s3(file_path, file_key)

    return "File uploaded successfully!"


def upload_image_to_s3(file_loc, s3_key):
    s3 = boto3.client("s3")
    with open(file_loc, "rb") as f:
        s3.upload_fileobj(f, S3_BUCKET, s3_key)

    print(f"Image: {s3_key} uploaded to S3")

We added a new function to our code upload_image_to_s3 which uploads the image we saved locally to S3.

Let’s try to upload another image and see if it gets uploaded to S3.

curl -F "file=@//home/abhishek/hellocat.jpg" localhost:5000/upload

Let’s check the S3 bucket to confirm that the file did get upload successfully.

aws s3 ls s3://your-bucket-name/uploads/
2020-11-16 20:52:48     105297 hellocat.jpg

Looks like our file got uploaded to S3 as we wanted.

Integrating AWS Lambda

Now that we have successfully uploaded our image to S3, we will make use of the Serverless Framework to create a serverless pipeline using AWS Lambda.

Prerequisites

  1. Setup serverless framework by following the instructions listed here.

  2. Run the following commands after installing the serverless framework.

    npm init -f
    npm install --save-dev serverless-python-requirements
    

The serverless-python-requirements plugin automatically bundles dependencies from requirements.txt and makes them available to your Lambda function.

Setting up the serverless project

Next, create the file serverless.yml in your root directory and copy the following content:


service: serverless-lambda-s3

plugins:
  - serverless-python-requirements
  - serverless-wsgi
package:
  individually: true
  excludeDevDependencies: true

custom:
  wsgi:
    app: app.app
    pythonBin: python3
  pythonRequirements:
    slim: true
    strip: false
    slimPatternsAppendDefaults: true
    slimPatterns:
      - "**/*.egg-info*"
      - "**/*.dist-info*"
    dockerizePip: true

provider:
  name: aws
  runtime: python3.7
  stage: dev
  region: us-west-2
  environment:
    S3_BUCKET: your-bucket-name
  iamRoleStatements:
    - Effect: "Allow"
      Action:
        - "s3:*"
      Resource: ["arn:aws:s3:::${self:provider.environment.S3_BUCKET}/*"]

functions:
  api:
    handler: wsgi_handler.handler
    events:
      - http: ANY /
      - http: ANY {proxy+}
  generate_thumbnail:
    handler: app.generate_thumbnail
    events:
      - s3:
          bucket: ${self:provider.environment.S3_BUCKET}
          event: s3:ObjectCreated:*
          rules:
            - prefix: uploads/
          existing: true

Key points

  • Remember to update the S3_BUCKET variable with the name of your bucket.
  • We are setting up a lambda function named generate_thumbnail that will receive events when new objects are created in the S3 bucket.

Lambda Handler


We have created three new functions in our application code:

- `download_image`: Downloads an image from S3
- `create_and_upload_thumbnail`: Creates a thumbnail from the downloaded image using the Python Pillow library and then uploads it to a different location in S3.
- `generate_thumbnail`: The actual Lambda function that receives the S3 Object creation event. We retrieve the name of the file that was upload and then download that file.

Next, we will deploy our Lambda function so that we can test it:

```bash
sls deploy

Once the function has been deployed, you should see output like this:

functions:
  api: serverless-lambda-s3-dev-api
  generate_thumbnail: serverless-lambda-s3-dev-generate_thumbnail

We will make another request to upload an image using our server:

curl -F "file=@//home/abhishek/hellocat.jpg" localhost:5000/upload

We can confirm that our thumbnail was generated in two ways.

First, let’s look at the logs of the Lambda function:

sls logs -f generate_thumbnail
START RequestId: 29a6a857-06ea-4f30-bc2e-3ae458879d6a Version: $LATEST
Image: thumbnails/book-cover.jpg uploaded to S3
Successfully created thumbnail for file: book-cover.jpg

Next, let’s look at the S3 bucket to confirm our thumbnail was generated:

aws s3 ls s3://your-bucket-name/thumbnails/
2020-11-16 21:16:57       2186 hellocat.jpg

We have successfully setup a pipeline to create thumbnails using AWS Lambda and S3.

Conclusion

AWS Lambda has integrations with a lot of AWS services. Lambda can be used with services like S3 to build scalable and reliable pipelines that are also easy to manage.