in Uncategorized

Deploying Modern Django Apps to AWS Beanstalk

This article contains detailed instructions on how to deploy modern django based web-apps to AWS Beanstalk stack. The article is opinionated in many cases, and feel free to modify things to fit your needs. The articles goes through in detail from what packages to use to actual configuring to wire up packages to work with AWS Beanstalk.

The Django Packages

You’re using Django, and one of the big benefits of Django is that it has tons of useful packages. Here are some of the important packages I use when creating modern web applications (i.e. the backend is mostly an API).

  • django-storages-redux
    To store media files on S3.
  • djangorestframework
    The best and full featured API framework for Django. Period.
  • django-oauth-toolkit
    Since we’re going to have an API, we need authentication – and we do of course want to use the industry standard: OAUTH2.
  • django-guardian
    If you want to have advanced permissions on specific objects in the database, django-guardian can help you with that.
  • django-easy-pdf
    To generate pdf’s using simple HTML templates.
  • django-ses
    To use the AWS SES service to send emails. A big plus with AWS SES: Emails getting in SPAM folder is not a problem anymore.
  • django-pipeline
    My django-app have some frontend as well, so this is needed for asset compiling SASS and minification.
  • sorl-thumbnail
    To generate thumbnails. Easy to integrate with django restframework also.
  • django-cors-headers
    Because we are building a modern web-app and the frontend lives on another server CORS support is of course needed.

App structure

.
├── .git
├── .gitignore
├── my_application_app
│   ├── celerybeat-schedule
│   ├── custom_storages.py
│   ├── .idea
│   ├── manage.py
│   ├── my_application_app
├── README.md
├── requirements.txt
└── www
├── .gitignore
├── media
└── static

The settings

Configuring the settings takes time, especially when you have the dev and prod environment completely separated.

This is a  good start for a settings file based on the packages I used. This needs to be customized but there are hints here that are helpful for the AWS deployment such as using environment variables for everything that has with passwords, tokens and usernames.

The settings are explained in parts below.

Here we just do some imports that we need further down in the settings file, and we include the apps we want.  As you can see I use the apps I explained further up in this article. I have found most of these packages to be “mandatory” for a modern API based backend.

Next up we tell django that we want to use OAuth2Backend for Oauth2 authentication and ObjectPermissionBackend so we can take use of django-guardian based authentication.

ANONYMOUS_USER_ID is for django-guardian. Since we don’t have any anonymous user object we set this to None.

Here we basically tell django-rest-framework about our defaults, such as 15 items per page and a custom pagination class if needed.  We also add authentication classes that we want to use, as you can see I have configured the API so it’s possible to use oauth2, basic and session based authentication.

Then we tell about our timezone / country and also set where we want to store media and static files for our development environment!

Now the fun part starts, lets configure so that Beanstalk uses the AWS RDS database if we find RDS_DB_NAME environment variable to be set. Beanstalk automatically names these RDS_* variables and sets them up to correct details.

Now we setup where static -and media files should be found. This is only for the development environment.

And wouldn’t it be nice to test emails also when in the development environment. Here you can configure your gmail as SMTP.

Some default security that we want enabled on all environments. Some of these variables will be overwritten if we are in production / AWS.

Here it happens a lot. First of all this is where we configure most of the AWS based settings. As you can see the whole section starts with a  if 'COMPANY_IS_ON_AWS' in os.environ  . This environment variable is automatically inserted by our .conf script in .ebextensions folder later in this article. So this section will run if our app is in AWS.

First we configure CORS, since this is production we only allow our web-app’s domain to talk with the API. This overwrites the earlier “Allow all hosts” that we configured further up in the settings file.

Then we configure a lot of security. Since we are in production we use SSL of course. So we set up redirects to https and configures other recommended settings for production.

Then we set AWS_HEADERS so that static assets are cached forever.  This is fine because we use asset fingerprinting.

We then configure up the S3 details to environment variables. These environment variables can be changed via the Beanstalk web interface later on.

 

Most important parts here are setting the needed middlewares for each of the packages.

Setting up the wanted assets for pipeline and the settings needed for it to work with S3.

Since we are hosting the app in the cloud, we can not have a filesystem based cache. Therefor we use the database here. Other solutions might be redis.

We then configure our scheduled tasks.

Configuration files in .ebextensions folder.

We need to create some configuration files to install the required packages for asset compliation and more. The .ebextensions folder should reside in the root of your django folder structure / git repo.

This file installs the packages we need for:

  • Asset compiling (yuglify / uglify)
  • Required packages for Pillow ( thumbnail generation in practice ).

This file is the heart of deployment. Basically the script runs commands. Some have configured with leader_only, it means that only ONE INSTANCE will run the command (useful for e.g. compiling assets to S3 and database migrations etc. ).

  1. 01_wsgipass
    This command fixes OAUTH2 authentication. By default Beanstalk will not pass the Authorization header. So we configure modwsgi to allow it.
  2. 02_createcache
    Creates the cache table (if not exist)
  3. 02_makemigrations
    Makes migrations if they was not generated. They should be so this command might not be needed.
  4. 03_migrate
    Runs the database migrations process.
  5. 04_createsu
    A custom command I created that creates the admin user if not already created with a basic password. This is because the built in django createsuperuser requires interaction.
  6. 05_collectstatic
    Collects static assets and puts them in on S3.
  7. 06_uninstall_pil and 07_reinstall_pil
    Was needed to get Pillow to work on AWS Beanstalk.

The option_settings is a collection with default configuration. The environment variables should be configured later on in the AWS Beanstalk GUI, so keep those “xxxx” there.

This is what makes scheduling tasks work, without the need of more EC2 instances. It just works.

AWS Beanstalk kit is required:

Setup the new app:

  • Select a default region
  • Enter Application Name: {your-company-name}-{app-name}
  • It appears you are using Python: y
  • Select a platform version: 2 ( Python 2.7 is best for compability with packages )
  • Do you want to setup SSH: y
  • Type a keypair name: {your-company-name}-{app-name}-aws ( and fill in wanted password for your keypair after hitting ENTER ).
  • This will ask a number of questions, answer correctly on all of them.
  • A new application will spawn in the Beanstalk section of the AWS admin interface.

Create environment

Now, we need to create environment, lets create a dev enviroment. Each environment is COMPLETELY separated and uses its own database, ec2, s3 etc. So if you have many environments the costs would double.

Naming rules for environments:

  • staging enviroment: {your-company-name}-{app-name}-staging
  • dev enviroment: {your-company-name}-{app-name}-dev
  • production environment: {your-company-name}-{app-name}-prod

Note: Each environment duplicates the cost, a dev enviroment should be deleted afterwards. Production enviroment is what you create in the end of the product release.

Run the command below and use the naming rules.

  • A new environment is created benath the app itself
  • It creates S3 instance, EC2 instance, security groups etc.
  • It tries to start the app.
  • Database will be created (postgres). The smallest micro server is default. When in production THIS database needs to be upgraded in the console!!
  • After it says “Command execution completed” press CTRL+C.

Open the console (which opens the AWS web interface for the app itself):

Create S3 bucket, where media (JS/CSS and uploaded user files) will reside.

Go to AWS web console, create a S3 media bucket: Services -> S3 -> Create Bucket

  • Create the bucket
    • Bucket-Name RULE: {your-company-name}-{app-name}-{dev|prod|staging}-media
    • Region: Ireland
  • Create a new user: Go to AWS IAM. Click “Create new users” and follow the prompts. Leave “Generate an access key for each User” selected.
    • User name RULE: {your-company-name}-{app-name}-{dev|prod|staging}-media ( THE SAME AS S3 BUCKET NAME )
  • Get the credentials (ACCESS/SECRET KEY, COPY THEM TO NOTEPAD, YOU NEED THESE LATER ON IN THIS INSTRUCTION )
  • Get the new user’s ARN (Amazon Resource Name) by going to the user’s Summary tab. It’ll look like this: “arn:aws:iam::32132123121:user/someusername”
  • SQS: Go to AMI -> users -> the -media user -> Attach policy: AmazonSQSFullAccess
  • Go to the bucket properties in the S3 management console.
  • Add a bucket policy that looks like this, but change “BUCKET-NAME” to the bucket name, and “USER-ARN” to your new user’s ARN. The first statement makes the contents publicly readable (so you can serve the files on the web), and the second grants full access to the bucket and its contents to the specified user::
  • What you should have now (WRITE IT DOWN):
  • Add CORS: S3 -> Your bucket -> Permissions -> Add CORS and paste this:
  • Click on Tags -> Add a new tag
    • Reason for this: This is so that the Beanstalk pricing details are the on the same tag as the custom S3 for media storage.
    • The “key” should be “Name”, the “value” should be the same as on “tags” on beanstalk. See tags on beanstalk by opening the beanstalk console and go into the environment and see Tags there.

Configure S3 with the Beanstalk instance.

Lets use the S3 we created above, The Django App needs to know to store the media inside that S3 bucket.

GOTO: Configuration -> Software Configuration ( click on the settings symbol )

  1. COMPANY_MEDIA_ACCESS_KEY = The S3 access key you got when creating an S3 for the media.
  2. COMPANY_MEDIA_BUCKET_NAME = The bucket name for S3 you created above.
  3. COMPANY_MEDIA_SECRET_KEY = The secret key for S3 you created above.

Do a final re-deploy

FINAL STEP:

Last steps:

  1. You should now be able to access: http://{app-name}.elasticbeanstalk.com/admin
  2. Go to “Authentisering og autorisasjon” -> Edit the admin user -> Change the password to something good.
  3. Done done, start creating awesome apps built on this base structure.

 

Share this post

Facebooktwittergoogle_plusredditpinterestlinkedinmail

Write a Comment

Comment

  1. Thank you for the detailed solution. I believe there is an issue with the scheduler in your sample code. As you are using the “- B” option for your celery workers, you should not have multiple instances then as you would end up with duplicate tasks.

    http://docs.celeryproject.org/en/latest/userguide/periodic-tasks.html#starting-the-scheduler

    I don’t know how it would be fixed with your code but I don’t think it’s possible with running celery beat on your scalable instances.