Immutable Infrastructure: Fast & Reliable Scaling

Moving to Immutable Infrastructure brings many advantages. In the previous article in this series we saw how it eliminates drift and the problems associated with deferred provisioning by capturing your application and all its dependencies at the lowest level possible: an image of the entire system. We also looked at how to enforce immutability by creating images with no SSH. With this technique any change requires rebuilding a new image through your regular automated build and deployment pipeline. In turn your system drastically gains in simplicity and reliability.

Another area where Immutable Infrastructure radically simplifies things is scaling. Let's find out how.

Why scaling?

Scaling is adding or removing capacity from your system. The main driver for scaling has traditionally been load. When load increases beyond what can be reliably handled by the current infrastructure, more capacity must be provisioned to handle it.

However today we have another important force at play: money. On the public cloud, cost is an important driver for the way we architect our systems. In stark contrast to on premise setups where hardware is almost universally a sunk cost, public clouds like AWS offer utility-like pricing. You essentially pay for what you use. It therefore becomes essential to not only be able to quickly add, but also to remove capacity from a system! You effectively optimize your budget to optimally handle the current load on the system.

Horizontal vs Vertical Scaling

When it comes to scaling, it really boils down to two options. Changing the size or the number of instances. You can either scale horizontally by adding (scaling out) or removing (scaling in) instances, or you can scale vertically to bigger (scaling up) or smaller (scaling down) machines.

Horizontal and Vertical scaling

Regardless of the scaling direction and technique you choose, you need to be able to do this quickly and reliably. While this sounds easy, its has traditionally come with a number of challenges, the most common ones being state, provisioning times and robustness. However as you'll see these aren't intrinsic problems to scaling. They come from the way we have traditionally handled state and provisioned machines.

The challenges with traditional provisioning

As we already saw in the previous article traditional provisioning with shell scripts or configuration management tools is very prone to drift as it flies in the face of the proven mantra: build your artifact only once.

The reason for this, is that instead of building an image with all software fully configured, secured and ready to boot, this provisioning process happens on every instance you launch. This puts you at risk for all issues related to deferred provisioning as well as all potential reliability problems of having to fetch everything again and again from remote repositories which may have changed or temporarily gone offline.

The other big issue with this is that it takes a very long time. Whereas with Immutable Infrastructure and fully-provisioned images all you need to do is make a copy of the volume and start up the instance, with traditional provisioning you first have to go through a tedious installation process before your application can even begin to start up.

Dealing with state

When it comes to dealing with state, Immutable Infrastructure forces you to do the right thing: keep all persistent state off the instance. This doesn't mean you can't have a local copy as a a cache to speed things up (beware of stale data though!), but the master should be in a central highly-available and replicated data store.

This applies first and foremost to application data which should reside in data stores, in-memory caches or object storage. But also to other things like log files which in 2015 should not be spread across multiple instances, but instead aggregated in a central log server where they can be archived, indexed, and made searchable. And last, but not least, also sessions should be moved to either a cookie, a data store or a central in-memory cache. We'll talk more in detail about this in a future post.


In a public cloud world you need to be able to quickly and reliably add capacity to a system in response to load, but also remove capacity to optimize your budget. You can do this by scaling horizontally (in & out) and adjust the number of instances. Or you can do it by scaling vertically (up & down) to change the size of your instances.

Traditional provisioning poses a number of challenges related to drift, deferred provisioning, reliability and performance. Immutable Infrastructure solves these problems through fast and reliable launching of instances based on fully provisioned images. It also forces you to do the right thing and keep all persistent state off instance.

Stay tuned for future blog blog posts where we'll explore more aspects of Immutable Infrastructure including sessions and service discovery.

Try it for yourself

CloudCaptain lets you do all this and more. CloudCaptain intelligently analyses your application and generates minimal images in seconds. There is no general purpose operating system and no tedious provisioning. CloudCaptain images are lean, secure and efficient. You can run them on VirtualBox for development and deploy them unchanged and with zero downtime on AWS for test and production.

Have fun! And if you haven't already, sign up for your CloudCaptain account now. All you need is a GitHub user and you'll be up and running in no time. The CloudCaptain free plan aligns perfectly with the AWS free tier, so you can deploy your application to EC2 completely free. And when it comes to scaling both horizontally and vertically, all you need is a single command.

« CloudCaptain Convert: fully-provisioned AMIs in 30 seconds
Deploy Java EE Apps effortlessly on AWS »