There has been some discussion recently as to whether Amazon’s EC2 infrastructure is or isn’t ideal for WordPress installations, as someone who’s spoken on the topic at multiple WordCamps including WCNYC and WCLV I feel like this is an important subject for those looking to run their own servers and are looking at all of their options.
An EC2 instance is not a server, it’s a node
Amazon had their Annual Re:Invent conference this year and their CTO, Werner Vogels said something that most people when they think about AWS don’t take into consideration, “An EC2 instance is not a server, it’s a building block.”
For all intents and purposes an EC2 instance acts like a regular virtual machine, you can SSH or remote into the machines very easily and this is how most developers use AWS from day to day. So for people like me who have been using AWS for a few years now, it’s as absurd to say that the EC2 platform is merely another virtual machine as it is to say that Google’s new Compute Engine is just another linux box. The bulk of the people who complain about AWS being unreliable, expensive or difficult to use are trying to replace their existing servers and setups out of a data center somewhere with EC2.
Jeff Bezos introduced EC2 as an SOA or a service oriented architecture, if you know anything about Object Oriented Programming then SOA works much the same way but from an infrastructure perspective instead of a development one. This means that when you are thinking about deploying WordPress or any application into something like EC2 then you have to think in terms of SOA and not a single server setup. Bezos mandated that all of his systems be built the same way which is why we now have S3 (which no one seems to complain about), RDS, Route53 and so many more less IO concentric distributed systems of that same nature.
Single Server Speed (Dedicated) vs Distributed System Speed (EC2)
It’s important that I mention right off the bat that you can still have a distributed system with a Dedicated setup, my point here is not to downplay the Dedicated systems but merely to talk about where EC2 really shines.
WordPress lends itself well to distributed systems but not out of the box, it is already possible to keep your database (RDS) separate from your application server (EC2) and seperate from your media content (S3). The whole idea behind a distributed installation is that at any point you can rev up or down new resources to either increase or decrease the capacity of your system.
So when competitors like Rackspace make the persistence argument and call it an advantage, they gloss over the point that EC2 is the complete opposite of persistence and was designed that way. EC2 was designed to work as a disposable node or building block created to be built and then thrown away or upgraded whenever a faster or more powerful installation is required by the larger application or site as a whole.
When you put together all of the AWS pieces into one big picture, you have RDS which is meant to handle database transactions and can be scaled up or down automatically or at the touch of a button, S3 which we all use to store our media files and content in some way shape or form, EC2 which runs the application code and can have hundreds of copies if necessary and Route53 which handles all the DNS at 100% SLA we start to see the bigger picture.
If you have trouble with a single EC2 instance or start running into IO problems, the proper design of a distributed system is to kill the server, not fix it or spend hours trying to figure out what’s wrong with it or why it’s running slow. If WordPress needs more power you should just temporarily upgrade the instance, boot up more of them or put automatic warnings in place to do so instead of calling up your server guy in the middle of the night to deal with it.
A distributed setup consists of a bunch of loosely put together components and is how you properly create an Amazon Web Services infrastructure, not this single server concept model that most current users implement.
Load Balancing should be the focus and not just running a single machine
Load Balancing allows you to run multiple servers at once and guide users as they arrive to your site towards a server that maybe isn’t getting hit as hard as others. If you have a site that gets a lot of traffic, or you have a content-heavy site with sporadic ups and downs, load-balancing becomes critical. Being able to intelligently balance traffic means you can make sure users don’t see a slowdown of your site, everyone does load balancing including Dedicated providers.
For bigger companies, the next generation of load balancing is Auto Server Generation which is what EC2 allows for at it’s core. This kind of setup allows you to make sure your site never goes down if you get hit with too much traffic. Instead of buying extra physical hardware capacity at another data center, most modern day cloud services like EC2 will spawn up new servers the minute that your site sees a traffic spike at a certain threshold to handle the extra capacity needed, even upgrades your Database Servers if necessary etc, this is where EC2 shines and the IO stuff becomes a mute point.
Done correctly, this can mean the difference between a hacker taking your site down, or your users never realizing there was anything wrong. Managed hosting providers like WP Engine manage this for you, while others like Amazon or Rackspace support it but you have to program it in yourself or use a deployment tool like Chef with Knife or something.
Conclusion: Maintaining Separation may effect IO but is still important for growth
When Load Balancing or Auto Scaling, it’s essential to keep the three major components of your code separate from each other. The database needs to be separate from the content, and they both need to be separate from the code. This will allow all of your sites to share the same data, using a system like S3 for your Uploaded content means you never have to worry about loosing something if you shut a server down, and only having executable code in your git repo so that it doesn’t matter what server the users get redirected to.
Maintaining these three in separate machines or services is the best way to keep your server from getting overwhelmed if it gets overrun with traffic but it can affect your IO performance, if you have the right kind of setup that won’t matter because the system will upgrade itself and grow with your user base. I don’t want people getting scared off from EC2, distributed systems are the future of DevOps in my opinion and any developer or systems engineer worth their salt should take it into consideration.
I’d love to hear your feedback, your good or bad experiences with AWS and especially with EC2, what your setup looks like etc. Only by talking through these infrastructure questions can we really help everyone in the community fix their complaints and or highlight their successes.