For most of our project, we have been relying on a PaaS (primarily Openshift, but also Heroku and CloudFoundry) to do deployment for us. [We did once, very early on, write a script to deploy directly to Amazon EC2, but we did not maintain that script, and I'm not even sure it ever really worked.] Recently, I've been thinking about doing our own deployments again, without a PaaS. While thinking about this, I have come to realize that there are some other approaches that are quite different from what we are used to at Openshift or on our development machines. When you create a new application at OpenShift, I believe it actually creates a new [partition of a] virtual machine on Amazon EC2. Openshift controls the software stack of the application VM (OS, middle-ware) although you have some control of the middle-ware stack through the 'cartridges' you select. The 'Application creation' operation is one of the few times you are really interacting with OpenShift servers directly - when you deploy your application code via a Git push, I believe you are pushing it directly to your application's own Amazon VM where it triggers an OpenShift script written by RedHat. This script will stop the currently-running application instance, update the application code (including installing new prerequisite packages from the internet if necessary), and then restart the application with the new code. I do not know how, when or whether the OS, middleware or RedHat-written Openshift scripts are updated on your application VM. I am 80% sure this is an accurate description of what OpenShift does, and 60% sure that Heroku and CloudFoundry do essentially the same thing. We have seen that this approach is not without problems - these scripts can (and do sometimes) fail, leaving your application 'down' until you can figure out how to fix your end of the script or wait for RedHat to fix their end, and even if they run correctly, your application is 'down' while the scripts run.
It occurred to me that there are other approaches to application deployment. For example, if I separate my load-balancer, application and database into 3 different [virtual] machines, I can deploy new versions of the application without ever 'patching' a [virtual] machine during deployment. In this approach, the output of my application build would be a complete virtual machine image, not just application files or packages or a git commit. To deploy this new application version, I would simply start a new virtual machine from the new image, and adjust the load-balancer tables to point to the new machine. I would still need a small amount of configuration of the new VM, for example to teach it to find the database. I might leave the old version running for a while to complete any outstanding requests and for potential roll-back. If there is a problem with the new version, I just need to change the load-balancer tables to point back to the old VM, which is still running, an action that should take almost no time at all, and I could also split traffic between old and new versions until I gain more confidence in the new one. There should be zero visible downtime for my application in all cases. The deployment script is much simpler, and so should be significantly more robust. Some corresponding complexity got added to my build scripts to create VM images, but that is a good trade, because failure there does not hurt me nearly as much. It's also fairly clear with this approach how to construct a system test environment that is guaranteed to be faithful to the production one - just run the same VM image. If the new version introduces a significant change to the data model, some of these scenarios become harder, but not harder than they are with the patch-in-place approach. Obviously this approach takes many more resources, so it might not be attractive for small 'casual' applications.
As a complete beginner in this field, I can't really comment yet on which of these approaches is better for us, but it's interesting that quite different choices are clearly possible, and the approach chosen for us by our PaaS may not be the ideal one for us. For our project, rather than implement the OpenShift approach on EC2 with my own scripts, it seems much more interesting to explore the alternate approach that avoids VM patching altogether. As Mae West said, “when choosing between two evils, I always like to try the one I've never tried before.”