Moving our server to Bluemix
In the 1.4 release of Code Rally in addition to new user interface changes, additional tracks and performance improvements one of the big changes was behind the scenes. We have moved our cloud servers from running in VMs to running on IBM Bluemix - the change sounds subtle but you should never judge an update by its patch notes.
The key difference between running on a VM to running on Bluemix is due to Bluemix being a PaaS and not an IaaS or other VM style deployment - when you look at what you can and cannot do on a PaaS the main standout point for Code Rally was the loss of having IO access. Our server side is configured using an xml file for things like OAuth configuration, security feature activation and other configuration options and we can no longer use that when deployed to Bluemix. Another change is that we have a lot of logging and file creation for active and completed races - again something we needed to fix when moving. The final thing that changed was that we moved away from using our own Derby DB to needing to use an SQL service (which needs to be configured and deployed when deploying or redeploying.
We solved the IO changes by now storing everything in our SQL database - we store all of our config options there, and have created an admin JSP that allows us to configure the settings from a simple webpage (see the image below for the options in our admin page).
We have taken a careful look at what we do need to store in the database as well - when deployed to a VM we had more storage space than we needed and didn't prioritize being lean and efficient in what we stored. This did lead to servers with lots of race history slowing to a crawl due to database queries taking too long. As part of moving to Bluemix we decided to be more efficient at what we are storing, and as we can ask the database service how large our database is getting we can set a maximum number of races to store data for, keeping the DB size consistent. The implication of this is that some race replays will become unavailable after a while - when the database gets too big we will delete the oldest race replay data from the DB until we get down to a better size. This means that you lose the ability to watch a video replay, but we do keep the time and positions of each of the AIs in the race (race replay data is our biggest consumer of space in the DB). Doing this has meant that server response times should remain consistent resulting in a better game at the cost of people not being able to watch old races (which are rarely watched after a certain time period).
To support players being able to run their own servers we have not removed the current non-PaaS deployment of the server side - when starting up the web application detects its environment and runs the same as before if on a traditional system or VM and uses the new code only when deployed on Bluemix - this keeps a common web application for our deployments (which is handy for testing as we only need to test the same web app, but in two places instead of one) while giving us and our end users a choice in where to run.
What does this move mean for you?
The changes, although behind the scenes, will impact the way the game plays for you. The server should be more responsive when logging in, querying races run and entering races (as well as viewing the replays). It does mean you older races may not be view-able after a time, but it doesn't look like anyone was viewing race replays that old anyway. The move to Bluemix has made it easier for us to administer our cloud servers - restarts are easier to initiate, server state easier to monitor and updating to newer versions for bug fixes are a breeze (click one button and we get a redeploy of all of our cloud servers to the latest release). This should mean a more stable experience for everyone playing and less headaches for us.
The final advantage to mention didn't really impact us is that of cost - our VMs were more expensive than our Bluemix deployment is even though we're getting the same capability. We're also able to more rapidly respond to changing load requirements in our deployment as we are primarily memory bound when looking at our ability to scale. With a VM we need to change the VM size, wait for that to be updated then go in and change our JVM settings to use the extra memory - with Bluemix we can increase the memory allowance in a webpage and after a very quick restart (<1 minute) the extra memory is available for use.