One Week Later on Amazon Web Services
Saturday marks the one week anniversary of moving the Helpstream application out of our data center and on to Amazon Web Services. I’m happy to report that it has gone even better than smoothly. Various customers are reporting that it has enhanced our performance as well. I attribute our success both to Amazon’s excellent service, and to our own approach to the transition. We did the transition itself in three stages over about six months to ensure we tested thoroughly and built up experience with what to expect from Amazon.
One of my role models for how to do this was Don MacAskill’s SmugMug. I saw him speak and met him at one of the Amazon road shows and have been following SmugMug’s progress ever since. At this point, we’ve pretty well caught up to their Amazon “Capability Maturity” level by running essentially our entire application on Amazon.
Our first stage was to start using Amazon S3 for our backup system. This is a total no-brainer for almost any company, and I highly recommend it. Our CTO, Dan Hardy, got this to work with a little scripting in a very short time. Once finished we had robust backups to multiple physical locations (a built in feature of S3) that were completed automated, right down to check summing whether the backup data matched the original. The cost for S3 storage was less than a new set of tapes for our old backup and we no longer had to physically visit a datacenter cage to shuttle tapes offsite and put a new set in the reader. This stage fit what we’ve observed about Amazon every time we use it: the end result is cheaper, easier, and better than what we had been doing at the data center. Better for all concerned in the sense that customers benefit too.
The next stage also involved Amazon S3, but was bolder. We moved all of our blob data (Binary Large Objects) out of MySQL and onto S3. For Helpstream, blob data is largely attachments and documents from our Knowledge Base. We had been storing it in MySQL, but that was for our development convenience more than anything else. The downside is that it meant a lot more data inside the database slowing it down. It was straightforward to set up the system so that each tenant in our SaaS system could be set to keep data either on S3 or in the database. We modified our app server code to make this transparent to the client, so as to minimize how many places code had to change.
Moving the blobs to S3 was another huge win. Any data on S3 really doesn’t need to be backed up as S3 essentially does that automatically. The cost for space there is extremely competitive and we estimate its actually cheaper from a TCO standpoint than keeping the data in MySQL. We managed to move between 70 and 85% of the data out of each tenant on the DB server (depending on how heavily they were using attachments), and this had benefits as well. MySQL obviously could perform better, but our overall backup time went way down since there was less data to back up. This was an important stage setter for the next and final phase.
Our last phase was the most complex from a planning standpoint, but not really all that hard or scary in the end of the day. We wanted to move all of our server instances up to run on Amazon EC2 instances. There were two essential gating factors to doing this. First, we needed a persistent storage answer for EC2 from Amazon. Not long ago, it used to be that if an EC2 instance went down, any data on it was lost. Amazon conveniently introduced EC2 block storage and that problem was fixed. Block Storage enables us to mount robust storage volumes on an EC2 machine.
The transfer of blobs into S3 was the second gating factor because it made the total time to transfer everything to Amazon and get it running manageable within one of our maintenance windows. The key was the radical reduction in the amount of data that had to be moved. In our new “slimmer” MySQL mode, our data could be moved in under 2 hours, which we deemed acceptably short. We could shut down the application during a weekly maintenance window (midnight on the weekend), move the data, and when we brought the system back up, it would be all Amazon.
We did some contingency planning along the way. We’d been running temporary EC2 instances of our application for months, and knew that worked well. We did a dry run of the whole move a few days before the real thing just to be sure we had our routine right and hadn’t missed something. We even changed one of our many (but this one seldom used) domains to point to the new instance so we could test our DNS migration.
Then came the real move. It took us about 5 hours to move the data, test the result to our satisfaction, and throw the switch to start pointing the DNS at the new servers. As a precaution, we kept the old servers running with forwarding capability. This way if portions of the Internet didn’t get the news right away, the old servers would still bounce requests through to Amazon.
Presto! We are live and we are nationwide (worldwide, actually) to paraphrase ZZ Tops. The view here from the Cloud is absolutely gorgeous. We have better performance, a huge amount of headroom (we can provision a whole new pod capable of running lots of tenants on our multitenant architecture in about 3 minutes), lower costs, and a raft of other benefits. Highly recommended!
References (2)
-
Response: http://fat-weightloss.blogspot.com/The Crow from the Movie The Crow. -
Response: Natural Quick Weightloss - Home PageInternet Based Financial Independence in Simple 10 Steps.: Use the Gold Key at the bottom of the site to bypass the order form and save.


Reader Comments