Cloud as DevOps Vending Machine

Thunder Technologies
6 min readSep 30, 2020

The quality assurance team that anchors your continuous delivery DevOps model probably does not lack tools and expertise to test your product. What they might lack, however, is equipment.

Reserving time to validate a new feature on congested test equipment can delay continuous integration and delivery goals. VMware ESX mitigates the problem somewhat through increased efficient use of your servers, but you are still limited by the overall horsepower of your on-prem physical equipment, and ESX licenses are costly.

It bears repeating the obvious that the cloud, among its many advantages, offers not only a theoretically infinite data center, but each individual developer can be assigned his or her own “test lab” eliminating the bottleneck of the QA team’s own work.

Of course there is no such thing as a free lunch. If used effectively, however, a QA-lab-per-developer model, especially for lean organizations with low-footprint, cloud-native applications like ours, the promise of DevOps can be truly realized.

Our products, Thunder for EC2 on AWS Marketplace and Thunder for GCP on GCP Marketplace, are deployed as SaaS solutions in the AWS and Google clouds to automate the replication, testing, and failover of mission-critical cloud applications across regions for disaster recovery protection. At $20 per month flat fee our products cost less per year than competitors cost per month. The low price is a direct offshoot of the simplicity of the products. Because the code base is relatively small, a comprehensive testing regimen can exercise essentially the entire product relatively quickly.

This means we can develop automation tools that run rapidly, minimizing our costs in the cloud. As a concrete example let me describe our basic testing tool called StoF (Start-to-Finish) that performs the following operations similar to an end-user:

  • Deploy a pair of instances in the “primary” region representing production workload using a custom image with MySQL installed, provisioned with test data and a basic load
  • Launch our product through Cloud Formation (on AWS) or Deployment Manager (on GCP) in a separate region representing the disaster recovery failover region
  • Connect to the product using HTTPS to drive the provisioning of duplicate instances in the DR region, replication from primary to secondary, testing of the DR instances after each replication job and failover of the applications from the primary to DR region. This automates the operations a user would perform through the user interface.

By enabling detailed cost reporting in AWS and GCP we can obtain the usage and cost of each major operation that our test program execute, in order to get a feel of the overall cost of running the test once. Below is a breakdown of a real usage on AWS choosing Mumbai (ap-south-1) as the primary and one of AWS’s newest regions, Bahrain (me-south-1) as the DR region, not only to exercise the product but to confirm there is no unexpected behavior in a new region.

Our custom MySQL image resides in Paris (eu-west-3) so first is copied to the primary region. AMI replication is itemized as an efficient snapshot copy (i.e. not charged for the entire 8GB disk but only for actual data in the instance, here about 0.6 GB) in AWS cost explorer:

Our test program then launches two instances from this image representing two production workloads. Less than an hour elapses from the start of the test to its finish, so the running charges for these instances is low:

Next, our solution Thunder for EC2 is launched as a micro instance in the DR region through Cloud Formation; our test program must copy the image there that represents our continuous development snapshot, similar to how it copied the MySQL image to the production region. The instance hosting our software runs for the entire length of the test, but again as a micro instance the cost is roughly one cent:

Thunder for EC2 automates the periodic replication of production data to the disaster recovery region by copying snapshots of the production instances using standard APIs. The size of the snapshot depends on the differential data copied since the last snapshot, and our test tool runs a few replication jobs in sequence. The entries below show the cost for each snapshot from primary (Mumbai) to secondary (Bahrain)

For a given instance pair, after each replication job, Thunder for EC2 tests the DR instance by powering it on briefly to confirm it can start, and optionally running a deep test to confirm application recovery. In the case of MySQL, our product can connect to the database and read sample data. These tests are extremely brief, less than a minute or so, so instance running time at the DR region is almost negligible (in fact for one of the instances it does not even appear as a line item). A failover merely powers on the DR instances and confirms the application is accessible and up-to-date.

The failover completes the test, so our test program proceeds to tear down the environment: terminate the DR instances, delete the Cloud Formation stack, and terminate the primary instances. The images remain in case the same pair of regions is used for a future test in order to avoid the image copy.

Image copying, snapshot replication and instance running time are the only significant line items. Everything else such as API access and bucket listings is noise. The total cost is then:

Twenty-five cents to run our comprehensive coverage test once from start to finish.

Given the elastic nature of AWS and GCP, there is no restriction as to when a developer can run the test program on a new image of code — resources are always available. Given the absurdly low cost, there really is no restriction on how often the test program can be run either.

Plainly the test can be expanded with more primary instances, or more consecutive replication jobs, but no matter the configuration a rough cost estimate is quantifiable. At twenty-five cents a run, the QA lab is almost like a vending machine: put in a quarter, get result. Run it after after code checkin, run it at lunch time, run it while in the bathroom.

Our business model is predicated on automating a straightforward disaster recovery procedure with a focused, uncomplicated product in order to save cloud users time and money and give them peace of mind that their workload is protected. A dividend of such a focused product is that it can be tested comprehensively and rapidly. By quantifying the costs of testing we can be confident that DevOps continuous development can be accomplished without sacrificing quality and without breaking the bank.

Our business model also values transparency and we hope that sharing our experience assists you in achieving your DevOps goals to the fullest capacity.

--

--

Thunder Technologies

Thunder Technologies provides robust, cost-effective disaster recovery automation for the public cloud