In 1999, David Bowie was interviewed on BBC Newsnight about the topic of the Internet (and BowieNet). While the concept of multi-cloud design did not exist at the time, here is one bit that stands out for me:
There are always two, three, four, five sides to every question. [The Internet] will crush our ideas of what mediums are all about.David Bowie
Multi-cloud design has a similar vibe. While much of the analyst industry and vendor ecosystem has been focused on the acceptance or rejection of multi-cloud to vie for position in marketing materials, practitioners of the world have already rolled up their sleeves to tackle problems that require – you guessed it – multiple clouds.
I’ve been heavily steeped in the world of “making things that run on multiple clouds” for the past half decade, with a lot of hands on engineering in the past 12 months, and boy do I have opinions! This blog post captures various musings on multi-cloud with a sprinkle of sass and a healthy dose of reality.
Clouds are Not Particle Throwers
There are two main schools of thought when it comes to multi-cloud design:
- Application stacks are spread across cloud vendors using common denominator services. Essentially, clouds are the big, bulky Proton Packs used by the Ghostbusters because you’re never supposed to cross the streams!
- Use cases are spread across cloud vendors using design criteria. These decisions stem from a variety of context, such as acquisitions of other companies or simply finding the best tool for the job.
The Flaw with Application Stacks
I’d agree that the first idea is bad in general – tough to do, rarely has any positive impact on the design, and, in most cases, done to hedge bets against some sort of lock-in monster without any measurable success.
An example of such a deployment is below:
There’s a lot going on with this type of architecture. Numerous services are shuffling, ingesting, and writing data to multiple places. Logs must be captured for the applications, infrastructure components, user / service principles, API calls, and more. If any optional shim layers are added – such as deploying a hypervisor or third party scheduler into the mix – even more “abstraction complexity” is introduced. Not a fan!
The Benefit of Use Case Driven Architecture
What about the second idea?
There are numerous advantages to having a plan on how, when, and where you will use various cloud providers, especially if you’re in the business of buying, merging with, or acquiring other businesses. The previously described model is simplified. The shim layer remains optional and is, in many cases, a migration tool or “crutch” to avoid impending toil.
Lydia Leong highlights the need for governance and discipline in the multicloud gelatinous cube blog post. I’m down with this name! A plethora of folks I chat with are approaching design using this model to determine where the sweet spots are in different cloud services. It takes steadfast discipline to avoid falling into the traps of well architected anti-patterns.
Pipeline Thinking is the Key to Multi-Cloud Design
I’m a big fan of using multiple clouds. There are services, data constructs, APIs, network resources, and other “things” that I find useful based on the project. I leave it up to the cloud providers to ultimately win my business; I don’t automatically give them business based on technical religion.
Adoption of this model requires a shift towards pipeline thinking as opposed to imperative building. Each use case is treated as a granular application stack that is deployed into the cloud or service provider of choice. A common set of languages and tools are adopted by those tasked with building and operating the environments. Governance is introduced early in the process with every application receiving the goodness that stems from those efforts.
I personally run resources in AWS, Azure, GCP, and other systems as both the architect and engineer. It takes very little of my time to do so and there is nothing special about me. This is all made possible by adopting pipeline thinking.
Each unique vendor builds their version of “the cloud” from a different perspective and with different underlying technology stacks. Each provider has good, bad, and ugly components, services, and pricing. Adopting a pipeline thought process means a return to true architecture thinking: how can I leverage these tools to fulfill requirements, avoid or mitigate constraints, and alleviate risk?
Managing Multi-Cloud Projects
Here are a subset of projects deployed across different cloud providers. These “site deploy” projects build VPCs, resource groups, projects, lions, tigers, and bears – oh my! ???
Each project pulls from a centralized repository of pipeline templates to ensure the infrastructure and application code meets defined standards when planned and deployed. A centralized dashboard maintains visibility on the health, deployment duration, drift, and other vital metrics as described in this blog post.
Pipeline thinking is about putting together a workflow that is used to deliver resources into cloud providers of choice. It is the first thing you should do! It makes building resources trivial and even fun.
Common Ingredients Needed for Success
In a nutshell, multi-cloud design can be achieved with the right ingredients:
- The conceptual, logical, and physical (specific) model still holds true. Blame Zachman, if you want. 😉
- The first investment after prototyping should be a pipeline. For infrastructure and most everything else, this means building a Continuous Integration (CI) pipeline to deliver all resources. The slowest and weakest link in resource delivery is your speed limit.
- Embrace “Dev/prod parity” from the Twelve-Factor App; avoid using unique cloud providers as tiers for development.
- Invest heavily in intelligence that lives within the pipeline. See Checkov and Cloudrail as examples. Write your own tools only when existing tooling falls short and contribute them to the open source community whenever possible.
- Avoid enforcement and governance outside of the pipeline. Beautiful style guides are utterly worthless if they cannot be programmatically enforced. My cup of tea is OPA.
When done well, the “cloudy” parts just sort of evaporate away as implementation details.
Because so much has already been done to construct applications in each cloud environment, much of the code can be re-used elsewhere to support DRY (Don’t Repeat Yourself) efforts. For example, using private modules across projects.
Multi-Cloud Design Anti-Patterns to Avoid
There are ways to “do the things” incorrectly. The most prolific anti-pattern is trying to distill cloud providers down to their minimum nuts and bolts for application hosting. No, stop this!
Regardless of the poison you pick to logically “pool” different clouds together – be it a hypervisor shim or container scheduler – this is generally a bad idea. Best case scenario, you now have a pile of spaghetti to manage in which each cloud’s services behave differently, have unique SLAs, and are supported with an extremely wide variation in performance. Worse case scenario, the application simply does not work after throwing in piles of effort and money.
This mistake is commonly made by the imperative builders at companies that just want their “stuff” to run in the cloud like it is today with minimal or zero effort. Sorry, this isn’t how it works, folks. Cloud has so much more to offer!
99 3 Common Problems
Digging into the anti-pattern example further, I’ve tried this approach in the past with a few different application designs (both monolithic and services oriented).
The main problems come from:
- The mechanisms to support authentication are generally geared towards account / subscription / project interoperation. A few services, such as signed URLs for Amazon S3, are good about not caring who is using the service. The remainder require hacks or role assumption that seemingly never end.
- There is a lot of intelligence baked into first party and third party solutions that is lost. Terraform, for example, understands dependencies within a cloud provider no problem; it requires much more hand holding and
depends_onstatements when spraying resources across clouds.
- Services across cloud providers vary widely in how their APIs respond to requests. Fulfilling those requests is another snake pit entirely. I just want to deploy my application, thank you very much, not babysit it across other vendors and write complex scheduling rules.
Not worth it, in my opinion.
Please accept a crisp high five for reaching this point in the post!