Wednesday, December 21, 2016

The Public Cloud: A Defense

“Much of history revolves around this question: how does one convince millions of people to believe particular stories about gods, or nations, or limited liability companies? Yet when it succeeds, it gives Sapiens immense power, because it enables millions of strangers to cooperate and work towards common goals."
Yuval Noah Harari. Sapiens: A Brief History of Humankind.

“A conclusion is the place where you got tired of thinking.”
Steven Wright.

“This is the central fallacy of the writer: he or she must absolutely believe something that is not true in order for it to become true.”                 
Peter Welch. The Writer’s Fallacy.

I seem to have kicked a hornet’s nest with my recent blogs about Enterprise IT and Infrastructure trends. At the heart of this discussion is the architectural inflection point at which we stand today, the public cloud, and its readiness for taking on business critical workloads for the most demanding enterprises – financials, healthcare, government, etc.
I've been getting tons of feedback - both positive and negative, both in person and in writing. I guess I'm glad we are finally starting to have the conversation. It's about time. I guess, sitting in my bubble, I had sort of assumed these decisions had already been made and the rationale for them made obvious. So I was somewhat surprised recently when I met with a group of Enterprise IT practitioners and professionals who pushed back on me quite stridently, convinced that the cloud is not ready for enterprise workloads.
Several of the comments stuck with me, although I felt we didn't have enough time to delve into the details. Hence this blog. I hope to answer some of the points they raised. Most of them are valid concerns and objections that I've heard in the past. But I've never seen them documented in on place. Overall, there was general agreement in the room that the cloud was the future. The only question was, when. They all seemed to think it was still many years (“decades”) away. I happen to think it’s right around the corner, if not already upon us.
One of the comments I heard was: “But they [the big cloud providers] won't indemnify us. What if they have a major outage? It's going to cost us millions of dollars in business.”
I wish I had had the quick wit to retort: “… as opposed to what the corporation is getting from its IT department today for management of on-prem infrastructure?” Last time I checked, the IT department was a cost center. If you have a major outage in your private data center, some small subset of the IT department may get fired or laid off. But, chances are, you will get an even larger budget next year to “fix” the problem. So, where and how exactly does the corporation get “indemnification” from its current IT organization for an outage? The pay the salaries of IT personnel and have no recourse if and when something goes wrong. As such, why would you expect it from a cloud provider? If they agree to that and have an outage, they have to pay thousands of companies using their service. The budgetary impact would be huge to any cloud provider. It doesn't scale. Forget about it. You're never going to get indemnification. You didn't have it until now so why did it suddenly become a requirement?
Another comment I heard was: “we are not like Netflix. If their infrastructure crashes because of an Amazon outage, the worst that happens is you lose your place in the video you were watching. [General laughter around the room] If our infrastructure crashes, it costs the company millions of dollars per hour of outage.”
The implication here is that cloud is good enough for consumer brands like Netflix because there is no critical business data at risk due to an outage. But that obviously wouldn't work for us: “We are the financials, we are the health care providers, we can't afford an outage even for an instant.”
This argument seems to ignore the fact that “consumer” companies like Netflix and Amazon and Facebook are, routinely, serving millions of customers with higher availability and performance than almost any Enterprise company I can think of. Remember that those “Enterprise” companies (let’s say a financial institution or a hospital) typically only have to deliver their “services” to thousands or at most tens of thousands of customers at a time. Not millions. And the architectural solutions currently deployed by those same Enterprise organizations is already breaking at the seams trying to handle that load. In other words, “Don’t knock it till you’ve tried it.” Delivering cloud services to millions of people is, in fact, the best way to make a technology bullet-proof for the enterprise. If you don’t believe me, just look at the history of Gmail.
For every single vertical you care to name, I bet I can name a “cloud native” company that is delivering better quality of service to its “customers” than any traditional Enterprise company. Hands down. And is innovating more quickly and with more scalable backend databases than any on-prem solution already stretched to its architectural extremes through twenty years of contortions - ahem, I mean integrations. Yes, financial and health verticals, too. Not just Netflix. How about Amazon Web Services? How about Amazon as the world’s biggest super store? How about Apple as a financial company? How about Google as an advertising company?
Another comment was: “What if they have a total data center outage?” Any properly architected enterprise application should have a redundancy strategy and a plan for failover and failback in the case of full data center outages. The requirement is on the modern app to architect itself properly. Any properly architected modern enterprise app can withstand the outage of a data center - regardless of whether that data center is managed locally on-prem or in the cloud by a cloud provider. The onus is on the app. If you are still running business critical apps that can't withstand an entire data center outage, you have bigger problems.
Another comment I heard was: “Every few years, Silicon Valley gets enamored with another new technology or framework. Last year, you industry pundits were telling us about how wonderful OpenStack would be. Look at how far that got us. Why should we believe you now that you are preaching cloud?”
OpenStack is an open source community effort. You, Mr. Enterprise IT Guy or Gal, are signing up for being part of the community as it evolves. With OpenStack, you - again - get to play System Integrator. Because it’s an open system with Swiss knife connectors for everything from block storage to image management to networking to security to patching to whatever. Why on Earth did you think that path would lead to success or even converge quickly? The cloud is the opposite of that path. It says, Mr. Enterprise IT Guy, please stop getting in the middle of that level of infrastructure integration. Let us hide that complexity behind an API and an SLA. Go up the stack, young man!
Another, valid, comment was this: “We don't control the budget anyway. The BUs hold the purse strings and they get to make these kinds of big strategic decisions anyway. And they don't have any stomach for big upheavals. They just want the current stuff to keep working.”
Yup. And those are the same BUs who have developers writing cloud native apps right now. Because they've given up on Central IT’s ability to help them in any timely manner. I never said it wouldn't hurt to rip off the bandaid. It will require alignment from the top levels of the organization and you will get pushback from all the BUs: They just want to get their jobs done. They don’t want to deal with infrastructure. Sooner or later, some startup will offer the same service you are offering in your data center (be it block storage or compute or higher level services like database and firewalling and intrusion detection and load balancing). Sooner or later, you will acquire another company with a more progressive cloud based approach to infrastructure delivery, and you will find that some part of your critical infrastructure is already dependent on the public cloud anyway. You can either be a passive and resistant party to this journey or you can take the lead. It's up to you.
“It can be quite expensive. The prices for public cloud based services are still too high.” Yes, of course. They will charge what the market bears. It’s an open economy. I suspect they will continue to drop their prices over the next few years as their platforms mature further. The rate of architectural enhancements made to next generation cloud architectures is an order of magnitude faster than that of on-prem infrastructure hardware and software. You’re welcome to continue to invest in the old generation but, I promise, it’ll be for diminishing returns.
When you compare the costs, be honest and include all the hidden charges that go along with on-prem infrastructure. That’s not just a SAN box you ordered last year. This year, you already have to order the clustered upgrade to improve availability. Next year, you will also have to invest in the management console and the snapshot provider and the backup adaptor as well. Not to mention the Enterprise License Agreement for support, your own operations team that has to learn how to manage it (compared to the other three SAN solutions they’ve inherited over the past decade). And that’s just the SAN box. Of course, you also just finished M&A of a rival which came with its own NAS based strategy and associated hardware and software. I can keep going but you get the idea. Let’s be honest and compare apples and apples.

Yes, it’s expensive. But the cost will go down as more and more people and companies adopt it. Ironically, that will also improve its quality. It’s a virtuous cycle that can’t be duplicated in the complexity of on-prem plug-n-play architectures of yesteryear. At the end of the day, my argument is an architectural one. We have learned a lot about distributed systems architecture in the past ten or twenty years. It’s practically impossible to retrofit those learnings into old monolithic architectures like the ones currently running in every enterprise data center in the world.

At the end of the day, I walked away being even more convinced that we will see a massive sea shift to the public cloud for enterprise companies over the few short years - at least the ones that want to survive in the long run.