"To improve is to change; to be perfect is to change often."
Winston Churchill. 1874-1965.
“I weighed myself today. I don’t know why. I don’t use the information to guide my behavior in any way. Why do I do it?”
Louis C. K. Chewed up.
“Hard work pays off in the future. Laziness pays off now.”
My latest blog on public vs, private cloud resulted in yet another flurry of emails from readers and friends alike. A former colleague, a senior engineering leader at a major high tech company, sent me this picture of a Gartner slide that he had found on Twitter:
His point should be obvious but I’ll spell it out: “Ben, even if you are correct about the public cloud being ready for mission critical workloads in terms of availability and security, you are still ignoring the cost. If you had done your homework, you would have known that private cloud is significantly cheaper than public cloud. Here’s some concrete data from Gartner.”
Neither of us had access to the Gartner report so we can only speculate about the specific details of the workloads or why 285 VMs are needed to run them. It almost doesn’t matter because I think the chart is too simplistic to be useful to any company seriously considering a re-architecture of their enterprise applications.
First and foremost, it assumes that I want to run exactly the same applications in the public cloud as I do on-prem. Gartner is trying to do an “apples to apples” comparison here but, by doing so, they are precluding any chance to optimize the workload running in the cloud. They are basically saying: “I don’t want to know what apps I’m running. I just want to keep running exactly the same workloads in the cloud.” I can see why this makes sense to an IT Professional for budgeting purposes. Unfortunately, such a comparison ignores so many factors that it misses the whole point of “cloud migration” and achieves only local optimizations in the overall TCO. I will be the first to admit that it is quite possible that it costs twice as much to run (let’s say) Exchange or Sharepoint or Oracle CRM in the public cloud as it does on my on-prem private cloud. But that’s not the point.
The stated question should not be “How can I keep running Exchange?” but rather “How can I deliver email services to my employees with X gigs of mailbox storage each and with retention policy Y?” By assuming the former question and comparing just the cost of running the VMs, we are pretty much ignoring all other capabilities of a public cloud and the potential benefits of a SaaS model. The only thing we are outsourcing here is the hardware required to run the workloads but we slavishly stick to exactly the same outdated stack of software we had on-prem, the same one that was designed twenty years ago before the internet even existed.
In the above example, the comparison doesn’t include the expense of buying Exchange Server, buying Windows Server, buying Active Directory and its associated Client Access Licenses, buying Outlook/Office for all employees, buying a hypervisor, paying for failover capabilities in Exchange or Windows, paying the salaries and bonuses of IT staff to manage said solution, etc. Well, guess what. $150k/year is peanuts compared to the cost of all that software and the cost of managing it - costs that we would completely bypass by truly utilizing the cloud as it was meant to be used.
What if, instead, the slide showed the cost of delivering enterprise class email services through an on-prem installation of Exchange versus delivering the same functionality using Gmail (or other equivalent SaaS solution)? I bet the numbers would look very different when you include all those other “hidden” costs of any on-prem (read: legacy) solution. Gmail doesn’t need any of the software I mentioned in the previous paragraph so why are you burdening the cloud solution with the associated costs? Shouldn’t we be analyzing the problem at the application/service level instead of just the cost of outsourcing the hardware to an IaaS provider? My main argument here is that you can’t get the benefits of the cloud unless you fully embrace the cloud and migrate to cloud-native solutions.
If we are talking about running homegrown (as opposed to pre-packaged) apps, the slide also ignores the biggest advantage of the cloud: its elasticity. It assumes that I need 285 dedicated VMs for the entire year to run my workloads. What if I only need 50 VMs for most of the year and the rest for only a week at the end of the quarter? Wouldn’t we save a massive amount on CapEx by using only what we need as opposed to assuming a fixed number of servers for the entire year? Isn’t that the whole point of the cloud?
Finally, by perpetuating the status quo, by taking the current “legacy” software and running it in the cloud, we are ignoring opportunities for architectural and implementation enhancements. Part of my argument here is that it is actually beneficial for an enterprise to revisit and re-architect their applications once in a while. A good time for such a review may be when massively disruptive inflection points, such as the cloud, show up in the industry.
Just because transactional databases were state of the art twenty years ago doesn’t mean you should continue to use them for storing all data (a sin that many old apps are guilty of). I bet most application state being stored in SQL or Oracle today can be thrown into a lighter weight, more scalable, and free key-value store - if only we were willing to change the code. I can’t count the number of times I’ve inherited legacy software that, according to its authors, couldn’t possibly be improved upon. In almost every single case, putting a few smart engineers (with fresh eyes) on the problem has resulted in multiple orders of magnitude improvement in performance and scalability. The last thing you should assume is that the crap you’re running in your data center is “state of the art” and needs to be maintained in perpetuity in its current state.
If I may quote the classic from the 1970’s, I think every enterprise should follow Eric Idle’s advice from Monty Python and the Holy Grail and tell its developers to “Bring out your dead!” at least once every five years. I’m sure many ancient apps (and their developers) would reply “I’m not dead yet!” just like in the movie, but clearing out the cobwebs will benefit everyone involved.
I won’t bore you with more details but, suffice it to say, I have a lot of scars on my back. I’ve inherited many such code bases throughout my career and pretty much every single one was ready for at least a 10x improvement within a few weeks or months. Assuming that the current monolithic ancient code base is performing optimally and using a forklift to move it to the cloud is not only inefficient cost-wise, it’s penny-wise and pound-foolish. By not taking advantage of architectural as well as implementation improvements, you are just perpetuating the mistakes of the past. What Gartner is proposing, and what many IT organizations are implementing today, is the lazy approach to cloud migration and the equivalent of “See no evil, hear no evil, speak no evil.”
Finally, by depending on on-prem infrastructure and private clouds, you are missing the single biggest opportunity offered by public clouds: business agility. I would be glad to sing the praises of an on-prem solution as soon as someone (anyone) shows me a single implementation of an Enterprise private cloud that offers the same deployment responsiveness and agility as any public cloud. I’ll take out my credit card and head over to AWS while you start contacting your IT department, filling out forms, and requesting a VM to get your job done. Even if you have a private cloud, I bet it will take you orders of magnitude longer to get productive than it would in the public cloud. Every hour spent waiting for resources is an hour wasted.
Here’s a quote from another former colleague, Tom Gillis, in Forbes magazine, which makes my point more eloquently:
Where does the truth lie? In my opinion, it doesn’t matter. The driver for using the public cloud is not a 10 percent or even a 90 percent cost improvement. It’s about something more important…
… the process of launching these services was painful, because our rock star IT had their hands tied by infrastructure limits and had to make new infrastructure appear out of thin air. I needed a forecast on customer count and type—large versus small. I needed to approve a large amount for CAPEX up front based on this forecast. It took rounds and rounds of executive review. Once we finally got the green light, we needed to get in line and wait for the “new data center buildout” somewhere in the heart of Texas. The problem: My business stood still while I and my team were waiting for bulldozers in Texas to turn over a cow field and build a data center. If I could have simply deployed our software on the public cloud, knowing that it was as secure or more secure than when running on-prem, and never needed a forecast, I would have asked, “Where do I sign?”
There are far too many factors at play here that can't be distilled to a single numerical table: the impact of open source and free software as well as the currently popular freemium model of the cloud, the impact of massive deployments of a uniform software stack finding corner cases and bugs when used by millions of users simultaneously, the impact of lighter weight virtualization technologies reducing footprint and maintenance overhead, the impact of architectural improvements in distributed applications over the past decade, the impact of having software managed by people who have access to the source code (as opposed to IT personnel whose best approach is to reboot the server or call the vendor for help), the inherent agility in a truly on-demand cloud model and, more importantly, the resulting business agility.
Sure, if you just want to keep running and managing your decades-old monolithic Enterprise app in exactly the same way, you can probably save some money on the infrastructure portion on your budget by running it on-prem. But I doubt you will convince me it’s the better approach for anyone involved.