As organizations embark on their cloud adoption journey, they quickly discover that there are a lot of things that could go wrong. These may include factors such as legacy operating models, outdated processes, and organizational resistance to change.
Our guest for today, Deloitte Consulting Managing Director Mike Kavis, focuses on the human side of digital transformation rather than technology. He talks about the organizational mindset that is needed to ensure a smooth transition to the cloud.
Today, joining us from Australia is Toro Cloud’s CEO and Founder, and Cocktails’ co-host, David Brown.
And our guest for today is the author of "Architecting the Cloud: Design Decisions for Cloud Computing Service Models" and the upcoming book "Accelerating Cloud Adoption: Optimizing the Enterprise for Speed and Agility" which we’ll be talking about today. A pioneer in cloud computing, our guest has also been ranked as one of the top 100 cloud experts and influencers in the world.
He is currently the chief cloud architect in Deloitte Consulting LLP’s Cloud practice, responsible for helping clients implement cloud strategy and architecture to drive digital transformation. Beyond his technology experience, he brings an insightful understanding of how to address the organizational change, process improvement, and talent management challenges associated with digital transformation.
Joining us for a round of cocktails is Mike Kavis. Mike, welcome to the show.
Hey, thanks for having me!
In your book, you said that to succeed with cloud adoption you need to consider people, process, and technology. We often focus on the technology side of the equation when considering cloud-related projects. Can you elaborate on the considerations associated with people and processes?
Yeah, and that's that's why I wrote the book, because my first book was about the technology, because when I started on Cloud, it was 2007, 2008, and there were no books and I learned a lot of stuff the hard way.
So, I said the next person shouldn't have to make all the mistakes I had. Here's what I learned. And then as I still have been this guy in consulting, I started seeing the struggles of all these companies. They have all the smartest people in the world but they just couldn't get enough stuff into the cloud. They could never turn off the old stuff.
And the patterns I kept seeing were the legacy processes, legacy org structures, legacy tools that were preventing them from moving to the cloud as fast as they would, as fast as they wanted. An example often is that a lot of companies focus on CI/CD. So, we have clients that have a great automated pipeline, but there's three months of process before they can hit the button to create it to go to stage. And there's another three months of process to put it in production because we still have these silos that have to do all these approvals, even though we automated all of it in the CI/CD pipeline. So, that's one example with that. Then when it comes to run the system, we try to run it the way we always have with the separate group over here that knows nothing about our cloud implementation.
And then all of a sudden our reliability goes down when there's an incident because we're trying to do things with tools that don't work in the cloud and with processes that were built for biannual releases on a mainframe, not daily releases in the cloud. So, you know, too often the focus is purely on technology, but we need to change the supporting organization structure of the operating model and supporting business processes at the same speed that we're changing to technology.
You mentioned these CI/DC pipelines and I think that’s where a lot of people started. Becoming an agile organization is to focus on the build process. It goes way beyond that. A lot of people have solved that problem. They have some sort of build process but when we’re talking about cloud adoption and digital transformation. We’re talking a lot more than that right?
With the right rigor, it [the cloud] can actually be more secure than what you have on-premise
A whole lot more and a lot of it is just pure mindset. Traditionally, we were organizing technology silos. When I wanted to get work done, I needed to go talk to the database team, server team, storage team, or the network team. As you move to the cloud, a lot of that stuff’s abstracted in its code so we put these pipelines in place. The problem is we still need to get approval from the server team, storage team, network team. There’s also this fear that this cloud is just a big, unsecure place. Reality is with the right rigor, it can actually be more secure than what you have on-premise. The problem is there’s a lot of old thinking. Another example I use is a lot of people think of the cloud as someone else’s data center. Anytime you hear someone say “cloud is someone else’s data center,” they’re on the path to failure. That means you’re gonna treat the cloud like a data center, “I gotta VM here, I gotta VM there.” The cloud is much more than that.
The value in the cloud is when you start moving up the stack. You use databases as servers. Now all of a sudden you have a fully managed database that auto scales process zones and regions. I can instead of implementing my own Kafka queue, I can use queue as a service. I don’t have to worry about scaling, managing these third-party solutions. The value is going up that food chain. The cloud vendors now are even creating healthcare APIs, financial APIs, and they’re even moving up the stack to business services as a service. When you’re allowed to go up the stack like that, the speed at which you can deliver software is incredible and what you focus on is your specific requirements instead of all the plumbing and commoditized business process.
As I understand it, your basic premise is that whilst cloud technologies offer this agility and speed. These services that you can leverage. Many organizations are bringing to that new paradigm their old processes of bureaucracy within a large organization.
Mike Kavis Absolutely. It’s kind of worse than that. No matter how good or bad your processes are in your data center, they’re known. When there’s an issue, no matter how good or bad companies are resolving it, there’s tribal knowledge and they figure it out. Then you move to this blank canvas in the cloud and what people do is they focus on building the software but then something breaks. First of all, those old processes don’t apply because there’s no physical infrastructure anymore so what I see often is reliability actually goes down because they never rethought how you respond to incidents in the world where half the stuff is abstracted from you. You see it so often, you see these companies going back to the data center. It’s not because cloud isn’t any good, it’s because their approach to the cloud created a lot of failure.
I guess the obvious question then is how should they be approaching the cloud adoption? How can they overcome this? When you’re talking about a large organization, it is easier said than done to break down processes which have been put in place for a reason over time. It’s how a large organization can become a sausage machine because they have those processes in place. How do you break down those processes and become a more agile organization when looking at adopting cloud technologies?
It’s hard to break down the political barriers and change the mindset of the wise.
It’s easy to write stuff down on paper but it’s hard to actually do it. There’s a lot of methodologies and processes you can use to analyze processes and figure out where the waste is and remove it. But if you can’t get everybody in a room to talk about it because they’re holding onto their turf or even if you say well, there’s 60% waste in this process, why don’t you move this team to that team and create this cloud operations team but there’s politics involved. It’s easy to recommend methodologies to identify waste and redesign it. But it’s hard to break down the political barriers and change the mindset of the wise. In the book, I give a lot of examples of both companies that applied something that didn’t work and companies that applied something that did work. I don’t really have the answer, the magic bullet in there but what I do is create awareness. These things in my experience work better than these approaches.
A big part of it is the operating model. Adrian Cockcroft if you know Adrian from Netflix fame always said DevOps is an org change. He said that years ago and cloud is an even bigger org change. Why do we say that? Because we’re building software entirely different than we had before. It doesn’t make sense to put teams of domain expertise anymore, server team, storage team, network team because all this stuff is in API way now. Now what we need is brains from the storage people, network people to sit in a room with us and help architect virtual private clouds and figure out what’s the right storage unit. We don’t need them turning screw drivers and patching devices anymore. We still need those people probably more than ever before but we need their engineering minds not their administrative minds. When they’re in silos, I’m trying to deliver daily, I can’t be sending tickets and having meetings and cap reviews, you can’t do that in a day when software is being deployed.
You really have to change things, it has to be more collaborative, more engaged. We really recommend teams are focused, more product-oriented. You can still have your center of excellence but you pull from there and you have security engineers on your product. That way, you don’t need 58 meetings, you have someone with that expertise and they can go back to the security wall when they need to to get the answers but you have a full stack team there with shared goals. Those shared goals are goals of the product. That’s the formula that works. It’s really easy to say that, it’s incredibly hard to change the heart and souls of the corporations to get them to move that way.
We talked about this recently in terms of the differences between monolithic application design and microservices. How with monolithic application design, you tend to have frontend developers, service layer, database, security engineers and alike. Like you said, they’re all different disciplines. In a monolithic design, that can work. You’re talking about long periods of time where the communication can be facilitated between those engineering teams. But in the microservices world, what tends to work better is bringing those teams into a single team around a domain of knowledge or a product knowledge.
If you’re building a service for a shopping cart then you have the shopping cart team. Everyone is working collaboratively on that domain of knowledge. The issue becomes how does that team collaborate with other teams? Whilst they can have agility inside that team, they still need to collaborate with other stakeholders. Other stakeholders are dependent on the services they create. How do you see that sort of communication?
If you look at what Amazon does, Amazon provides all these servers. Each one of those teams is a product team with their own culture so the only requirement is that the interface is the same, API interface. They can go solve that problem any way they want and they have a culture of collaboration to do that. Within companies, it’s a lot harder because they’re not used to operating that way.
If you take that approach and get a standard API interface, you may not be able to have the level of collaboration some of the unicorn companies do but you can be kind of self-sufficient in the service you deliver. If it’s loosely coupled you can deliver that service. It may not satisfy the customer at the end of the day because they need these other things but at least you can deliver what’s expected of your group. This goes back to operating models, if everyone’s in silos, it’s harder to get them to communicate. If I’m building a product I want either a collocate not on everyone’s virtual, but I want to set up ways that people can meet virtually or in real time and their day is focused on marching towards the same product goals.
It’s a culture thing but I want to add one thing to the monolith story you’re painting. I've been both a victim and a criminal of this, – a lot of times in the monolith, because we’re in those silos, the database team, they’re supporting many teams. A lot of times they can’t service my needs. Sometimes, I solve a problem they should have been solving in the database, in the UI layer or middle tier. I work around the bottlenecks. Monoliths, hard as they are, become like a trailer park architecture. You don’t do the right things all the time you do the things that help you get out the door. You code around bottlenecks you just get this spaghetti architecture, every release is just more technical.
On the microservices side, you can go down the rat hole pretty quick too right. Microservices are great but how do you manage 100 microservices especially if 3 teams are writing at the wrong level of granularity. It can become a mess. I’ve seen spider grafts of microservices where there’s like someone fell asleep drawing with a crayon. How do you manage all that? Microservices can create challenges too and they require new tools, processes, and new ways to monitor. You start getting things like AI ops. Humans can’t process 250 microservices running at the same time. We need to look at new ways of operations. There’s no easy ticket.
I guess there are also some efficiencies in the monolithic approach. When you do have that database team which is in demand by lots of different project teams and so their resources are being allocated by their IT manager according to priority or who has the loudest voice. At least those resources are being efficiently used, you know that they’re being run at maximum capacity all the time.
We’re talking about by breaking them down into independent teams working on a single product or domain of knowledge. It sounds like a lot more people. You now have database engineers across every product so, is there a danger that we’re just increasing cost expense in what used to be an efficient model where it’s driven by, almost a capitalistic approach to what project has the highest return on investment as opposed to replicating our teams across multiple product domains?
I don’t think the difference between being very busy and very efficient, yes the database teams are very busy normally not efficient. You don’t need more people to do microservices, you’re delivering small batches. You’re actually delivering more frequently and hopefully if you have an architectural vision you’re inheriting the work you did in the previous releases so you’re not creating from scratch all the time.
If you have good architecture with every release, you can start consuming all your other services. Just going back to the monolithic model again, with every release, your architecture usually gets worse like I said people are skating around the bottlenecks. The other important point when I was working for a large company, regardless of what my business problem was, Oracle was the answer for OLTP and Netezza was the answer for transactional data.
What if I had a requirement for a document store database, I had to stick it with one of those two. I didn’t have those options because you would have to do a procurement process which was usually longer than my project and do a RFP and an evaluation then we have to hire PDAs that could do that. Now it’s an API call, I don’t need a deviate to a Mongo database because it’s an API call. Now I can build architectures that choose the right tools for the job instead of having to shove everything to work like I always used to have. Those Oracle people were busy but it wasn’t the most productive work because we were making bad technology choices.
I guess that leads to agility that organizational units can now adopt software or cloud platforms on their own, independently. As a result, the centralized IT department which used to manage the procurement, deployment process has lost control with the shadow IT concept.
There’s good and bad to that. A lot of the book talks about those types of operating models. It talks about federated vs centralized vs decentralized. I go in and say what’s the pros and cons of it. The pros of the decentralized model where my business unit controls things is I can move as fast as I want. Nothing’s prescriptive. I can choose a technology I want. The cons is: if I’m not really good at security I can really damage the company’s reputation.
If a company is one that has bought a lot of startups or a lot of wet properties, usually there’s so much cloud skills in those groups because they’re born in the cloud that it’s a decentralized model and what they try to do is build a federated model and says “okay, what kind of services do we need to centralize?” They usually focus around “okay, we need to get control operating system at least right and create the most patched operating system and let them pull from that.”
Usually when it’s top down, it’s entirely centralized. We control everything to the point where you can’t get anything done because it’s just another data center in the sky right. It’s okay to start there because you don’t wanna start everyone doing their own thing but at some point as your practice matures, you start allowing business units to take on some responsibility to move faster. I think the ultimate, depends on each company, the promise land is kind of a federated model where there’s some level, some things we need to control centrally and some things you have economy out in the business unit. In a financial institution, there’s gonna be a lot less control internally.
For a media company, maybe the only thing they want to control is maybe, “here are the tools you can actually use for CI/CD or you could use one of these three but not 12.” We’re gonna give you the blueprints for Redhat OS or Apache Tomcat, we’re gonna make sure that’s patched and that’s it. It really varies on what you’re trying to do. In the other part of that where I talk about different engagement models for that central team is that I may have a real web savvy team that doesn’t need my help, they just need self service capabilities. But I have this newbie group over here that doesn’t know anything and they need a white glove service. I think that’s where a lot of companies go wrong. They treat everyone the same and it’s usually to the lowest denominator of skills so the teams with skills, they can’t get the work done so shadow IT is just born and flourishes.
Interesting. Let’s talk about where we’re at in the future. We’re talking about DevOps and CloudOps, you’ve even mentioned AI Ops. It seems like it’s constantly evolving and if you’re adopting DevOps and thinking they’re on the leading edge. Where do you think we’re realistically at, where are we going?
If you talk to some of the godfathers of DevOps, it’s really about outcome.
I’d like to define what DevOps is. A lot of people think DevOps is CI/CD. That’s a piece of it, but in my mind, if you talk to some of the godfathers of DevOps, it’s really about outcome, it’s really about how can we as an organization deliver software better, faster, more reliable, more productful, those types of characteristics. Part of working better and faster is automating CI/CD pipelines but before you build and automate CI/CD pipelines, you may want to re-evaluate business processes otherwise you’re just automating waste.
So, you’re failing faster but you’re still failing. With that out of the way, what I see is there’s a big challenge, there’s so much change coming at the same time that’s why you have all these X Ops, there’s DevOps then we said Dev Sec Ops, AI Ops, yesterday I was on a podcast it was on my ops.
It’s all ops, right, in my mind the technology’s different so the approaches to it are different. Those things still apply, you may not need to have the meetings that were used to and like to but you still need service request management, you still need those things. Just you could solve a lot of those things through automation. Then the other thing that is happening is, cloud is a distributed environment. I was fortunate enough to grow up in distributed environments. I was always in retail, stores everywhere, everything was distributed. My learning curve was a lot less. That’s a different beast in a 3 tier architecture. We have one more worried about. It’s a lot harder to maintain.
What you’re seeing is this boom in operations right now of new ways of thinking about ops and you start hearing terms like observability and chaos engineering, testing in production. People wouldn’t dream of purposely breaking things in production in the old world but if I have 500 microservices running in hundreds of servers and sometimes you’re there and sometimes you’re not. How do humans manage that through looking at dashboards? We need to be proactively finding flaws in our system and fix them before our users encounter these issues.I mentioned earlier, a lot of people moved into the cloud and suffer worse reliability because the system ain’t moved because it’s kind of a different game here. That’s not good for the users. I think there’s a tremendous amount of good thinking happening in the world of operations and the companies that start embracing that, even if it’s slowly and start not thinking about how they operate things today, I think we’ll have a lot more success in the cloud.
AI and ML is a piece of that. AI is good at automating things that humans do. When you’re trying to monitor thousands and thousands of applications or containers, humans can’t do that so you let AI do that. ML is more suited for discovering the unknowns. Finding things that make you fail later that you don’t know about and you can fix that today. A lot of it is bias on my side, even though I grew up in the app development, most of my focus now is SRE and operations side of it because that’s where I see a lot of challenges to it.
In all of this stuff, we’re talking about automation such as microservices and container management with hundreds or thousands of microservices. Cloud adoption is not an option, it’s a necessity right. It’s something you can realistically do in-house. Increasingly, cloud adoption in terms of hosting this stuff on a public cloud provider is a necessity which of course leads to the value added services of the AI Ops to manage those services. It's going hand in hand, this orchestration of microservices and DevOps as well as the services to manage them. Would you agree?
I would agree in one area talking about containers. Kubernetes is now the orchestration engine of choice but people wanna roll around Kubernetes and these cloud service providers have Kubernetes as a service. The amount of energy people spend to be cloud agnostic is quite incredible. You have to ask the question at some point. What does portability mean? People worked so hard to be cloud agnostic they don’t get any value out of the cloud because the cloud is just computing, a working storage.
Anytime we start talking about capabilities of cloud, I think the ability to offload plumbing, it’s not just infrastructure it’s not just that middleware layer, it’s even at the business layer. It’s where the value is, AI Ops and ML. I worked in a wealthy marketing company years ago and we used to extract huge amounts of data, throw it to a SaaS data set and these data scientists, SaaS geniuses would do all this analysis and they would come up with hypotheses of things that they know and six months later we would have a new way to target shoppers or we would have new ways to do analysis of post shopping trip.
Today, it’s an API right, there’s a model you can go get. I had a team that provided all the analysis. If you’re gonna shop this is the next thing you’re gonna buy therefore we’re gonna put this coupon come up when you shop in. Go to Amazon or Google, there’s an API for that. I had a team, eight, nine people, tons of servers, and now it’s an API call. The value of the cloud is there, you just go get it.
Consuming services for ML and AI which you’re largely forced to with the investments being made by public cloud providers in those spaces to leverage APIs in creating and you’re talking about saving six months of engineering effort by using an API. But it also works in an inward fashion as well right. As I’m building applications, services, building out my databases and exposing them to teams throughout my organization. I also wanna take the same approach in terms of exposing everything as APIs so that people can consume my services, extract the value out of it just the same way as you’d be consuming that ML service at a public cloud provider.
Absolutely, I’m starting to see industries think that way. I was just talking to a CTO of a financial institution. They’re trying to automate the process of data science. Data scientists have all this work to do which is non-data scientist work which is get to the models they’re creating, they’re trying to automate that but at the same time they’re trying to create an environment where other companies can share models. It's the same thing with your code.
If you’re delivering a service that has a valuable data set, not only may it have a value to provide that as an API within your company but it may be your opportunity to monetize that and offer it outside your company or not monetize it but offer it to a community who’s also offering APIs back to you. We heard about the API economy for a long time, I think it’s starting to get here. I think the companies that have been in the cloud for four, five, six years who have gotten pretty good at this, they’re starting to move up that stack and they’re not thinking about infrastructure anymore. They’re thinking about data, they’re thinking about, how can we not write code, how can I get to market faster with writing the least amount of code possible? Traditionally, we just love to write millions of lines of code but every line of code you write you think hey.
I’m very tempted to start talking about low-code application development but I think that’s a whole different podcast. Mike, it’s been a great talk. Thank you very much for your time, how can people learn more about you and your upcoming book?
I’m on Twitter, madgreek65 and my daughter, I’ll tell you a quick story. My daughter’s a Marketing major. She built my website in a day and she doesn’t know what a server is. Talk about abstraction. She’s using platforms and she’s just talented at content. If you go to mikekavis.com, there’s a website about my books, a bunch of blogs, and my podcast. She created it in a day. One time we were talking servers, what’s a server? She doesn’t have to know what’s a server and we should think about that as we’re building these things. I can do IoT without knowing too much about IoT, just consume APIs, anyways that’s a long goodbye.
Thank you very much Mike for being here with us. To our listeners, what did you think about this podcast? Have you ever experienced migrating to the cloud? Do you have any cloud success or cloud horror stories that you would like to share? Let us know in the comments from whatever podcast platform you’re listening to. Also, please visit our website at www.torocloud.com for our blogs and our products. We’re also on our social media, Facebook, LinkedIn, YouTube, Twitter, and Instagram. Talk to us there, we’ll listen, just look for Toro Cloud.
Again, thank you thank you very much for listening to us today. This has been Mike Kavis, David Brown, and Kevin Montalbo at your service for Coding Over Cocktails.