What is technical debt?
In this episode of Cocktails, we explore the definition and talk about how technical debt is anchored on our own thought processes, and how software architecture has evolved along with our development philosophies. We also dig into some great stories, analogies, and advice on how we can deal with and clean up our own technical debt.
- George Fairbanks talks about the evolution of “technical debt” as a term.
- “We could just call them a ‘mess’, right? We didn't need a special term for that.
- We find out if an agile development process is incompatible with well-architected applications.
- “It's very difficult to characterize what agile is these days, right? There's a lot of different dimensions.”
- Fairbanks discusses software architecture today and the philosophies that have changed with it as well.
- “A developer from 1995, if we plop them into 2021 is just going to have their head blown as far as how software is being built today.”
- Why is it so easy for technical debt to pile up?
- “You can find people that are narrowly focused, they get in, and they can get the feature coded, but at the expense of some technical debt piling up, that they maybe perhaps didn't even perceive as technical debt.”
- Fairbanks tells us if there is a way to escape technical debt.
- “All these problems seem to be inherent in the nature of design.”
- Are there systematic processes in trying to clean up all that technical debt?
- “Most processes I see for handling technical debt to me are the generic game plan for, ‘The company has a problem.’”
Welcome to Episode 39 of the Coding Over Cocktails podcast. My name is Kevin Montalbo. Joining us from Sydney, Australia is Toro Cloud CEO and founder, David Brown. Hey David, good morning!
All right. And our guest for today has a PhD in Software Engineering from Carnegie Mellon University. He is the author of the book “Just Enough Software Architecture” and the “Pragmatic Designer” column for the IEEE Software Magazine. He's also a software engineer at Google joining us for a round of Cocktails is George Fairbanks. Hey George, welcome to the show!
Thank you very much!
All right! So, you've been working at Google as a software developer for the past eight years. Can you tell us about your experience working there and some of the projects that you've worked on?
Sure thing. You know, this is a great time to mention that I'm here speaking on behalf of myself. My opinions are my own, you know? The standard sort of spiel here. I've been there for eight years. I've worked on a bunch of different products. I was on ads at the beginning. I worked on supply chain projects, and I got to work on the virtual reality things too. And that was a lot of fun. In the last couple of years, I've switched over to an education role. So, I currently lead the Software Design Education team.
What prompted you to join Google as a software engineer? Because as I understand it, before this, you had your own research organization and consultancy. So, what made you go into going back into developing software itself?
Well, there's two reasons you know? In the nineties, it was a perfectly viable thing that if you got pretty good at something in the software world, you could hang out a shingle and become a consultant. And there was plenty of training that companies were hiring people to do and so forth. But that got harder and harder that most of the people that I know that were in the business, they saw the recession in 2008 as being very tough.
And so, they hung on so much like I did for a bit. And then I switched to being the tech lead on a project for one of my projects that I was consulting for. And at that point, I remembered how much I liked building things. And when you're a consultant for a while, it's kinda neat to see another client and get them up to speed on some ideas and help them out. And there's a lot of that that feels really good to me. I really enjoy that, but boy, do I like building stuff. And so I figured this was a great opportunity. And a buddy said, “You should go talk to this recruiter at Google.” And the rest is history.
I suppose it both works hand-in-hand quite well. Keeping your hand in the software development stuff and building stuff whilst writing and talking about software architecture and the issues associated today around software development. I guess keeping your hand in that helps, right?
It's exactly right. I mean, imagine you have the clearest mind possible, which, you know, none of us do, right? But let's imagine you had the clearest mind possible, just getting your fingers on the code and being part of a project. It forces you not to gloss over or like, trivialize things that are actually really hard. And it also makes you understand when things are really changing. So, I mean, we're going to chat, I think, a bit today about the changes that have happened in the last 20, 25 years. And if you're in there in the trenches, it feels very different. And so you can reflect about, well, what was it like? How did I do that in 1995 versus how would I do that same thing today?
Yeah. Well, let, let's start getting into that. So, you talk a lot about software architecture and technical debt amongst other issues. So, we're going to focus on a couple of those topics today. Technical debt is a phrase which has been thrown up and around a lot these days. It seems to be one of the buzzwords of the software development industry at the moment. How has the definition of technical debt evolved since the phrase was first coined by Ward Cunningham in 1992?
I think it's very healthy that we are talking about... what kind of trouble that we're causing to our programs over time.
You know, that's a really interesting topic. I think most people now use the phrase very fluently, right? Like, we just talk about debt in our programs. And what's more, I think it's very healthy that we are talking about not only what features we're getting out the door, but what kind of trouble that we're causing to our programs over time. So, I think all of that is very good, but I’ll tell you what, we already had a perfectly good name for mundane software problems. We could just call them a “mess”, right? We didn't need a special term for that. So, when Ward wrote that original experience report, he wasn't talking about just mundane, run-of-the-mill, “I cut some corners and guess what I'm paying for it today.” He didn't need a special term for that.
And you wouldn't have gotten published for saying, “I cut some corners and now I'm paying for it”, right? What he was talking about was something that I find extremely interesting and it gets at what what is so exciting about being a software developer, [which] is you're confronted with a problem and you get to create a little world inside your software, where there's different entities, whether you represent them as objects or as data structures or functions. But you're manipulating these things. And the way that they combine is really related to how you understand the problem that's being posed.
So, when ward came up with this idea of “technical debt”, he was in a situation where he was working on a financial product and he needed to explain to his boss. He said, “Well, you know, I wrote this code several months ago. And when I go back and I try to add new features to that code, it's not so easy. And it's still working. I mean, that code still makes us money and all that stuff. But when I go back to that code, what I'm seeing...” This is how I interpret Ward was saying this. He's saying, “The code is speaking back to me.”
My ideas from several months ago have moved on. When he talks about financial products, you could imagine that if you didn't understand finance before, you start out with a very simplified, rudimentary model of finance. But then there's these waves of realization where you start to go, “Oh, you know, debts are actually like these other kinds of financial assets in some deep, interesting way.” And yet your code doesn't say that back to you. Your code is still very simplistic, okay?
So, what Ward was trying to tell his boss was what we need to do in order to keep up our feature velocity in order to be able to say, “Here's a new requirement. I just want to code it up and ship it out.” If I want to keep up that cadence, well, the software has to meet my current thoughts, okay? It doesn't just have to work, which is what the boss is concerned about. Like, it's profitably making money for the company. [But] it also has to be in a shape that the developers are ready to add that incremental feature.
And so that's why he had to coin the term “technical debt”, right? He said, “I don't want to stop the world, do a big waterfall, analyse all the requirements and then give you a beautiful system a year from now. What I really want to do,” says Ward, “is I want to ship this code, say, every couple of weeks or every month. But when, as soon as I ship that code, it freezes my thoughts at that moment that I shipped it,” okay?
And so this term that we now use all the time “refactoring”, but doesn't even appear in this paper because it wasn't a term that was current, right? This is 1992. So, he's just trying to explain the basic idea that we now understand quite well. You ship some code, but in order for you to keep up your feature velocity, you have to keep refactoring the code. So, it represents your current set of ideas. Otherwise, you're going to find it harder and harder and harder as you try to add each new feature that comes in the door. Right?
So, that's what the technical debt is. The technical debt is not, “I have bad judgment. I took shortcuts,” or, you know, “I deliberately put in something ugly in order to be at a deadline.” That's straightforward. What he's talking about is a much more subtle thing, which is, “My ideas evolve. The ideas get represented as source code. And if I want to keep incrementally, chipping away at the problem, shipping a new revision every week or every month or something like that. Well, the code has got to be changed over time. It's got to catch up with my thoughts,” and that's the debt. The debt is the difference between how you're currently thinking about the problem and what you're seeing when you look at the code.
That’s interesting. So, that's to suggest that technical debt is a naturally occurring process, and it's not something necessarily that should be punished for, because your knowledge is always going to evolve. Your knowledge of the project, the problem you're solving is always going to evolve. You can't change that. So, your thought process of where you were several months ago to where you are now is always going to be different. So, that suggests that technical debt is a natural part of the development process?
Yeah, that's absolutely right. And I think that's exactly why it's interesting because if you say we have bad developers, or we have developers with bad judgment, or we have developers who are forced to code to a deadline and therefore took shortcuts, it’s not surprising when bad things happen. But if you say, “Here's a team of great developers who are following best practices, who understand their craft and are doing everything right,” and you still don't have the software that you want, well, now that becomes very interesting, right? You really want to get it. “What is going on here?”
And when Ward Cunningham was originally talking about this, he was comparing what most people were doing, which was waterfall development. And he was saying, “Actually, I think you're going to have better results if you chip off some requirements, ship that code, chip off some more requirements, ship that code…” but explaining that the code that you wrote in the first month is going to have to change. It's going to have to evolve as time goes on. And that I find managers are still surprised by. So, what I mean by that is they look at their sharp developers on their team and they know they wrote that code six months ago and they kind of don't want that code to have to change, right? Because they're like, “Were you a knucklehead six months ago?” “No, I'm not a knucklehead. It's just that we now understand the problem better,” or “We've added new requirements that would say, well, if I'd known about all these requirements, I wouldn't have written it that way.”
Well, I mean, software architecture has been evolving over several decades. And as you say, we’ve gone from a waterfall process to a microservices and iterative updates with an agile process. Recently though, you've been arguing that it's getting harder to design well-architected applications when developers are focused on this iterative, developing incremental changes that we're going through with. Does that mean agile software development is incompatible with well-architected applications?
No, I really don't think that's the case. I think that, well, first of all, it's very difficult to characterize what agile is these days, right? There's a lot of different dimensions. Even back in the nineties, before agile was a thing, people were talking about iterative development. That's what Ward Cunningham was talking about in 1992. If you go farther before that, you find Fred Brooks talking about it in the 1970s, okay? You can go farther back into the sixties and people are saying, “You know, we probably shouldn't build software like we build bridges. It's just not the same,” right? “We have opportunities with software that we don't have with bridges”. So, if we just take the iterative aspect and we say, “Is iterative development compatible with software architecture ideas?” I would say, “Absolutely. Okay. This is perfectly fine.”
But what you find a lot of times is people, especially early on, say in the early two thousands, we were thinking about how to get agile in architecture to go together. They would not be genuine in how they would try to stick them together. They would say, “You should do Iteration Zero.” And Iteration Zero is like, “We're going to do waterfall and we're going to design the architecture and then we're going to do iterative development.” And I'm like, “Yeah, come on. That's, that's kind of cheating. You're not really figuring out how to make these two things go together,” right?
So, believe it or not, what I've been inching towards in these IEEE Software columns is the idea that's very compatible and you see it in extreme programming back in the late nineties. Okay? And that idea is that the team has an idea, a shared idea about what the software design is. And everyday, as they write software, they're subtly redesigning that software. Okay? Sometimes those redesigned changes are completely mundane.
Let me just use a quick example. Let's imagine you had an audio editing program and you have the facility for putting plugins. So, like, audio plugins, right? So, this one does a compressor and this one does noise reduction or so forth. Right? If you come in and put another audio processing element into this framework, no big deal. You're not really changing the architecture. You already made that decision. You said, “I'm going to need these plugins. I need to sequence the plugins. I just put another one in,” okay? That's not really an architectural change.
But if there was something else where someone said, “You know, we've currently been doing all this stuff on a desktop and suddenly we're now on the web. And I want someone to be able to do the audio editing on their phone, except that the audio files live on the server.” And you're suddenly like, “Oh, I am not ready for that,” okay? There's many things that I know about audio and there's many bits of the source code that I might be able to reuse, but really that's a different kind of architecture that you're asking me to work on, right? And to begin with, it's a distributed system and the original one was on a single node. There's no remote procedure calls anywhere.
So, the team has to come up with this idea about what the software is, what is the design of that code. And so the reason I've been chatting about technical debt is to get people thinking about how the ideas evolve and how the source code does it, or doesn't catch up.
And I want to talk about the processes because when we do iterative processes and so forth, it can make us focus on the minutia. Like, I am making this pull request, right? I'm going to make these three changes to the source code and I'm going to send it to you for review. And we're all done with it. It's going to go into the repo. And the continuous integration is going to push it off into production. Okay? Those things can encourage me to focus on small things and that's not incompatible with architecture unless I don't have the ability to focus in and then pop back out and think about the big picture, the big design that the whole team is trying to keep in their head. So for me, I think that what XP was trying to do with having people pair up and move around was so that the understanding of the system stayed in all the developer's heads.
I think we need to be doing the exact same thing today. The things I've been writing about though, are talking about how I see software today, having a hard time popping out. Okay? And keeping the big picture in mind, which would include that architecture. And the two big things that I find that are troublesome: first is that focus on the patch right on the next little chunk of increment. Like, a new requirement comes in on Monday and I have to get it out the door by Friday. Okay? And you and I, and the rest of the team, we focus on how that work flows through the system.
And this is incredibly productive. It's a great practice. I'm not dissing that practice in any way, but what I am saying is if that practice becomes the only practice and there's nothing in the way that your team organizes itself, such that you pop up and the team has a big discussion about the big patterns and you suddenly go, “Oh, wait a second,” we're still thinking one system. And we should be thinking distributed system or something like that. Right? If we don't have that baked into how our team cooperates, well, then we're making it harder to do architectural concerns. Okay?
The second thing that's been going on and people sometimes put it together with the iterative part, but I think it's different, which is that we are increasingly looking to factory metaphors to inspire how our software teams sequence their work. And it's like, “Well, okay, that seems very much like iterative,” but it's not in this particular way. If you think about a factory, a factory is doing the same kind of thing over and over again, like you produce cars and you're going to try to move this car through. And you're looking for opportunities to say, “I produced this wheel in five minutes and I really like to produce it in four and a half minutes,” or “I would like not to have a whole pile of wheels and unfinished cars,” right? I want to say, “Build the wheels just in time.” Right?
There's all these kinds of metaphors that we're pulling in that say, “I would like to improve efficiency by decreasing the cycle time. And I want to improve efficiency by reducing work in progress”. Right? These are definitely factory and production inspired metaphors. Okay? And as we chatted about before, there's a part of software which acts like a machine. When you're that boss, and Ward Cunningham is trying to explain to his boss that he needs to do refactoring, his boss is thinking about the code as a machine that's doing work for the company. “Okay, great. Why do we need to change the machine?”
So, in that sense, the factory metaphors help us build a more efficient machine, but when it comes to the thoughts, “Okay, like how do I think about the problem,” the factory metaphors, actually, in our relentless search for efficiency in our actions as software developers, squeeze out the time that we would have normally spent popping up and thinking about the big picture and explaining to coworkers how everything fits together.
And so, I think those are things that we need to seriously consider. When we look at our software processes, if we relentlessly treat them like factory processes, we build it up for a while, the code continues to work, but then at some point it's just so complicated because I haven't refactored. I haven't done deep, hard refactoring that ended up with a big mess. And it's hard for me to keep adding incremental features to it. And what I really get worried about is sometimes when I talk to senior managers, they feel like that's inevitable. That is the inherent nature of software.
And that is what I'm really fired up to change people's minds about. Right? What I would love to have people think is we have put ourselves into this difficult situation for good reasons. Okay? Like, the agile processes, the iterative processes, the factory-inspired metaphors have all done wonderful things, but we have to add something else. That other thing allows us to think about the architecture clearly and to make sure that we are not letting it erode as we incrementally chip away with pull requests. Okay? That we solve the features. We win the battles, but lose the war. I don't want people to be in that situation.
This “popping up” [is] a concept of looking at the bigger picture. For practical purposes, where in the agile process, do you see that occurring? For example, during the sprint review, does it occur during a quarterly plan? Is it an ad hoc process? Practically, when do you inject that “popping up'' concept?
Yeah. So I think if you ask different people, you'll get different answers. And I think that's healthy. There is no single agile process. Right? You know, you can find several, very reasonable agile processes. Okay? But let me go back to extreme programming. Now, extreme programming was tailored to small teams, right? And so a lot of people are trying to apply agile in bigger teams and they might believe XP is not the one for them. But if you go back to XP, you'll notice that Kent Beck was specifically worried about making sure the team understood things. He was breaking down the old ideas of code ownership. He said, “Look, we should all really own the whole code base.” He was trying to remove the, what we call, bus factor, right? That only I understand this module. Nope.
I'm going to pair with somebody else. And now that person is going to feel like they own this module a little bit and they can come in and change this thing. And as we move around and we understand different parts of the system as we go in and refactor different parts of the system, the system should represent our thoughts. And our thoughts, the team's thoughts should start to converge on a design. So, in a way, although there's not a specific practice, you can point to maybe the system metaphor in XP, but there's not a specific practice you can point to that says like, “That's where design is happening.” You would find Kent saying things like, “I design everyday,” and I agree with that. I think the teams that are functioning well are designing everyday. And again, the warning flag that I'm trying to put out is with our processes today.
If people are not thinking about, “How do I make room for thinking about the big design? How do I think about this code? Does this code have the right architecture?” right? This moment, if I don't have time to think about that, well, maybe my architecture isn't going to be healthy. Right? It's going to start out fine, but then it's going to decay slowly. And I would love to see if we can change that. And I understand, I dodged your question a little bit, because you're asking a good one. Like, when should we be doing architecture? I do have some specific suggestions, but I'll let you ask your next question.
Okay. Well, in your most recent essay, “Why Is It Getting Harder to Apply Software Architecture?”, you just described how we changed that philosophy towards development in two specific ways. Can you run us through these changes and how they impact our software architecture?
Yeah. So I started developing software in production or, you know, in industry in the early 1990s. And so in the early 1990s, what I found was there were many of these same people that are very famous today in the agile community. We're very interested in software design, right? I mean Kent Beck and Ward Cunningham, these guys were very interested in patterns and design patterns. Ralph Johnson, you know, put out a book along with three other folks.
And these people found that their ideas about how to do object oriented programming were not making their way. They were having a hard time getting those ideas into businesses. And so I believe that they basically terraformed the environment. Okay? You know, like that term from science fiction, where you go to a planet and you put things, machines in there to make a good atmosphere so humans can live there?
They did that same thing to our software development environment. They went into the world and they changed how big corporations build software. At first, it was like tentative littles, 10 steps of dipping their toe into agile. And let's try three-month delivery windows. Okay? And, you know, all sorts of crazy things like that, but now it is completely mainstream. And we have automated deliveries and so forth. The two things that I see that the agile community set out to change in companies: first was the shift from waterfall processes to iterative processes. Right? And so we just sort of take that for granted today. There are relatively few companies that are still doing waterfall and those companies generally have very good reasons for doing so. Okay? Enough said.
The second thing are these factory metaphors, okay? I alluded to earlier the factory metaphors and the idea of, if it can be automated, you should automate it. If you can do it faster, you should do it faster. [These] have, as I said, have been incredibly productive. Okay? And if you say, “What are best practices today?” You should have a bug tracker. You should file bugs that represent feature requests. You should track those through like I've filed the feature request. I've coded the feature request. It has passed all the tests and it has been automatedly pushed into production. Like, these are all very healthy things that allow teams to be agile and reactive. Okay?
So, those are the two big things, the iterative development and the factory metaphors that I believe have transformed. And so a developer from 1995, if we plop them into 2021 is just going to have their head blown as far as how software is being built today.
Well, you, you went on in that article to further explain how these two changes have made it easy though, to pile up technical debt. So, whilst those changes sound really positive, and yeah, as you said, common practice today, how have they led to us piling up technical debt?
You know, the way that I realized that this was really happening was a buddy of mine. He was in a startup company and they were doing great, great work. They had identified a good market that they were going to make some money in, and they had a good team of software developers that were making fast progress. But after 18 months or so, their software development philosophy started to really slow down.
And so my buddy went over and he, you know, sat with the software developers. And when he saw what was going on inside the code, it was a massive pileup of technical debt. Right? One way of saying this was that there had not been aggressive refactoring. There had been mild refactoring throughout that the team had not done a reasonable job of balancing their obligations to the company, which is like, “We've got to get features out the door.” With their obligation to tell the company that, like, “I'm piling up so much technical debt, that the next feature is going to be twice as long for me to implement it. And then the feature after that is going to take even longer for me to implement it,” right? Because things have gotten so swirled and so forth.
And it was at that point that I realized that it was possible to be in a state at a project where the project was like a zombie, that it's walking along and it's making forward progress. And it's not obviously dead yet. But backing out of that is nearly impossible. I mean, I think that team basically had to rewrite the code. They had to reanalyse the problem and rebuild everything. If they were able to save things, my friend was not optimistic that they were going to be able to do that.
So, the lesson I took from that was that the practices that make us efficient in the short term, like these factory metaphors and iterative software development, being able to get software out there and find out how it works, are all very good things. But it's like the yin and the yang. You need that second force, which is the very large scale refactoring and keeping the design very in everyone's head. And it's possible not to do that. And I think that's really common, right? You can find people that are narrowly focused, they get in, and they can get the feature coded, but at the expense of some technical debt piling up, that they maybe perhaps didn't even perceive as technical debt.
It's interesting. We went through a major refactoring of our own platform a couple of years ago, and it probably lasted 12 months or more. That project during that time, there's obviously very few new features coming out because we're focused on refactoring. In your experience, does a project like that, for the value of the future, so that this becomes a maintainable platform that will take you where you want to go in the future, you almost pause and see no new features, which from a management perspective, does it cause some sort of friction? It's like, “Hey guys, when are we going to see the end? When am I gonna start seeing new stuff? When will we start seeing new features and adding value to our organization?”
Yeah. And, I realized that, sort of, when I explain it this way, and I'm sort of presenting, There's like, “Oh, the happy team is refactoring all the time. And they're staying in this nice state. And the unhappy team becomes a zombie and eventually throws it all away.” I'm being a little too glib. I'm doing it to show the contrast. Let me give you another story. Because the stories, I think, are very helpful.
I showed up on a team and the team had just been asked to shift its focus. To give you an idea, imagine you were serving customer number one, but you are in a certain problem domain, and then you are going to serve customer number two. And you would think if you hadn't looked at the code, “Wow, well, I should be able to reuse almost all of this code,” but when you dig into the code, you find out that unfortunately, an enormous number of decisions were made such that almost none of the code was reusable. Okay? Even though all you're doing is customer one to customer two in the same problem domain.
And when I looked at that, I thought about this in a slightly different way. That agility means you recognize that the music's gonna stop. The requirements are going to change, and you want to be able to react to that change. So, the metaphor I came up with was the musical chairs, right? It’s that when you're going along on a project, you don't know when the music's gonna stop, but you need to have a chair when that music's done. So, okay, you need to be ready for when that happens. And readiness means, if this unforeseeable or let's say low-probability thing happens, like we switched from customer A to B, do I have any reasonable assets or do I have to start from scratch? Okay?
And in that particular case, the answer was no. But I would love to be in a situation where, as the team is reorganizing the code continuously all the time, right? A little bit, all the time, they're looking at things and going, you know, the dependency between this thing and this thing, let's say between customer A and the source code is a little bit too tight. But, you know, I could rework this such that, you know, going back to this audio example, right? That the audio filters are completely independent from something else. Right? Like all it really needs to say is there's audio coming in and there's audio going out and that's how filters work. Okay, great. I've decoupled it from customer A, which means that when the requirements change, you know, I've got a chair, I can reuse 80% of the components I've got for customer B.
But you know, it's very easy for me to say that. And it's also very easy to imagine being in the situation you're in that you were going along, you thought you were doing the right thing, but oh, you know, like the day came and you had to do 12 months of refactoring basically in order to to play the next requirement. Okay? To just solve the next requirement.
And in some cases, it's not necessarily to play the next requirement. It's just that, you know, we're going to have a problem at some point in time. And it's just like, we either take the pain now and pause, and build something which is going to be more scalable for the future or we create a problem in the future where we are gonna have to deal with a bigger problem. Is there a way that we can completely avoid technical debt, or is it an inevitable part of the process? I mean, I'm guessing from what we discussed earlier, you already alluded to the fact that it's a change in thought process. So, it seems to me that it's almost inevitable.
[Technical debt] is inevitable, regardless of what process you follow.
I agree. I think it is inevitable. Sometimes it's interesting to take things that we think we understand and sort of play with them. Okay so imagine that my team is doing waterfall and I'm going to do a waterfall process for six months, deliver something. I'm gonna do a waterfall process for six months and deliver it. I'm going to do a waterfall process for six more months and deliver it. Okay? So, here's my question in terms of shifting our perspective. How is that different from three very slow iterations? And what I mean by this is, the hope at the beginning of these waterfalls is that I take in all the requirements and I think about them hard, and I design a system that will solve those requirements. So, alluding to your scale question from earlier you know, at the beginning of the iteration, you would say, “I'm going to build a system that will scale up to X users or X messages or X…”, whatever it is, okay?
So, you already knew that requirement. But when I do one after another waterfall, after another, suddenly I'm sitting here with like a year's worth of code when I'm going into that third iteration. I mean, I have years worth of code, and now you're telling me that I need to scale up by 10 X or something like this? Right? You know, I need to be really ready. And I'm like, wow. You know, the technologies that were totally reasonable in iteration one, which is the first waterfall or iteration two, which was the second waterfall, don't make any sense. It also doesn't make any sense to completely throw away all the work and build a new system at the beginning of every iteration. So, my point being is that if you start playing with the sequence of waterfalls versus iterations, that nature of technical debt building up is actually inevitable, right?
It's inevitable, regardless of what process you follow. It's just the question of like, when it starts to hit you. In fact, my dad worked for Procter and Gamble and he made soap and he would have a variety of different kinds of jobs. But oftentimes the jobs were, there is a factory that makes Tide detergent. Okay? And now we're going to make the next variant of Tide detergent. And so, he was part of a group of engineers. And the guys in the lab had created the new variant, but his job was to make it work in the factory. And of course the factory was built 20 years ago. And so this is a technical debt problem, right? You're saying like, given the factory that I've got, how can I make the new tide?
And so, all these problems seem to be inherent in the nature of design. And it's the kind of problem that you want to have because the alternative is, we went out of the soap business, or you didn't have a scale problem because you don't have any customers. Right? So, it's a great problem to have, but I think it is inherent. You're going to see it no matter what happens over time.
Can you run us through a systematic process in order to deal with technical debt? How can teams implement a process to systematically clean up technical debt?
I would say, yes, it is possible to be systematic and keep tech debt low.
I have an idea and it is not quite what you're asking for. It's not as grand as a systematic process for technical debt. Most processes I see for handling technical debt to me are the generic game plan for, “The company has a problem.” So, the company's problem could be anything, but what you do is you identify it, you quantify it, you prioritize it, and then you schedule work on it. Okay? And I think that is exactly what I see people doing on technical debt, righ? That, you know, if it's small enough, I fix it this week. It never even makes it onto the backlog. Okay? But if it's bigger than that, I put it on a backlog. And then I start to quantify those things. And I say, “Well, that one looks like a one-week fix, and that looks like a two-week fix,” and so forth.
And then I prioritize them and I say, “Okay, which 1:00 am I going to do?” Right? So to me, this is common sense. I'm going to suggest something else, which is not exactly common sense. It's actually the topic of the next IEEE Software article, the one that hasn't been published yet. So, in this one, I make a comparison between technical debt and garbage collection. I know that sounds a little bit strange but let's try it on for a second. Okay?
So, imagine you have a program and I'm going to exclude the programs that say, statically allocate their memory. But like, if you have a regular program that allocates memory and then has to clean it up, usually via garbage collection, okay? What you're doing over time is you're generating garbage, but you have to clean it up.
And when you do go to clean it up, you would like the pauses to be as small as possible and as deterministic as possible. So,if you have to have a pause, you'd rather not say the first one is for one second and the next one is for 10 seconds. Right? Like, if they're going to be big, you'd like them to be consistent. Okay? So let me compare that to technical debt. You've got a team and as a team builds software, it unfortunately creates debt at the same time. You're going to have to pause and clean up some of that technical debt. You would really prefer if the pauses were short and you'd really prefer it if the pauses were deterministic, not like it took me one second to clean this one up. And it took me 10 days to clean up that one, or in your case, like it takes you 12 months to refactor for scalability, right?
That's really not what you want. So, the question is, how do I do it? How do I get good garbage collection? Well, the answer is we've been researching garbage collection algorithms for a long time. Okay? And over time, it's a very hard problem, but we've got better and better at making the pauses smaller and more deterministic. Right? And there's even this thing called the “No-Op Garbage Collector” where you don't actually do any garbage collection. And you say, “Well, that sounds crazy.” Well, what if your process fires up, runs for one second and then shuts down, okay? Let's say you're a little microservice or even smaller. Why would you even run a garbage collector on that? Right? Like the whole process is going to be gone in one second. So, I don’t even need to do anything. Well, it turns out there's enormous parallels again, when you go over to teams, right?
And you can say, “What are the things that my team is doing to cause technical debt?” Well, the first thing I could do is I could try to reduce the amount of technical debt that I'm generating. Okay? And that could be like a stroke-your-chin-for-10-seconds more before you start typing, right? Like maybe, actually you evaluated a few more alternatives and you just didn't allocate as much memory and you didn't allocate as much tech debt. Okay?
That's one thing. And the other thing is you could be smarter about how you returned to the technical debt and clean it up. Okay? And this is close to what you were asking about before, is there a place in my agile process when I'm supposed to return to technical debt? And again, I'm sort of hedging and I'm not saying there is a right answer, but I'm saying the different answers people come up with are analogous to the different garbage collection algorithms.
Okay, the garbage is going to exist. You have choices about when you return to it and what you do about it. Like, so there's mark-and-sweep collectors and generational collectors and all kinds of different algorithms that heuristically seem to be good. And they probably depend upon how your program generates garbage, right? So, when we go and we look at technical debt and a team, well, what we're really talking about is what is the team's process that causes tech debt to be created and what is its process for returning to it and paying it down?
And I would think that any reasonable long term team process needs to have both of those to some extent. It needs to have some amount of saying, “Well, let's not unnecessarily generate garbage. Let's not code up. Let's think of two alternatives and choose the less bad of the two.” Okay?
And then the second thing is if the team, no matter how good the team is at number one, if the team is not good at returning to the technical debt, whether it's weekly or it's, you know every Friday we do tech debt pay down or whatever it is, or maybe there's two teams, one team is generating debt and the other team is paying it down, like you can imagine a bunch of different algorithms, the solution to tech debt lies in how teams run that algorithm.
And I think that that's an empowering way of looking at it because the disempowering way of looking at it is to say, “Tech debt is inevitable. Let's just crank up all the tech debt. And eventually we declare bankruptcy and we start again.” Okay? And I find a lot of people think that that's just what you have to do. That you might keep tech that down a little bit, but eventually the complexity builds up and the heat death of the universe hits your project and you're done. Okay? I don't believe that. I do actually believe you take it.
Look at very large long-term projects, say like the Linux Kernel. And it's not easy, but over time, human beings together on a team can evolve code and keep it healthy and safe. And I'm not trying to say the Linux Kernel is perfect. I don't think anyone's going to say that, but I will say it's a heck of a lot of code and a heck of a lot of distributed collaborators working on something, but they have processes to keep the tech debt low enough. Right?
We can do it too. I don't know what that is, our projects, [but] we can do it too. We just have to figure out the right process for it. So I would say, yes, it is possible to be systematic and keep tech debt low. And if your team is not doing it, look in those two places. Like, are you minimizing the creation of it? Maybe you could do some more stuff there. And second, if you're not returning to it, if you feel like it's still up, you're probably not paying it down fast enough. So that's my process.
George Fairbanks, you're really good at analogies and metaphors and making this stuff seem really simple and easy to understand and with practical processes to achieve it. Where can people, our audience, learn more about what you're writing about and follow you on social media?
Well, I am on Twitter, although I don't say very much over there. “@GHFairbanks” on Twitter, because I wasn't fast enough to get “George Fairbanks'' or “The Real George Fairbank” or something. I write this column for IEEE Software. And that one you can find on my website, GeorgeFairbanks.com. Although if you do go to my website, please click on the “IEEE” one because they look at how many people look at it on the website. And there's a book which is starting to get a little bit old now, my book on software architecture. But thank you so much. You've asked great questions here today, and it's great to talk about something I'm so passionate about. Thank you. It's a pleasure.
Thank you very much, George. And I would highly recommend people visit GeorgeFairbanks.com and visit that IEEE Software link because it is very worthwhile reading. Thank you for joining us on the program.
- IEEE Software Magazine
- “Just Enough Software Architecture: A Risk-Driven Approach” by George Fairbanks
- “The Pragmatic Designer” by George Fairbanks
- George Fairbanks on Twitter