Serverless computing is growing in popularity and is heavily promoted by public cloud providers. The much touted benefit of serverless computing is to allow developers to focus on their code whilst the public cloud provider manages the environment and infrastructure that will be running it.
But how is serverless different from container-based services? What are the best use cases for serverless? How about the challenges? And can this architecture move forward in the future? We answer these questions and more in this episode of Coding Over Cocktails.
Kevin Montalbo: Joining us all the way from Australia is TORO Cloud’s CEO and founder David Brown. Hi, David!
David Brown: How have you been Kevin? I'm very well. You?
KM: I'm great. And our guest for today is a Developer Evangelist at Sematext.com, a SaaS based out of Brooklyn, New York. He is also a passionate teacher, helping people embrace software development and healthy Devops practices in various venues since 2017. He is also the author of "Node JS Monitoring: The Complete Guide", and has several published articles, programming tutorials, and courses under his belt found in websites such as freeCodeCamp, HackerNoon, Medium, and Dev.to.
He’s now here with us today to share his expertise on Serverless computing. Joining us for a round of cocktails is Adnan Rahic. Hey Adnan! Great to have you on the podcast.
Adnan Rahic: Hey, good to be here!
KM: All right, so let's dive right in. In our previous podcast, we have often discussed Kubernetes and container-based approaches to microservices. Can you briefly explain to us how serverless is different from container-based services?
AR: Yeah, for sure. When you think about it, with containers, you get a package where your code runs, which basically, you package your code into an executable. And then you run this on an infrastructure, right? And they're quite logically called containers because of this. But with serverless, you don't really get that. With serverless, you just deploy your code directly to the cloud provider, and then the cloud provider handles everything from there. You don't really care about the dependencies. You don't really care about the runtime or anything like that. You just let the cloud provider handle all of that for you. Whilst with containers, you have to package all of those things within that container. So you have to figure out, "Okay, so I need to package around the dependencies. I need to manage all of that. I need to make sure that's all running correctly."
But having this serverless approach, it kind of makes it easy in one sense. But it can also be very complex in another sense, because if you overdo it, it gets really hard to manage all of that complexity. And then when you think about it, you can also reduce complexity. Because if you have a huge Kubernetes cluster, for example, or a huge monolith, and then you have things like cron jobs or email services or things that aren't really related to the core functionality of your actual cluster or of your product, you can then cut those pieces out into serverless functions that would basically be isolated.
So if you know how to use it correctly, or if you have a very good sense of how to get the best out of it, then it makes sense. But it's not a silver bullet. As anything, you have to figure out the best use-case and then based on that, use it for what it's kind of intended to be used as if that makes any sense.
DB: Yeah good stuff. We'd like to get to the use-cases and some of the challenges and complexities you mentioned in a minute. Before we get onto that, serverless is often mentioned in reference to Functions-as-a-Service. But serverless is broader than that, right? So it's encompassing more than just Functions-as-a-Service.
AR: Oh yeah, definitely. Basically, anything that doesn't require a server can be considered as serverless, right? But only Functions-as-a-Service? That's a subset, if you can call it, basically. If you think about services like Lambda or Azure functions or things like that, those are all f-a-a-s or FaaS. We call them Functions-as-a-Service where you have this service where you can deploy your code, hook it up to an event trigger, something triggers that code and, you know, it runs. Something happens, and you get a return value, which is basically what you want. And that's just one subset of having serverless or using serverless. If you think about it, like if you're running a website - a super simple static website - on S3? That's serverless as well. Are you managing a server? No. You have S3, slap your files in there and you hook it up to a domain and it’s serverless, right?
So it's very vague in what it could be defined as. But it's also very loose in a way where, if you're running a website on Netlify and you're hooking up an API to some Lambda functions or using services like [inaudible] or you're just running it by yourself on AWS Lambda, on S3. All of those things could be considered serverless, because, I mean, have you ever touched an EC2 instance? Not really. No, right? So, I mean, it could still be considered that way. I know a lot of people that are, like, hardcore, like purists. They're gonna say, "Oh, this is so weird." Maybe yeah. Maybe no.
It's just that in the end, whatever floats your boat. I mean, the point of serverless is to make it simple, to make it easy for people that don't need to manage infrastructure. Hypothetically, if I'm a startup founder, I don't really wanna care about managing containers and instances and running the infra and then hooking all of these things up, getting like a really large bill for something. I mean, I don't really need that if I'm making a ton of money and then I need to employ tons of people to run that so I don't have downtime, then sure. Yeah, I mean that's the next logical step.
DB: Well, there’s managing services for your containers as well. So, manage Kubernetes and, you know, as you say, managed virtual service through EC2 or container-based services as well. So there's plenty of opportunity for managed infrastructure and containers that I guess that sort of starts leading us down the path. And I guess, one thing we want to clarify, sometimes when we're talking about best use-cases or complexities or challenges, we're actually talking about functions and service. We're talking about that subset, so I think we just need to clarify that.
Let's maybe talk about some of the best use cases for serverless then. So you said, you know, it depends on the use-cases to when you use serverless, when you use microservices, container-based technologies. So let's run through some of that: some of the differentiations between serverless and microservices based on containers.
AR: Yeah, for sure. I mean, to keep it simple, anything that requires a persistent database connection or requires many database connections, especially to relational databases like Postgres or SQL. Whatever. Just don't. Just skip the FaaS altogether. Unless you have, if I go really technical in it, unless you have, like, a proxy API that hooks into your database, then it's fine. But that requires another layer of complexity that often you don't really want. Except if that's a use-case that you're okay with, because the problem with functions is that if you run one function, that's basically one API. If you think about it, that one API needs a connection to the database and if you're scaling out, then you have thousands of functions. They have thousands of connections to the database, and that's just that's just like an accident waiting to happen. That's just running with scissors, right? You don't want to do that. It's an unnecessary load on the database. It's unnecessary connections, multiple points of failure, multiple points breaches. So I mean, you just don't really want to do that right? Unless you're using a database that's a service as well that hooks into that fast ecosystem, like AWS has DynamoDB, which works fine. Azure has DocumentDB or I don't really know what it is called. So, any service that can hook into it, it’s fine. But you get vendor lock in there, so if you want to move away from that, you're gonna have a pain on basically anything that goes with that.
So, I reckon if you have database connections to figure something else out with anything else that has to do with, basically, you can think of it as sidecars. So, if you have cron jobs that are running, you don't really need to run that in your core infra, like if you have a core community server that handles your main APIs or your main database-handling. You don't really need to run those front jobs there. You can just like fire a Lambda, right? Or if you have email services or any type of service, an API that you can extract from your core product? Great. Because you have that one less thing to think about. And that's going to be less of a load on your entire system.
So regarding those things, amazing. That's absolutely great. One example is, I built an email service to get triggered through a Lambda function and another few services through AWS that when somebody types in a form, I got emailed that response for that question. And then I could just email back that person through any email client, so on. But that's not running anywhere. It's not running on a server. That's not taking up any space or any mental capacity for myself to have to, like, focus on actually getting that running and keeping it running. It's just there in a function in my account in AWS. So, things like that are absolutely amazing because it takes away all the stress of having to manage it. Unless it's databases. You don't wanna go into that one at all.
DB: What about managing it at scale, though? So, I get the cron job thing or infrequently run services or functions, then you don't necessarily want as service sitting there idle most of the time, if it’s only gonna be running that function every five minutes or every hour. A serverless makes a perfect use-case for that. But what about when you're doing at scale? The serverless still makes sense when you're running hundreds of thousands of transactions per second.
AR: Yeah. It can. Just because it can scale so effortlessly. So if you think about the use-case on, if you have function on AWS, if you get 1000 concurrent connections in the same millisecond to that one API, it's gonna scale horizontally to 1000 functions right away. So, you’re not gonna get this typical type of latency you would get on a standard API like on a server or whatever. That's a good use-case, but that would also mean that it's going to cost a ton. Like it's gonna cost a lot of money. So, if you're like a big corporation and having that type of flexibility is something that you want, and you don't really care about the price. But for the majority of people, that can often be a problem.
But going to the latency, I think that would also be a really, really interesting topic to cover. Because once those 1000 functions get instantiated and run concurrently, every single one of them is gonna have a start-up latency, because that initial request needs to grease the engine of it. You know, it needs to warm up.
DB: This is one of the biggest challenges most frequently mentioned associated with serverless computing Functions-as-a-service. Just explain the concept of this, you know, the warm-up processes and firing up a new function on that first use.
AR: Yeah, in the service community, it's called "cold starts", which kind of makes sense because it is cold. It's not like the instance of the function isn't there when you're calling it at the initial time. Let's say you have an event that's an API and that event will trigger your code that's in the function. The instance of this function doesn't exist anywhere, so you have to call it the initial time to actually tell AWS, "Ayo, can you just, like, make sure this package exists somewhere?" Then they package it up, put it in a virtual server, whatever they do. Like, I have no idea what happens, which is kind of the point. And then that runs and that's gonna take an initial set of, I don't know, 200 milliseconds to 5, 6. It kind of depends on what you're running, but you're always going to have that initial latency which is called the cold start.
Now the problem is there's no tactical way to go around that. There's no way to bypass that per se. You can do some things that are maybe not always considered best practices. But there are hacks that people do use and one would be, you could just periodically trigger the function to keep it “warm”. Quote unquote warm, which is okay-ish. But again, if you have 500 concurrent connections right away and you're keeping one function warm, I mean, it's not doing much, right? You're still gonna get 499 cold starts, right? So, you also have to figure out peak times for when you're gonna expect traffic, when you're not going to expect traffic, which is hypothetically, okay but practically pretty much impossible to always be on point regarding that. But otherwise, I mean, there's not much you can do. You can keep a set of functions warm. But, you know, in the end...
DB: I'm guessing the cold start problem is compounded by, in some cases, the language of choice as well. I'm guessing Node.js server is gonna execute a function a lot faster than Java servers, of which typically you need to warm up the JVM itself. Once the function has been started from the cold start, then the JVM needs to be typically warmed up before starting to serve a request quickly as well. So does language come into this as well in your serverless choice?
AR: It does, but it's not that big of a difference. So, the way that at least I know for Lambda, the way it works is that AWS packages this code into, like, a docker image. Well it's not a docker image, but it's a container per se. It's a container image. And so, the runtime gets packaged into this image as well. But it's a major difference whether you have no runtime at all and are running a Golang as a language with just an executable, it doesn't need anything. You just run it versus something like Node or Python or Java. So definitely having a language that doesn't need such a big start or doesn't have the big warm-up process, it’s better. But the end is not that big of a difference. It's not like a seconds difference. It's maybe, in the hundreds of milliseconds difference, which for most people, is acceptable. But again, if you have those margins that need to be hit, it's not really acceptable.
DB: Okay, so you've mentioned a couple of things. You mentioned there's potentially a cost penalty associated with serverless. When you're looking at scale then there's the cold start issue, which - as you say - is only an issue if it's very infrequently run. And there’s possibly ways around that, although they have disadvantages as well. Any other challenges that people should be aware of associated with serverless before we go into their great use-cases as well as their advantages?
AR: If you want to talk about general developer experience and how easy something is just to build and to integrate or whatever, then, yeah. The barrier of entry for using serverless as a developer can be pretty huge, especially if you haven't done something like that before.
DB: Why is that?
AR: Because it's a whole new concept of development. Right? You have to think outside of the typical box of development because the typical way is like, I run this server on my local machine. I do some changes, I hit reload or whatever, and I see the changes and I can, you know, figure out how to do it, whether I'm running Node, or it doesn't matter. I'm personally a Node.js developer so I can compare that. But if you're running it in serverless like, you have this typical dev environment that you can run, but there is no way of simulating a Lambda function. You can't really do that right? The main issue is that you have to run multiple environments for testing, for development in AWS, in the actual cloud to get a proper sense of what's gonna happen in production as well.
And then that means that if you're not doing test-driven development, if you're not running unit tests for the code, it's gonna be a pain. It’s gonna be absolutely horrible. And things like that. But luckily, AWS figured out a way. They recently released something like a container runtime. I can't remember. They always have freaking hard names. It's weird names for stuff. I don't know why.
DB: Anyway, it's okay. People find out.
AR: So what they did was they added this feature where you can, basically, you can build the container image yourself, and then you can push that container image - the actual Lambda container image - and then you can push that, and then you could hook that into Lambda. And then if you want to, you can run that image on your local machine, like through Docker. Like any other containers. So that gives you the opportunity to actually test the live version, the production version of the function before you push it, which is like - for me when that happens - thank you, God. So that was a breakthrough. I mean, I think that's gonna be the goal. One example is that we have CNCF (Cloud Native Computing Foundation) for Kubernetes and all of the tools that go with Kubernetes.
If we could as a community, have a similar thing for serverless and get this one norm of how we do things and this one path, of how we could have both scalability, ease of use, developer productivity. If we could have monitoring and log management as well, because monitoring log management is a pain in containers, let alone in service, right? So if we could have one standardized flow for all of that, that would be so amazing. If we could go that path that would be so amazing. I forgot the initial questions.
DB: I think you just answered some of the questions I’m gonna ask you later in terms of where you see serverless evolving and what you would like to see in the future. So, I think you just iterated through the number of things you'd like to see in the future for serverless. One of the issues associated with serverless functions is they typically have a runtime in them, right? So, you know, if it doesn't execute within X number of seconds, then the function is terminated prematurely. Am I right?
AR: Yeah that’s right.
DB: How do you monitor that? Is that one of the challenges as a developer as well? And how do you control this? Like, how do you know whether it's your code or if it's the server or when things are failing unexpectedly?
AR: Right, that's a really good topic here. So, back up until I think it was last year, the runtime limit for Lambda functions in AWS was five minutes. And they pushed that up to 15 now, which is, I mean, if you're running something for 15 minutes and it's not executing after 15 minutes, that's kind of bad. Let's just say that's kind of not good. But on the other side, like I myself as a developer, I don't really want a function to run for 15 minutes. I mean, if I have some data-intensive calculation thing going on, fine. But like, I don't wanna keep it open for 15 minutes. Why would I even want to do that? So the ideal way of doing this - that is also the best practice in the community - is that if you have something that is going to run for that long, chain the functions. Because if you think about it from a logical standpoint, as an engineer, as like when you're writing a product or software, you don't want one function to do a ton of things. You want functions to be modular, right? Especially if you're writing languages that are functional, like, I don't know, freaking Erlang or something.
So ideally, every function should be a few seconds, right? And then you get the value at the end. Yes, one function can be a bit longer, like five minutes or whatever. But if you set up your architecture correctly that way then it's fine, you're not gonna have them in the issues. But yeah, I do understand. I do understand that the initial problem with the execution runtime, but in the end, if you really need to run something more than 15 minutes, probably using a server is gonna be cheaper and more efficient.
DB: How do you manage the complexity when you have thousands of functions? In microservices, we have service discovery. A microservice will say to the gateway or Kubernetes, "Hey, I'm here in my endpoints. This is the service that’s available," and so you're aware of it. Something is aware of it and what it could do and can route requests to it if it gets a request for that for that particular microservice. If you have thousands of serverless functions which are sitting there idle and can be executed if an event triggers them, how do you manage that complexity and doesn't become sort of an unmanaged web of functions?
AR: You mean you manage it very poorly? It's not really mean. Jokes aside, there's no good way of doing it. That's where we have the problem with Kubernetes. It's mature enough that you get the service discovery and you can see what's happening. You have tooling that are open source for both monitoring and log management, which is awesome. And you see what's happening.
But the problem with serverless now, if you have thousands of functions, yeah, you can see them in your AWS console but there's no real way of getting this overview, right? You can check the logs as well. That's fine, but that doesn't give you this service overview and you know, if you want to get the service overview, you need to use, like a third party tools SaaS product, whatever, for monitoring. I mean, yeah, there are a few out there that you could use that are really well funded. They've been around for a few years so they have really good leadership, like the C-level executives are really competent people. I know a few of them as well and I can vouch that are super, super, super talented people. But then again, they're all like separate tools for separate SaaS where we don't have that one unified way that we can all agree on like, "Yeah let's use this and make this the best possible way of getting this overview, of getting the logs and getting the metrics."
DB: What about a serverless industry alliance? Is there such a thing where there's governing bodies trying to drive standards and adoption?
AR: Yeah, as far as I know there is no such thing. I might be a bit outdated, but that would be something we should definitely try pushing towards. We do have some, like serverless-esque tooling inside of CNCF that run on Kubernetes. OpenFaaS is one of them. We have Kubeless or something. I think it's called Kubeless, which basically that's a set of tools that you can basically set up functions inside of your Kubernetes cluster. But they're very, very rarely used if you compare to Docker or just containers in general, inside of Kubernetes, which does make sense, because if you're already running the complex container environment, then why would you also run the complex FaaS environment inside of that Kubernetes environment?
So it gets complicated really quickly. But yeah, having what you said which you mentioned, let's say like a collective or an open collective or something that will get people to, you know, push the same things, the same needs. I think that will be all, like, really freaking awesome. That is so cool. I'm already hyped if we actually get that going.
DB: It sounds like an opportunity for you, Adnan.
AR: Oh I should maybe start another startup. But the startup is just like hyping other startups to do this one thing.
DB: With serverless, you’re obviously very much reliant on a public cloud provider. No one sits up there in serverless infrastructure. I'd say we're talking about, you know, typically the big three public cloud providers providing some sort of infrastructure to support service. You know, in some respects, I’m gonna answer my question because we're very much reliant on public cloud providers for a whole bunch of things now, including Kubernetes and Microservices as well.
But when you have a serverless infrastructure, we simply cannot see even an underlying VM or containers being spun up and spun down. And that infrastructure is completely hidden from you. How do you manage downtime? How do you manage maintenance periods? How did you just hope that the vendor gets this right and is able to transition your code? Is this even a problem associated with maintaining serverless infrastructures?
AR: I mean, step one is like, really praying a lot. No, I'm kidding. Not really kidding. But seriously though, when you think about it, I worked with startups before my career, and right now, I work at a monitoring SaaS right now. And we run stuff on AWS. Do you wanna take a wild guess how many times we've had downtime because of AWS and how many times we've had downtime due to human error?
Human error 100% of time. Never has something weird happened to AWS that we had downtime because of them. If we as a major company that has a number of employees that runs so many services and spend so much money on the AWS has no problem, somebody that spends, like no money at all or somebody that's Coca-Cola type of huge corporation, I don't think you're gonna have any problems with running anything on AWS.
DB: You wouldn’t have issues setting up functions in different regions and all the rest of it as well.
AR: Exactly. Exactly. And one thing is super, super nice with serverless and setting up the serverless functions is that you have - I think AWS even calls it like Edge Provide or something like that - where basically, if you deploy that one function, it gets copied into all of these regions. All of these different availability zones, meaning that not only will this be good for the end user, because if they hit the API it’s going to trigger the function closest to them. So if they're in Singapore, they're going to get the instance in Singapore. If they're in, you know, the East Coast in the US they're gonna get the one that's running in the East Coast in Virginia. So, I mean, those are all things but also, if one fails, all 12 or 13 of them are not gonna fail at the same time. Because if all of those things happen it’s probably like a world-ending event. Yeah, so there's probably aliens landing or something.
DB: All right, look. What if you could have a wish list? So, you know, you've written books on serverless and you've been working with it for some time now, if you could have a wish list of the things you would like to see - whether be tool kits or infrastructure or governing bodies and standards - give us a quick rundown. What would that wish list look like?
AR: Yeah, I mean, better tooling for sure. I think better tooling. I mean, I work at a monitoring SaaS and we're currently working on making better tooling, but it's so hard when you don't have this one unified body governing what the community wants, what we need, what we like, what we're striving for. So definitely having better monitoring tooling is number one. Also, if we could get the cloud providers to get to be unified about the APIs they use, and that way they run the serverless functions, that will be absolutely amazing. Then we can have a unified way of gathering stuff like logs because right now, like you have to bake in your own log collection enrichment shipping type of tool that's gonna run inside of this serverless environment, extract this data, extract the laws, extract the metrics from the run from all of the execution, whatever is happening and then kind of package that and send it somewhere.
So I know this because I've built this before, like this freaking log-collection tool. We have that in our product, and it's a real pain to build, right? It's not a straightforward thing to do. And if we could get all of us to work on the same problem and work on the same solution to that problem, then it's gonna be much easier, because if we all put our brains in the same place, I think it's gonna be great. But right now we have, like, tons of people trying to solve the same problem in, like, 15 different ways, right? We're all hopefully very intelligent people, many more intelligent than me. I'm not that smart. But then, I'll actually put all of our brains in the same place. And it's gonna be, you know, it's gonna be a better impact. There's gonna be more of an impact. Because if you think about it, in Kubernetes we have Prometheus, we have Fluent Bit. All those tools are part of the CNCF, and you know that those folks, they support that. They push that and everything that goes into Kubernetes as a service or as a tool, it just works. And if we could get that as well, get a foundation or whatever for serverless, I think that's gonna be the thing. You know, the bomb, as kids call it nowadays.
DB: Well, so Adnan you've mentioned the SaaS company you work for a few times without mentioning the name. Can you tell us the company you're working for?
AR: Yeah, it’s Sematext.com. I've been there for, I think, a few years now. It was my - I'm not gonna call it like, first proper job because I only did startups and consulting freelancing. And then I just got into, like, this real job. And I have to say, like having a normal job is super nice. It's not that stressful. It's not that stressful. I mean, you have, like, normal hours. I mean, I even started going to the gym and, like, working out. I have a social life now which is like, “What?” So yeah, it's like I’m a new person. Like I don't have back pain in the morning when I wake up, you know, because I actually exercise except for walking. All of those things are, you know, freaking awesome. But if you wanna like the shoutout for where I work or if you like, if you wanna work with me, we have a few job openings as well. I promise I will send funny memes in Slack all the time. If you do not want that, I'm very sorry you're gonna get it anyway. So yeah, Sematext.com. I could drop a link in this description or anything.
DB: And your social channels? Where can people follow you, please?
AR: Yeah, Twitter is where I might go. My DMs are open, so if you have any questions there, it’s just @AdnanRahic. It’s just my first and last name and you’re gonna find me there. You can check me out on LinkedIn or all of those things. Also, Instagram. I have a fire Instagram profile. I do like influencing and I don't really do influencing. I have like, seven followers. All of them, like my family and friends. I just post workout and stuff on Instagram like everybody just unfollows me because I'm so boring with the workouts. And so because I go to the gym for, like, half a year. Yeah, but check me out like I do these things, the community stuff and events and conferences. I used to do a ton of those before COVID, but I'm still up for doing online stuff like videos and podcasts. Anybody that wants to collaborate, just have a conversation, feel free.
DB: Adnan thank you very much for joining us to talk about serverless computing and the listeners can follow you, as you say on Twitter, LinkedIn or maybe even Instagram if they're brave.