The SolarWinds hack in December of 2020 is considered one of the largest and most sophisticated attacks known to date. The attack, which exposed the data of over 30,000 public and private organizations, was used as a springboard to compromise a raft of U.S. government agencies. According to experts, this hack could be the catalyst for broad changes in the cybersecurity industry, prompting companies and governments to devise new methods on how to protect themselves and react better to breaches and attacks.
In this episode of Cocktails, we talk to a Distinguished University Professor and expert on cybersecurity, and we touch on some taxonomies and frameworks that organisations can apply to build their security. We also discuss how we can take a more proactive stance with regards to cybersecurity, and take on some great practical advice to make our software products more robust and secure.
- Laurie Williams puts emphasis on being proactive instead of reactive when it comes to cybersecurity.
- “I don’t see a lot of people making that transition voluntarily.” 02:41
- We learn about the “prevent, detect, response” patterns and how we can develop them for our organisations.
- “Companies don't have to start from scratch, you know? They can go and look at the types of things that are in these maturity models…” 10:22
- Williams discusses if there is a need for new design principles for developing our security mechanisms.
- “Developing mechanisms and frameworks to support people using those types of principles is important.” 15:19
- We find out if you really need to log everything in order to improve security.
- “So disclosing data through logs is another attack factor.” - 21:21
- Williams discusses the important role of AI and machine learning in cybersecurity.
- “AI in security, software security is an emerging field.” 29:03
Welcome to Episode 38 of the Coding Over Cocktails podcast. My name is Kevin Montalbo. And joining me from Sydney, Australia is Toro Cloud CEO and founder David Brown. Hi David!
Good morning, Kevin!
And our guest for today is a distinguished university professor in the Computer Science Department of the College of Engineering at North Carolina State University. She is a co-director of the NCSU Secure Computing Institute and the NCSU Science of Security Lablet. She's also the Chief Cybersecurity Technologist of the Secure America Institute. Our guest for today is Laurie Williams. Hi, Laurie. Welcome to going over cocktails.
Hey, thanks for inviting me!
It's our pleasure. So, we're going to obviously be talking about cybersecurity, one of your core areas of expertise. Can we start off by asking the question, “How can we take the practice of cyber security from being a primarily reactive process to a proactive discipline?”
I don't see a lot of people making that transition -- from proactive to reactive cybersecurity -- voluntarily.
Yeah, I mean, what I could say with my experience over the last twenty years of working with security is, the way to make it more proactive is for people to notice all the really bad things that are happening. I don't see a lot of people making that transition voluntarily. So, when there's a really bad thing that happens, then the awareness goes up. I think it was 2017, during Christmas time, there was a big attack on Target - the Target shopping center, which is huge in the US.
So, I worked with a building security and maturity model BSIMM and the BSIMM folks do a survey. Or it's not even a survey. They go out to companies once a year and really survey what practices the companies use through interviews, by showing artifacts. It's not just self-reported.
They actually have to demonstrate what they do, and we analyze that data. And you see a jump up. So, Target changed the industry. And so as things happen, then organizations increasingly will think, “I don't want to be in the news.” You know, we just had it on the East Coast of the United States, a ransomware attack that took out gasoline so people couldn't drive their cars. So, that will raise awareness. So, unfortunately, I think that the transition from reactive to proactive is going to come with more and more of these big attacks.
Being proactive comes from being reactive to attacks on other other organizations.
The way to make it more proactive is for people to notice all the really bad things that are happening.
Right. Exactly. And then the desire to like, “Oh, we don't want that to happen to us.” And then they start to adopt some more practices. As we analyze this BSIMM data, which is really the largest data set of cybersecurity practices done by organizations, it's the largest data set there is. It's done out of Synopsys, started out being out of Cigital, but then the personnel moved to a company, Synopsys and there's 125 security practices that they assess. And then, you know, we actually have access to the raw data and we're analyzing it. And we see these jumps happen when these big incidents happen.
So, people will be more proactive. Another thing that may turn people to be more proactive, [is] like, there was just a big executive order in the US, an executive order on cybersecurity that came out in May and they are prescribing what I would classify as proactive practices. So, if the US government is saying, “We won't buy from you unless you do these proactive practices,” then companies will do it. At least the ones that supply the US government, and lots of companies supply to the US government in some shape or form. And I know I'm speaking very US-centric. The BSIMM data is worldwide data. But I think that to some degree, the standards, the NIST standards, the things that happen in the US relative to cybersecurity are spread throughout the world.
Absolutely. I think that the new standards you referred to, perhaps we can go through some of these. I know you take a very scientific approach and data-driven approach to security. So, how does an organization go about systematically, developing prevention, detection and response patterns for their security requirements?
Yeah, so for the security requirements, I mean, one angle that we took a number of years ago was to use a natural image processing algorithm to read the functional requirements of a product and to match them up with the security controls in the NIST 800-53 standard. And so it would look for keywords in the functional requirements and then match them to the patterns and then suggest security requirements. For example, a requirement might be a doctor. He edits the patient record. And from that natural image requirement, which is functional, it's functionality with the product you have, this would say, “Okay, if the doctors can edit a patient record, the doctor must be authenticated. The transaction must be logged.” So, we don't tell the development team the associated security requirements based upon that functional requirement.
So, you know, that's one way, but I really do think increasingly, organizations are providing a pretty, pretty extensive taxonomy of security controls, “security requirements”, that's what they often say. Use the security controls that companies can use and to be kind of exhaustive as they're trying to be proactive in the prevent-detect-response. So responses, “Oh my gosh, we got attacked! What should we do?” Like, that's not that great. Detectives, their vulnerabilities in there, how do we get them out before someone else finds them and the best cases when they're preventative?
Are there some blueprints to get people, get companies started?
Companies don't have to start from scratch, you know?
Are there blueprints? Well the NIST cybersecurity framework is a good way to get companies started. I guess some of the ways I would say, a blueprint, it sets a risk-based model. And so not all companies [are the same]. You know, they’re starting from different places. And they also have different risks. So like, depending upon the product itself and the product that the company is producing, they would consider themselves higher or lower risk. The NIST cybersecurity framework does a little bit of that risk assessment. It's not that specific though, but it is good for getting a company started.
I'd say another couple ways to get started, I mentioned the BSIMM, the building security and maturity model. So it's a taxonomy of 125 practices. And the maturity model says there's level one practices, level two practices, level three practices. So, I'll tell you a competing way to look at it in a moment, but level one practices, these are the practices most organizations do. And then level two is more advanced companies, level three is the most advanced. So, if you're just starting out, looking at the level one practices of the BSIMM is a good start.
They had similar origins, but there's another taxonomy called SAMM or OpenSAMM which comes out of the OWASP organization. And they have level one, two and three as well. And the MM is “maturity model” as well, but they have a different tact. And what they're saying is, [it’s] more like it's prescriptive. Like, you should do level one and then you should do level two and then you should do level three and you should develop a procedure to get yourself to level two and then to level three. So, you should set your goals to increase the maturity. So, similar [to] something like 125 practices and the way to advance through it.
So, that means companies don't have to start from scratch, you know? They can go and look at the types of things that are in these maturity models and start to say, “Okay, we should adopt these things.” And they're also pretty wide, varying from developer practices, management practices, training practices, compliance practices, governance practices, they're all in there. So, those are good ways.
So, I think some of our listeners would be familiar with NIST, for example. Because NIST, for example, in software development, publishes vulnerabilities in software frameworks and libraries often use a build process to identify developers and create alerts for those in a proactive step. BSIMM, I think a lot of people may not have heard off. So, it's an organization, a collaborative organization, which means people can join using a membership. Can you tell us a bit more about the organization?
Yeah. So, the two, the BSIMM and the OpenSAMM, had similar origins and then they split. Now, the BSIMM, it's not membership, it's not what you just described. It's a framework where consultants from the organization can come to your business and help assess you and develop a plan. So, that's one. OpenSAMM comes from OWASP, the Open Web Application Security Project, OWASP, which is nonprofit. And so that's more of what you're talking about. You know, there are some other NIST and ISO standards. But I think that the NIST 800-53 security control is also a good starting point. A nice comprehensive list. Unless, you know, if the organization wants to pay to be assessed by the BSIMM, that's great.
If they just want to look at what has been published about it and what are the 125 practices? That's one thing. OWASP OpenSAMM has more like, spreadsheets. You can download that, help you develop a plan. And then the NIST 800 53 security controls. All of those provide a good framework for people to get started. Another thing that I really like is, again, OWASP and it's ASVS, which is Application Security Verification Standard, and that's more technical. So, that's saying like, kind of enumerating 136, if I remember right, different things you should test for your product.
So, it's much more developer centric, not governance, not anything else. And so, you know, when everyone's trying to figure out, like, “What are all the things I should do?” Starting from scratch and developing your own standard is not a recommended practice. Go to some of these NIST and OWASP resources that are available.
I was looking at your paper that you co-authored, Establishing a Baseline for Measuring Advancement in the Science of Security. I was interested in this concept of establishing a baseline and you get in that, that we need to establish scientifically founded design principles for building in security mechanisms from the beginning. What do these principles look like?
Yeah. So, there are principles that have been around. There was a famous paper written by Saltzer and Schroeder back in 1976. And it's full of design principles. And I think all of them are still valid. And so, I won't go through all of them, but some like “Least privilege” which says every person should have the least amount of privilege possible. So design that in. “Minimizing trust”, so don't trust anyone. Only give them things that they absolutely need. “Defense and depth”, so assume that the attackers are going to get through your first line of defense and make another, make multiple lines of defense. So, “Complete mediation” is one where you need to check access, like you continuously check that the person is who they say they are. Don't just assume if they log in, they are the person that they say they are. Keep checking.
So, there are, you know, a good 12 or 13 design principles that have been around for quite some time. And so, actually developing mechanisms and frameworks to support people using those types of principles is important. So, there's not really a need to develop more new principles. It's really to adhere to the ones that we know about. But that paper that you referenced, Establishing a Baseline, I've worked with the security agency, the NSA for more than 10 years on a “Science of Security” project. And the basis of that project is that the NSA would like for researchers to be more principle-based.
A lot of research these days is very “reaction”, you know? So, prevent-detect-response. A lot of the research can fall into the response. You know, “The attackers just did this. Now we need to have a supply chain.” That's the thing now. We had some big SolarWinds and some other big supply chain attacks. So now, that's the thing. But that's the reactionary, that's response-based research and NSA would really like us to be more prevention-based and, you know, sees the research community as not being as principle-based. And so that paper is about, “How do we, as a scientific field of security report results so that other people can build upon our science?’ And so that's why we're establishing a baseline.
Speaking of which, you talked about supply chain being one of the hot topics of the moment. In the news recently, President Biden met with Putin and mentioned cybersecurity and cyber attacks, and mentioned some lines in the sand. The lines in the sand, as I understand, when actually published as to what the lines of the sand were, these are the areas or entities, which we consider, you know, a line in the sand if you undertook a cyber attack. But if you were to have a guess,what sort of areas would you be imagining? Infrastructure and military, and these obvious candidates, but would there be some non-obvious candidates, perhaps?
Yeah, I mean, so certainly, military and government, you know? There are definitely accusations that I'm not sure, maybe even proof that the, you know, the Russians tampered with the election and, you know, and got in and, you know, exploited Hillary Clinton's email, for example. And so that's a government and interrupting the political process.
So then, you know, you mentioned non-government, but the SolarWinds, which is somewhat government. So, SolarWinds established, you know, a Trojan or a hack that was able to get in through the supply chain and then opened up the doors for government, you know, the Pentagon, some other government organizations, as well as some big companies like Microsoft. And so, that is a case where likely Russians opened up the doors to cause damage both at the government and the industry level.
You know, I'm not sure how many of these things get across the whole world, but the colonial pipeline, that was believed to be people from Russia. And, you know, it's interesting. And I saw the president of Microsoft, Brad Smith, mentioning in a blog that cyber espionage, just making money off of this, was a $19-billion business. And the people who launched that colonial pipeline attack that, you know, caused people like me to have trouble getting gas said, “We really didn't mean to do that. We just wanted to make some money.”
So like, that's kind of the next wave, like this economy of just causing these cyber destructions to make money. And so, you know, I'm not sure of course and no one knows what Biden exactly said. It really stands all of that. Like, what are nation state attacks at a defense level, at a government level that disrupts government processes, as well as harming citizens and companies, which they've shown to do all of those?
I like to ask a more practical question, on-the-ground type of question, if you like. The security breaches are often discovered through log files. So, the question then becomes, what should we be logging in our application if we really want to be able to find anything? Should we just log everything or are there implications for that as well?
Yeah. So, we did some work with logging and you know, one of the things we showed with our work was that a lot is not logged. And so, you know, we had papers like “Modifying Without a Trace.” So, a lot of things aren't logged and logging, in the general case in computer science, has originated from debugging. Logging to debug and not logging for forensics. So, forensics, by design, is a new field. We coined the “forensic ability”, the ability for a product to enable forensics.
So, you know, these are all things that people do need to be considering much more. And so should we log everything? For sure, not everything. So disclosing data through logs is another attack factor. And so you have to be careful about what you log and not, to not log any sensitive identifiers. Like, you can't use the social security in the US as the unique identifier, because now that's sensitive data in itself.
So, like in the general case, CRUD - create, read, update, and delete when someone does those and view, which is another thing that we determined. So, as long as you can say, “Who saw something”, then you need to log that as well. So, some identifier of who did a create, a read, and update, delete or view is important. Watching for not logging so much that you create a new attack vector. And it's hard. It's actually hard to decide what to log and what are the data fields that you need to log. So, these are still open research topics. When my students were working with logging, it was funny to me. One of my funnier memories as an advisor was they went through a medical application and the requirements for the medical application.
And it's similar to the other product I described. So, what we were trying to do is to be able to read the requirements for a system, and then based upon some heuristics, recommend what should be logged. And so what my students were doing was, you know, looking at the transactions and coming up with heuristics and then applying them. And between students, they had some disagreements and I was like, “Come on, bring them to me. I’ll resolve it.” Like, you know, “I don't know why you guys can't figure this out.”
And then they brought it to me and I'm like, “Ah, yeah, I don't know either.” You know? And so then the three of us were like, making our best guess. So, it's really not that straightforward. You can read a requirement and be like, “Hmm.” So, you know, there's still work that needs to be done. I do think that there's definitely the potential for natural image processing to aid in the logging, what-should-be-logged process, there's definitely potential there. And then to watch for not disclosing information in the logs, not allowing your logs to be altered, you know, so they're write-only, they're backed up, things like that.
We are a software company ourselves, and software companies and organizations which have a team of software engineers will often create a culture of focusing on adding new features. Can that culture lead to danger? Can focusing on features as opposed to fixing non-critical security issues result in what you call or refer to in the past as security technical debt?
Yes, absolutely. Absolutely. And then I actually did a keynote at a conference called TechDebt. It was one year ago. So, things could have changed, but at the time, you know, I looked for all the papers far and wide. Looked at all the papers that address security technical debt. And there really weren't.
So, it's not an issue that there has been much study on, but certainly your company and most companies really do focus and reward the production of functionality. So, the cognitive overload of a typical software engineer, when they're having to build security into a product and then the running static analysis tools and bugging tools and getting notifications that components that they use have vulnerabilities, like getting all of these signs from all over the place really does cause cognitive overload and then, can cause technical debt because more likely they're like, “Ah, this is too much, I'm not doing anything. I'm just gonna produce my functionality.” And so that human aspect of security is, again, an area that needs a lot more focus so that they're getting the strongest signals. The engineers are getting the strongest signals of what they need to deal with so that they don't create the security technical debt and reduce the false positives. A lot of tools cause alerts with false positives.
Yeah. And speaking of those false positives, I mean, like I mentioned before, we use the NIST database to identify vulnerabilities and risks during the build process. But you’ve also written about artificial intelligence being able to potentially assist organizations, deploy more secure software products as well. How do you see that in practice? People using more and more artificial intelligence to build security in?
Yeah. And so a number of ways. So, like if you put natural image processing in the AI category, I mentioned a couple of the projects that we've worked on, like what to log being NLP, or what should be your security requirements? So, there's other people doing things like that, but there's also a lot of learning algorithms. Like, something that we've done and a lot of other people have done is vulnerability prediction models.
So, based upon features, where should you look for your vulnerabilities? What are the signals that say, “There are vulnerabilities here. You should look here.” So, there are a lot of people doing that, mining logs. So, logs are terabytes and terabytes and terabytes, and you can log all you want. If you never look at the logs then you know, you might as well not have them.
And so, that's definitely an AI application. Identifying the anomalous behavior in the logs is another AI. So, looking at the components that have vulnerabilities in the national vulnerability database, I think that's what you're saying that your company does. And that's like the most rudimentary. I mean, that whole field. So, it's called SCA, Secure Component Analysis. It’s very complicated. So, the national vulnerability database is, in most cases, the beginning part. And so tool vendors in that space are reading, using natural image processing to read bug databases and security advisories to identify vulnerabilities before they get reported into the national vulnerability database. So, that's an AI kind of thing.
And then another aspect, which is probably not so AI-related is people don't want to be notified if a component has a vulnerability, if that vulnerability has nothing to do with the functionality that they use from that component. And so, trying to identify that which may require dynamic analysis is another aspect.
But, you know, AI in security, software security is an emerging field. I actually have done one keynote in that space in, I think February and I have two more this year talking about just that topic, the union of cybersecurity, software engineering and AI, and what are people in the world doing about that. And my group does work in that, in that space, that three, the union of those three areas and some researchers in Singapore do as well. And then in Luxembourg, in Europe. Like, there's a lot more that can be done.
So, whichever you are dealing with, whether it be log files or transactions or whatever it may be, you simply do not have any alternative other than to use some sort of machine learning techniques. So it's basically, as you say, a necessity. Otherwise those log files I use to try and identify the cause of an event, post to the event, but not necessarily?
Right. Yeah. So, that classic case-to-be, you know, it's a movie star in the hospital and it gets into the news. Who found that out? Look in the log files. So, in one case you can find out the nurse or whoever that found it and that's in the log files, but you only looked because it was in the newspaper. And then the worst case is when you try to look and you can't even find out who looked, you know? There's no trace of who looked. If there's, like I said, terabytes and terabytes of log files and not a learning algorithm to identify, then it's a waste. In analogy, credit card companies. If anything, we get more calls like to just say, “Is this transaction really you?”
That's really what we have to get to. It’s that, you know, we're being proactive like that. Or something that we've proposed in the past too is that if someone looks at something, someone gets notified. There's kind of a chain. Like the chain of command where, you know, if someone looked at a patient record that has never been on the floor of that patient, never performed in the service of that patient so much, should be notified right away. I have a system that we've developed in a classroom, a medical record system. And in that system, it's a fictitious system. That's quite large. You could click on a button and find out anyone who's ever looked at your patient, your record. So, then if you're working in a hospital and you know, if I look at some of these records I'm not supposed to, they can find my name out by just pushing a button.
Maybe you won't do it. So, it's kind of a deterrent. If we have those types of actions and software, just knowing that there's transparency, you know? Similar to a security camera. Like, I'll get a fictitious example of, you have a supply room in your work and you go in on a Saturday, you could collect your child's school supplies from the supply room. But if there's a security camera, you might not do it. But if there's not a security camera, no one's around, sure. I might do it similar to medical records or other applications. If you think you can do it, no one will ever know. Or even if it's logged, no one looks at the logs anyway, then it's not a deterrent, but if I could be easily found out, then I might not do it in the first place.
Good point. Laurie Williams, you've published hundreds of papers and the like, but do you publish on social media? And if so, where and how can our listeners follow what you're reading and writing?
Yeah. Mostly Twitter from a professional standpoint.
Twitter. Yeah. And your handle is “lauriewilliams”?
That's right. Yep.
Thank you so much for your time today, Laurie. It has been a pleasure talking to you about cybersecurity. It's a big topic and you've written an overwhelming amount of material. It was very interesting researching today's topic. It's a fascinating area. And I think something that we don't talk about enough. So, thank you for coming and joining us on the program today.
All right. My pleasure.
- OpenSAMM by OWASP
- NIST (National Institute of Standards and Security)
- The Security Principles of Saltzer and Schroeder
- Establishing a baseline for measuring advancement in the science of security: an analysis of the 2015 IEEE security & privacy proceedings co-authored by Laurie Williams
- Laurie Williams on Twitter