All security specialists nowadays are fully occupied by the Log4j “vulnerability” which can potentionally have huge business impact. We try to use this as a opprotunity for our blog post to show you the difference between the traditional security approach and shift security left. So, as most of you already know, the last series of vulnerabilities with huge impact all around the world was in the Log4j logging Java component. Log4Shell vulnerability has a “basic” CVSS 10, while the “temporal” changes every day as the exploit becomes more mature, so the levels of remediation also change from bad to good and wise-versa (CVE-2021-44228, CVE-2021 - 45046, CVE-2021-45105). We are sure that no one wants to have vulnerability like this on their systems. Engineers from all around the world are using their power to discover which systems they are owning are vulnerable to Log4Shell. We can just address some of the tools that are currently in use to discover this kind of vulnerability, like network scanners (OpenVAS, Nmap, Nexpose, Nessus, etc.) image scanners (Trivy, Anchore, Xray, Clair, etc.) dependency scanners, and all other scanners that are currently available. This vulnerability easily explains how something that is not considered a critical component can be the vector for “bad guys” to ruin our Christmas, rest time, etc. Therefore, we should improve our vulnerability management program. Let’s do it, we try to show you how.
To achieve what we have mentioned above let’s take one example. Let’s imagine we are building a cloud-native service, that is running thousands of microservices from 50 different base images, and components are written in multiple languages. The service is using a queuing system, key/value cloud-native databases, and exposing service over web GUI and API. Definitely to assess vulnerabilities the first step would be to understand the bill of used software.
In this example, vulnerabilities can be part of each component starting with badly/non-hardened platforms like master/worker host OS, “etcd” and API components with badly configured authentication and authorization or simply with old packages. Then down to k8s resources like “Config Maps” that are storing sensitive information, secrets that are actually static (e.g. hardcoded) in some GitLab repository, “pods” without “Limitations” in terms of resources or ability to communicate with resources out of its scope (no “Network” or “Pod Policies”), “Service Accounts” that are having too many permissions (typically we call them privileged accounts - which means, among other things, they can act out of its namespace), or just vulnerable images with old packages, secrets or even altered by a malicious actor. All of the above is for a better image of how difficult it is to notice/discover all the vulnerabilities that may exist in this relatively small example.
Applying traditional security approach
We can try to apply the traditional way of security testing within SDLC (Software Development Lifecycle) on this Simple Service Example. Let’s wait for the service team to deliver their service and then engage the security team to execute vulnerability scans and do penetration testing of the service, or just the compliance department to do a simple compliance check.
This kind of work brings an advantage (later you will see and can consider whether it is really “advantage”), because security
trade of it not blocking developers from delivering their product very fast, and without obstacles that are coming from a security perspective.
But then, before the system is going online (into the Production), the responsible business/product owner will ask the security team to do their job (in order to secure the service and therefore the business itself - reputation risk, etc.), which means asking them to help secure the service. Most probably security team will use their tools or will do a simple code review, in order to find vulnerabilities starting from [OWASP 10]https://owasp.org/www-project-top-ten/). The most obvious one A02-2021-Insecure Design, maps 40 CWEs (Common Weakness Enumeration), could be the first pain point. The problem with “Insecure Design” is that those vulnerabilities are the most expensive ones. I’ll try to explain what I mean, so there are no compensation controls that can easily reduce the risks of that vulnerabilities. Usually over helm approach would be to go back to architects and developer teams to change what they have delivered. And the result would be high-cost resulting from service delays, additional effort of architects and developers, etc - on the edge it could mean that the whole service/product has to be redesigned (without reaching the Production and therefore without making money). So, as we mention on the top of this part, do you still think that this approach is an “advantage”? We will keep this question open, and to your personal consideration.
From the example Simple Service Example above, in the design phase if we didn’t plan load and didn’t at the first place go with Message Queue System (but in example hopefully we did not), we could face problems of availability and scaling. Or our developers did a mistake and hardcoded sensitive information like credentials into the code (CWE-798) for instance into the CICD pipeline as a clear test. Or it turned out that our developers made a mistake using vulnerable libraries or methods in application dependencies. We can go deeper and deeper and provide even on this “Simple Service Example” tons of possible misconfiguration, etc. We think that for your imagination these “security violations” are enough, and you understand what we want to tell you.
In the traditional security testing approach, many challenges, as previously described, are coming from the technology part, but there is another huge challenge. In the “Waterfall” type of working, where the release cycle is long, the traditional approach might work. But “Agile” way of working, introduces short times between releases (Agile Manifesto: Deliver working software frequently (weeks rather than months)). Then it would not be possible to deliver the service/function on time, applying traditional testing procedures that would have decent test coverage (including security).
Shift Left approach
In this part, we try to describe this approach which could be shortly understood as “Secure Software/Service by design”. As a necessary basis for everything to be secure, it is necessary to discover potential sources of security threats as soon as possible. This may be due to the continuous appropriate testing. So also testing should be introduced in earlier stages of the project or by the different words “Shift testing Left”.
Shift testing Left has two goals:
- Early discovery of system/program malfunctions or vulnerabilities
- Better test coverage (detailed), including faster testing
The final result of the “Shift Left” approach would be the cost reduction and higher quality. When we put it together with Agile way of working it will end up in a more competitive product on the market, by placing most of the tests in the earliest stages of SDLC (Plan, Design, Develop & Build).
For our previously described “Simple Service Example” it means, every single building block needs to be tested as much as possible and reasonable in as earliest stages of the SDLC.
Transition from traditional to Shift Left testing
Like any other transition, the transition from traditional to “Shift Left” requires to be familiar with some concepts:
- Automation become essential
- Engaged Test Driven Development
- Developers need to be aware of security practices
- Security team as more consulting role
- Security Team as enabler
- Security must be part of the project from the earliest stages (even from the Design phase)
Automation is essential concept in modern service development/deployment. Starting from automated test executions from CI pipelines, successful tests are enablers for automated deployment. As said earlier high “velocity” in modern service delivery without automation is not possible anymore. To achieve it and be able to keep the high level of security the Infrastructure as Code (IaC) concept is introduced. By nature, we have to admit that every code can be vulnerable. It brings one more level that needs to be tested against the vulnerabilities. But the final result is a secure, reliable deployed service.
Test-Driven Development (TDD) is a concept that usually comes together with agile, where the decision to change things can be introduced during the development phase. To track potential defects introduced with new changes, everything needs to be properly tested (e.g. unit tests, integration tests, security tests, etc.). One note in this topic, when talking about security, negative tests have the same importance as positive ones. We want to be sure that the services will behave in an expected way in case of no wanted events (e.g. how to service/system will react on invalid inputs).
Now security principles need to be deeply integrated into SDLC (into each step), developers need to be security aware and apply security best practices from the beginning of their service journey. To help them with this task, the security team needs to act more as consultants, to create security awareness for the architects, developers, in order to teach or advise them what and how things need to be done in a secure way.
The security team still exists but is mostly acting as an enabler. They are not only providing best practices or recommendations to developers. The security team is responsible to build tools that will help developers deliver their code securely with minimal additional effort on their side. For example, they have integrated security tools that can discover vulnerabilities in code repositories globally (e.g. crawl all repositories for crypto material), or manage global scanners for any other used software within the company (e.g. docker images, packages, etc.). The next field of responsibility for the Security team would be to drive specific tasks like Penetration testing, run security incident processes, etc. As you can see security has a really large and complex set of tasks on their heads.
Now let’s try to apply this approach to our “Simple Service Example”. Each of the developers is building their service in the language they agreed, some of them are using GoLang and others are using Python. For each of their merge, they need to be sure that their code is safe against code smells, bugs, or vulnerabilities using the white-box approach. For that purpose, they will use one of the SAST (Static application security testing ) tools, which should not necessarily be the same for both program languages. Then they would like to detect if some of the secrets are hardcoded into services, optionally they can use “GitLeaks”. Then code is integrated as a microservice into the Docker containers. So, logically now it should be checked against vulnerabilities of the Docker containers themselves. For this purpose, they can use some of the tools like “Anchore”, or “Trivy”. Code is delivered like Infrastructure as Code, using k8s manifests, and there should be ensured that everything is created by the book, and there is not used “latest”_ docker tags, also limits are in place, and finally, secrets are not exposed into Config Maps. For this use-case maybe the “Popeye”_, or “kube-bench” will work. In the end, the system as whole needs to be checked after the deployment against well know scenarios using DAST (Dynamic Application Security Testing) tool e.g. “ZAP”. So there are many tools that need to be in use (I really like this DevSecOps list), and without automation, it is not possible from the time perspective. So the security team is here to help, in a way of proper security tools which will be in line with this approach of Agile methodology and automatization. Plus security can prepare the code snippets for these tools for better and easier integration for developers into the CI pipeline. For example code snippets can look like this:
--- scan_appscan: stage: test image: registry.safescarf.in.pan-net.eu/zap2docker-weekly:latest variables: TARGET: "https://vulnerable.host.foo.bar/" SAFESCARF_HOST: "customer.safescarf.pan-net.cloud" SAFESCARF_ENG_ID: "xx" before_script: - printf "$CI_PROJECT_DIR\n" - mkdir -p /zap/wrk script: - /zap/zap-baseline.py -t $TARGET -m 10 -a -x report.xml || true - ci-connector upload-scan --scanner "ZAP Scan" -e $SAFESCARF_ENG_ID -f /zap/wrk/report.xml
And at the end, we will let the Security “Red Team” try to break into the service using one of the penetration testing approaches (white, gray, or black).
So, we can see there is no “silver bullet” for detecting vulnerabilities or just non-compliances to the requirements. Each of the mentioned tools produces a specific report (format). It is represented as CI artifact or as an item injected into a tool dashboard. To conclude, it is hard to track vulnerabilities when results are located within different locations and different formats. So, it would be useful to have one platform that will integrate results from different sources into one simple pane of glass, and support the vulnerability management process in an automated way. This is the reason why we developed SafeSCARF.
SafeSCARF was developed
We in TDI (Technology Delivery International - under DT group) were facing challenges with the lack of visibility of vulnerabilities inside of the products we are delivering. We were missing assurance that our products are well-tested towards vulnerabilities that can expose our systems to malicious actors. Our Security department offers many security tools that can be integrated into the CI tools and pipelines to our developers to support automated processes. But still, the results are inside of multiple places for Product owners and developers and therefore it is hard to track findings and apply proper vulnerability management process. So we had a few requirements for the product we wanted to build:
- All components needs to be FOSS (Free and open-source software).
- Support for user segregation with multiple levels of access to its resources.
- REST API interfaces dedicated to full automation (user management, product structure management, etc.) and injection of results.
- Support for most of the scanners we are using, and to be extended with additional scanners.
- Integration with SSO (Single Sign-on) provider.
- Notification engine.
- Vulnerability Management capabilities.
- Reporting Capabilities.
- Scanner result injection needs to be possible using CI scripts.
Those were our requirements for our SaaS (Software as a Service) product named SafeSCARF (as the name of the product implies SCARF = Scan, Collect, Analyse, Report, Fix). SafeSCARF is the system that integrates different CI scanners by providing their docker images together with Gitlab CI code snippets and SafeSCARF Vulnerability Management Platform (SVMP). All building blocks of this system are “FOSS”, including SVMP which is built on the top of OWASP DefectDojo. We decide DefectDojo is a very mature platform that can be in use, but still, we are trying with the other community members to make it better.
To simplify SafeSCARF product consumption we integrated two more components:
- Cloud Portal, where customer can order SafeSCARF service and manage users, products and assign their rights (based on roles).
- CI-Connector, that acts as an API client, to simplify the process of data uploading.
SafeSCARF currently officially supports only a few scanners, which we consider as very basic from a security perspective. Every SafeSCARF customer can consume any other scanner that is in the list of supported tools in the DefectDojo community (but without support from us).