Testing jsPolicies with Godog
Writing acceptance-tests as scenarios and running them on a cluster
Recently, I learned how to enforce some best security practices on a kubernetes cluster by deploying jsPolicies. Meanwhile, I’ve also been looking into ways to write BDD tests in a more intuitive way, and I figured, why not combine them 🤷
That’s how I arrived at…
jsPolicy
jsPolicy is a policy engine for Kubernetes that allows you to write policies in JavaScript or TypeScript.
jsPolicy is a great choice for developers who want an easy-to-use and powerful policy engine for their Kubernetes clusters, with key features including:
- Ease of Use: JavaScript, a widely used programming language, serves as the foundation for jsPolicy, making it accessible to developers familiar with the language.
- Power and Flexibility: jsPolicy offers a versatile framework capable of handling a wide range of policies, ranging from basic access control to complex anomaly detection.
- Integration and Customization: jsPolicy seamlessly integrates with your existing Kubernetes ecosystem, while enabling customization to suit unique organizational requirements.
jsPolicies are essentially snippets of JavaScript code that specify conditions under which an admission request should be allowed or denied. These policies are compiled into a highly optimized bundle of JavaScript code, which are loaded at the beginning of the policy’s execution flow, executing the JavaScript code within the modules, and making a decision based on the defined conditions. The result of this decision (allow or deny) is then communicated back to the Kubernetes API server.
The part that I find the most impressive about jsPolicies is the speed of their execution. I don’t understand the optimizations made well enough to explain them here, but feel free to check out their architecture page for more details.
Godog
Godog is the official Cucumber BDD framework for Golang, it merges specification and test documentation into one cohesive whole, using Gherkin formatted scenarios in the format of Given, When, Then.
Going over BDD briefly, Behavior-Driven Development is a software development methodology that emphasizes collaboration between developers, testers, and non-technical stakeholders by using natural language descriptions of software features and scenarios. Godog allows developers to write and execute BDD-style tests in Go, providing a structure for writing and automating tests based on Gherkin syntax.
In this story, we’ll be writing acceptance-tests for a jsPolicy using Godog, so let's get right into it (don’t worry, I’ll go over what exactly we will be testing in the next sections).
Creating your own jsPolicy
jsPolicy has an SDK project that can be used as a template to write custom policies. However, there is an issue with one of the helper functions that makes the policies block your kubernetes apply requests even when your resources are compliant with them.
And since I was going to create a custom policy and clean up the template anyway, I copied the code, cleaned it up, fixed the issue, and created my own repo to store it. Feel free to clone the repo, we’re going to go over it next.
Alright, let’s break down the src
directory structure first.
lib
: contains policy logic grouped by kubernetes resource type (or a field on their definition), which can be used by different policies that target the same kubernetes resource. Our policy would be validating pod containers and metadata. Hence, the folders:containers
andmetadata
.policies
: each folder here represents a policy (1x policy.yaml + 1x index.yaml as entrypoint). In our case, we have one policy,validate-pods
that will validate pods.util
: shared utility functions (not policy specific, small helpers).index.ts
: specifies all functions that are exported and are intended to be reused by others that use this package.
Our policy is supposed to validate a pod before it's created on the cluster and block it if any of the following is true:
- will be deployed to
kube-system
ordefault
namespace - has a container without memory limit set
- has a container that is set to run as the root user
We’ll go over the implementation of the 3 rules above now.
Starting with validateNamespace.ts, this where we put the logic to check what namespace the pod is being created in and block it if it's kube-system
or default
.
const disallowedNamespaces = ["default", "kube-system"]
export function validateNamespace(request: V1AdmissionRequest): string[] {
const object = request.object as {metadata: V1ObjectMeta};
const errors: string[] = [];
if (disallowedNamespaces.includes(object?.metadata?.namespace!)) {
errors.push("Field metadata.namespace is not allowed to be: " + disallowedNamespaces.join(" | "))
}
return errors
}
One thing to note is that we are using errors.push
here because all the validations failed are returned as elements in an array and the policy only lets the pod creation go ahead if the array is empty.
Coming to validateResources.ts, the logic here checks if each container and init-container in the pod has limit
set for memory resource.
const errors: string[] = [];
podSpec?.containers?.forEach((container: V1Container, index: number) => {
if (!(container.resources?.limits) || !('memory' in container.resources.limits)) {
errors.push("Memory limit not defined for spec.containers[" + index + "].")
}
})
podSpec?.initContainers?.forEach((initContainer: V1Container, index: number) => {
if (!(initContainer.resources?.limits) || !('memory' in initContainer.resources.limits)) {
errors.push("Memory limit not defined for spec.initContainers[" + index + "].")
}
})
return errors;
Here again, each container and init-container that fails validation adds its own entry to the errors
array.
Coming to blocking pods with containers that run as a privileged user, this logic is implemented in validateCapabilities.ts. Every container’s and init-container’s securityContext
field is checked and error is pushed for each that fails the check.
const errors: string[] = [];
podSpec?.containers?.forEach((container: V1Container, index: number) => {
if (container.securityContext?.capabilities?.add?.length ||
container.securityContext?.runAsUser == 0 ||
container.securityContext?.privileged) {
errors.push("Field spec.containers[" + index + "].securityContext is not allowed.")
}
})
podSpec?.initContainers?.forEach((initContainer: V1Container, index: number) => {
if (initContainer.securityContext?.capabilities?.add?.length ||
initContainer.securityContext?.runAsUser == 0 ||
initContainer.securityContext?.privileged) {
errors.push("Field spec.initContainers[" + index + "].securityContext is not allowed.")
}
})
return errors;
Putting the jsPolicy to test
Alright now that our policy code is ready, let's compile it to generate the corresponding JsPolicy
and JsPolicyBundle
resources.
npm install
npm compile
Note: you might run into the following error while running npm compile
library: 'DSO support routines',
reason: 'could not load the shared library',
code: 'ERR_OSSL_DSO_COULD_NOT_LOAD_THE_SHARED_LIBRARY'
to resolve this issue, you need to instruct Node.js to use the legacy OpenSSL provider instead of the newer default provider.
export NODE_OPTIONS=--openssl-legacy-provider
Successful compilation of the policy would generate 2 files, validate-pods.bundle.yaml
and validate-pods.yaml
in the policies
folder.
A JsPolicy
resource defines the configuration for a policy. In our case, the resources it applies to is pods, and the operations it triggers on are Create and Update. Whereas, a JsPolicyBundle
resource stores pre-compiled JavaScript code for the policy logic. It’s used to improve performance and avoid recompiling the policy for every request.
The reason jsPolicy
and JsPolicyBundle
resources have the same name is that as soon as the JsPolicy
exists in the cluster, it will be active immediately and start looking for the corresponding JsPolicyBundle
. If the corresponding JsPolicyBundle
is not existing yet, all requests for this policy will fail. This is the reason it is recommended to apply the JsPolicyBundle
first.
Alright now, before we can deploy these policies, we need a cluster with jsPolicy installed on it. Follow these steps:
- Install
kubectl
following instructions provided here. - Install minikube and create a cluster locally.
minikube start --memory 6144 --cpus 3 --kubernetes-version=v1.24.13
Note: you can adjust the memory and cpu to allocate to the minikube cluster. I ran this tutorial with kubernetes v1.24.13, but you can try testing with newer versions too.
- Install
helm
as mentioned here. - Now we can install jsPolicy on the local cluster using helm.
helm install jspolicy jspolicy -n jspolicy --create-namespace --repo https://charts.loft.sh
If the all the steps above were successful, you should have a pod running in the namespace jspolicy
in your minikube cluster.
Time to deploy the policy we created. Policy compilation should have put our jsPolicy
and JsPolicyBundle
yaml files in the policies
folder. We can deploy them using kubectl
Now, let’s try to create a pod with a container that runs as the root user, has no memory limit set, and is to be created in the kube-system
namespace.
And as expected, our policy blocks its creation and logs all the errors/violations.
Setting up Godog tests
Now that we have our policy deployed on the cluster, we can create acceptance-tests to make sure it works as expected, given different pod definitions. To do so, we will write the tests in Go using the Godog framework.
I’ll be using the following repo for reference, so feel free to clone it.
The package that stores all the code that interacts with our minikube cluster is k8s
which has all the functions needed to perform operations like pod creation, deletion, configuration etc. You can try out the main.go file as well, but that’s only for playing around with these functions and will not be relevant to our tests.
Creating acceptance-tests for your jsPolicy
As mentioned earlier, using Godog, when you’re writing acceptance tests, you can write scenarios in a feature file and in our case, describe how the policy is supposed to behave when pods with different configurations are deployed on the cluster. Let’s start with a simple scenario where we try to deploy a pod that is compliant with our policy.
Feature: jsPolicies
In order to deploy a pod
As a developer
I need to configure the pod to be compliant with all the jsPolicies
Rules:
- Pods without memory limit set are not allowed
- Pods with containers that run as the root user are not allowed
- Pods cannot be deployed in kube-system and default namespace
Scenario: Allow deployment of a compliant pod
Given I create a pod manifest with name compliant-pod in namespace acceptance-tests that is compliant with all policies enforced
When I apply the pod manifest
Then the pod should be created in the namespace
Decoding the features and scenarios written above in Gherkin syntax:
- The name of our feature is
jsPolicies
and following that we have mentioned what the rules of the policy are. - Our first scenario is where we deploy a pod definition compliant with all the rules we just mentioned, named,
compliant-pod
, in namespace,acceptance-tests
.
The repo has a pod yaml definition, compliant.yaml, that we can use to test this scenario.
Now we get to the part where we convert the Given
, When
, and Then
steps of our scenario into actual functions, which are implemented in main_test.go file.
Note: For context, you can over Godog docs that cover step definition in depth.
Godog ScenarioContext
uses regex to match steps, extract parameters, and call a function providing those parameters. Take the following as example:
sc.Given(`^I create a pod manifest with name ([a-z0-9][-a-z0-9]*[a-z0-9]?) in namespace ([a-z0-9][-a-z0-9]*[a-z0-9]?) that is compliant with all policies enforced$`, createPodCompliantWithAllPolicies)
This will match the Given
step in our scenario above, extract the name compliant-pod
and the namespace acceptance-tests
, and call the function createPodCompliantWithAllPolicies
providing these two as the arguments.
Coming to the createPodCompliantWithAllPolicies
function:
k8sPod, err := k8s.LoadPodFromYaml("./k8s/pods/compliant.yaml", k8sPodName, k8sPodNamespace)
if err != nil {
return ctx, err
}
return context.WithValue(ctx, pod{}, k8sPod), nil
Here we are creating a Pod
object from the yaml file where we have stored the definition of a pod compliant with our jsPolicy, and if you check the implementation of the function LoadPodFromYaml
, we are setting the name and namespace of the pod as, k8sPodName
and k8sPodNamespace
respectively.
One more thing to note is how we will be storing states for a particular scenario:
type podApplyError struct{}
type podName struct{}
type podNamespace struct{}
type pod struct{}
We are using custom types to store state of the pod for a given scenario, and then these states are passed along using context. The return statement of the function createPodCompliantWithAllPolicies
does the same by storing the pod definition in the context using pod
struct as the key.
Coming to the When
step:
sc.When(`^I apply the pod manifest$`, applyPodManifest)
This will match the applyPodManifest
function, which in turn will create the pod in the cluster with the name and namespace specified in the Given
step.
func applyPodManifest(ctx context.Context) (context.Context, error) {
k8sPod, ok := ctx.Value(pod{}).(*coreV1.Pod)
if !ok {
return ctx, errors.New("there is no pod set to apply")
}
err := k8s.ApplyPodManifest(k8sPod)
ctx = context.WithValue(ctx, podName{}, k8sPod.GetName())
ctx = context.WithValue(ctx, podNamespace{}, k8sPod.GetNamespace())
ctx = context.WithValue(ctx, podApplyError{}, err)
return ctx, nil
}
In the createPodCompliantWithAllPolicies
function, we stored the pod definition in the context with the pod{}
key. In applyPodManifest
we retrieve that same pod definition from the context and then call the ApplyPodManifest
function in k8s
package to create it. The function also stores the name and namespace of the pod, and the error returned when creating it, in the context.
Now, the final step in our scenario, Then
, would match the following function call:
sc.Then(`^the pod should be created in the namespace$`, podShouldBeInNamespace)
podShouldBeInNamespace
function would extract the name and namespace of the pod from the context and checks if a pod by that name exists in the namespace.
func podShouldBeInNamespace(ctx context.Context) (context.Context, error) {
k8sPodName, ok := ctx.Value(podName{}).(string)
if !ok {
return ctx, errors.New("pod name is not set")
}
k8sPodNamespace, ok := ctx.Value(podNamespace{}).(string)
if !ok {
return ctx, errors.New("pod namespace is not set")
}
namespacePodNames, err := k8s.GetPodsInNamespace(k8sPodNamespace)
if err != nil {
return ctx, err
}
for _, namespacePodName := range namespacePodNames {
if strings.Compare(k8sPodName, namespacePodName) == 0 {
return ctx, nil
}
}
return ctx, errors.New("pod not found in the namespace")
}
If you checked out the godogs-k8s-acceptance repo then you’ll see full implementation of all the steps in the InitializeScenario
function, however the scenario we are going to test will only involve the Given
, When
, and Then
steps we went over above.
Testing jsPolicy behavior with a compliant pod
Make sure the features/jspolicies.feature
has only one scenario: Allow deployment of a compliant pod.
Create a namespace acceptance-tests
in the cluster.
kubectl create ns acceptance-tests
Time to run, go test -v
And there it is. The scenario ran as expected and our policy did not block creation of the compliant pod.
Note: The After
step cleans up any pod created while executing a scenario after its evaluation. That’s why you won’t be able to see the pod on the cluster. However if you remove the After
step you can see the compliant pod in the aaceptance-tests
namespace.
Testing jsPolicy behavior with non compliant pods
Earlier we went over the rules based on which our policy will bock the creation of a pod. Let’s now write down scenarios where we will try to deploy pods that break these rules to test if our policy behaves as expected.
Scenario: Block deployment of a pod with a container running as root
Given I create a pod manifest with name bad-pod-1 in namespace acceptance-tests that is compliant with all policies enforced
And I set the user of container indexed 0 as 0 i.e., root
When I apply the pod manifest
Then the pod should be blocked with error:
"""
- Field spec.containers[0].securityContext is not allowed.
"""
Scenario: Block deployment of a pod with a container without memory limit set
Given I create a pod manifest with name bad-pod-2 in namespace acceptance-tests that is compliant with all policies enforced
And I remove the memory limit of container indexed 0
When I apply the pod manifest
Then the pod should be blocked with error:
"""
- Memory limit not defined for spec.containers[0]
"""
Scenario: Block deployment of a pod in the namespace kube-system
Given I create a pod manifest with name bad-pod-2 in namespace acceptance-tests that is compliant with all policies enforced
And I set the pod namespace as kube-system
When I apply the pod manifest
Then the pod should be blocked with error:
"""
- Field metadata.namespace is not allowed to be: default | kube-system
"""
Scenario: Block deployment of a pod with a container with user set to root and with memory limit removed in the namespace kube-system
Given I create a pod manifest with name bad-pod-2 in namespace acceptance-tests that is compliant with all policies enforced
And I set the user of container indexed 0 as 0 i.e., root
And I remove the memory limit of container indexed 0
And I set the pod namespace as kube-system
When I apply the pod manifest
Then the pod should be blocked with error:
"""
- Field metadata.namespace is not allowed to be: default | kube-system
- Field spec.containers[0].securityContext is not allowed.
- Memory limit not defined for spec.containers[0]
"""
If you go over the main_test.go file, you’ll see we already have steps defined with regex that would match the new steps we added to test negative scenarios. One more thing to note here is that we expect error messages pushed by our policies to be in the error response returned when we try to create these non compliant pods. Not only that, if a pod breaks multiple rules, we also expect to have error messages corresponding to each broken rule. This goes back to how we pushed each error message into the errors
array and then returned an error string combining these errors separated by a new line.
Alright, time to run the tests for all of these scenarios.
As you can see, our policy worked perfectly, blocking every non compliant pod deployment for the right reasons.
This was a very basic example of testing a jsPolicy and using Godog to do so. I haven’t gone over the full codebase in the godogs-k8s-acceptance
repo as this story is already way longer than I anticipated. But feel free to post comments reagarding any questions you might have.
The takeaway here for me is how these scenarios make testing so intuitive and that too for anyone regardless of whether you’re familiar with the abstracted code or not. And for jsPolicies, as far as I know this is the most “conventional code oriented” way of enforcing policies on a cluster, which gives you the freedom to perform very complex operations in the policy before determining whether to allow a resource on the cluster or not.
And with that, if you have any questions or suggestions, drop them in the comments section as well.
See you on another post 🖖