Leveraging software development practices for policy authoring
In this blog post I want to walk through a challenge I had to work through when attempting to use Open Policy Agent (OPA) for API authorization with my RESTful API application. I like working with new technologies on personal projects, and using it in my development environment was easy and rather enjoyable! OPA is an incredible piece of open source technology, with great docs and many resources in a growing community centered around it.
The challenge for me became addressing how to responsibly use OPA for API authorization when deploying my application to production. How could I use OPA’s policies in Tandem with the application’s APIs?
OPA Authorization Use Cases
Before diving in, let’s start with a short review: OPA is very generic and multipurpose, and thus might be useful for a wide range of use cases. To name a few:
OPA can be used to authorize requests made to an application. This use case is one that is close to the actual business logic of the application itself. Because of that, policies are apt to change more often than other use cases for using OPA as a policy engine. The most common way to integrate OPA for API authorization will be using a middleware or a proxy server (Envoy, Trafeik). OPA’s vast ecosystem provides integrations with many application languages and services.
Kubernetes admission control
Admission control for Kubernetes is another form of access enforcement where, leveraging OPA, you can validate and mutate objects during create, update, and delete operations. In a managed interface, developers can validate deployments on a K8s cluster. OPA also has an extension called Gatekeeper for this specific use case.
Validating infrastructure pre-deployment helps companies shift left on cloud security. You can use OPA to run policies over terraform configuration files or plans. Some widely used tools for infrastructure as code analysis (such as terrascan) use OPA internally.
There are many more examples ranging from data authorization to Kafka topic authorization that are covered through OPA docs and their online ecosystem. In this article I chose to focus on the first use case mentioned — API authorization, the specific needs for this use case, and how this complicated challenge could be tackled using OPA.
API authorization using OPA
When we’re managing a deployment into a production environment, extra precautionary steps need to be taken and additional measures need to be put in place to ensure the process is managed securely. To name a few:
Audit — In a proper production environment, policy updates must be audited just like every other critical change in the system. Imagine trying to investigate a problem in the authorization system with no way of determining which policies (or more specifically — which version of your policies) were in power during the incident.
Review and approval — When dealing with production environments, changing policies without having a formal review process is never good idea. There is no better example of policy as code than OPA and its proprietary language “rego”. Just as no changes to the application’s code are deployed to production without being reviewed (and tested), the same should apply for policies.
Green-blue deployments & rollback — Another important part of mature, well managed CI/CD pipelines is developers building robust measures to prevent errors. Usually when deploying a new version of the application, the new version will first be fully functional and tested, before draining the containers running the old version of the app. Additionally, a proper production pipeline should also be able to rollback changes that were deployed, but later found out to be faulty and include some major issues. Imagine deploying your app with a brand new endpoint you’ve just added. That also requires an access policy change to address who is allowed to use this new endpoint? On what conditions? Maybe you’re also deprecating an older version of this endpoint which is no longer supported. For these cases, a backwards compatible policy might be a solution, but even taking backwards compatibility into consideration when updating your policy, what happens if you rollback your application version due to an error? Having the newer policy in power might leave you with non-existent or unready endpoints open. In each of these cases you’ll need your policy version to match the current version of the application.
This brings us to one of the main points to take into consideration while designing an authorization solution using OPA: How do the policies get deployed to OPA?
Policy deployment methods
There are a few methods of pushing policies into OPA to use:
- Policy API — A simple REST API call to upload/update/delete a policy. In this case OPA is the server receiving a request from a client to update its policies.
- Bundle service API — A polling mechanism for policy and data bundles. In this case OPA is the client polling a remote server for policy and data updates. Bundles have revisions, and OPA includes the revision of the active bundle on its status and decision log reports.
- Some advanced users might also build custom docker images of OPA that already contain the required policies (using a script to load the policies on server startup through the policy API, or maybe taking advantage of the bundle persistence feature). This could be useful for use cases where connectivity to a bundle server isn’t possible, so policy deployment can’t be done dynamically, and OPA has to be shipped loaded with the relevant policies.
Each one of these methods has its advantages and disadvantages.
The policy API method for updating policies can be used for simple scenarios of policy development and testing, or production environments that very rarely require policy updates. However, out-of-the-box this method doesn’t meet all of the requirements we’ve mentioned above for properly managing deployments to production environments.
The bundle service method is appealing since it provides us with a centralized mechanism for deploying our policies. For production environment use cases, we want to leverage the bundle service functionality and integrate it with our CI/CD in order to effortlessly and efficiently manage application code development and deployment.
An Authorization Solution
The solution I developed is composed of two important parts:
- Policy code should be maintained together with the application code, in the same repository.
- Application deployment and policy deployment should be coupled together.
Let’s go over the development process step by step with our example repository.
We have a git repository containing our application (in the example repo, a car store) and side by side with the application code it contains the rego policies used for authorizing its endpoints.
opa-bundle-version-example ├── app │ └── server.py └── policies ├── carstore_authz.rego ├── carstore_authz_test.rego └── data.json
Let’s say David is a developer that wants to add a new endpoint to the application. He git branches out to a feature branch, and on this branch he:
1. Makes changes to the app & app tests to include the new endpoint
see the example repo closed PR
2. Makes changes to his policy & policy tests to comply with the new endpoint
see the example repo closed PR
After finishing, David will open a PR, and the changes he made to the app will be reviewed and tested together with the matching changes in the policy.
Later on, David fixes the review comments he got, and his PR is approved. Once all app tests and policy tests run and pass, he can merge his PR to deploy the new feature he was working on. Upon merging his PR, CI/CD runs:
1. The policy and the data are bundled using OPA’s
build command, with the flag
-r to set the bundle revision to the git SHA of the current commit:
see the example repo makefile
2. The policy bundle is tested using OPA’s
see the example repo makefile
3. The policy bundle is uploaded to a S3 bucket under the name
4. The new version of the app is deployed, together with OPA. OPA is configured to download the uploaded matching bundle:
see the example repo docker compose
Throughout the CI/CD pipeline, this process is reviewed, audited, and could be rolled back if needed. Coupling the deployment of the app with the matching policy version helps us avoid the headaches and problems mentioned earlier. For example, if the application deployment fails since the new version contains a bug, the already-deployed old version of the application is still running with OPA using the old version of the policy bundle.
What if I have to update my policies without redeploying?
Sometimes, redeploying your app is costly, both in terms of time and resources. The methods above fit a scenario where a single app uses a single OPA agent, which can be deployed as a sidecar for each instance of the app, or as a service that serves many instances of the same application. Obviously, if you use OPA for admission control in your Kubernetes cluster, and you want to enforce the latest new CIS rules, you don’t need to re-deploy your entire cluster. For use cases where multiple applications use the same OPA agent, or some frequent changes need to be made to the policies regardless of the app’s version, the previous methods could be combined with OPA’s Discovery service API (and possibly utilizing multiple policy bundles) and smarter CI/CD to achieve a more fluid policy deployment pipeline.
The missing piece
In many cases, API authorization decisions are made from external data sources that need to be pushed or pulled into OPA. In the example repository the employees and their roles are kept in a json file together with the policies themselves. This of course is not a production ready solution — no company will open a PR and redeploy its services when a new employee joins, or keep private data in git repositories.
You can read the docs to learn more about how to use external data with OPA.
To conclude, this is what you should keep in mind when using OPA for API authorization:
- Your policies should be versioned together with your app. Git is the perfect tool.
- Policy as code means designing, testing and reviewing your policies, just like you do with your code.
- Coupling app deployment and policy deployment protects you from a mismatch between the app version you run, and the policies used to authorize its APIs.
Hopefully the concepts and methods I described here can help you when planing, designing and using OPA for API authorization. If you’re interested in further experimenting and reading, make sure to run the local example included in the repository, check out the links throughout the article, or feel free to reach out to me.