/
Opa on Sunbird

Opa on Sunbird

Introduction

  • OPA is a short form for open policy agent

  • OPA is a cncf graduated project that is generally used for evaluating policies

  • We are using opa on sunbird to perform API checks such as role check, organisation check, user check etc

  • For more details on opa, see here - https://www.openpolicyagent.org/docs/latest/

  • Envoy is a high perform proxy which receives the traffic in the pod and then makes an api call to opa to check if the request should be allowed or not

  • The opa team has written an envoy-opa plugin which allows this authorization check possible in envoy

  • For more details take a loo at this github repo - https://github.com/open-policy-agent/opa-envoy-plugin

  • For understanding envoy proxy and its configurations, refer to the envoy proxy docs - https://www.envoyproxy.io/

Opa Implementation Details

  • For details and design doc on how opa is implemented, see this doc (check out the slide deck first that is mentioned on this page and then the contents of the page) - RBAC on Sunbird

  • For details on how to run opa locally, perform policy checks etc., see this video - https://youtu.be/RwuRH-KRxic

Opa Policy Files

  • Opa polices are located here - https://github.com/project-sunbird/sunbird-devops/tree/master/kubernetes/opa

  • The policies are segregated based on the service name (same as helm chart names)

  • The common folder is where all the common files and checks reside that can be utilised by any other polices.

  • The main.rego is the entry point for the code

  • The policies.rego is the service specific policy file

  • The common.rego is the common set of policy functions and checks

  • The policies_test.rego is the test case file where every policy will have a corresponding test case

Opa Test Case and Coverage

  • Every opa polciy should have a test case written

  • A wrong policy can cause issues and can potentially block or allow requests that were not supposed to be blocked or allowed.

  • Every test case should pass (be it allowed condition or deny condition)

  • Code coverage should be 100% for both the service policies and common policies

  • Without 100% code coverage and test case for every API, the deployment will fail

  • Test case and code coverage should be implemented for both the type of policies

    • Common policies

    • Service policies

  • Care should be taken when making changes to the common policies as that would affect all other service.

  • Most issues will get identified during test case or code coverage phase, but its not guaranteed. A user must also self validate the policy to be 100% sure of the policy code

  • The ansible role for opa test case and code coverage is located here - https://github.com/project-sunbird/sunbird-devops/tree/master/kubernetes/ansible/roles/opa-test-coverage

  • We can enable or disable the test case and code coverage mandate using these variables (not recommended as this can cause issues in case of a incorrect policy) - https://github.com/project-sunbird/sunbird-devops/blob/master/ansible/roles/stack-sunbird/defaults/main.yml#L1037

OPA Enabled Services

Deployment Jobs, Ansible Roles, Helm Charts

OPA and Envoy Metrics

OPA Custom Logs Plugin

OPA Logs Searching

  • In graylog all opa specific logs start with opa_ keyword

  • For example, in order to search all the logs that have been rejected by opa, we can use this query on graylog opa_result_allowed: false and then further drill down from there using other fields

  • The log will also contain information about the API endpoint, roles of the user, userid of the user, other headers and payloads, token validity etc.,

  • By using these information and cross check the policy definition, we can easily identify why a request was rejected.

Related content