Access Control - from scary to simple with one open-source tool

TL;DR: Open source code promotes innovation, collaboration, and transparency in software development. Open Policy Agent (OPA) is a powerful policy engine that helps manage access control in microservice architectures, and has been adopted by major companies such as Netflix. However, OPA alone may not be sufficient for some policy enforcement needs. OPAL, an open-source project, complements OPA by addressing these limitations and is already being used by companies like Tesla, Cisco, and the NBA.

Why Open Source?

Open source code plays a crucial role in the world of software development. It allows developers from around the globe to access, modify, and contribute to the source code of various projects, resulting in a powerful knowledge-sharing ecosystem. By making software freely available, open source promotes the rapid evolution of technology and ensures that solutions are accessible to a broader audience. This collaborative approach helps accelerate problem-solving, encourages the development of better and more reliable software, and drives the growth of vibrant developer communities. Furthermore, open source enables individuals and organizations to build upon existing solutions, reducing the duplication of effort, lowering entry barriers, and empowering even small teams to create powerful, feature-rich applications.

Access Control is Hard

Developing access control is not an easy task, especially considering today’s microservice architectures: many authorization points are required by design, and changing requirements and regulations from various departments constantly challenge the solution. Moreover, even a minor bug in this sensitive layer can have a devastating effect on your organization.

To successfully build an access control layer, it is necessary to embrace some best practices — One of which is to decouple policy from code. This can be done using a policy engine such as Open Policy Agent (OPA).

OPA: Separating policy from code

OPA, an all-purpose policy engine OSS project, was introduced several years back to reinvent how we enforce permissions throughout the stack. Since its inception, OPA became a CNCF graduate (alongside Kubernetes) and was battle-tested by major players like Goldman Sachs and Netflix.

OPA is very efficient and built for performance — It keeps the policy and data for which it needs to evaluate the rules in the cache, and supports having multiple instances as sidecars to every microservice, thus avoiding network latency.

OPA’s policy rules are written in Rego — a high-level declarative (Datalog-like) language. Here is an example:

default allow = false
allow = true { 
 input.type == "writer"
     input.action == "create"
     input.resource == "article"
}

Here is a high-level overview of how OPA works.

OPA is really powerful and can solve a lot of policy-related issues. Having said that, for some policies, OPA isn’t enough, and another solution is needed.

Taking OPA to the next level

While OPA is a strong candidate for all permissions-related needs, there are cases where OPA might not be enough.

For example, let’s say that our policy requires that a user will be a subscriber to see parts of our site. In this case, to make a policy decision, we need to know the user’s subscription status from our billing service (e.g. PayPal or Stripe) and know every time it is changing: a new subscriber expects immediate access to our site, and a churned user should lose access on the spot.

To make the data accessible to OPA, we will need to develop a pushing mechanism that will update OPA in real-time, which requires additional planning and development.

There are additional cases for which OPA isn’t enough:

When your policy relies on multiple data sources for decisions, you need to use the bundle API, which is not really straightforward to implement.
When you have multiple OPA instances, you need to build your tools to keep them in sync.

OPA was adopted by giants like Netflix, and they faced the same challenges. Let’s take a look at how they overcame their issues.

Learning from giants — Netflix & OPA

There are three things that Netflix implemented in their architecture that utilize OPA, which allowed them to have an effective solution.

Running Multiple OPA Instances Simultaneously: Netflix’s architecture incorporates numerous OPA instances, each operating locally on a service’s pod and equipped with its own authorization agent. By doing so, they significantly reduce network latency when querying OPA, ensuring a smoother and more efficient authorization process.
Real-time Policy and Data Synchronization: Netflix’s authorization system relies on certain inputs not included in OPA’s data, such as employee-department associations. To address this, they developed a microservice called an “Aggregator” that gathers data from various sources and keeps it up to date in real-time. Additionally, they implemented another microservice, the “Distributor,” which ensures that all OPA instances remain synchronized and current.
Empowering Developers with Self-service Policy Creation: As a large organization, Netflix needed a scalable solution that granted developers the autonomy to define policies for their respective services. To achieve this, they designed the “Policy Portal” — a user-friendly website that automatically generates Rego code based on user-defined rules. Once a policy is created or updated, it’s versioned and stored in a dedicated policy database.

There is one issue with this solution — Netflix never made it public.

That’s what OPAL is here for!

Open Policy Administration Layer (OPAL) is an OSS project invented to aid with OPA management, essentially open-sourcing Netflix’s solution. OPAL is an administration layer for the OPA agents, which keeps them updated in real-time with data and policy updates. Since its creation, OPAL has been adopted by major players such as Tesla, Cisco, Palo Alto Networks, multiple banks and even the NBA as the administration layer for their OPA agents.

OPAL offers two remarkable features that have garnered widespread appreciation among numerous organizations:

The ability to monitor a specific policy repository (like GitHub, GitLab, or Bitbucket) for updates. It does this by either using a webhook or checking for changes every few seconds. This makes the system act as a Policy Administration Point (PAP), which sends the policy changes to OPA, the Policy Decision Point (PDP) in this case. This ensures that OPA is always up to date.
The ability to track any relevant data source (API, Database, external service) for updates via a REST API, and fetch up-to-date data back into OPA.

Designing access control and managing permissions for cloud-native or microservice-based products can be a complex endeavor. The inherent nature of distributed applications and microservices demands numerous authorization points, while evolving requirements from different departments continuously put pressure on every authorization solution. Furthermore, even the slightest oversight in the authorization layer may result in severe consequences for your application, leading to security vulnerabilities and potential privacy or compliance issues. Therefore, it is vital to diligently approach and handle these challenges to ensure robust and secure applications.

You can join OPAL’s Slack community to chat with other devs who use OPAL for their projects, contribute to the Open-source project, or follow OPAL on Twitter for the latest news and updates.