Exposing APIs outside of an Apiman Organisation: A SQL-based implicit permissions system
Apiman's explicit permissions system provides fine-grained permissions for users who are a member of an Apiman organisation. It uses a combination of RBAC and ABAC to determine whether a given user can perform an action.
For example, if you wanted to gain access to APIs in Organisation MyOrganisation, your account would need to be added to the organisation and granted (at least) read permissions.
A Black Parrot Labs client needed an enhancement to Apiman allowing an API published in Apiman to be discoverable outside of its home organisation, without consumers being granted explicit permissions (i.e. consumer is not a member of the owning organisation, and has no org-specific permissions). This is analogous to a GitHub public repository, where non-members of an organisation can read a repo, but they can't modify it - i.e. they implicitly have read permissions.
This change would allow Apiman to be more versatile, working in complex and multi-tenant deployments where organisations need to offer APIs without knowing who the consumer is, just that the consumer is a particular type/class of user. The feature also enables 'anonymous browsing' of APIs that is required by the Apiman developer portal.
The implicit permissions was dubbed discoverability.
Example use cases:
- Multi-tenancy: many different organisations may cohabit on the same Apiman Manager instance, yet want to allow non-members to subscribe to a curated subset of APIs with an organisation.
- Developer portal: allow APIs to be exposed to dev portal users without exposing everything.
- Expose a subset of APIs in an organisation to external consumers without needing to know them a-priori.
- Distinguish between different categories of Apiman user and offer different APIs.
- Allow API managers to expose an API to non-members (read-only).
- API consumers can discover and subscribe to these externally visible APIs.
- Allow distinction between non-members who are logged-in vs anonymous.
- Way of segregating "full users" from "devportal-only" users.
- Minimise incompatible changes to the database model.
- Avoid large numbers of invasive changes to an existing stable codebase.
- Must work seamlessly with the existing explicit permissions model.
We undertook an extensive requirements gathering and problem definition phase, with a proposal created in the upstream community. The client created a series of detailed use-cases that acted as acceptance criteria.
Project design & implementation
As you can imagine, the explicit permissions system is used rather extensively throughout the codebase, so a careful design was required in order to avoid polluting the codebase with huge numbers of disruptive changes.
BPL designed and implemented a SQL-based solution using materialised views (or triggers), in combination with optimised local caching, in order to provide the necessary guards.
The Simple Part
A new discoverability attribute was added to the APIVersion and ApiPlan tables and corresponding JPA models. This represents whether the entity how visible that entity is to API Consumers outside of the organisation the API resides in.
The discoverability levels as of implementation are:
- ORG_MEMBERS: The existing default behaviour: only users with explicit org view permissions can see the entity.
- FULL_PLATFORM_MEMBERS: Only users registered & logged in with the IDM solution that are full members of the platform.
- PORTAL: All logged-in users with a valid account, anonymous users, and devportal users can see the API. It is explicitly indicating that the API should be exposed in the developer portal.
The following entities in the Apiman data model needed to be changed to reflect this new attribute:
- ApiVersion: In Apiman all APIs are versioned, and you can have a public API, which allows users to sign up without an API key. We needed the discoverability here to cover the public API case.
- APIPlan: This is slightly misleadingly named in the Apiman data model; it's actually a combination of Api Version and Plan Version. This is what you are creating/deleting when you set an API Version as being available via a particular Plan Version in the 'plans' tab in the Apiman Manager. So, this is your standard API with an API key.
Architectural Complexities and Permission Inheritance
There were some significant architectural factors to consider:
Firstly, let's say we grant the FULL_PLATFORM_MEMBERS permission to a particular APIPlan (i.e. so it can be seen by non-members), that implies that we must grant read-only access to a specific ApiVersion, PlanVersion, and various other policy configuration and organisation metadata elsewhere in the data model.
Apiman has a wide variety of API endpoints that allows users to efficiently resolve resources without pulling in the whole universe; when a user invokes an endpoint we need to know whether the user should be allowed to access the endpoint, and which resources should be returned in any given query.
As an example, if we have an API MappingApi/1.0 and we make it available under the Gold/1.0 plan with FULL_PLATFORM_MEMBERS visibility and Gold/2.0 with ORG_MEMBERS (default). If we later invoke the Apiman API anonymously and query for all plan versions in this organisation, it must return the Gold/1.0 plan, because it has implicitly been elevated to FULL_PLATFORM_MEMBERS by virtue of our first action; conversely, it must not return Gold/2.0 as that remains as members only.
As we can see from the figure above, the effect of this is that we have a variety of related entities that are inheriting the discoverability permission set on an ApiPlan (or a public ApiVersion, not shown). These same principles apply for various other similar interactions.
There's also precedence to consider, as we can have multiple different APIs using the same plans, and a related entity must always inherit the 'most visible' permission from all permissions set (i.e. PORTAL wins over ORG_MEMBERS).
As you can see, even in this simplified explanation there's quite a bit going on, and we really don't want to be doing multiple complex queries for every single request to figure out which implicit permission might apply. The number of requests and performance impact would be completely unsustainable.
So, how did we solve this thorny issue? It is a tale of two parts:
Database efficiency with materialised views and triggers
Rather than frequently issuing complex SQL queries (with the additional undesirable effect of complicating Apiman's search-related queries with several additional joins), BPL opted to take an approach that takes best advantage of some helpful features offered by RDBMS: materialised views (sort of, as we shall explain).
A materialised view can be thought of as the result of a query in the database, except rather than the result being ephemeral and disappearing once it is complete, it is stored in disk read-only and made available to subsequent queries. The results are periodically refreshed based upon the configuration that the developer sets when creating the materialised view -- with a major issue being that this varies considerably by vendor in terms of the types of refresh functionality and guarantees that are available.
Materialised views work particularly well for read-heavy situations where writes are infrequent but expensive. A classic application includes Top-K, where you list the top K elements of a dataset in a given period of time. For example, most popular post in the last 24 hours.
This is ideal for our implicit permissions system, because it's rare to change the discoverability attribute, but it is read extremely frequently. The materialised view we created for Apiman contains a variety of information to immediately determine whether a given entity is viewable by the user making a query (org id, api id, api version, plan id, plan version, discoverability, etc) -- no need to do any secondary queries!
In databases such as Oracle, the materialised view can be configured to automatically update itself when a commit is performed in the tables that feed into the materialised view; timers and manual refresh are also available. Unfortunately, in most other open source databases, including PostgreSQL at time of writing, only manual refresh logic is available; this won't work for our use-case as we need commit-based updates (eventual consistency is potentially OK, but we still want it to be fairly tight).
Thus, for Oracle, we have a baked-in mat view solution, but in the other databases we need something that can provide the same functionality. Our saviour comes in the form of database triggers.
A new table, discoverability was added, with the same fields as the materialized views. Then for each database we support, a hand-crafted series of triggers were created; these update the discoverability table every time an ApiVersion or ApiPlan row is created, updated or deleted, ensuring it remains in sync and performing the related queries to pull in any related data into this single view.
Application efficiency with PATRICA-trie cache
We have the database side resolved, but there's still the issue of what to do when a request is made to Apiman and how to resolve the correct permissions, plus ensure we are correctly filtering various search and listing endpoints.
Multiple such checks can be made in a single request, so we need a discoverability cache in the Apiman Manager to avoid hammering the database with lots of small requests. The cache's lifetime is tied to the request, and hence it expires once the response is returned (and each request has its own unique cache). Typically, any additional requests spawned within the same request-response cycle will be with to the same organisation.
To avoid pulling in the entire discoverability table on each request we only query discoverability data for the organisation we're interacting with.
Secondly, we want to very quickly resolve whether a given entity is visible to the user, which must respect the precedence rules outlined earlier. We took advantage of Apiman's well-designed hierarchal structure to create keys that can be queried exactly or by substring to provide any entity's discoverability.
We used a PATRICIA trie (also known as a PATRICIA tree) to store this information. A PATRICIA trie is a variant of a radix tree, which very compactly stores string keys with common prefixes, and allows for fast resolution of substrings.
When a new entry is inserted into the cache, it compares the discoverability and always stores the 'most visible' version.
We'll cover the more in-depth technical details of this in a future blog post, for those inclined.
Tying It All Together
There were now two major areas to tie together:
Firstly, whenever user issues a request to the Apiman Manager API, we needed to check whether the user had the correct explicit or implicit permissions:
Secondly, whenever the user did a search, we needed to loop these new read permissions into the search query to avoid returning entries the user should not see.
The original Apiman codebase used the antiquated JPA Criteria API to generate those dynamic queries. For those who have not experienced this API, it is very difficult to implement with and even worse to maintain. It turned out to be an excellent opportunity to refactor the dynamic code to the modern (and excellent) Blaze-Persistence - which functionally equivalent but completely overhauls the API into something comprehensible.
The 'discoverability' system has proven successful in the customer's downstream platform, covering all of the use-cases presented, and receiving positive feedback from their clients.
We were delighted that this relatively complex project was achieved with minimal invasiveness; it required a depth of knowledge of Apiman's internals that others do not have. It was a pleasure working with the client to refine and test the concept, as they solidified their requirements and produced concrete use-cases; the few bugs that cropped up during testing were trivial to fix.
This was contributed to the Apiman upstream and is released in Apiman 3.0.0.Final.