Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Success metrics

Goal

Metric

Assumptions

Milestones

...

Requirements

Requirement

User Story

Importance

Jira Issue

Notes

1

Status
colourRed
titleHIGH

2

 

 

 

 

 

...

An Authentication Server contains an Identity Realm and provides a UI for interactive logins, an API for service logins , and an API for managing security tokens. An Authentication Server is deployed for each cloud shard and ensures the Personally Identifiable Information (PII) for managed users remain within the shard’s home AWS region. This is important for shards, such as GDPR, that have regulatory requirements for data location and lifecycle. A second type of Authentication Server is designed to integrate with a customer’s Active Directory (AD) or LDAP domain where the PII is stored and managed externally while the tokens are still managed locally.

Cloud Manager

The Cloud Manager provides myriad services related to the operation and management of the OpenMethods cloud including playing key roles in user authentication and permission management. The Cloud Manager contains an Identity Realm that is responsible for handling logins and security tokens for Cloud Team Admin users and shared infrastructure systems. It provides similar APIs for service logins and security tokens as the Authentication Server, but the interactive login is handled by . It also provides the UI for all interactive logins for access to the Cloud Manager Admin Console instead of being an independent UI.

In addition to the identity information for some users, the Cloud Manager is responsible for storing and providing access to extended information for all users regardless of home identity realm. Cloud Manager includes a component that manages the extended user information and exposes a secure REST API for retrieving this information. The REST API requires a valid security token for a user that has the appropriate permissions. Changes to a user’s extended information, including user permissions, can only be made through the Public Console by an authenticated user with appropriate permissions.

...

There are two types of cloud users: Humans and Systems. There are two styles of login: UI and API. Human logins are restricted to the user interface and cannot be used with the Login API. System logins use the Login API and will not work in the user interface. Humans access the login screen using a browser and provide their username (email address) and password to authenticate. Systems use the Login API and provide the username (special value) and password (special value). Once the user’s credentials are validated, a JWT is generated for the user. A Human user is only allowed to maintain a single active token with any existing tokens being immediately revokedtoken. Issuing a new token for a Human user automatically revokes any existing tokens for that user. Human users access the main login screen by visiting Cloud Management Console. This will be available at console[.shard].openmethodscloud.com and is hosted by the Cloud Manager. After the login is successful, a cookie will automatically be set containing the JWT and associated with the shard’s domain.

System logins use the Login API and will not work in the user interface. Systems provide a username (special value) and password (special value). A System user can have any number of active tokens, each with their own lifecycle. There may be some identifiable information related to the specific instance of a system user that can be used to unsure only a single token is active for that instance.

Human users access the main login screen by visiting their home shard’s authentication server. This will be available at login[.shard].openmethodscloud.com. If the user attempts to visit the public console directly, the public console will redirect the user to the authentication server login page. After the login is successful, a cookie will automatically be set containing the JWT and associated with the shard’s domain.

Roles

...

Roles

A role is a predefined set of permissions that can be applied to a user. A role typically groups permissions based on the type of activities being enabled. There are 4 standard role types: VIEW, MANAGE, ADMIN, and SUPER. The role types generally inherit the capabilities of lower role types. For example, the MANAGE role type includes all the permissions associated with the VIEW role type. It would be hard to MANAGE something if you can’t VIEW it to begin with. A role also has a scope that determines which objects are applicable to a role’s permissions. There are 3 role scopes: DEPLOYMENT, CUSTOMER, SYSTEM. DEPLOYMENT scoped roles hold permissions tied to a specific customer deployment. CUSTOMER scoped roles hold permissions related to a customer and any of their deployments. SYSTEM scoped roles hold permissions related to overall system operation, customer activities, and customer deployments.

A user can have any number of roles assigned. A user must be associated with a customer before a customer or deployment scoped role can be assigned. Similarly a user must be associated with the system level object before a system level role can be assigned.

Each customer will have 3 roles automatically created. These default roles will reflect the VIEW, MANAGE, and ADMIN role types and will contain all permissions for those types and are scoped for that customer and all deployments owned by that customer. Assigning the customer VIEW role will allow a user to view all customer information and all information for every deployment. Each deployment will also have 3 roles automatically created that reflect the VIEW, MANAGE, and ADMIN role types. These roles are specific to the deployment.

Permissions

A permission represents the ability to perform some action within the system. Similar to roles, permissions follow the standard types of VIEW, MANAGE, ADMIN, AND SUPER. Permissions also are targeted for a specific scope of DEPLOYMENT, CUSTOMER, or SYSTEM. A permission can be assigned to either a role or to a user directly.

Extended User Information

...

There is a separation between authenticating a user and retrieving the properties associated with a user, including a user’s permissions. The user’s token DOES NOT include the user’s extended information. This information must be retrieved in a separate request to the cloud manager. This request uses the token of the system or human user making the request and provides the token id of the user whose information is being requested. The Cloud Manager first determines if the requester token is authentic and valid. If the requester token is authentic and valid, the Cloud Manager then determines if the requester has the RETRIEVE_EXTENDED_INFORMATION permission. If the requester is authentic, valid, and has the correct permissions, the Cloud Manager determines if the query token is authentic and valid. If the query token is authentic and valid, the query user’s extended information is returned to the requesterreturns the extended information.

Token Renewal

When a token is issued for a user, the time it remains valid is limited. After time expires, the token is no longer valid and the user must log in again. API requests using an expired token will fail. The token expiration can be extended through renewal. This involves making a request to the Login API and providing the original token as well as the security stamp returned during the initial login. The renewal can occur any time prior to expiry. An updated token will be issued to the requester. A token owner should leave plenty of time prior to expiry to attempt to renew the token in case of network delays or temporary outages. Typically the token owner should begin the renewal process with at least a quarter of the time remaining.

Token Leasing

For APIs that receive multiple requests from the same client in quick succession, the process of validating the client’s token can get very expensive and would produce significant traffic directed at the authentication service. To avoid this situation, a service can choose to lease a token’s validation result. Part of the user’s extended information is values related to token leasing. These values represent the length of time a previous validation can be used to service the request without having to revalidate the token.

The type of request determines which value is used. The length of time for read operations is typically a longer time frame as these types of requests are generally safe. Some write operations are only moderately dangerous so can be performed within a shorter window of time. Security critical or destructive operations like deletes should typically only be performed in conjunction with a token validation so usually don’t allow a lease window. Regardless of the type of request that requires a new validation, the clock for all leases are refreshed.

Token Revocation

A token can be revoked by another user that has the appropriate permissions. As a result of features like high availability of authentication servers and token leasing, revoking a token won’t be immediate. There may be some time lag between when a token is revoked and API requests being aware of the revocation. The maximum amount of time a token may remain in effect is the authentication server cache cycle plus the longest token lease time frame. In most cases the revocation will be in effect within 30 seconds and typically within a much shorter period of timeA successful lease hit does not restart the clock for the ongoing lease window.

Scenario 1 - Valid Lease Hit

...

An initial request comes in and the token is validated. A read request comes in 6 seconds after the initial request. The token is found in the leasing cache. The last validated property of the leased token is checked to see if the new request falls into the token’s read window by adding the read lease time to the last validated time and comparing it to the time of the request: lastValidated + readLease < currentTime. Since the new request falls into the read window, the request is processed as if the token is valid without explicitly validating the token with the issuer.

Scenario 1 - Invalid Lease Hit

...

An initial request comes in and the token is validated. A write request comes in 6 seconds after the initial request. The token is found in the leasing cache. The last validated property of the leased token is checked to see if the new request falls into the token’s write window by adding the write lease time to the last validated time and comparing it to the time of the request: lastValidated + writeLease < currentTime. Since the new request falls outside the write window, the token is validated with the issuer before the request can be processed. Once validated, the last validated property is updated and the lease windows begin again.

Token Caching

All tokens are stored in a database that is shared by all authentication servers servicing a user realm. For large user realms, the number of token related requests can negatively impact the overall performance of the authentication server. To avoid accessing the database for every request the authentication server employs a token caching strategy. When a token is issued after a successful login, it is stored in the database immediately. The token is also placed in the cache. When a validation request arrives for a token, the authentication server first looks for the token in the cache. If the token is found in the cache and is valid the token is successfully validated. If the token is found in the cache but is invalid, the token is removed from the cache. If the cached token was invalid or is not in the cache, a request is made to the database to retrieve the token. If the token is found and is valid it is placed in the cache and is returned. If the token is not found or is not valid, the validation request is denied.

The cache mechanism must also be periodically cleaned up to avoid validating revoked tokens and leaking memory over time. The cache clearing process is run in the background and not related to incoming requests. When the authentication server starts up, a timestamp is taken and stored as the previous cleanup cycle time. A timer is created to launch the cleanup cycle and scheduled to fire after the clean up cycle cool down time. When the cleanup cycle is started, a new timestamp is taken and stored in a temporary variable. A database request is made to update all tokens in the ACTIVE state whose expire time has passed. A second database request is made to retrieve all tokens not in the ACTIVE state whose end date is after the previous cleanup cycle time. All tokens returned in the list of newly ended tokens are removed from the cache. Finally the temporary timestamp is stored in the previous cleanup cycle time variable and the cleanup process is scheduled again.

Token Revocation

A token can be revoked by another user that has the appropriate permissions. As a result of features like high availability of authentication servers and token leasing, revoking a token won’t be immediate. There may be some time lag between when a token is revoked and API requests being aware of the revocation. The maximum amount of time a token may remain in effect is the authentication server cache cycle plus the longest token lease time frame. In most cases the revocation will be in effect within 30 seconds and typically within a much shorter period of time.

Authentication Server API

...