Introduction
Cedar is an open source policy language and evaluation engine. This book collects all the accepted RFCs (Request for Comments) for Cedar, providing a central repository for easier access and reference.
RFC Status
RFC Template (Fill in your title here)
Related issues and PRs
- Reference Issues: (fill in existing related issues, if any)
- Implementation PR(s): (leave this empty)
Timeline
- Started: (fill in with today's date, YYYY-MM-DD)
- Accepted: TBD
- Stabilized: TBD
Summary
Brief explanation of the feature.
Basic example
If the proposal involves a new or changed API, include a basic code example. Omit this section if it's not applicable.
Motivation
Why are we doing this? What use cases does it support? What is the expected outcome?
Please focus on explaining the motivation so that if this RFC is not accepted, the motivation could be used to develop alternative solutions. In other words, enumerate the constraints you are trying to solve without coupling them too closely to the solution you have in mind.
Detailed design
This is the bulk of the RFC. Explain the design in enough detail for somebody familiar with Cedar to understand, and for somebody familiar with the implementation to implement. This should get into specifics and corner-cases, and include examples of how the feature is used. Any new terminology should be defined here.
Drawbacks
Why should we not do this? Please consider:
- implementation cost, both in term of code size and complexity
- integration of this feature with other existing and planned features
- cost of migrating existing Cedar applications (is it a breaking change?)
There are tradeoffs to choosing any path. Attempt to identify them here.
Alternatives
What other designs have been considered? What is the impact of not doing this?
Unresolved questions
Optional, but suggested for first drafts. What parts of the design are still TBD?
is
Operator
Related issues and PRs
Timeline
- Started: 2023-06-16
- Accepted: 2023-07-28
- Landed: 2023-11-08 on
main
- Released: 2023-12-15 in
cedar-policy
v3.0.0
Note: These statuses are based on the first version of the RFC process.
Summary
This RFC proposes to add an is
operator to the Cedar language that allows users to check the type of entities.
Basic example
Say that a user wants to allow any User
to view files in the folder Folder::"public"
.
With an is
operator, the following policy achieves this goal.
permit(
principal is User,
action == Action::"viewFile",
resource is File in Folder::"public"
);
At evaluation time, principal is User
will evaluate to true if the type of principal
is User
, and resource is File in Folder::"public"
will evaluate to true if the type of resource
is File
and it is a member of Folder::"public"
.
Motivation
The is
operator allows users to restrict the scope of their policies to particular entity types.
In many cases, this can also be done by adding restrictions to the schema.
For example, the appliesTo
field for each action in the schema allows users to specify which entity types can be used as principals and resources for that action.
In the example above, the schema could specify that viewFile
should only apply to principals of type User
and resources of type File
.
However, this doesn't help if the user is not using a schema, the action is unconstrained, or the policy refers to an action that may act on multiple principal/resource types.
For example, here is one case where the action is unconstrained, taken from the Cedar document cloud example.
forbid (principal, action, resource)
when
{
resource has owner &&
principal != resource.owner &&
resource has isPrivate &&
resource.isPrivate
};
This policy applies to any principal, action, resource pair, but requires that the resource has owner
and isPrivate
attributes.
If the resource does not have these attributes, then the policy is a no-op.
Say that only one entity type (e.g., File
) has the fields owner
and isPrivate
.
Then this policy can be written more succinctly as follows.
forbid (principal, action, resource is File)
when
{
principal != resource.owner &&
resource.isPrivate
};
Detailed design
Changes required to parser/AST
- Update the grammar and AST to support expressions of the form
e is et
whereet
is aName
(e.g.,Namespace::EntityType
). - Update the grammar and AST to support expressions of the form
e1 is et in e2
(but note1 in e2 is et
, because this is more difficult to mentally parse). - Update the grammar to allow expressions like
e is et
ande1 is et in e2
in the policy scope for principals and resources. The action part of the scope will only allow expressions likeaction == Action::"foo"
andaction in [Action::"foo", Action::"bar"]
, as before.
Changes required to evaluator
e is et
will evaluate to true iff et
is a Name
(e.g., Namespace::EntityType
) and e
is an expression that is an entity of type et
.
If e
is an entity-typed expression, but does not have type et
, then e is et
will evaluate to false.
If e
is a non-entity-typed expression, the evaluator will produce a type error.
Note that this design maintain the property that "conditions in the scope never error" because principal
and resource
are guaranteed to be entities, so the expression (principal|resource) is et
will never error.
Namespaces are considered "part of" the entity type. So:
User::"alice" is User
is trueNamespace::User::"alice" is User
is falseNamespace::User::"alice" is Namespace::User
is trueUser::"alice" is Namespace::User
is false
Changes required to validator
e is et
will be typed as True
if the validator can guarantee that expression e
is an entity of type et
, or False
if the validator can guarantee that e
is an entity not of type et
.
If the validator can guarantee that e
is an entity, but not determine the exact type, then e is et
is given type Bool
.
Otherwise, the validator produces a type error.
To improve typing precision, we can extend the current notion of "effects" to track information about entity types.
Currently, the validator uses effects to know whether an expression is guaranteed to have an attribute.
In particular, the validator knows that if e has a
evaluates to true, then expression e
is guaranteed to have attribute a
.
We can do something similar for is
: when the expression e is et
evaluates to true, expression e
is guaranteed to be an entity of type et
.
Other changes
The Dafny formalization of the Cedar language, corresponding proofs, and differential testing framework in cedar-spec will need to be updated to include the new is
operator.
Drawbacks
-
This is a substantial new feature that will require significant development effort.
-
This change will break existing applications that depend on the structure of policies. For example, if we decide to allow
is
in the policy scope (as shown in the examples above), tools that perform basic "policy slicing" based on the scope will need to decide how to handle the new syntax. -
The
is
operator may encourage coding practices that go against our recommended "best practices" (more details below).
Best practice: Using "typed" actions
From the discussion on https://github.com/cedar-policy/cedar/issues/94:
Guidance we have given other customers is that actions should be "typed," in the sense that they include in their name the type of the resource they apply to. So instead of writing
permit( principal == User::"admin", action == Action::"view", resource is Photo );
you would write
permit( principal == User::"admin", action == Action::"viewPhoto", resource );
In your schema, you indicate that the "viewPhoto" action should always be accompanied by a
Photo
for the resource, so the policy doesn't need to say that explicitly (and doesn't need to perform a runtime check).
If we support an is
operator, then customers may be encouraged to use a single "view" action for multiple resource types, rather than a new "viewX" action for each resource type X.
Alternatives
-
Another way to simulate
is
is to add anentity_type
attribute to entities, and check the type of an entity usingresource.entity_type == "File"
(for example). However, this requires users to manually add an entity's type as an attribute, and leads to the possibility that an entity might be created with an attribute that doesn't actually match its type. -
The behavior of
is
can also be simulated using groups. For each entity type (e.g.,File
), a user could create a group (e.g.,Type::"File"
) and add all entities of that type to the group. Then to check whether an entity has a particular type, users could usein
(e.g.,resource in Type::"File"
). Like alternative (2), this requires some extra work on the part of the user (setting up the memberOf relation), and does not prevent entities from being added as members of the wrong group.
Resolved questions
- Should we allow
is
in the policy scope? Currently we only allow==
andin
expressions in the scope, with the intention that these expressions can be used to represent RBAC constraints, whereas more general expressions in the policy condition can be used to express ABAC constraints.- Decision: We will allow
is
andis ... in
in the scope. - Summary of discussion:
- We feel that
is
is natural to write in the policy scope since it limits the "scope" of the policy. - Most requests we have received for this feature use an example with
is
in the scope, confirming our belief that this is the most natural location for its use. - Although
is
is not currently used for policy indexing, it could be in the future, and current indexing strategies can safely ignoreis
constraints in the meantime.
- We feel that
- Decision: We will allow
- Should users be able to indicate that an entity may be of multiple types, e.g.,
principal is [User, Group]
? Can this be combined within
(e.g.,principal is [User, Group] in Group::"orgA"
, or evenprincipal is [User, Group] in [Group::"orgA", Group::"orgB"]
)?- Decision: We will not support this. The syntax is more difficult to read and the same behavior can be achieved by writing multiple policies instead. We may reconsider this position based on user feedback.
Disallow embedded whitespace in string representations of the EntityUID
Related issues and PRs
- Implementation PR(s): cedar#134
Timeline
- Started: 2023-06-19
- Accepted: 2023-06-26
- Landed: 2023-06-28 on
main
- Released: 2023-06-29 in
cedar-policy
v2.3.0
Summary
Disallow embedded whitespace, comments, and control characters in string representations of the EntityUID
Basic example
Currently, the following syntax is legal:
EntityTypeName::from_str("ExampleCo :: Photoflash :: User //comment");
This change would modify the function so that the string input must be normalized; i.e. all extraneous whitespace, control characters, and comments removed.
EntityTypeName::from_str("ExampleCo::Photoflash::User");
Motivation
Similar to other programming languages such as Rust, Cedar is currently whitespace agnostic. This leads to the ability to embed whitespace, control characters like newlines, and comments around the ::
delimiter in an Entity UID.
For example, the following syntax is valid:
permit(
principal == ExampleCo :: Photoflash :: //This is a comment
:: User::"alice",
action,
resource
);
permit(
principal == ExampleCo::Photoflash::User:://comment
"alice",
action,
resource
);
This capability was little known, even amongst many Cedar team members, and was only identified as a result of a typo in a Cedar demo. While this behavior may be OK inside policy statements, there is a risk when policy contents such as the EntityTypeName are used as strings outside the context of policy statements, such as in the JSON schema representation. For laypersons who are not Cedar experts, these values appear to be normalized strings rather than fragments of Cedar policy syntax, leading to bugs in application code when whitespace and comments can appear in the value.
Examples:
- The Cedar Schema format models the schema configuration under a JSON key for the namespace. Policy stores which index schemas by namespace are unlikely to recognize the need to normalize the value, leading to the possibility of storing duplicate schema definitions for "ExampleCo::Photoflash" and "ExampleCo :: Photoflash" and indeterminate behavior regarding which schema takes effect at runtime.
- Policy Stores can implement logic that relies on string comparisons against the EntityTypeName. In a real issue, an application using Cedar sought to preclude callers from passing Actions in the inline slice of entity definitions. It did so by checking if an EntityTypeName matched
.*::Action
. It presumed that:: Action
was invalid syntax and would be rejected by the Cedar evalator, the same as any other syntatically invalid input. This resulted in a bug, as it allowed callers to bypass the extra validation that the application sought to enforce. - Customers are anticipated to build meta-permissions layers that restrict callers to manipulating policy store contents for only certain namespaces. This may lead to policies such as
forbid(...) when {context.namespace = "ExampleCo::Photoflash"};
. There is a risk that an unauthorized actor could bypass this restriction by using a namespace with embedded spaces.
While it is technically possible for applications to mitigate this risk by diligently using Cedar tooling to normalize the values, the little-known nature of this Cedar behavior implies that few will know they should normalize the value. As a point of reference, application developers who have worked with Cedar extensively for over a year were bitten by this bug in production. Hence, this is likely to result in bugs in many other Cedar-based implementation with similar logic, risking a perception that Cedar is fragile or unsafe.
Detailed design
In EntityTypeName::from_str()
, whitespace would be disallowed throughout -- either before, after, or in the middle of the typename. The analagous change will also be made in functions EntityNamespace::from_str()
and EntityUid::from_str()
.
Drawbacks
This is a breaking change. We do not generally want breaking changes in Cedar. It is being proposed only because it is believed the risk of breakage is low, and any potential short-term impacts to stability will pay long-term dividends in Cedar's perceived reliability.
The parties most at risk are those who accept ad hoc values of EntityTypeName from callers in a string format where the values may contain embedded whitespace. At the current time, the only known party who does this in a production setting is Amazon Verified Permissions. Due to the relative newness of Amazon Verified Permissions, this party is accepting of the change as it is not expected to break any production scenarios, and appears to be a valid risk to accept in order to benefit the Cedar community in the long-term.
Other parties at risk are those who use static, programmatically defined values of EntityTypeName where the code contains a typo in the value that accidentally includes additional whitespace. The belief is that this is likely to get caught in a test environment before going to production, and represents a typo that the impacted party may wish to fix, regardless.
The last party at risk are those who allow customers to upload custom schema files where the namespace value may contain whitespaces. The only known party who accepts adhoc schema uploads from customers with namespaces is Amazon Verified Permissions, which is accepting of this change.
To test these assumptions, this change will preceeded by reach-outs to Cedar early adopters to highlight the risks and gain community buy-in on the approach.
Alternatives
An alternative is to modify the policy syntax to disallow embedded whitespace in Entity UID across all parts of Cedar, including within policy statements. This could be done by treating the UID as a single identifier with special structure (rather than tokenizing it into ::
separated sequence of identifiers).
One additional benefit of this approach is that it can mitigate attempts by pentesters and malicious actors who may seek ways to craft Cedar policy statements that look like they do one thing, but actually do another. The existence of little known parsing behaviors provide an avenue for exploration by this audience, as demonstrated by the following example:
permit(
principal == ExampleCo::Photoflash::User:://"alice"
"bob", //TODO: Try various tricks to obfuscate this line from readers
action,
resource
);
Despite the potential benefit, this alternative was rejected due to the high risk of breaking existing Cedar users. In the worst case scenario, a customer may have an existing forbid
statement that relies on this behavior as illustrated below. By changing the behavior, the forbid
statement will error at runtime and be skipped, and hence the system could fail open.
forbid(
principal in ExampleCo:: Photoflash::UserGroup::"someGroup", //Accidential typo, but valid syntax
action == ExampleCo:: Photoflash::Action::"write",
resource
);
This risk is too great. Therefore, the suggested approach is a compromise that mitigates the known production bugs with fewer risks. Any concerns about pentesters and malicious actors crafting obfuscated policies will need to be addressed by other non-breaking means, such as linter warnings and syntax highlighting.
Standalone strict validation
Related issues and PRs
Timeline
- Started: 2023-07-11
- Accepted: 2023-09-05
- Landed: 2023-09-08 on
main
- Released: 2023-12-15 in
cedar-policy
v3.0.0
Note: These statuses are based on the first version of the RFC process.
Summary
Internally, strict validation of a policy is implemented as 1/ checking and transforming the policy using a permissive validation mode; 2/ annotating the transformed policy with types, and 3/ checking the types against more restrictive rules. We would like to be able to at least explain strict mode independently of permissive mode, but there is no easy way to do that. This RFC proposes to separate strict mode from permissive mode, making the implementation simpler and more understandable/explainable, though somewhat more strict.
Motivation
Consider this policy, given in file policy.cedar
, where principal
is always of type User
:
permit(
principal,
action == Action::"read",
resource)
when {
(if context.sudo then Admin::"root" else principal) == resource.owner ||
resource.isPublic
}
If we validate this policy against the schema schema.cedarschema.json
, then resource.owner
is always of type User
. As a result, the policy is rejected by strict-mode validation, with two errors:
- the
if
/then
/else
is trying to return entity typeAdmin
in the true branch, and entity typeUser
on the false branch, and these two are not equal. - since
resource.owner
has typeUser
, it will have a type different than that returned by theif
/then
/else
, which can sometimes returnAdmin
.
But if we change "User"
in schema.cedarschema.json
on line 13 to "Org"
, then resource.owner
is always of type Org
. As a result, the policy is accepted! The two error conditions given above seem not to have changed -- the conditional returns different types, and those types may not equal the type of resource.owner
-- so it's puzzling what changed to make the policy acceptable.
The reason is that strict-mode validation is dependent on permissive-mode validation. In particular, strict-mode validation is implemented in three steps: 1/ validate and transform the policy using permissive mode; 2/ annotate the transformed policy AST with types; and 3/ check the types against more restrictive rules.
In the first case, when resource.owner
is always of type User
, the expression (if ...) == resource.owner
has type Boolean
in permissive mode and as a result no transformation takes place in step 1. As a result, it is rejected in step 3 due to the errors given above.
In the second case (after we change the schema), resource.owner
is always of type Org
. Under permissive mode, the (if ...) == resource.owner
expression has singleton type False
. That’s because the (if ...)
expression has type User|Admin
("User
or Admin
") and User|Admin
entities can never be equal to Org
entities. Both the singleton type False
(which only validates an expression that always evaluates to false
) and union types like User|Admin
are features of permissive mode and are not present in strict mode. Because of these features, step 1 will replace the False
-typed (if ...) == resource.owner
with value false
, transforming the policy to be permit(principal,action,resource) when { false || resource.isPublic }
. Since this policy does not require union types and otherwise meets strict mode restrictions, it is accepted in step 3.
In sum: To see why the second case is acceptable, we have to understand the interaction between permissive mode and strict mode. This is subtle and difficult to explain. Instead, we want to implement strict mode as a standalone feature, which does not depend on features only present in permissive mode.
Detailed design
We propose to change strict-mode validation so that it has the following features taken from permissive mode:
- Singleton boolean
True
/False
types (which have 1/ special consideration of them with&&
,||
, and conditionals, and 2/ subtyping between them andBoolean
, as is done now for permissive mode). - Depth subtyping, but not width subtyping, to support examples like
{ foo: True } <: { foo: Boolean }
.
Strict-mode validation will continue to apply the restriction it currently applies:
- Arguments to
==
must have the compatibles types (i.e., an upper bound exists according to the strict - depth only - definitions of subtyping) unless we know that comparison will always be false, in which case we can give the==
typeFalse
. The same goes forcontains
,containsAll
, andcontainsAny
. - Arguments to an
in
expression must be compatible with the entity type hierarchy, or we must know that the comparison is always false. - Branches of a conditional must have compatible types unless we can simplify the conditional due to the guard having a
True
/False
type. - The elements of a set must have compatible types.
- As a consequence of the two prior restrictions, union types cannot be constructed.
- Empty set literals may not exist in a policy because we do not currently infer the type of the elements of an empty set.
- Extension type constructors may only be applied to literals.
Validating Action
entities and template slots ?principal
and ?resource
must be done differently so as not to rely on the "top" entity type AnyEntity, which is essentially an infinite-width union type, used internally.
Next, we describe how we propose to handle these situations, in both permissive and strict validation.
Importantly, these changes all retain or even improve precision compared to the status quo -- they will not be the reason that policies that typechecked under the old strict mode no longer do.
Actions
Permissive mode validation considers different Action
s to have distinct types, e.g., Action::"view"
and Action::"update"
don't have the same type.
This is in anticipation of supporting distinct actions having distinct attribute sets.
The least upper bound of any two actions is the type AnyEntity.
Such a type supports expressions that are found in the policy scope such as action in [Action::"view", Action::"edit"]
.
This expression typechecks because the least upper bound of Action::"view"
and Action::"edit"
is AnyEntity so the expression is type Set<
AnyEntity>
.
Since AnyEntity cannot be used in strict validation, the current code special-cases the above expression, treating it as equivalent to action in Action::"view" || action in Action::"edit"
.
However, this approach is unsatisfying as it only supports literal sets of Action
entities.
It turns out we can resolve this issue by making all Action
entities have one type -- Action
-- in both strict and permissive validation.
This change strictly adds precision to validation, compared to the status quo.
Expressions like [Action::"view", Action::"edit"]
have type Set<Action>
and thus do not need AnyEntity to be well-typed.
Expressions like if principal.isAdmin then Action::"edit" else Action::"editLimited"
will typecheck in strict validation, since both branches have the same type (Action
).
A possible drawback of this change is that it will not allow distinct actions to have distinct attribute sets (if indeed we add attributes to actions in the future).
Rather, all actions must be typeable with the same record type.
As a mitigation, for action entities that have an attribute that others do not, the attribute can be made optional.
However, two entities would not be able to have the same attribute at different types, e.g., { foo: String }
vs. { foo: Long }
.
Arguably, this restriction is clearer and more intuitive than what would have been allowed before.
Template slots
There are no restrictions on what entity type can be used to instantiate a slot in a template, so the permissive typechecker also uses AnyEntity here. The only restriction on the type of a slot is that it must be instantiated with an entity type that exists in the schema. The strict validator today does not properly support validating templates, which is something we want to allow.
Because AnyEntity cannot be used in strict validation, we will modify both permissive and strict validation to precisely typecheck a template by extending the query type environment to include a type environment for template slots. Doing so will strictly improve precision for permissive validation; all templates that validated before will still do so after this change.
Here is how the change will work.
In the same way that we typecheck a policy once for each possible binding for the types of principal
and resource
for a given action, we will also typecheck a template once for every possible binding for the type of any slots in that template.
Typechecking for every entity type may seem excessive,
but we know that a template slot can only occur in an ==
or in
expression in the policy scope where the principal
or resource
is compared to the slot.
We will infer the type False
for these expression if the slot is an entity type that cannot be ==
or in
the principal or resource entity type, short-circuiting the rest of typechecking for that type environment.
Typechecking could be optimized by recognizing and ignoring type environments that will be False
based only on the policy scope.
For example, using the schema from the motivating example, the following template will typecheck.
permit(principal == ?principal, action == Action::"read", resource);
Using the usual request type environment construction, there is one environment (User, Action::"read", Object)
.
Because this is a template containing ?principal
, we extend the environment with each possible type for the slot: User
, Org
, Admin
and Object
.
When typechecking, the validator sees principal == ?principal && ...
first in every environment which is always an equality between two entity types.
The principal
is always a User
while ?principal
is one of the possible slot types depending on the type environment.
In every environment where ?principal
is not User
, the equality has type False
, causing the &&
to short-circuit to type False
.
The rest of the policy is typechecked as usual when ?principal
is User
.
This change will match the precision of the validator today, and will enable more precise typechecking if we expand where in a policy template slots may be used.
In particular, if we allowed expressions like ?principal.foo == 2
in the condition of a policy, the above approach would allow us to know precisely that ?principal
will always have a foo
attribute, whereas the current approach using AnyEntity would not be able to.
Implementation details
The current way we implement strict mode will change to accommodate these additions. We propose the following:
- Change the Rust code to implement strict mode by reusing the logic of permissive mode and using a flag to perform more restrictive checks. For example, when given
User
andAdmin
entity types, the code performing the least-upper-bound computation will return union typeUser|Admin
in permissive mode, and will fail in strict mode. And, the code for conditionals would force branches' types to match in strict mode, but not in permissive mode. Etc. - When checking in strict mode, we construct a new AST that performs the needed transformations, as in step 2 today.
- Either way, we can add type annotations. For strict mode, they could be added during the transformation; for permissive mode, no transformation is needed. No checking is done after annotations are added, since the checking is always done, regardless of mode, prior to the transformation.
Drawbacks
The main drawback of this approach is that strict-mode validation will be more restrictive. This is by design: Policies like the one in the motivating example will be rejected where before they were accepted. Accepting this RFC means we are prioritizing understandability over precision under the assumption that the lost precision will not be missed.
Policies that validated before but will not validate now are quite odd -- they would rely on the use of width subtyping, unions, etc. to determine that an expression will always evaluate to true
(or false
) and thus can be transformed away. But policies like these will be like those in the motivating example at the start of this RFC, and are probably not what the user intended in the first place. Ultimately, flagging them as erroneous is safer.
A secondary drawback is that this change will result in a non-trivial, and backwards-incompatible code change. Mitigating this issue is that the backward incompatibility will be minimal in practice (per the above), and we will use verification-guided development to prove that the change is sound (i.e., will not introduce design or implementation mistakes).
Alternatives
The proposal is, we believe, the minimal change to make strict mode self-contained, but not too different from what was there before. It also should not result in pervasive changes to the existing code. Alternative designs we considered (e.g., allowing more permissive mode features in strict mode) would be more invasive.
Updates
- 2023-11-07: The original text said that the handling of unspecified entities needed to be adjusted. However, although released implementations of Cedar type unspecified entities as
AnyEntity
,==
andin
expressions involving unspecified entities are typed asFalse
(rather thanBool
). This provides the same behavior as using a specialUnspecified
type, as originally proposed in this RFC.
Disallow duplicate keys in Cedar records
Related issues and PRs
Timeline
- Started: 2023-07-14
- Accepted: 2023-08-04
- Landed: 2023-10-24 on
main
- Released: 2023-12-15 in
cedar-policy
v3.0.0
Note: These statuses are based on the first version of the RFC process.
Summary
Today, Cedar allows duplicate keys in record literals (and other record values, including context
), with last-value-wins semantics.
This RFC proposes to disallow duplicate keys in record values, making that an error.
Basic example
Today,
permit(principal, action, resource)
when {
resource.data == { foo: 2, foo: 3 }
};
is a valid Cedar policy. The foo: 2
is dropped and foo
will have the value 3
in the record literal.
In other words, in Cedar,
{ foo: 2, foo: 3 } == { foo: 3 }
evaluates to true
.
More precisely, Cedar's semantics does not just drop values prior to the last, but fully evaluates them before throwing them away. This has consequences in the presence of errors. For instance,
{ foo: 1 + "hello", foo: 3 }
does not evaluate to { foo: 3 }
as you might expect based on the previous example, but rather, it results in an evaluation error.
This RFC proposes that all of the above expressions and policies would result in an evaluation error due to duplicate keys in a record.
Motivation
There are two main motivations for this change.
First, in the status quo, Cedar policy ASTs actually cannot losslessly roundtrip to EST and back. This is because the AST representation correctly allows duplicate keys in record literals, but the EST representation drops them (due to an oversight during EST design). In this way, the EST is unfaithful to Cedar semantics, and not all valid Cedar policies are expressible in the EST, which violates the EST's design goal of being lossless modulo whitespace/comments/etc.
More precisely, if a JSON EST contains duplicate record keys in a record literal, our JSON parser will take last-value-wins, and the internal EST representation will contain only the last value.
This results in the correct behavior on examples like { foo: 2, foo: 3 }
, but the incorrect behavior on examples containing errors, like { foo: 1 + "hello", foo: 3 }
.
It also means that roundtripping the AST representation to the EST representation and back necessarily loses information -- specifically, the values associated with all but the last instance of each key.
The second motivation is that the status quo makes things difficult for our Cedar formal specification and accompanying proofs. The proofs are required to reason about "well-formed" records as a pre- and post-condition, which could be avoided if records were structurally prohibited from having duplicate keys (e.g., by using a map type instead of an association list). The status quo also requires reasoning about the order keys appear in a record literal, meaning that the keys can't be naively sorted without preserving information about their original order. If duplicate keys were prohibited, the order of keys wouldn't matter (but see "Unresolved questions" below).
Two secondary, more minor motivations for this proposal are:
- It would bring the Cedar language in line with our desired behavior on duplicate keys in other Cedar API surfaces. See cedar#159.
- Cedar records resemble other languages' struct types more than they resemble map types. The current behavior would be unexpected for struct types, and the proposed behavior would be intuitive.
Detailed design
We'd do something very like cedar#169, except that instead of implicitly dropping all but the last value for each key (in .collect()
), we'd be careful to emit an error when constructing an Expr
if the record has duplicate keys.
In the case of record literals, the error could be caught in the CST->AST step, resulting in the user seeing a parse error rather than an evaluation error.
In the case of context
, the error would need to be caught when constructing the Context
object.
In the case of records in entity attributes, the error could be caught when constructing an Entity
object.
There are no other cases, because currently, Cedar has no other way to construct a record value:
there are only record literals, and records preexisting in context
or entity attribute values.
All of these cases would be caught prior to is_authorized()
in Core, so we would not need a new category of evaluation error. No new evaluation errors would happen due to this proposal.
Another consequence of the above choices is that duplicate-key errors "take precedence" over (other) evaluation errors. The expression { one: 1 - "three", one: 1 }
will throw a duplicate key error and not a type error.
I don't believe this change would have a significant effect on validation or width/depth subtyping of records, but please correct me if I'm wrong.
Drawbacks
This is a breaking change, not just to the Cedar API surface, but to the Cedar language (policy syntax) itself. That's a significant drawback not to be taken lightly. In this case, it's mitigated somewhat by the fact that we're turning previously-valid policies into errors, not changing them into a silently different behavior. It's also mitigated by the fact that most Cedar users are probably (citation needed) not writing policies in the class that will have a behavior change -- i.e., policies with duplicate keys in record literals. And finally, it's mitigated by the fact that Cedar's current behavior on duplicate keys was not documented in the public docs (e.g., in this section), so if any policies are relying on this behavior, it's probably by accident.
Alternatives
The main alternative is doing nothing, which of course has the reverse pros/cons of the proposal as explained in "Motivation" and "Drawbacks".
Another alternative, which would address the first motivation and not the second, would be to change the EST format so that it supports record literals with duplicate keys. I believe this would require a breaking change to the EST format, and record literals would not be expressible as JSON maps, which is unfortunate because I believe that's the most natural/ergonomic. (Devil's advocate: we could maybe find a way to support JSON maps in the common case with some escape syntax for expressing record literals that have duplicate keys. If we want to take this alternative seriously, I can investigate this.)
Unresolved questions
Order-independence for record keys
[Resolved: we won't make any further changes as part of this RFC, but may address some of this in a different RFC]
At first glance, one would hope that this change would be sufficient to ensure that the order of keys in a record literal (or more generally, any record value) does not matter. However, that's not quite true -- if keys are mapped to expressions with errors, the order does matter, because only the first error encountered will be returned. While we're making this change, should we also make a change so that we have true order-independence for Cedar record keys? I could think of at least three ways to do that:
- Always evaluate all values in a record, and return potentially multiple evaluation errors rather than just the first
- Provide no guarantee about which error the user will receive when multiple record keys have errors (nondeterministic error behavior) -- I think the formal spec currently guarantees the first error?
- Guarantee that record values will be evaluated in lexicographic order of their keys, regardless of the order the keys appear in the policy text
This issue is particularly hairy for records contained in attribute values in the entity store -- I believe these records go through a hashmap representation in our current datapath, so we lose ordering information and also duplicate keys (silently). Since the values in these records are restricted expressions, there are very few ways they can produce evaluation errors, but they still can, in the case of calls to extension functions. This is probably a potential source of difference between the errors produced in the Rust implementation and in the formal spec; I wonder why DRT hasn't caught this, particularly in the case where we drop a duplicate key that was mapped to an erroring expression. Maybe the Dafny JSON parser silently takes last-value-wins? or maybe we don't test extension values in attribute values?
0021-any-and-all-operators
Cedar Schema Syntax
Related issues and PRs
Timeline
- Started: 2023-07-24
- Accepted: 2023-10-02
- Landed: 2024-02-19 on
main
- Released: 2024-03-08 in
cedar-policy
v3.1.0
Note: These statuses are based on the first version of the RFC process.
Summary
This document proposes a custom syntax for Cedar schemas. The syntax was developed with the following goals in mind:
- Use the same concepts in both the JSON-based and custom syntax, allowing them to be naturally inter-converted
- Reuse Cedar policy syntax as much as possible, to help with intuition
- When no Cedar policy syntax analogue exists, leverage ideas from other programming languages that favor readability
- Use as few syntactic concepts as possible; similar ideas in the schema should look similar syntactically
- Support converting back and forth between the JSON-based schema without losing information
This is not a proposal to replace the JSON-based syntax; the aim is to provide an additional syntax that better satisfies the goals above. The JSON syntax is appropriate in many circumstances, such as when constructing schemas systematically. Adopting this proposal would put schemas on the same footing as Cedar policies, which have a custom and a JSON syntax.
Basic example
Here is the proposed syntax for the TinyTodo schema, whose auto-formatted JSON version is 160 lines long.
entity Application;
entity User in [Team,Application] { name: String };
entity Team in [Team,Application];
entity List in [Application] {
owner: User,
name: String,
readers: Team,
editors: Team,
tasks: Set<{name: String, id: Long, state: String}>
};
action CreateList, GetLists
appliesTo { principal: [User], resource: [Application] };
action GetList, UpdateList, DeleteList, CreateTask, UpdateTask, DeleteTask, EditShares
appliesTo { principal: [User], resource:[List] };
Motivation
Cedar schemas for non-toy applications are hard to read and write. The root cause of the difficulty is the use of JSON:
- JSON has low information density. For three applications we’ve modeled — TinyTodo, Document Cloud, and GitHub — schemas span 4-5 pages.
- JSON does not support comments. This means that any design intent for a schema cannot be expressed inline.
We believe that a custom syntax for Cedar schemas can help. It can be more information-dense, support comments, and leverage familiar syntactic constructs. We think customers will be happier with a custom syntax: we regularly hear favorable reviews of Cedar's custom policy syntax, as compared to a JSON-based syntax.
Detailed Design
In what follows we first present the syntax by example. A full grammar is given at the end.
Comments
Cedar schemas can use line-ending comments in the same style as Cedar, e.g., // this is a comment
.
Namespace format
namespace My::Namespace {
...
}
A Schema can have zero or more namespace declarations. The contents of each (...
above) comprise entity, action, and common-type declarations, all of which are unordered; i.e., they are all allowed to mutually refer to one another (but see discussion on common types below). The recommended style is to put entity declarations first, followed by action declarations, with common type declarations interspersed as needed. Entity, action, and common types referenced in the same namespace in which they are defined need not be prefixed by that namespace.
Entity, action, and common-type declarations can appear on their own, outside of any namespace; in that case they are implicitly within the empty namespace.
Entity declarations
Here is an example illustrating the entity declaration syntax.
entity User in [Group] {
personalGroup: Group,
delegate?: User,
blocked: Set<User>,
};
entity Document {
owner: User,
isPrivate: Boolean,
publicAccess: String,
viewACL: DocumentShare,
modifyACL: DocumentShare,
manageACL: DocumentShare
};
entity DocumentShare;
entity Public in [DocumentShare];
The first declaration indicates that User
is an entity type, can be a member of the Group
type, and has two attributes: personalGroup
with entity type Group
, and blocked
with type Set<User>
.
To allow syntactic parity with common type
declarations, discussed below, we can optionally add =
just before the { ... }
defining an entity's attributes, e.g., entity Dog = { name: String }
rather than entity Dog { name: String }
. Allowing =
also supports using a common type name as the entity's shape, and also allows a future extension in which an entity's shape is a type other than a record.
Entity declarations may or may not have the in ...
component, and they may or may not have attributes. A singleton list can be expressed without brackets, e.g., entity User in Group = ...
Optional attributes are specified in the style of Typescript, with a ?
after their name. The last element of an the attribute list can optionally include a comma.
Entity types that share a definition can be declared together. For example, the following declares two entity types User
and Team
, both of which can be members of Team
and both of which have a name
attribute.
entity User, Team in [Team] { name: String };
Since the JSON-based format does not support groups, this creates a challenge for back-and-forth translation. Our suggested solution is to add an optional groupid
attribute to an entity declaration in the JSON. Those entities that are in the same group will be put together when translating to the custom syntax (after confirming that they indeed have the same type), and when translating to JSON such a group ID will be automatically generated.
Action declarations
Here is an example illustrating the action declaration syntax.
action ReadActions;
action WriteActions;
action CreateDocument
appliesTo { principal: [User], resource: [Drive] };
action DeleteDocument in [WriteActions]
appliesTo { principal: [User], resource: [Document] };
action ViewDocument in [ReadActions] appliesTo {
principal: [User,Public],
resource: Document,
context: {
network: ipaddr,
browser: String
}
};
A declared action indicates either none, one, or both of the following:
- What action groups it is a member of, using
in
- What principal, resource, and context types it may be coupled with in a request (if any), using
appliesTo
Specifying []
for the in
component is equivalent to omitting in
entirely. An action that has no appliesTo
component essentially names an action group: having no possible principal and/or resource types means there’s no way to submit a legal request with the action.
The appliesTo
specificiation uses record syntax, where the "attributes" of the record may be any combination of principal
, resource
, and context
, with the following constraints.
- If
principal
and/orresource
is given, the accompanying type must be either- a single entity type, or
- a non-empty list of entity types
- If a
principal
orresource
element is not given, that means that this request component is unspecified, i.e., corresponding to theNone
option in theprincipal
orresource
component of aRequest
. - If
context
is given, the accompanying type must be a record type. If it is not given, the type{}
is assumed. - At least one of
principal
,resource
, orcontext
must be included ifappliesTo
is present; i.e., writingappliesTo { }
is not allowed.
Here are two additional examples that follow these rules:
action ViewDocument in ReadActions appliesTo {
resource: Document,
context: {}
};
action Ping appliesTo {
context: {
source: ipaddr,
dest: ipaddr
}
};
Since actions are entities, action names are entity IDs, which can be arbitrary strings. Thus we admit the following more general syntax as well.
action "Delete Document $$" in ["Write Actions"] appliesTo {
principal: [User],
resource: [Document]
};
We anticipate future support for attributes as part of action entities. If/when they are supported, we can specify action attributes after the appliesTo
part as an inline record using Cedar syntax following an attributes
keyword. For example, here’s an extension of ViewDocument
with action attributes.
action ViewDocument in [ReadActions]
appliesTo {
principal: [User,Public],
resource: Document,
context: {
network: ipaddr,
browser: String
}
}
attributes {
version: 1,
group: "meta"
};
As with entity
declarations, since the JSON-based format does not support groups, we can add an optional groupid
attribute to an action declaration in the JSON. Those actions that are in the same group will be put together when translating to the custom syntax (after confirming that they indeed have the same type), and when translating to JSON such a group ID will be automatically generated.
Common types
You can define names for types (as in Typescript, OCaml, Haskell, and many other languages) and then use them in action
and entity
declarations. Here is an example:
type authcontext = {
ip: ipaddr,
is_authenticated: Boolean,
timestamp: Long
};
entity Ticket {
who: String,
operation: Long,
request: authcontext
};
action view appliesTo { context: authcontext };
action upload appliesTo { context: authcontext };
Note the use of =
for type
declarations, but the lack of (required) =
for entity
declarations; the =
is needed because the definining type may not be a record, e.g., type time = Long
.
As with entity
and action
declarations, the name implicitly includes the surrounding namespace as a prefix.
While common types can be declared anywhere within a namespace and safely referenced by entity and action declarations, they cannot refer to one another, to avoid introducing definitional cycles. (To allow mutual reference, we could make declaration order matter: entity
, action
, and other type
declarations cannot refer to a common type before it is defined.)
Much later note, May 2024: In the fullness of time we relaxed the above restriction and allowed common types to refer to one another, as long as there are no definitional cycles. See cedar#766.
Disambiguating Types in Custom Syntax
Cedar has four kinds of type name:
- Primitive types, e.g.,
Long
,String
, etc. - Extension types, currently just
ipaddr
anddecimal
- Entity types
- Common types
The first two are built in, and could potentially change in the future. The last two are schema-declarable; they have an associated namespace prefix, but references within the defining namespace can drop the prefix.
In the current JSON syntax, common types may not overlap with primitive types; e.g., declaring a common type Long
is disallowed. However, common types are permitted to overlap with extension types; e.g., declaring common type ipaddr
is allowed. This works in the JSON because specifying an extension type requires an explicit "type"
designator, as in the following:
"addr": {
"ip": { "type": "Extension", "name": "ipaddr" },
}
If we also had defined a common type ipaddr
we would reference it by writing
"addr": {
"myip": { "type": "ipaddr" },
}
Common types may also overlap with entity types, for similar reasons: To reference an entity type you must write { "type": "Entity", "name": "Foo" }
while to reference a common type you'd just write { "type": "Foo" }
.
The custom syntax does not require such designators, for readability, so we need a way to disambiguate type name references when necessary.
When two different kinds of type definitions are in separate namespaces, we rely on the namespace resolution rules to disambiguate. For example, suppose we defined the following schema:
type email_address = String; // (1)
namespace Foo {
entity email_address { addr: String }; // (2)
entity User { email: email_address }; // (3)
}
Here, in the definition of User
, the unqualified type email_address
at (3) resolves to the entity definition at (2), because it is in the closest enclosing scope.
When two different kinds of type definitions are in the same namespace, we propose the following five rules:
- Issue a warning (for both syntaxes) when a schema defines the same typename twice (within a namespace)
- Disallow overlap of extension and primitive type names
- Resolve name references in a priority order
- Reserve
__cedar
as a namespace to disambiguate extension/primitive types from others - Handle entity/common typename overlaps in translation
These rules aim to address several (additional) tenets
- Conflicting type names are an exception to be discouraged, rather than a norm, for readability and clarity
- Schemas should not break when new extension types are added to Cedar
- Backward-incompatible changes should be minimized
Rule 1: Issue a warning (for both syntaxes) when a schema defines the same typename twice
Defining an entity type with the same name as a primitive or extension type, or an entity and common type with the same name (within the same namespace), is unlikely to be a good idea. Though they can be technically disambiguated in the JSON syntax, the potential for confusion is nontrivial. None of the rules that follow need to be understood, and conversion to/from JSON and custom syntax is assured, if defined types are all kept distinct.
Rule 2: Disallow overlap of extension and primitive type names
Right now, all primitive types have different names than extension types. We propose that it should always be that way. Though the flexibility is there in today’s JSON schema to allow overlap, we see no good reason to allow it. Because we are currently the sole purveyor of extension types, this rule is easy to implement.
Rule 3: Resolve name references in a priority order
When the schema references a type name (without a namespace prefix), it should resolve in the following order:
- common type
- entity type
- primitive
- extension type
For example, suppose we declared a common type type ipaddr = Long;
If the schema later referred to ipaddr
, then that reference would be to this common type definition, not the extension type, since the common type is higher in the order. This ordering ensures that future-added extension types will not break existing schemas. For example, if we added a future extension type date
, then schemas that define an entity or common type of the same name would have the same semantics as before: they would refer to the schema-defined type, not the new extension type.
In tabular form, this priority order is depicted below. We write n/a for the case that the two kinds of type can never share the same name (due to the status quo and rule 2).
| |common |entity |primitive |extension | |--- |--- |--- |--- |--- | |common | | | | | |entity |common | | | | |primitive |n/a |entity | | | |extension |common |entity |n/a | |
The next two rules consider how to disambiguate conflicting types.
Rule 4: Reserve __cedar
as a namespace to disambiguate extension/primitive types from others
If a schema defines a common or entity type that overlaps with an extension type, there is no way to refer to extension type — the common or entity type always take precedence. This is particularly problematic if a new extension type overlaps with an existing definition, and an update to the schema might wish to use both.
To rectify this problem, we propose reserving the namespace __cedar
and allowing it to be used as a prefix for both primitive and extension types. This way, an entity type ipaddr
can be referred to directly, and the extension type ipaddr
can be referred to as __cedar::ipaddr
. Likewise, this allows one to (say) define an entity type String
that can be be referenced distinctly from __cedar::String
(though this practice should be discouraged!).
Rule 5: Handle entity/common typename overlaps in translation
If a schema defines an entity type and a common type with the same name, the references to that name will resolve to the common type, and there will be no way to reference the entity type in the custom syntax. Per rule 1, users will be warned of this.
Custom syntax users can rectify the situation by changing either the common type or entity type to a different name (or placing them in different namespaces). JSON syntax users can refer to both types distinctly, but will be warned that translation to the custom syntax will not be possible without changing the name. Translation tools can easily make this name change automatically.
Note that translation from JSON to custom syntax is only problematic when a schema defines both a common and entity type of the same name. Translation tools can easily address this issue by renaming the common type (and offering to change the type name in the original JSON)
Evaluating these rules against our added tenets above, we can see:
- Conflicting type names are discouraged by issuing a warning when there is overlap
- Due to the choice in priority order, schemas will not break when new extension types are added
- Only rule 4 (introducing the
__cedar
reserved namespace) is backward incompatible, and it is very unlikely to affect existing schemas.
Rejected alternative designs for disambiguation
One alternative design we considered was to use prefixes on typename references to indicate the kind of type, e.g., extension types could be optionally prefixed with +
, e.g., +ipaddr
and +decimal
, and entity types could be optionally prefixed with &
, e.g., &User
and &My::Namespace::List
. However, we rejected this design because it would be unfamiliar to users (especially the use of +
), and doesn't discourage the use of overlapping type names.
Another alternative we considered was to forbid type-name overlaps entirely. However, rejected this design because it is strongly backward-incompatible, and also because it could lead to surprise if a future definition of an extension type happened to match a common or entity type in use in an existing schema.
Mockup: Document Cloud
Here is a version of the Document Cloud schema.
namespace DocCloud {
entity User in [Group] {
personalGroup: Group,
blocked: Set<User>
};
entity Group in [DocumentShare] {
owner: User
};
entity Document {
owner: User,
isPrivate: Boolean,
publicAccess: String,
viewACL: DocumentShare,
modifyACL: DocumentShare,
manageACL: DocumentShare
};
entity DocumentShare;
entity Public in [DocumentShare];
entity Drive;
action CreateDocument, CreateGroup
appliesTo { principal: [User], resource: [Drive] };
action ViewDocument
appliesTo { principal: [User,Public], resource: [Document] };
action DeleteDocument, ModifyDocument, EditIsPrivate, AddToShareACL, EditPublicAccess
appliesTo { principal: [User], resource: [Document] };
action ModifyGroup, DeleteGroup
appliesTo { principal: [User], resource: [Group] };
}
Mockup: GitHub Model
Here is a version of the GitHub schema.
namespace GitHub {
entity User in [UserGroup,Team];
entity UserGroup in [UserGroup];
entity Repository {
readers: UserGroup,
triagers: UserGroup,
writers: UserGroup,
maintainers: UserGroup,
admins: UserGroup
};
entity Issue {
repo: Repository,
reporter: User
};
entity Org {
members: UserGroup,
owners: UserGroup,
memberOfTypes: UserGroup
};
action pull, fork, push, add_reader, add_triager, add_writer, add_maintainer, add_admin
appliesTo { principal: [User], resource: [Repository] };
action delete_issue, edit_issue, assign_issue
appliesTo { principal: [User], resource: [Issue] };
}
Grammar
Here is a grammar for the proposed schema syntax.
The grammar applies the following the conventions. Capitalized words stand for grammar productions, and lexical tokens are given in all-caps. When productions or tokens match those in the Cedar policy grammar, we use the same names (e.g., IDENT
and Path
).
For grammar productions it uses |
for alternatives, []
for optional productions, ()
for grouping, and {}
for repetition of a form zero or more times.
Tokens are defined using regular expressions, where []
stands for character ranges; |
stands for alternation; *
, +
, and ?
stand for zero or more, one or more, and zero or one occurrences, respectively; ~
stands for complement; and -
stands for difference. The grammar ignores whitespace and comments.
Schema := {Namespace}
Namespace := ('namespace' Path '{' {Decl} '}') | {Decl}
Decl := Entity | Action | TypeDecl
Entity := 'entity' Idents ['in' EntOrTyps] [['='] RecType] ';'
Action := 'action' Names ['in' (QualName | '[' [QualNames] ']')] [AppliesTo] [ActAttrs]';'
TypeDecl := 'type' IDENT '=' Type ';'
Type := PRIMTYPE | Path | SetType | RecType
EntType := Path
SetType := 'Set' '<' Type '>'
RecType := '{' [AttrDecls] '}'
AttrDecls := Name ['?'] ':' Type [',' | ',' AttrDecls]
AppliesTo := 'appliesTo' '{' AppDecls '}'
ActAttrs := 'attributes' '{' AttrDecls '}'
AppDecls := ('principal' | 'resource') ':' EntOrTyps [',' | ',' AppDecls]
| 'context' ':' RecType [',' | ',' AppDecls]
QualName := Name | Path '::' STR
QualNames := QualName {',' QualName }
Path := IDENT {'::' IDENT}
EntTypes := Path {',' Path}
EntOrTyps := EntType | '[' [EntTypes] ']'
Name := IDENT | STR
Names := Name {',' Name}
Idents := IDENT {',' IDENT}
IDENT := ['_''a'-'z''A'-'Z']['_''a'-'z''A'-'Z''0'-'9']* - PRIMTYPE
STR := Fully-escaped Unicode surrounded by '"'s
PRIMTYPE := 'Long' | 'String' | 'Bool'
WHITESPC := Unicode whitespace
COMMENT := '//' ~NEWLINE* NEWLINE
Drawbacks
There are several reasons not to develop a custom syntax:
Multiple formats can raise cognitive burden
Adding another schema format raises the bar for what customers need to know. They may ignore one format at first, but eventually they may need to learn both. For example, if one team uses the JSON format and another uses the custom format, but then the teams merge, they will each end up having to read/update the other's format. They may also fight about which format to move to.
Mitigating this problem is that it's easy and predictable to convert between the two formats, since the syntactic concepts line up very closely. The features you lose when converting from the new format to the JSON one would be 1/ any comments, and 2/ any use of intermingling of action
, entity
, and type
declarations, since they must all be in their own sections in the JSON. Otherwise the features line up very closely and the only difference is syntax. We would expect to write conversion tools as part of implementing the new syntax (which are easily achieved by parsing in the new format and pretty-printing in JSON).
Another mitigating point is that it's early days for Cedar, and we can promote the custom syntax as the preferred choice. JSON should be used when writing tooling to auto-author or manipulate schemas, e.g., as part of a GUI. We have a JSON syntax for Cedar policies for similar reasons, but it's the custom syntax that is front and center.
Requirement of good tooling
As a practical matter, having a custom schema syntax will require that we develop high-quality tools to help authors write correct schemas.
Parse error messages need to be of good quality, and report common issues such as missing curly braces, missing semi-colons, incorrect keywords, etc. A formatting tool can help ensure standardized presentation. An IDE plugin can put these things together to provide interactive feedback. For the JSON syntax, IDE plugins already exist to flag parse errors and to format documents uniformly.
However, JSON plugins do not understand the semantics of Cedar schemas, so they cannot provide hints about semantic errors (e.g., that setting principalTypes
to []
for an action
effectively means the action
cannot be used in a request). We could develop a linting tool that flags these issues, and it could be applied to both syntaxes.
Note that the problem of matching up curly braces in JSON-formatted schemas is more acute than in the custom syntax, since it's common to have deep nesting levels where matching is hard to eyeball. For example, in the TinyTodo
JSON schema, we have up to seven levels of direct nesting of curly braces, whereas the custom syntax has only three, and these levels are more visually isolated because of the other syntax in between them.
Greater implementation cost
Supporting a new syntax is an extra implementation cost, including the new tools mentioned above, now and going forward. More code/tools means more potential for bugs.
Mitigating this problem: The Rust internals already parse the JSON schema to a data structure, so adding a new format only involves adding a parser to this data structure and not, for example, making changes to the validator. We can use property-based testing to ensure that both formats are interconvertible, and correctly round-trip, i.e., that parse(pretty-print(AST)) == AST.
Alternatives
One alternative would be to replace the current JSON-based sytnax with the one in this proposal. This proposal would avoid the "cognitive burden" drawback mentioned above, but would be a disruptive, backward-incompatible change, and would lose the JSON format's benefits of existing tooling and easier programmatic schema construction.
Another alternative would be to adopt a Yaml-based syntax. This approach would meet our goals of greater information density and support for comments, and it would come with some existing tooling (such as IDE extensions). A downside of Yaml is that it provides more than we need, with a lack of conciseness leading to confusing. We could make our own parser for a subset of Yaml we wish to support for schemas, but that may lead to a confusing user experience. Yaml's indentation-sensitive parsing also means that an indentation mistake will be silently accepted, leading to a confusing user experience. Our custom syntax is whitespace-insensitive, and having total control over the grammar means better context for error messages.
Appendix A - AppliesTo Conversion Table
The relationship of appliesTo
clauses between JSON schemas and this new syntax are non-obvious when unspecified entities or actions with no targets are involved.
This appendix details a conversion table for the purposes of specification.
Both Unspecified
The following JSON schemas all set principal
and resource
to be unspecified, and leave the context
as the empty record. They all map to the same schema
"actions" : {
"read" : {
}
}
"actions" : {
"read" : {
"appliesTo" : null
}
}
"actions" : {
"read" : {
"appliesTo" : {
}
}
}
This means that both principal
and resource
are unspecified and that context
is the empty record. This becomes:
action read {
context : {}
};
We must include the context : {}
as action A {};
is not valid in the grammar.
Missing Both/Yes Context
"actions" : {
"read" : {
"appliesTo" : {
"context" : A
}
}
}
This means that both principal
and resource
are unspecified and that context
has type A. This becomes:
action read appliesTo {
context : A
};
Missing principalTypes
/No Context
"actions" : {
"read" : {
"appliesTo" : {
"resourceTypes" : [A]
}
}
}
This means that principal
is unspecified, but resource
is specified. context
is the empty record. Becomes: (Including the empty context would also be valid)
action read {
resource : [A]
};
Missing principalTypes
/Yes Context
"actions" : {
"read" : {
"appliesTo" : {
"resourceTypes" : [A]
"context" : B
}
}
}
This means that principal
is unspecified, but resource
is specified. context
has type B
. Becomes:
action read {
resource : [A],
context : B
};
Missing resourceTypes
/No Context
"actions" : {
"read" : {
"appliesTo" : {
"principalTypes" : [A]
}
}
}
This means that resource
is unspecified, but principal
is specified. context
is the empty record. Becomes: (Including the empty context would also be valid).
action read appliesTo {
principal: [A]
}
Missing resourceTypes
/Yes Context
"actions" : {
"read" : {
"applieTo" : {
"principalTypes" : [A],
"context" : B
}
}
}
This means that resource
is unspecified, but principal
is specified. context
has type B. Becomes:
action read appliesTo {
principal: [A],
context: B
};
Non-Applicable Actions
The following all collapse to the same schema object. They all involve action types which are not applicable. Specifying that you are applicable to some principal
P and not applicable to any resources is equivalent to not being applicable to any principals or resources.
"actions" : {
"read" : {
"appliesTo" : {
"principalTypes" : [],
"resourceTypes" : []
}
}
}
"actions" : {
"read" : {
"appliesTo" : {
"principalTypes" : [],
"resourceTypes" : [A]
}
}
}
"actions" : {
"read" : {
"appliesTo" : {
"principalTypes" : [A],
"resourceTypes" : []
}
}
}
"actions" : {
"read" : {
"appliesTo" : {
"principalTypes" : [],
"resourceTypes" : [],
"context" : A
}
}
}
"actions" : {
"read" : {
"appliesTo" : {
"principalTypes" : [A],
"resourceTypes" : [],
"context" : B
}
}
}
"actions" : {
"read" : {
"appliesTo" : {
"principalTypes" : [],
"resourceTypes" : [A],
"context" : B
}
}
}
All of the above map to the following:
action read;
Port Cedar formalization to Lean
Related issues and PRs
- Implementation PR(s): cedar-spec#138
Timeline
- Started: 2023-10-12
- Accepted: 2023-10-24
- Landed: 2023-10-26 on
main
- Released: The Dafny formalization was deprecated in
cedar-spec
v3.1.0 (released 2024-03-08)
Note: These statuses are based on the first version of the RFC process.
Summary
This RFC proposes to port our current Cedar models and proofs (written in Dafny) to an alternative verification tool (Lean) that is better suited for meta-theory proofs, like Cedar’s validation soundness proof.
Note: The changes proposed in this RFC do not impact the cedar-policy/cedar
repository or any of its associated Rust crates. The proposed changes are restricted to the formalization available in cedar-policy/cedar-spec
.
Motivation
Following a verification-guided development process, we use proofs to gate the release of Cedar’s core components. To release a new version, the Rust implementation of each component is differentially tested against its Dafny model, which is formally verified against key security and correctness properties.
We chose Dafny for its balance of usability and ability to automatically prove basic security properties of Cedar's authorizer. But proving Cedar's meta-theoretic properties (such as validator soundness) is less suited for Dafny’s automation (e.g., see this issue and associated PRs). For these proofs to be robust in terms of performance and maintenance, they need to be highly detailed, favoring the use of an interactive theorem prover. We propose porting Cedar to Lean, an interactive theorem prover that is well suited for large-scale proofs about meta-theory.
Detailed design
Cedar consists of three core components: (1) the evaluator, (2) authorizer, and (3) validator. The evaluator interprets Cedar policies, and the authorizer combines the results of evaluating multiple policies into an authorization decision. Cedar’s security theorems prove basic properties of the authorizer (e.g., deny overrides allow). The validator is a tool for working with Cedar policies. It type checks a policy against a schema. To assure correctness, we prove a validator soundness theorem which guarantees that if the validator accepts a policy, evaluating that policy on a valid input can never cause a type error.
All core components of Cedar are modeled in Dafny and have proofs. There are ~6,000 lines of modeling code. The proofs for the authorizer and evaluator are small: the largest is ~200 LOC. The validation soundness proof is ~2,000 LOC.
For each component, we plan to port the model first and start differentially testing it against the Rust implementation, and then port the proofs. The Dafny and Lean code will co-exist while we work on the port. Once the Lean validator proofs are complete, we will archive the Cedar Dafny folder.
We believe it is best to do the port now while our development is relatively small, and before we begin working on proving other meta-theoretic properties, like soundness of the partial evaluator (which is not yet modeled in Dafny).
Drawbacks
We will need to support both Dafny and Lean proofs for a period of time, to ensure that Cedar remains sound and there are no gaps.
Precomputed Entity Attributes
Related issues and PRs
- Related RFC(s): This RFC is mostly directly taken from RFC 28;
it splits out the changes from RFC 28 that relate to precomputed entity
attributes, without the larger changes in RFC 28 regarding
EntityDataSource
. This RFC is also the successor to RFC 23. - Implementation PR(s): cedar#430
Timeline
- Started: 2023-10-24 (predecessor #28 started on 2023-10-03, and predecessor #23 started on 2023-07-18)
- Accepted: 2023-11-14
- Landed: 2023-11-16 on
main
- Released: 2023-12-15 in
cedar-policy
v3.0.0
Note: These statuses are based on the first version of the RFC process.
Summary
Internally, Cedar entity objects will store their attribute values as
precomputed Value
s, rather than in RestrictedExpression
form.
This entails some minor breaking changes to the Cedar public API, but provides
important efficiency gains.
Motivation
-
Today, every
is_authorized()
request re-evaluates all of the attributes in theEntities
; they are stored as restricted expressions and evaluated toValue
on every call tois_authorized()
. (See cedar#227.) This is inefficient. -
Today,
Entities::evaluate()
is implemented onmain
and provides a way to mitigate this issue, but it is opt-in and we'd rather have it be the default behavior. Notably, as of this writing,Entities::evaluate()
is not released yet, so we can make changes in this area without breaking our users.
Detailed design
This RFC has three two components: (a third component was severed and moved to
Alternative A after discussion)
Component 1: Store attributes as precomputed Value
s
We redefine Entities
and Entity
to hold attributes as Value
s instead of as
restricted expressions.
This addresses the first point in Motivation.
This change is breaking for the public API in two (small) ways:
Entity::new()
,Entities::from_json_*()
, and friends can take the same inputs, but will need to sometimes return an evaluation error due to errors evaluating the entity attributes (particularly, errors from evaluating extension functions for attribute values of typeipaddr
ordecimal
). Today,Entity::new()
never returns an error, and theEntities
functions can returnEntitiesError
but not an evaluation error.- Conversely,
Entity::attr()
need not return an error anymore, as all of the attributes will already be evaluated. (If we are very concerned about backward compatibility, we could keep the current signature and just never returnErr
. Currently, this RFC is proposing we change the signature to not containResult
, since we're already making the related breaking change toEntity::new()
.)
Accepting these breaking changes allows us to give users the best-performing behavior by default. For alternatives that are less breaking, see Alternatives.
Component 2: Remove no-longer-necessary interface to precompute Value
s
We remove Entities::evaluate()
, as all Entities
are now stored in their
evaluated form by default.
This addresses the second point in Motivation.
This is not a breaking change because Entities::evaluate()
has not yet been
released in any Cedar release.
Drawbacks
- This RFC represents some breaking changes for users upgrading to a new major version of Cedar. All breaking changes come with some cost in terms of user experience for existing users. This RFC contends that the benefit outweighs the cost in this case.
Alternatives
Alternative A
In addition to what is currently proposed in this RFC, we could add the
following change, allowing users to construct entities using precomputed
Value
s.
Motivation
Today, Entity::new()
requires callers to pass in attribute values as
RestrictedExpression
.
For some callers, evaluating these RestrictedExpression
s is an unnecessary
source of runtime evaluation errors (and performance overhead).
Detailed design
We add new constructors for Entities
and Entity
which take Value
s
instead of RestrictedExpression
s.
These constructors would work without throwing evaluation errors, in contrast to
the existing constructors after the changes described in the main part of this
RFC.
This change requires that we expose Value
in our public API in some manner
(probably via a wrapper around the Core implementation, as we do for many other
Core types).
Note that today, EvalResult
plays the role of Value
in our public API, but
conversion between EvalResult
and Value
is relatively expensive (O(n)
).
We could either:
- expose
Value
but leaveEvalResult
alone. This provides maximum backwards compatibility, but might cause confusion, asValue
andEvalResult
would be conceptually similar, but slightly different and not interchangeable. - consolidate
EvalResult
andValue
into a single representation, calledValue
and based on the CoreValue
. This provides maximum API cleanliness but requires the most breaking changes. - expose
Value
, but keepEvalResult
as a type synonym forValue
. This is almost the same as the above option, but keeps theEvalResult
type name around, even though it makes breaking changes to the structure ofEvalResult
. This might reduce confusion for users who are migrating, as they only have to adapt to the changes toEvalResult
and not renameEvalResult
toValue
in their own code; but, it might increase confusion for future users, who will see two names for the same thing. - not expose
Value
and just useEvalResult
, even in the new constructors. This provides maximum backwards compatibility, like the first option, and also avoids the possible confusion generated by the first option, but comes at a notable performance cost (due to the inefficientEvalResult
-to-Value
conversion required under the hood) and might generate a different kind of confusion asEvalResult
is an awkward name in the context of these constructors.
Commentary
The sub-proposal in this alternative is severable and could be its own RFC, or perhaps a change that doesn't rise to the level of requiring an RFC. We have the freedom to do it any time in the future (after the rest of the RFC is accepted and implemented) if we don't want to do it now.
@mwhicks1 opinion, which I tend to agree with, is that this alternative is
probably more trouble than it's worth -- that it provides only marginal benefit
to users at the cost of cluttering our API, both with additional constructors,
and with the public Value
type.
If we don't take this alternative, users will have to still construct Entity
and Entities
via restricted expressions, using constructors which can throw
errors.
But, we would be able to avoid having to expose Value
or make changes to
EvalResult
, at least for now.
Alternative B
In addition to what is currently proposed in the current RFC, we could provide
the status-quo definitions of Entities
and Entity
under new names: e.g.,
UnevaluatedEntities
and UnevaluatedEntity
(naming of course TBD).
This is very unlikely to be of any use to new users, but existing users really
wanting the old error behavior could use these types and functions instead of
Entities
and Entity
.
To me, this provides little benefit, at the expense of more confusion and more APIs to maintain. It also doesn't make any of the changes more back-compatible; for users, migrating their code to use these renamed types and functions may be just as much work as adopting the actual breaking changes proposed in this RFC.
Alternative C
Instead of changing Entities
and Entity
as suggested in the current RFC, we
could provide new versions of these types -- say, EvaluatedEntities
and
EvaluatedEntity
-- which have the behaviors proposed in the current RFC. This
avoids making any breaking changes to existing types and functions.
The downsides of this alternative include:
- Existing users don't get the improved performance by default (see Motivation). They have to opt-in by migrating their code to these new types and functions.
- Whether or not we officially deprecate the existing
Entities
andEntity
, new users may incorrectly assume that they should default to using those types, as they have the most-intuitive / least-qualified names, and types likeEvaluatedEntities
might appear to be for special cases. - If we do not officially deprecate the existing
Entities
andEntity
, we are stuck maintaining more APIs indefinitely. These APIs are also somewhat redundant and provide multiple ways to do approximately the same thing, some more optimally than others. - If we do officially deprecate the existing
Entities
andEntity
we end up in a final state where we only haveEvaluatedEntities
and no other kind ofEntities
, which is a kinda ugly and nonsensical API from a naming standpoint -- not aesthetically pleasing.
Annotations for Cedar Schemas
Related issues and PRs
- Implementation PR(s): cedar#1316
Timeline
- Started: 2024-02-05
- Accepted: 2024-11-07
- Landed: 2024-12-16 on
main
- Released: 2025-01-21 in
cedar-policy
v4.3.0
Note: These statuses are based on the first version of the RFC process.
Summary
Like Cedar policies, users may want to associate arbitrary, machine readable metadata with Schema objects. We solved this problem in Cedar policies by allowing for annotations: arbitrary key/value pairs that are attachable to policies. This could be extended to Cedar schemas, allowing users to attach attributes an entity type/common type/action declaration and attribute declaration.
Basic example
Here is a basic example for doc comments.
@doc("this is the namespace")
namespace TinyTodo {
@doc("a common type representing a task")
type Task = {
@doc("task id")
"id": Long,
"name": String,
"state": String,
};
@doc("a common type representing a set of tasks")
type Tasks = Set<Task>;
@doc("an entity type representing a list")
@docComment("any entity type is a child of type `Application`")
entity List in [Application] = {
@doc("editors of a list")
"editors": Team,
"name": String,
"owner": User,
@doc("readers of a list")
"readers": Team,
"tasks": Tasks,
};
@doc("actions that a user can operate on a list")
action DeleteList, GetList, UpdateList appliesTo {
principal: [User],
resource: [List]
};
}
The @id("...")
notation is similar to the notation used for policy annotations.
Motivation
Users should be allowed to associate machine readable metadata with objects in a Schema. While we could create special syntax for associating particular kinds of metadata, we cannot predict all of the metadata uses that users will have. Thus providing a flexible system that users can adapt to their needs is preferable. This proposal re-uses the same syntax from Cedar Policies, creating a unified syntax.
Detailed design
Semantics
Attributes have no impact on validation decisions. Attributes are arbitrary key/value pairs where:
- 'key' is a valid Cedar identifier
- 'value' is a Cedar string
The Cedar spec takes no opinion or stance on the interpretation of annotations. The interpretation is entirely up to users of Cedar.
Cedar Schema Syntax
Attributes in Cedar Schemas will mirror the syntax used for attributes in a policy: informally that's @<key>("value")
.
Formally the following rule is added to the Cedar grammar:
Annotation := '@' IDENT '(' STR ')'
Annotations := {Annotations}
With an arbitrary number of them being able to prepend to a namespace declaration, entity type declaration, common type declaration, action declaration, and an attribute declaration.
Thus the full schema syntax becomes:
Schema := {Namespace}
Namespace := (Annotations 'namespace' Path '{' {Decl} '}') | {Decl}
Decl := Entity | Action | TypeDecl
Entity := Annotations 'entity' Idents ['in' EntOrTyps] [['='] RecType] ';'
Action := Annotations 'action' Names ['in' (Name | '[' [Names] ']')] [AppliesTo] [ActAttrs]';'
TypeDecl := Annotations 'type' IDENT '=' Type ';'
Type := PRIMTYPE | IDENT | SetType | RecType
EntType := Path
SetType := 'Set' '<' Type '>'
RecType := '{' [AttrDecls] '}'
AttrDecls := Annotations Name ['?'] ':' Type [',' | ',' AttrDecls]
AppliesTo := 'appliesTo' '{' AppDecls '}'
ActAttrs := 'attributes' '{' AttrDecls '}'
AppDecls := ('principal' | 'resource') ':' EntOrTyps [',' | ',' AppDecls]
| 'context' ':' RecType [',' | ',' AppDecls]
Path := IDENT {'::' IDENT}
EntTypes := Path {',' Path}
EntOrTyps := EntType | '[' [EntTypes] ']'
Name := IDENT | STR
Names := Name {',' Name}
Idents := IDENT {',' IDENT}
Annotation := '@' IDENT '(' STR ')
Annotations := Annotation {Annotations}
IDENT := ['_''a'-'z''A'-'Z']['_''a'-'z''A'-'Z''0'-'9']* - PRIMTYPE
STR := Fully-escaped Unicode surrounded by '"'s
PRIMTYPE := 'Long' | 'String' | 'Bool'
WHITESPC := Unicode whitespace
COMMENT := '//' ~NEWLINE* NEWLINE
JSON Syntax
None of the three top-level constructs (EntityTypes, Actions, CommonTypes) in JSON schemas allow for arbitrary key/value pairs.
This means a new key can be safely added while preserving backwards compatibility.
The same fact also applies to entity attribute declarations.
This proposal reserves the annotations
key at the top level of each of those constructs, which contains an Object, containing each annotation key as an Object key, associated with the annotation string.
The only oddness here is Common Types, whose toplevel is a regular type. While this should still be backwards compatible, it will look a little odd to have annotations in some types and not in others.
A corresponding JSON schema for the above example is as follows.
{
"": {
"annotations": {
"doc": "this is the namespace"
},
"commonTypes": {
"Task": {
"annotations": {
"doc": "a common type representing a task"
},
"type": "Record",
"attributes": {
"id": {
"type": "Long",
"annotations": {
"doc": "task id"
},
},
"name": {
"type": "String"
},
"state": {
"type": "String"
}
}
},
"Tasks": {
"type": "Set",
"element": {
"type": "Task"
}
}
},
"entityTypes": {
"Application": {},
"List": {
"annotations": {
"doc": "an entity type representing a list",
"docComment": "any entity type is a child of type `Application`"
},
"memberOfTypes": [
"Application"
],
"shape": {
"type": "Record",
"attributes": {
"editors": {
"type": "Team"
},
"name": {
"type": "String"
},
"owner": {
"type": "User"
},
"readers": {
"type": "Team"
},
"tasks": {
"type": "Tasks"
}
}
}
}
},
"actions": {
"CreateList": {
"annotations": {
"doc": "actions that a user can operate on a list"
},
"appliesTo": {
"resourceTypes": [
"Application"
],
"principalTypes": [
"User"
]
}
}
}
}
}
Drawbacks
- Complexity: adds more complexity to schema
- Oddness around syntax for Common Types in JSON form
- By not taking a stance on annotation meanings, it makes it harder for a standard to form around them (ex: for doc strings)
- Multi-line docstrings are technically valid but awkward.
Alternatives
Take a stance
Reverse our decision around annotations and start taking stances on what annotations mean.
This lets us standardize certain annotations, like doc
.
This probably can't happen unless we also do this for policies, which we've said we don't want to do.
Doc Strings as comments
Instead of annotations, we could add "doc-strings" as a first class feature. Could look like this:
/# Stop users from accessing a high security document unless:
/# A) The principal and user are at the same location
/# B) The principal has a job level greater than 4
forbid(principal, action, resource) when {
resource.security_level == "HIGH"
unless {
(resource.location == principal.location) || (principal.job_level > 4 )
};
This has nice and easy multi-line syntax, but is special cased and not as general.
Reserve Identifiers for Internal Use
Related issues and PRs
Timeline
- Started: 2024-02-14
- Accepted: 2024-06-10
- Landed: 2024-07-15 on
main
- Released: 2024-09-16 in
cedar-policy
v4.0.0
Note: These statuses are based on the first version of the RFC process.
Summary
This RFC extends the reservation of the __cedar
namespace in schema in RFC24 and
proposes reserving all namespaces and identifiers where the first element is __cedar
,
forbidding their use in Cedar policies, schema, and entity data so that they can be used by future Cedar language features.
Basic example
Under this RFC the names __cedar
, __cedar::Long
, and __cedar::namespace::User
will all become invalid in all contexts.
These would be rejected in a policy
permit(principal == __cedar::User::"alice", actions, resource);
in a schema
namespace __cedar {
entity User;
}
and in entity data
[
{
"uid": { "__entity": {
"type": "__cedar::User",
"id": "alice"
}},
"attrs": {},
"parents": []
}
]
In this entity data example, the __entity
JSON key does not demonstrate the new reserved namespace mechanism.
It is a place were we already assign special significance to some identifiers starting with __
in entity data.
Motivation
In designing RFC24 we found that we needed to disambiguate user defined identifiers from built-in primitive and extension types.
We accomplished this by reserving only the __cedar
namespace inside of schema,
but we anticipate that we may need more reserved namespaces in more contexts in the future.
Detailed design
We will update the namespace grammar to the following to exclude namespaces and identifiers starting with __cedar
.
Path ::= IDENT {'::' TRAILING_NAMESPACE_IDENT}
IDENT ::= INNER_IDENT - '__cedar'
TRAILING_NAMESPACE_IDENT ::= ['_''a'-'z''A'-'Z']['_''a'-'z''A'-'Z''0'-'9']* - RESERVED
RESERVED ::= 'true' | 'false' | 'if' | 'then' | 'else' | 'in' | 'like' | 'has'
This primarily applies to entity types names.
In the case where an entity type isn't qualified with a namespace, the leading identifier is just the entity type, so we would forbid an unqualified type __cedar
.
It also applies anywhere else namespaces may appear, including the top-level namespace
element in a schema file, qualified references to common types in a schema file, and extension function names in policies.
It also applies where a standalone identifier is expected.
For example, an entity attribute could not be __cedar
.
Reserving this single attribute is not own it's own particularly useful, but we may later extend the attribute grammar to accept namespace paths.
Drawbacks
This is a significant breaking change.
Anyone using the __cedar
namespace will need to update their policies, schema, and entity data to use an unreserved namespace.
The nature of this breaking change fortunately allows for migration away from reserved namespaces before the change is released,
and the break will manifest immediately on attempting to parse an affected policy, schema, or entity.
Alternatives
A larger breaking change
We could make a larger breaking change by reserving identifiers starting with __
instead of just __cedar
.
For example, we would additional forbid referencing an entity __tiny_todo::User
.
We don't currently see an immediate need for this more aggressive change, and it increases the likelihood a current user will be affected.
We may also find that consumers of Cedar who expose the policy language to their users want to reserve namespaces for their own use.
They could of course choose a namespace with any leading characters, but they may prefer the uniformity between __cedar::Long
and, for example, __tiny_todo
.
No breaking change
We can avoid a breaking change to Cedar by updating the Grammar to allow for a new identifier that is currently invalid, and then reserve that identifier.
For example, we could reserve $cedar
or cedar$
(other options for characters include %
or !
).
It would would be an error to use this identifier in any way that isn't explicitly permitted by the Cedar spec.
Taking this option would require an update to the schema grammar proposed in RFC24.
Enumerated Entity Types
Related issues and PRs
- Implementation PR(s): cedar#1377
Timeline
- Started: 2024-02-20
- Accepted: 2024-03-20
- Landed: 2025-02-05 on
main
- Released: TBD
Note: These statuses are based on the first version of the RFC process.
Summary
Extend schemas to support declared enumerations of entity-typed values, analogous to how schemas can currently be used to enumerate a finite list of Action
-typed values.
Basic example
An enumerated entity type is declared as a normal entity type, but includes the keyword enum
followed by the list of legal entity UIDs. Here is a simple example:
entity User;
entity Color enum ["Red", "Blue", "Green"];
entity Task {
owner: User,
name: String,
status: Color
};
action UpdateTask
appliesTo { principal: [User], resource: [Task] };
These data definitions could be used in policies such as
permit(
principal,
action == Action::"UpdateTask",
resource)
when {
principal == resource.owner &&
resource.status != Color::"Red"
};
Motivation
Enumerated types are useful when you have a fixed set of possible values, and the only thing you want to do with the values is compare them for equality. While you could effectively treat an entity type as an enumeration now, without declaring it in a schema, you gain some benefits by declaring it:
- The validator can error on uses of illegal enumerated values, e.g., flagging the typo
resource.status != Color::"red"
in thewhen
clause in the basic example. - When using a policy analyzer, it can always generate request and entity store instances where the enumerated entity type has declared-valid values, rather than random UIDs.
- When using an IDE or web-based policy builder, the enumeration can inform auto-completion suggestions. For the basic example above, in an auto-completing IDE writing
resource.status !=
... would pop up the three options.
Detailed design
An enumerated entity Foo
is declared by writing
entity Foo enum [ … ];
where [
… ]
is a non-empty list of allowed values, expressed as strings.
In the JSON format for schemas, you would write
"entityTypes": {
...
"Foo": {
"enum": [ … ]
},
...
}
You can use an enumerated entity type anywhere you can use a normal entity type. Since an enumerated entity cannot have attributes, nor can it have ancestors in the entity hierarchy, all you can do with it is test it for equality (e.g., with ==
or contains
).
The policy validator confirms that any enumerated entity literal in a policy is valid. The request validator does likewise. The entity store validator confirms that enumerated entities do not appear in the store, or if they do, they have no attributes or ancestors. It also confirms that the declared enumerated entity type has no invalid values, and references to entities of the enumerated type are valid. The schema-based entity parser likewise confirms that parsed-in enumerated entity values are valid.
As another example, consider the following.
entity Application enum [ "TinyTodo" ];
entity User in [ Application ];
action CreateList
appliesTo { principal: [User], resource: [Application] };
This is a generalization of our TinyTodo example from RFC 24, where we can refine the definition of Application
to indicate that it has a single value, Application::"TinyTodo"
. This allows the validator to catch typos in policies, such as the following.
permit(
principal,
action in [Action::"CreateList"],
resource == Application::"TinyTODO"
);
Likewise the request validator would flag a request with Application::"TinyTODO"
as its resource, and it would flag the passed-in entity store if it contained such an illegal value.
Notes
As a convention, our example enumerated entity names, like Color::"Red"
or Application::"TinyTodo"
, all begin with an uppercase letter. We choose to consider this approach good style, but not to mandate it.
We require entity enumerations to be given as a non-empty list of strings, like ["Red", "Blue"]
, but we could also allow them to be specified as identifiers, like [ Red, Blue ]
. Doing so would be similar to the handling of attributes, which can be specified as identifiers, principal.owner
, or as strings, principal["owner"]
. However, entity enumerations can only be referenced as strings, e.g., as Color::"Red"
not Color::Red
. Specifying them as strings, only, makes this connection a little stronger.
We do not permit declaring empty enumerations. Allowing them would add complication to policy analysis (to consider the exceptional case), but would be essentially useless: You could never create an entity of type Foo
, where Foo
is uninhabited, and while you could write expr is Foo
, this expression is always false
.
That an entity is an enumeration is specified as a refinement when declaring the entity type, e.g., writing entity Application;
declares the Application
entity type, while writing entity Application enum ["TinyTodo"];
declares the Application
entity type and then refines it to say that only the "TinyTodo"
entity ID is well defined. An alternative syntax that is more intuitive to some readers is entity enum Application ["TinyTodo"]
. This syntax is similar to Java-style syntax, enum Application { TinyTodo }
. However, this approach could create some confusion: enum
is currently a valid entity type, so it's legal to write entity enum;
in schemas today. Moreover, if we eventually take Alternative C, below, we may allow enum
to be accompanied by other type refinements, such as in
and attribute declarations. For example, we could one day be able to write entity Application in [Application] enum ["TinyTodo", "Office"]
(or swapping their order, entity Application enum ["TinyTodo", "Office"] in [Application]
), and might prefer the uniformity of that to entity enum Application ["TinyTodo", "Office"] in [Application]
.
Drawbacks
One reason not to do this is that it's not particularly full featured---you cannot do anything useful with an enumerated entity value in a policy other than compare it for equality. We consider more full-featured extensions in the alternatives below, but these have drawbacks of their own. The functionality could be easily extended later, depending on how things play out.
Alternatives
Alternative A: Enumerated primitive values
We previously proposed, in RFC 13, declaring a finite set of primitive values (strings, numbers, etc.) as an enumerated type. The killer objection to the RFC is that it introduces subtyping (you want to use the enumerated type of string as both the enumeration and the string, e.g., with like
), which is a significant complication for policy analysis. The present proposal is not problematic for analysis as enumerated entities can be encoded like any entity, but with added ground constraints limiting what form their UIDs can take on.
Alternative B: Enumeration as a distinct concept
Rather than specify a particular entity type as an enumeration, we could define a new concept of enumeration as another kind of primitive type. Here is a notional schema:
enum type Color = Red | Blue | Green;
entity Task {
name: String,
status: Color
};
Here is a notional policy:
permit(
principal,
action == Action::"UpdateTask",
resource)
when {
resource.status != Color.Red
};
This syntax is similar to what's provided for Java enum
s.
The benefit of this approach is that it may feel a little more natural than representing a concept, like a color, as a set of legal entity values. It would also be easy to encode this approach in a policy analysis.
The main drawback of this approach is that it introduces a new primitive type to the language that does not make much sense without schemas. Recall that for Cedar, validation with a schema is optional. We'd allow users to include any random enumerated identifiers (like Scheme-style symbols) in policies which, without validation, would fail equality checks at run-time.
The proposed approach that lists particular entity values as an enumeration makes no change to the language, leveraging the existing entity type concept. So policies without schemas still make sense. With the added schema, there is additional benefit when validating and constructing reasonable policies.
Alternative C: Enumerated entities with hierarchy
Enumerated entities as proposed are limited in their functionality and specificity. We could extend them. For example:
entity Application enum [ "TinyTodo" ];
entity RequestEntity enum [ "Principal", "Resource" ] in [ Application ];
entity User in [ Application ];
action CreateList
appliesTo { principal: [User], resource: [Application] };
action GetLists
appliesTo { principal: [User, RequestEntity::"Principal"],
resource: [Application]};
This differs from some earlier examples in the following ways:
- Enumerated entities can have parents that are enumerated entity values, and can be parents of other enumerated entity values; both cases are shown in the definition of
RequestEntity
- Enumerated entities can appear as singleton types, e.g., as
RequestEntity::"Principal"
in the definition of actionGetLists
.
Both of these extensions are similar to what's available for Action
s right now, but generalized to arbitrary entity types. You could also imagine enumerated entity types having attributes, as has been anticipated for Action
s.
One drawback of this Alternative is that it creates added complication for both the validator (e.g., to handle singleton types) and the analyzer (e.g., to deal with the more general hierarchy constraints).
Another drawback is that it adds complication to the entity store: Once you add hierarchy constraints and attributes, you need to create actual entities to include with policy requests, by extracting them from the schema at request-time. The RFC as proposed does not require that.
The good news is that this Alternative is a strict generalization of the RFC as proposed, which means if we agree to the current proposal we can later upgrade to some or all of this Alternative without incurring a breaking change.
Explicit Unspecified Entities
Related issues and PRs
Timeline
- Started: 2024-02-27
- Accepted: 2024-05-28
- Landed: 2024-06-19 on
main
- Released: 2024-09-16 in
cedar-policy
v4.0.0
Note: These statuses are based on the first version of the RFC process.
Summary
Cedar currently supports unspecified entities, which are entities of a special, unique type that have no attributes, and are not ancestors or descendants of any other entity in the store. Unspecified entities are intended to act as placeholders for entities that don’t influence authorization (see examples below).
The current implementation of unspecified entities results in several corner cases in our code (especially the validator), which have been difficult to maintain (see discussion on this recent PR). The concept of unspecified entities is (in our opinion) confusing for users, and will only become more confusing once we stabilize the partial evaluation experimental feature, which allows “unknown” entities that are subtly different from “unspecified” entities.
This RFC proposes to drop Cedar support for unspecified entities, instead recommending that users create their own application-specific entity types that act like unspecified entities.
Basic example
Consider a "createFile" action that can be used by any User
, as encoded in the following policy:
permit(principal is User, action == Action::"createFile", resource);
When making an authorization request for this action, the principal should be a User
, and the context can store information about the file to create, but it is unclear what the resource should be.
In fact, given the policy above, the resource doesn't matter.
In this case, a Cedar user may want to make the authorization request with a “dummy” resource; as of Cedar version 3.x, this is what unspecified entities are for.
Now consider a "createAccount" action. It's unclear what principal or resource should be used with this action. The approach above can be extended to this case too: make the action apply to an unspecified principal and resource. But now "unspecified" entities are being used to represent multiple distinct concepts: a file store, (unidentified) users creating an account, and a system that manages accounts.
This RFC proposes that users should instead explicitly create these "dummy" entities in an application-specific manner. For this example, the user should create entities of three types: FileSystem
, UnauthenticatedUser
, and AccountManager
. The first will act as the principal for a "createFile" request, and the second and third will act as the principal and resource for a "createAccount" request.
This is (arguably) the approach we used in the TinyTodo application: we made a special Application::“TinyTodo”
entity to act as the resource for the “CreateList” and “GetLists” actions.
This alternative is also in line with our previous suggestion to use an application-specific Unauthenticated
type to represent "unauthenticated" users, rather than using unspecified entities.
Motivation
Unspecified entities are a confusing concept, as evidenced by messy special casing in our own code and multiple RFCs to refine their behavior (RFC35, RFC47). We expect that this concept will only become more confusing once we stabilize the partial evaluation experimental feature, which allows “unknown” entities that are subtly different from "unspecified" entities, despite sharing similar names.
Unspecified entities also lead to two sharp corners in our public API:
- Omitting a
appliesTo
field in the schema (or setting it to null) means that an action applies only to unspecified entities, while using an empty list means that it applies to no entities. There is no way to say that an action applies to both unspecified entities and other types of entities. For more discussion, see RFC35. (Note: RFC35 has been closed in favor of the current RFC.) - It is unclear how we should create
Request
s that use both unspecified and unknown entities. This has not been an issue so far since partial evaluation and its APIs are experimental, but we are looking to stabilize them soon. Currently the best (although not ideal) proposal is to use a builder pattern forRequest
s (e.g.,request.principal(Some(User::"alice"))
sets the principal field of a request). To set a field to be “unspecified” users could pass inNone
instead ofSome(_)
; to set a field to be unknown users would not call the field constructor at all.
Aside: Unspecified vs. unknown
Unspecified and unknown entities are both pre-defined "dummy" entities that act like any other entity, as long as they are unused or only used in a trivial way (e.g., principal == principal
).
If a policy uses an unknown entity in a non-trivial way, then evaluation will produce a residual.
If a policy uses an unspecified entity in a non-trivial way, then evaluation will produce an error.
Detailed design
Status quo
As of Cedar v3.x, the only way to construct an unspecified entity is by passing None
as a component when building a Request
.
So a request for the "createFile" action above might look like:
In the schema, the only way to say that an action applies to an unspecified entity is by omitting the relevant appliesTo
field.
For example, the following schemas specify that the "createFile" action applies (only) to an unspecified resource.
Cedar schema syntax (RFC 24):
action createFile
appliesTo { principal: [User] };
Cedar schema JSON syntax:
"createFile": {
"appliesTo": {
"principal": ["User"]
}
}
Proposal
The proposal is simple: stop allowing the syntax above. The Request
constructor will no longer accept optional arguments and schemas will no longer support missing appliesTo
fields. The Request
constructor API will be changed to require an EntityUid
(instead of Option<EntityUid>
) for the principal/action/resource, requiring code changes in programs using Cedar's Rust APIs. The schema APIs will not change, but omitting an appliesTo
field will result in a parse error.
Relation to RFC35
This RFC combines nicely with the (accepted) RFC53 (Enumerated Entity Types), which allows users to fix the instances for a particular entity type. This means that custom "unspecified" entities can be easily restricted to a small set of possible entities. For example, RFC53 allows specifying that only Application::"TinyTodo"
is valid, and no other Application::"eid"
.
Upgrade strategy
We appreciate that this RFC proposes a breaking change for Cedar's users, and the upgrade path is non-trivial because it requires users to look carefully at their applications, and decide what they should use instead of unspecified entities. To help, while implementing this RFC, we plan to develop a blog post with examples of applications where you may have used an "unspecified" entity, and what we think you should use instead (suggestions for examples are welcome!).
We also plan to provide APIs (hidden behind a flag, like our experimental features) for the following operations, with the same API and behavior that was available in 3.x.
- Parsing a schema in the Cedar schema format
- Parsing a schema in the Cedar JSON schema format
- Constructing a request
This will allow services that build on top of Cedar (e.g., Amazon Verified Permissions) to handle this upgrade for their customers invisibly. The implementation will follow the approach outlined in Alternative B. However, we will not maintain these APIs indefinitely so you will need to support the new changes eventually. Please reach out to us if there's a way that we can help.
Alternatives
Alternative A - Predefine a special "unspecified" entity type
Continue to support unspecified entities, but re-brand them as "default" entities, with the intuition that you should use them when you want some default value to pass into an authorization request.
This alternative proposes to give an explicit name to the unspecified entity type (__cedar::Default
) and to require this name to be used in requests and schemas that use this type.
Under this proposal, the schema definition for the "createFile" action from above would look like:
action createFile
appliesTo { principal: [User], resource: [__cedar::Default] };
and a (pseudo-syntax) request for this action might be:
where the type of the Request
constructor is:
(Note: the Unknown
variant may be hidden behind the "partial-eval" experimental flag, depending on when this RFC is implemented.)
Users will not be allowed to include entities of type __cedar::Default
in the store, which ensures that these entities will never have attributes or ancestors/descendants, which are properties we will rely on during validation.
Users will not be allowed to reference the type __cedar::Default
in their policies (so neither principal == __cedar::Default::"principal"
nor principal is __cedar::Default
are allowed).
Users do not need to add __cedar::Default
to their schemas (and it will be an error if they do) because __cedar::Default
is defined automatically.
Finally, users cannot create entity uids with type __cedar::Default
, which prevents using default entities with the RequestEntity::EntityUid
constructor.
Naming specifics
The names RequestEntity
and __cedar::Default
are both subject to change.
For the latter, we have agreed that we want to use the __cedar
prefix (reserved as part of cedar#557 in v3.1), but the exact type name is up for debate.
Options we've considered include:
None
Null
Ghost
Dummy
Empty
Irrelevant
Generic
Here are some options we’ve ruled out:
Any
: possibility to confuse the scope conditionprincipal
withprincipal is __cedar::Any
. The first applies to any principal, while the latter applies to only the default entity.Arbitrary
: same issue as "any"Unspecified
: potential for confusion with partial evaluation "unknown"Mock
: implies that this entity type should be used for debugging/testing
Discussion notes
We considered this alternative seriously, and it even became the main proposal for a period of time. But ultimately we decided that this alternative feels like a temporary solution, and we would eventually want to get rid of unspecified/default entities anyways.
Here's a nice (lightly edited) summary from @max2me:
Both the main proposal and Alternative A are breaking changes, so I'd rather go a step forward and require customers to update schema, code, and policies once for the benefit of a brighter future (Alternative A would require customers to update just schema and code, yet all the same testing will have to occur).
My concerns with Alternative A:
- Lack of symmetry -- you omit something in the policy and yet explicitly mention it in the authorization code and schema. This leads to a fragmented experience of working with Cedar as a technology.
- Surprise factor -- I suspect that most customers who leave principal/resource unscoped in policies will not realize that in addition to all the entities they reference in their code, there is a hidden default type.
In my opinion, occasional duplication of policies and the need to define your own default type is a small cost to pay for the overall simplicity and cohesiveness of the system.
Alternative B - Maintain the status quo, but clean up the implementation
Pre-define the __cedar::Default
entity type as in Alternative A, but do not allow users to reference it directly.
Instead, update the underlying implementation to use the new type while leaving the current API as-is.
We took this approach in the Lean model to avoid special handling for unspecified entities, and it seems to have worked out well.
Differential testing has given us confidence that it leads to the same behavior as our current implementation.
Note: This alternative should result in no changes from the user’s perspective. If it does result in changes (e.g., to validation results) then this was unintentional behavior of our current implementation, and should be treated as a bug fix.
General multiplication in Cedar
Related issues and PRs
Timeline
- Started: 2024-03-07
- Accepted: 2024-03-15
- Landed: 2024-03-20 on
main
- Released: 2024-03-29 in
cedar-policy
v3.1.2
Note: These statuses are based on the first version of the RFC process.
Summary
Allow multiplication of arbitrary expressions (that evaluate to Long), not just multiplication of an expression by a constant.
Basic example
permit(principal, action, resource) when {
context.foo * principal.bar >= 100
};
Motivation
Today, Cedar only supports multiplication of two operands if (at least) one operand is a constant. In the more general n-ary (chained) multiplication case, Cedar requires that at most one operand is a non-constant, but allows the constants and non-constant to appear in any order.
- This creates some confusion for Cedar users. No other popular programming language has this restriction on multiplication.
- This leads to bugs and confusing behavior. To enforce this restriction, Cedar's internal AST node for multiplication only supports multiplying an expression by a constant. But this requires the Cedar parser to re-associates (and re-order) operands in a chained multiplication, which leads to confusion and problems as described in RFC 50.
- This limits the power of Cedar by preventing users from writing policies involving multiplication where neither operand is known at policy-authoring time.
This RFC proposes to relax all of these restrictions and allow multiplication
of arbitrary expressions (that evaluate to Long).
In other words, multiplication will be treated just like addition: *
will be
valid in all positions and situations where +
is valid today.
Detailed design
This will actually simplify the evaluator code, validator code, and internal AST.
Multiplication will become an ordinary BinaryOp
, and no longer need a special
AST node.
The evaluator and validator can handle multiplication in the same way as
addition and in the same code paths.
With this RFC, the policy parser becomes strictly more permissive: All policies that currently parse will still parse, plus additional policies that used to produce parse errors will now parse as well. There are edge cases where evaluations that do not overflow today could result in overflows after this RFC -- see Drawbacks below.
Drawbacks
One motivation for the original restriction on multiplication was to make SMT analysis of Cedar policies easier. However, presuming that the SMT analysis uses bitvector theory to analyze operations on Cedar Longs, multiplication by a general expression is not much more expensive than multiplication by an arbitrary constant (ignoring special cases such as 0 or constant powers of 2). Some multiplication expressions will be expensive to analyze (computationally), but that is already true for some other kinds of Cedar expressions that don't involve multiplication. In general, we'd prefer to use a linter to warn users about (these and other) expensive expressions, rather than use restrictions on the Cedar language.
This RFC firmly closes the door on changing Cedar's default numeric type to bignum. However, that door is already basically closed. Also, we could still introduce bignums as an extension type (with their own separate multiplication operation) in the future.
This RFC has essentially no interaction with any other currently pending or active RFC, except for RFC 50 which it replaces and makes obsolete.
There are edges cases where evaluations that do not overflow today could result
in overflows after this RFC.
One example is an expression CONST * CONST * context.value
for large CONST
when context.value == 0
.
Today's parser will produce the internal AST (mul (mul context.value CONST) CONST)
which will not overflow when context.value == 0
, while this RFC would produce
the internal AST (mul (mul CONST CONST) context.value)
which would overflow
even when context.value == 0
.
This is because, like RFC 50, this RFC proposes to left-associate
multiplication and not to reorder operands, in contrast to the current Cedar
parser.
This edge case seems very unlikely for users to hit (as it requires a multiplication by two large constants), and this RFC's proposed behavior is undoubtedly more intuitive, as multiplication is left-associative in most programming languages.
Alternatives
RFC 50 is one alternative which fixes some issues with chained multiplication without (significantly) relaxing Cedar's restrictions on multiplication operations. This RFC's proposal leads to simpler evaluation and validation code than RFC 50, and also makes the Cedar language simpler and easier to understand for users.
Extended has
Operator
Related issues and PRs
- Implementation PR(s): cedar#1327
Timeline
- Started: 2024-04-05
- Accepted: 2024-05-21
- Landed: 2024-11-27 on
main
- Released: 2025-01-21 in
cedar-policy
v4.3.0
Note: These statuses are based on the first version of the RFC process.
Summary
This RFC proposes to extend the syntax of the has
operator to check for the presence of all attributes in an access path.
Basic example
Suppose that entites of type User
have an optional contactInfo
attribute of type {email: String, address?: {street?: String, zip?: String, country: String}}
. To safely access the zip
field of a user in a policy, we write a chain of has
checks as follows:
permit(
principal is User,
action == Action::"preview",
resource == Movie::"Blockbuster"
) when {
principal has contactInfo &&
principal.contactInfo has address &&
principal.contactInfo.address has zip &&
principal.contactInfo.address.zip == "90210"
};
This RFC proposes to extend the syntax of the has
operator so that this chain of checks can be written more succinctly as follows:
permit(
principal is User,
action == Action::"preview",
resource == Movie::"Blockbuster"
) when {
principal has contactInfo.address.zip &&
principal.contactInfo.address.zip == "90210"
};
The expression principal has contactInfo.address.zip
evaluates to true when all the attributes in the given access path are present.
Motivation
Long chains of has
checks are tedious to write and hard to read. The extended has
operator replaces these chains with a single constraint that is easier to read and write.
Detailed design
This RFC proposes to desugar the new has
syntax into the corresponding chain of basic has
checks in the CST -> AST and EST -> AST converter. This requires extending the parser, CST, and EST to support the new syntax; and changing the CST -> AST and EST -> AST converters to implement the desugaring. It should be possible to share the core desugaring code between the two converters.
No other components need to change. In particular, validator, evaluator, and Lean models remain the same.
Extending the syntax
We extend the grammar as follows:
Relation ::= ... | Add 'has' (IDENT['.' IDENT]* | STR) | ...
Note that this extension works only for attribute names that are valid Cedar identifiers. It cannot be used with attribute names that are not valid Cedar identifiers. For example, we cannot write principal has "contact info".address.zip
; this check has to be expressed as principal has "contact info" && principal["contact info"] has address.zip
. Arbitrary attribute names can be supported in the future if there is demand for it.
Desugaring the extended syntax
The following Lean code shows how to perform the desugaring given an expression, an attribute, and a list of attribute:
def desugarHasChainStep
(acc : Expr × Expr)
(attr : Attr) : Expr × Expr :=
(Expr.getAttr acc.fst attr,
Expr.and acc.snd (Expr.hasAttr acc.fst attr))
def desugarHasChain
(x : Expr)
(attr : Attr)
(attrs : List Attr) : Expr :=
(attrs.foldl
desugarHasChainStep
(Expr.getAttr x attr,
Expr.hasAttr x attr)).snd
Here are some examples of desugared ASTs:
#eval desugarHasChain (Expr.var .principal) "contactInfo" []
Expr.hasAttr (Expr.var .principal) "contactInfo"
#eval desugarHasChain (Expr.var .principal) "contactInfo" ["address"]
Expr.and
(Expr.hasAttr (Expr.var .principal) "contactInfo")
(Expr.hasAttr
(Expr.getAttr (Expr.var .principal) "contactInfo")
"address")
#eval desugarHasChain (Expr.var .principal) "contactInfo" ["address", "zip"]
Expr.and
(Expr.and
(Expr.hasAttr (Expr.var .principal) "contactInfo")
(Expr.hasAttr
(Expr.getAttr (Expr.var .principal) "contactInfo")
"address"))
(Expr.hasAttr
(Expr.getAttr
(Expr.getAttr (Expr.var .principal) "contactInfo")
"address")
"zip")
Drawbacks
-
This feature requires extending the grammar, CST, EST, CST -> AST, and CST -> EST converters. This extension will be a breaking change to the EST. We have to either modify the EST
hasAttr
node to include an additional List argument, or introduce a separate node to represent the extended syntax. Either change may break code that operates on the EST. -
If we choose not to modify the EST, and implement a CST -> EST conversion using the above algorithm, then the EST loses information. It will no longer be possible to recover the original policy syntax from the EST.
Alternatives
We considered several variants of the extended syntax that would support arbitrary attribute names. For example, principal has contactInfo."primary address".zip
or principal has ["contactInfo"]["primary address"]["zip"]
.
We decided that the alternatives were less intuitive and readable. If needed, we can support arbitrary attribute names in the future without causing any further breaking changes to the EST.
Disallow shadowing definitions in the empty namespace
Related issues and PRs
Timeline
- Started: 2024-06-26
- Accepted: 2024-08-01
- Landed: 2024-08-26 on
main
- Released: 2024-09-16 in
cedar-policy
v4.0.0
Note: These statuses are based on the first version of the RFC process.
Summary
In schemas, disallow definitions of entity types, common types, and actions that would shadow definitions of other entity types, common types, or actions in the empty namespace.
Basic example
Borrowing and slightly tweaking the example from RFC 64:
entity User {
email: String,
};
namespace Demo {
entity User {
id: String,
};
entity Account {
owner: User,
};
}
Today, this schema is accepted, and Account.owner
is assumed to be
Demo::User
and not the empty-namespace User
.
As noted in RFC 64, there is no way for the user to indicate that they intend
Account.owner
to be the empty-namespace User
.
Rather than add syntax to enable this (as proposed in RFC 64), this RFC
proposes to make this situation an error -- disallowing declaring a User
type
in the Demo
namespace when one already exists in the empty namespace.
Motivation
Name shadowing is complicated and confusing; schemas using name shadowing may look like they are doing one thing when they are actually doing another. For Cedar policy and schema authors, this can cause misunderstandings and make it more difficult to write correct schemas and policies. By disallowing shadowing in the most egregiously ambiguous cases -- when the name being shadowed was declared in the empty namespace -- we nudge users to make the schema clearer, for instance by renaming a type, or moving a conflicting empty-namespace declaration into a new namespace.
Detailed design
We disallow shadowing a type or action in the empty namespace with a new declaration in a nonempty namespace.
- For entity and common types, we disallow this shadowing regardless of whether the shadowed declaration is an entity or common type, and regardless of whether the shadowing declaration is an entity or common type.
- For actions, we only disallow shadowing other actions. We continue to allow an action to have the same name as an entity or common type (in the same namespace or in the empty namespace). That is not "shadowing", as there is no place in a schema where it's ambiguous whether you're referring to an action or an entity/common type.
This RFC does not disallow shadowing altogether; in particular, it does not disallow shadowing a entity typename with a common typename in the same namespace, as originally proposed in RFC 24 (see "Disambiguating Types in Custom Syntax", and in particular Rule 5 in that section):
// Allowed today, and still allowed after this RFC
namespace NS {
entity User {
id: String,
}
type User = User;
// or, more confusingly, type User = String;
}
This RFC also does not disallow shadowing the names of primitive or extension
types, where you can still refer to the shadowed type using __cedar
, again as
explained in RFC 24.
Conceptually, primitive and extension types are not declared in the empty
namespace; they are declared in the __cedar
namespace, and implicitly imported
as part of a prelude.
That is how we could explain (e.g. in the Cedar docs) why the rule is different
for shadowing entity/common types and shadowing primitive/extension types.
(Internally, another reason for continuing to allow shadowing
primitive/extension types, is that we want to be able to introduce new extension
types without any possibility of breaking existing schemas.)
This RFC does not change anything about cedar#579. We still plan to implement cedar#579, which allows users to refer to types in the empty namespace, in cases where they are not shadowed.
This RFC does apply the new restriction uniformly in both the human and JSON schema syntaxes. The motivation and situation is the same in both syntaxes for the case of an entity typename shadowing an entity typename, and for the case of a common typename shadowing a common typename. For the case of an entity typename shadowing a common typename or vice versa, this RFC still proposes to disallow in both syntaxes, even though these situations are not ambiguous in the JSON syntax, which distinguishes between entity type and common type references. See this RFC discussion thread.
Drawbacks
Drawback 1
The motivation section for RFC 64 claimed that
Users justifiably expect that they can refer to any declared typename (in any namespace) from any position in the schema, by writing a fully qualified name. The fact that RFC 24 did not provide a way to refer to typenames in the empty namespace by using some kind of fully qualified name seems like an oversight, and should be corrected for consistency.
This RFC does not address that motivation, and effectively counterclaims that users don't actually expect this.
Drawback 2
If RFC 69 or something similar are eventually accepted, the error in this RFC may not be so easily avoided, if the empty-namespace definition is part of a third-party library being imported by the user. In that case, the user couldn't easily rename the empty-namespace type or move it into a nonempty namespace.
To address this drawback, we should guide library authors to not declare library types or actions in the empty namespace (and instead, e.g., declare them in a namespace named after the library). We could even codify that rule, requiring that all libraries have a name and all declarations in the library live in the corresponding namespace, or alternately, implicitly prefixing all library declarations with the library name as a toplevel namespace. Many existing package systems have similar rules or mechanisms.
Alternatives
RFC 64 proposed instead introducing new schema syntax to allow users to refer to entity and common types in the empty namespace whose names are shadowed by declarations in the current namespace. (RFC 64 could be naturally extended to actions as well.)
Introducing new schema syntax is a major change, and in the discussion on
RFC 64 we have been, to date as of this writing, unable to agree on a syntax for
this which is clear and acceptable to everyone.
On the RFC thread and in offline discussions, the leading candidates for such a
syntax are currently (as of this writing) ::User
, root User
, or
__cedar::Root::User
(reusing the __cedar
namespace reserved in RFC 52).
But it seems apparent that none of these options are immediately intuitive to
all users; if we were to introduce such a syntax, we run the risk of
complicating the schema syntax only to not actually clarify the situation
because the new syntax is still unclear.
Finally, this shadowing situation is viewed as an unlikely edge case in general, and certainly one that can easily be avoided by users (by renaming types/actions or moving empty-namespace declarations into a namespace). It seems expedient to resolve this in the easiest, clearest way possible. This RFC proposes that disallowing the premise is easier and clearer than designing and implementing new schema syntax that will only be used in an unlikely edge case.
Trailing Commas
Related issues and PRs
- Reference Issues: cedar#1024
- Implementation PR(s):
Timeline
- Started: 2024-06-26
- Accepted: 2024-09-10
- Landed: TBD
- Released: TBD
Note: These statuses are based on the first version of the RFC process.
Summary
The Cedar grammar (both policies and schemas) should accept trailing commas whenever possible.
Basic example
Currently, this is a parse error:
permit(principal, action, resource) when {
{ foo : 3, bar : 2, } == context
};
This RFC would make it not a parse error. It would also change this is several places beyond records.
Motivation
In general, allowing trailing commas makes editing easier. Re-arranging, deleting, and copy-pasting code into/from a comma separated grammar element will no longer require adjusting commas.
Example:
{
"managers": ["Dave", "Shelly", "Chris"],
"projects" : {
"Alpha" : {
"zones" : ["A", "B"]
},
"Beta" : {
"zones" : ["D", "F"]
}
}
}
The Beta
project cannot simply be deleted from this record, and
likewise, new fields cannot be simply pasted into any level of the
objects, without adjusting commas.
Many languages allow for trailing commas, including:
- Rust
- Python
- Java,
- Go
- JavaScript
- OCaml
(Frustratingly: not JSON!)
In addition, it will make our schema grammar and policy grammar more consistent, as the schema grammar allows trailing commas in many places.
Detailed design
Cedar Policy Grammar
We want to allow trailing commas in the following expressions:
- scopes:
permit(principal, action, resource, )
- sets:
[1,2,3,]
- records:
{ foo : 1, bar : 3, }
- all method/function calls:
.contains(3,)
,ip("10.10.10.10",)
In the grammar:
The following grammar rules change:
Scope ::= Principal ',' Action ',' Resource
becomes:
Scope ::= Principal ',' Action ',' Resource ','?
EntList ::= Entity {',' Entity}
becomes:
EntList ::= Entity {',' | ',' Entity} ','?
ExprList ::= Expr {',' Expr}
becomes:
ExprList ::= Expr {',' | ',' Expr} ','?
RecInits ::= (IDENT | STR) ':' Expr {',' (IDENT | STR) ':' Expr}
becomes:
RecInits ::= (IDENT | STR) ':' Expr {',' | ',' (IDENT | STR) ':' Expr}
Cedar Schema Grammar
The schema grammar already allows trailing commas in several places (such as record types and entity lists), but not everywhere:
We want to allow trailing commas in the following expressions:
- membership lists:
entity Photo in [Account, Album,];
- Multi-entity declarations:
entity Photo,Video;
- Record types:
{ foo : Long, bar : String, }
(currently allowed) - Multi-action declarations:
action foo,bar,;
- Apply specs:
appliesTo { principal : ["A"], resource: ["B"],"
(currently allowed)
EntTypes := Path {',' Path}
becomes:
EntTypes := Path {',' | ',' Path}
Names := Name {',' Name}
becomes:
Names := Name {',' | ',' Name} ','?
Idents := IDENT {',' IDENT}
becomes:
Idents := IDENT {',' | ',' IDENT}
Backwards Compatibility
This change is fully backwards compatible.
Drawbacks
- Trailing commas are potentially confusing.
Alternatives
- Keep status quo: this is a non critical change
Unresolved questions
Entity Slice Validation
Related issues and PRs
- Implementation PR(s): cedar#1446
Timeline
- Started: 2024-08-01
- Accepted: 2024-10-03
- Landed: 2024-10-03 on
main
- Released: 2024-10-07 in
cedar-policy
4.2.0 as an experimental feature
Note: These statuses are based on the first version of the RFC process.
Summary
This RFC introduces "Entity Slicing Validation" (ESV), which consists of
- An additional validation check, and
- A generic entity slicing algorithm
The two are connected by the concept of a "level," which defines the depth from root variables (like principal
and resource
) at which entity dereferencing operations may safely take place in policies. The validation check ensures that policies never dereference beyond the given level. The slicing algorithm determines what entities are needed to safely decide an authorization request when evaluating policies valid at a given level.
Basic example
Consider the following policies, taken from the TinyTodo example application:
// Policy 1: A User can perform any action on a List they own
permit (
principal,
action,
resource is List
)
when { resource.owner == principal };
// Policy 2: A User can see a List if they are either a reader or editor
permit (
principal,
action == Action::"GetList",
resource
)
when { principal in resource.readers || principal in resource.editors };
// Policy 4: Admins can perform any action on any resource
permit (
principal in Team::"Admin",
action,
resource
);
We suppose (per the schema, not shown) that principal
always has entity type User
, and resource
always has entity type List
. These policies are valid at level 1, meaning that variables principal
, action
, resource
and context
are only directly dereferenced, i.e., any entities they reference are not themselves dereferenced.
For example, in policy 1, the expression resource.owner == principal
directly dereferences resource
by accessing the contents of its owner
attribute. In policy 2, expression principal in resource.readers
directly dereferences resource
to retrieve the contents of its readers
attribute, and also directly dereferences principal
to retrieve its ancestors, to see whether one might be equal to the contents of resource.readers
. Policy 4 also directly dereferences principal
to access its ancestors, to see if one is Team::"admin"
.
Because these policies are valid at level 1, applications submitting authorization requests only need to provide entity data for entities provided as variables, and nothing else. For example, say we want to evaluate the request (User::"Aaron"
, Action::"GetList"
, List::"Objectives"
, {}
). Then we would only need to provide entity data (direct attributes and ancestors) for User::"Aaron"
, Action::"GetList"
, and List::"Objectives"
. There is no need to provide data for entities referenced from these entities. For example, suppose the readers
attribute of List::"Objectives"
is Team::"interns"
; there is no need to provide data for Team::"interns"
because we know it will never, itself, be dereferenced when evaluating these policies. There is also no need to provide entity data for other entities directly mentioned in policies, e.g., Team::"Admin"
.
Now suppose we wanted to extend our TinyTodo policies with the following:
// Policy 6: No access if not high rank and at location DEF,
// or at resource's owner's location
forbid(
principal,
action,
resource is List
) unless {
principal.joblevel > 6 && principal.location like "DEF*" ||
principal.location == resource.owner.location
};
This policy does not validate at level 1. That is because the expression resource.owner.location
is only valid at level 2: It requires dereferencing resource
to retrieve the entity in the owner
attribute, and then requires dereferencing that entity to obtain the contents of its location
attribute. As a result, if we only provided the principal
and resource
entity data, as described above, evaluating policy 6 could produce an evaluation error since data may not be provided for resource.owner
.
All of the Cedar example policy sets validate with level-based validation, either at level 1 or 2; see the Appendix for details.
Motivation
Currently, Cedar provides no support for slicing. Users are required to either be inefficient and produce their entire entity store as a slice, or derive their own ad-hoc slicing algorithm. We'd like to provide an out-of-the-box solution that addresses many user's needs. Entity slicing in general is a problem very tied to the specifics of application/deployment. It's likely impossible to have a fully general solution. This RFC is attempting to provide a solution for entity slicing that works for the 80% case, while acknowledging that it won't work for everyone.
Additionally, there are some application scenarios where the actors writing the policies and the people performing slicing are not the same people, and thus the slicers need to impose some kind of restrictions on policy authors.
This RFC has a close connection to RFC 74: Entity Slicing using Data Tries, which also seeks to provide a sound foundation for entity slicing. We discuss the relationship between them in the Alternatives below; ultimately, we believe they are complementary and should coexist.
Detailed design
Entity dereferences
An "Entity Dereference" is any operation that requires directly accessing an entity’s data. The following is an exhaustive list of entity dereferencing operations
e1 in e2
dereferencese1
(but note2
)e1.foo
dereferencese1
iffe1
is an entitye1 has foo
dereferencese1
iffe1
is an entity
Note that testing equality is not a dereferencing operation. Therefore expressions like e1 == e2
, e1.contains(e2)
and e3 in e2
do not dereference e1
or e2
.
Also note that by “dereference” above, we are only referring to the given operation, not about any operations that might be executed to evaluate subexpressions. For example, while e1 in e2
only dereferences e1
(per rule 1 above), if e2
is principal.foo
, then of course evaluating e2
dereferences principal
(per rule 2 above).
Entity roots
The variables principal
, action
, and resource
are entity roots. (Entity literals are not entity roots, and operations on such literals are limited, as described below. Slots and the Unknown
entity are treated as entity literals because slots will be replaced by literals and Unknown
cannot be dereferenced.) They directly name entities from a request that could be dereferenced in a policy. A context
is not an entity root directly; rather, any attributes it (transitively) contains that have entity type are considered roots. For example, consider the following schema:
entity User {
manager: User,
age: Long,
};
action getDetails appliesTo {
principal: User,
resource: User,
context: {
admin: User,
building: { location: Long, ITDeptHead: User },
foo: { inner: User },
bar: { inner: User }
}
};
Here, principal
, action
, and resource
are entity roots (as usual), as are:
context.admin
context.building.ITDeptHead
context.foo.inner
, andcontext.bar.inner
Note that context
, context.building
, and context.building.location
are not entity roots because they do not have entity type. Moreover, context.admin.manager
is not an entity root because manager
is not an attribute of context
itself.
Dereference levels
During validation, we extend entity types with a level, which is either ∞
or a natural number. We call these leveled entity types. The level expresses how many entity dereferences are permitted starting from an expression having that type. Levels on types are only used internally; policy/schema writers never see them. When the level is ∞
, validation is unchanged from the status quo.
Specifying the level
The maximum level that policies may validate at is specified as a parameter to the validator. Schemas are unchanged. Just like Cedar currently supports strict (the default) and permissive validation, it would now support those things with the additional parameter of what level at which to validate. As discussed in the alternatives below, a drawback of this approach is that policy writers cannot look in one place (the schema) to know the limits on the policies they can write. But in a sense, due to the dichotomy of strict vs. permissive validation, that's already the case.
Validation algorithm
Validation works essentially as today (see our OOPSLA’24 paper, page 10), but using leveled types. To validate a policy c at level N, we use a request environment which maps principal
, resource
, and context
to leveled types with level N. For example, for the schema given above, to validate at level 2 our request environment would map principal
to User
(2), resource
to User
(2), and context
to the type given in the schema annotated with 2, i.e., { admin: User(2), building: { location: Long, ITDeptHead: User(2) },... }
.
The rules for expression e1 in e2
(line (3) in Fig. 7)(a) in the paper) are extended to require that the entity type of e1
is E1(n)
where n > 0
. This requirement indicates that e1
must be an entity from which you can dereference at least one level (to access its ancestors).
The rules for expression e has f
and e.f
(line (6) in Fig. 7(a) in the paper) are extended so that if e
has entity type, then its type must be E(n)
where n > 0
. The rule for e.f
is additionally extended so that if e.f
has entity type F
(per the schema), then the the final type is leveled as one less than n
; e.g., if e
has level n
, then final expression has type F(n-1)
. (Rules for e has f
and e.f
are unchanged when e
has record type.)
The rule for entity literals (not shown in the paper) always ascribes them level 0 unless we are validating at level ∞
. For example, the type of expression User::"Alice"
is User(0)
. This is because we do not permit dereferencing literals in policies when using leveled validation. (See the Alternatives below for a relaxation of this restriction.)
We extend the subtyping judgment (Fig 7(b) in the paper) with the following rule: E(n) <: E(n-1)
. That is, it is always safe to treat a leveled entity type at level n
as if it had fewer levels. This rule is important when multiple expressions are required to have the same level. For example:
if (principal.age < 25) then
principal.manager
else
principal
Suppose we are validating at level 2, then this expression will have type User(1)
, where we use subtyping to give expression principal
(in the else branch) type User(1)
rather than User(2)
. Similarly, if we were defining a set of entities, all of them must be given the same level. For example, we could give [ principal.manager, principal ]
the type Set<User(1)>
. Note that for records, individual attributes may have entities with different levels. E.g., { a: principal, b: principal.manager }
would have type { a: User(2), b: User(1) }
.
Slicing
Once a schema has been validated against a non- infinite level, we can use the following procedure at runtime to compute the entity slice:
/// Takes:
/// request : The request object
/// level : The level the schema was validated at
/// get-entity : A function for looking up entity data given an EUID
function slice(request, level, get-entity) {
// This will be our entity slice
let mut es : Map<EUID, EntityData> = {};
let mut worklist : Set<EUID> = {principal, action, resource, all root euids in context};
for euid in worklist {
es.insert(euid, get-entity(euid));
}
let current-level = 0;
while current-level <= level || worklist.empty? {
// `t` will become our new worklist
let t : Set<EUID> = {};
// Clear our worklist
for euid in worklist {
if !es.contains(euid) {
es.insert(euid, get-entity(euid));
}
// Set of all euids mentioned in this entities data
let new-euids = filter(not-in-es?, all-euids(es.get(euid)));
t = t.union(new-euids);
}
// set t to be the new worklist
s = t;
// increment level
level = level + 1;
}
return es;
}
Soundness
Leveled types permit us to generalize the soundness statement we use today. Currently, soundness assumes that expressions like principal.manager
will never error, i.e., that the proper entity information is available. Now we assume dereferences are only available up to a certain level. In particular, we define a well-formedness judgment μ ⊢ v : τ where μ is an entity store, v is a value, and τ is a possibly leveled type. This judgment is standard for values having non-entity type, e.g.,
mu |- i : Long
mu |- s : String
mu |- v1 : T1 ... mu |- vn : Tn
-------------------------------------------------------
mu |- { f1: v1, ..., fn: vn } : { f1: T1, ..., fn: Tn }
There are two rules for entity literals of type E::s
. The first says level 0
is always allowed. The second says a literal can have a level n
only if n
dereferences are possible starting from E::s
:
mu |- E::s : E(0)
mu(E::s) = (v,h) where v = { f1: v1, ..., fn: vn } and h is an ancestor set
mu |- v1 : T1 ... mu |- vn : Tn
where level(T1) >= n-1 ... level(Tn) >= n-1
------------------------------------------------------------------------------------
mu |- E::s : E(n) where n > 0
The auxiliary function level(T)
identifies the lower-bound of the level of type T
. It’s defined as follows:
level(E(n))
=n
level({ f1: T1, ..., fn: Tn })
=m
wherem
=min(level(T1),...,level(Tn))
level(Long)
=level(String)
=level(Boolean)
=level(Set(T))
= (etc.) = ∞ for all other types
Thus the soundness statement says that
- If policy set
p
is validated against a schemas
with levell
, wherel < ∞
. - If entity store μ is validated against schema
s
- If request σ is validated against schema
s
- If μ ⊢ σ(principal): E(l) where E is the expected type of the principal for this particular action, and similarly check the proper leveling of action, request, and context.
- Then:
isAuthorized(sigma, p, mu)
is either a successful result value or an integer overflow error- All errors except integer overflow + missing entity are prevented by the current validator
- missing entity errors are prevented by the new validator + slicer
We extend the proof of soundness to leverage premise #4 above: This justifies giving principal
a type with level l in the initial request context, and it justifies the extensions to the rules that perform dereferencing, as they decrease the leveled result by 1, just as the μ⊢v:τ rule does, which we are given.
We also want to prove the soundness of the slice(request, level, get-entity)
procedure defined above: If slice
(σ, l
, get-entity
) = μ and request σ is validated against schema s
and level l
, then μ validates against schema s
at level l
.
Drawbacks
Level-based slicing is coarse-grained: all attributes up to the required level, and all ancestors, for each required entity must be included in the slice. Much of this information may be unnecessary, though, depending on the particulars of the request. For example, for the TinyTodo policies in our detailed example, if the action in the request was Action::"CreateList"
, then only policies 1 and 4 can be satisfied. For these policies, no ancestor data for principal
and resource
is needed, and only the contents of resource.owner
, and not (say) resource.readers
and resource.editors
, is required. For some applications, the cost to retrieve all of an entity’s data and then pass it to the Cedar authorizer could be nontrivial, so a finer-grained approach may be preferred. Some ideas are sketched under alternatives.
Implementing levels is a non-trivial effort: We must update the Rust code for the validator of policies and entities, and the Lean code and proofs. We also must provide code to perform level-based slicing.
Alternatives
Alternative: Entity Slicing using Data Tries
RFC 74, Entity Slicing using Data Tries proposes the idea of an entity manifest to ease the task of entity slicing. This manifest is a trie-based data structure that expresses which entity attributes and ancestors are needed for each possible action. This data structure is strictly more precise than the concept of level, meaning that it may require less entity data to be provided. For example, the first part of our Basic Example in this RFC (policies 1, 2, and 4 of TinyTodo), the determined entity manifest would indicate that for the GetList
action, you would need to provide resource.owner
, resource.readers
, resource.editors
and the ancestors of principal
. This is less data than would be retrieved by the slicing algorithm proposed here: These are level-1 valid policies, so the generic algorithm would acquire the same data given in the manifest but also more, e.g., the ancestors of resource
and the attributes of principal
.
Entity manifests are less effective as a prescriptive mechanism that aims to limit the shape of allowable policies. For example, a Cedar policy and entity storage service might like to disallow uploading policies whose entity data is expensive to retrieve in the worst case. With levels, such a service can upper-bound this work by blocking schemas with a level greater than a threshold, say 3. Validation will then ensure that uploaded policies respect this upper bound. Manifests are unwieldy as a specification mechanism, since they are specialized to particular entity definitions and name particular types and attributes, of which there could be many. They'd also have to be updated as types evolved over time, while a level-based specification is less likely to change.
We believe that ultimately level-based validation and entity manifests should coexist. Level-based validation is used to bound entity retrieval work and prevent pathological policies, while entity manifests are used to more efficiently define the needed entity data. We could imagine a deployment strategy in which we accept this RFC and perform generic entity slicing as described here, and then later implement entity manifests, and start performing slicing using those instead.
Alternative: Level in the schema, not as a validation parameter
Instead of specifying the validation level as a parameter to the validation algorithm, we could specify it directly in schemas. The benefit of this is that policy writers can see the expected form of policies in one place. For example, we could extend our schema from the initial example as follows:
level = 1;
entity User {
manager: User,
age: Long,
};
action getDetails appliesTo { ... }
Policy writers know, by seeing level = 1
, that they can write principal.manager
in policies, but not principal.manager.manager
.
On the other hand, you could imagine that different applications might like to use the same schema file but have different restrictions on entity retrieval, owing to the efficiency of a chosen storage strategy. For example, in a graph database, retrieving data may be efficient no matter what the level is, whereas in a relational database, each level might necessitate another join query, which could be very inefficient.
Specifying levels in schemas also adds a complication when multiple schema files are used during validation. When a user specifies multiple files, they are effectively concatenated together by validator. What should happen if multiple files specify level
and the level is different? One approach is that the validator would allow at most one to have a level
specification, else it's an error. Doing so serves the expected use-case that one of those files will be for the application, specifying the action
types and the level
, while any others will be only general entity
and type
definitions, which are not application specific and thus need not concern themselves with entity slicing.
Alternative: Per-entity levels, rather than a global level
We might refine in-schema level
to not apply to all entities, but rather to particular entity types. For example, for User
entities (bound to principal
) we might specify level 2 but for any other entity type we specify level
as 0, as per the following schema:
@level(2)
entity Issue = {
repo: Repository
};
@level(0)
entity Repository = {
owner: User,
readers: Set<User>,
};
@level(0)
entity User = {
manager: User
};
action ReadIssue appliesTo { principal: User, resource: Issue };
Per-entity levels allows you to reduce the entities you provide in the slice — given a request, you get the types of the principal and resource entities, look up what their levels are from the schema, and then provide data up to that level for each. Validation for per-entity levels is also a straightforward generalization of the algorithm sketched above.
Note that per-entity levels adds some complication to understandability, in particular for nested attributes. The question is whether the performance benefits are worth it. Suppose I have a policy
permit(principal, action, resource) when {
resource.repo.owner.manager == principal
};
This is allowed because Issue
is labeled with level 2. That means that we are allowed to write resource.repo.owner.manager
(two levels of dereference beyond the first, allowed one by level 0). But this is a little confusing because we wrote @level(0)
on Repository
, and yet we are accessing one of its attributes. The reason that’s Ok is that Repository
never actually is bound to principal
or resource
, so the level is irrelevant.
Alternative: Full support for entity literals
Entity literals cannot be dereferenced. That means that, at all levels, they can only appear in ==
expressions or on the RHS of in
. This captures use-cases that we know about. However, if you wanted to use them on the LHS of in
or access their attributes, one approach to doing this would be to extend enumerated entity types to normal entity type definitions. For example,
entity User in [Team] enum { "default" @level } = {
manager: User,
organization: Org
};
This would say that there is a User
entity whose ID is default
which should always be present in the entity slice, e.g., along with other entities for the principal, resource, etc. That presence is at the level specified globally, e.g., if level was 1, then just the User::"default"
and its direct attributes would be included, but not the entities those attributes point to.
Alternatives: Extensions to reduce ancestor requirements
One drawback of level is that it gets all of an entity’s ancestors when it is included in a slice. This could be very expensive in the case that the entity has many such ancestors. For example, in TinyTodo, each User
has Team
-typed ancestors that correspond to the viewers
or editors
of the List
entities which they can access. This could be 1000s of lists, which are expensive to acquire and to include in the request (especially if calling AVP). Worse, this work is apparently wasted because the policies can only ever look at some of these ancestors, notably those reachable from the editors
or viewers
of the resource also provided in the request.
Instead of passing in a principal in a request with all of its ancestors, we could just as well pass in only those that are possibly referenced by policy operations. For TinyTodo, it would be those equal to resource.viewers
or resource.editors
(if they match). Since this invariant is not something the application necessarily knows for certain (it might be true of policies today, but those policies could change in the future), we want a way to reduce the provided ancestor data while being robust to how policy changes in the future. Here are three possible approaches to addressing this issue.
Alternative 1: Leverage partial evaluation
We can use partial evaluation to solve this problem: Specify a concrete request with all of the needed attribute data, after entity selection, but set the ancestor data as **** unknown. Partially evaluating such a request will result in results that have incomplete entity checks. These checks can then be handled by the calling application. This might be a nice feature we can package up and give to customers as a client-side library.
A drawback of this approach is that it could slow down evaluation times. This is because the entity hierarchy is used to perform policy slicing. In the worst case, you’d have to consider all possible policies. It might be possible to provide at least someof the hierarchy with the request to reduce the set of policies that are partially evaluated.
Alternative 2: Ancestor allow-lists
When validating static policies, it’s possible to see which ancestors are potentially needed. Could we do something similar to level for policy validation, but for ancestors instead?
For TinyTodo policies above, we see the following expressions in policies:
principal in resource.viewers
principal in resource.editors
principal in Team::"admin"
For the other examples, the situation is similar. We also see multi-level access, e.g., with GitHub policies principal in resource.repo.readers
, and we see recursive access through sets, e.g., for Hotel Chains we see resource in principal.viewPermissions.hotelReservations
where hotelReservations
is a set containing entities, rather than being a single entity.
Approach: Add an annotation @ancestors
to the in
declarations around entities. When I write entity User in [Team]
, add an annotation that enumerates the Team
-typed expressions that can be on the RHS of in
in policies. Here’s an example of the annotated TinyTodo schema:
@ancestors(resource.viewers,resource.editors,Team::"admin",Team::"interns")
entity User in [Team, Application] = {
"joblevel": Long,
"location": String,
};
@ancestors()
entity Team in [Team, Application];
@ancestors()
entity List in [Application] = {
"editors": Team,
"name": String,
"owner": User,
"readers": Team,
"tasks": Tasks,
};
@ancestors()
entity Application;
...
Notice that we annotate List
, Team
, or Application
with ()
since entities of these types never appear on the LHS of in
. An entity type with no annotation at all is assumed to require every potential ancestor.
Validation: The validator will check for syntactic equality against the expressions in the annotation. In particular, when it sees an expression E in F
in a policy, if E
has an entity type that is annotated with the set of expressions S
in the schema, then the validator will check that F
∈ S
, using syntactic equality. With the above schema, all current TinyTodo policies would validate, while the following would not
// Fails because a Team-typed entity is on the LHS of `in`,
// but entity Team has no legal RHS expressions per the schema
permit(principal,action == Action::"GetList",resource)
when { resource.editors in Team::"admin" };
// Fails because Team::"temp" is not a legal RHS expression for User
permit(principal in Team::"temp",action,resource);
Entity selection: When collecting ancestors during entity selection, the application only gets ancestors per the expressions in the annotation. To do that, it would evaluate these expressions for the current request, and provide the ancestors present in the result, if present. For example, suppose our TinyTodo request was
principal = User::"Alice"
action = Action::"GetList"
resource = List::"123"
context = {}
Since TinyTodo policies are level 2, we would provide entity data for both User::"Alice"
and List::"123"
. Suppose that the latter’s data is
{ id = "123",
owner = User::"Bob",
viewers = Team::"123viewers",
editors = Team::"123editors"
}
The principal
has type User
, which is annotated in the schema with the following expressions, shown with what they evaluate to:
resource.viewers
evaluates toList::"123".viewers
which evaluates toTeam::"123viewers"
resource.editors
evaluates toList::"123".editors
which evaluates toTeam::"123editors"
Team::"admin"
(itself)Team::"interns"
(itself)
Thus the application knows we can determine in advance whether User::"Alice"
(the principal
) has any of these four ancestors, and include them as such with User::"Alice"
’s entity data, if so. No other ancestor data needs to be provided.
Note: The above algorithm i not very helpful for templates containing expressions in ?resource
or in ?principal
( (templates with only == ?principal
and == ?resource
in them are fine, since they don’t check the ancestors). For these, you are basically forced to not annotate the entity type because ?resource
and ?principal
are not legal expressions. This makes sense inasmuch as a template with constraint principal in ?principal
tells us nothing about possible values to which ?principal
could be linked — the values could include potentially any entity, so in the worst case we’d need to include all possible ancestors. It may be that you could use further contextual data (like the action) to narrow the ones you provide.
Note: Interestingly, no policy-set example ever has an expression other than principal
or resource
on the LHS of in
. The scheme above is not leveraging this fact.
Note: This approach composes nicely with the per-entity @level
approach sketched above.
Alternative: Reduce attribute requirements
The per-entity level approach is one way to reduce attribute requirements, but only across levels — for a particular entity you still must provide all of its attributes if you are providing its data. We might like to provide only some attributes, similar to the alternative that requires only some, not all, ancestors.
As discussed earlier, partial evaluation can be used to help here. There could also be some sort of allow-list for attributes. Probably it would make sense to associate allow-lists with actions.
Appendix: Accepted policy sets
All existing policies in published example Cedar applications can be validated using levels. In particular, the following policy sets validate at level 1:
- Tags and roles: https://github.com/cedar-policy/cedar-examples/tree/main/cedar-example-use-cases/tags_n_roles
- Sales orgs: https://github.com/cedar-policy/cedar-examples/tree/main/cedar-example-use-cases/sales_orgs
- Hotel chains: https://github.com/cedar-policy/cedar-examples/tree/main/cedar-example-use-cases/hotel_chains
And the following validate at level 2:
- TinyTodo: https://github.com/cedar-policy/cedar-examples/blob/main/tinytodo/policies.cedar and also https://github.com/cedar-policy/cedar-examples/blob/main/tinytodo/policies-templates.cedar
- Policy 6 includes expression
resource.owner.location
- Policy 6 includes expression
- Tax Preparer: https://github.com/cedar-policy/cedar-examples/blob/main/cedar-example-use-cases/tax_preprarer/policies.cedar
- The access rule involves checking something to the effect of
principal.assigned_orgs.contains(resource.owner.organization)
.
- The access rule involves checking something to the effect of
- Document cloud: https://github.com/cedar-policy/cedar-examples/blob/main/cedar-example-use-cases/document_cloud/policies.cedar
- This has policy expression
(resource.owner.blocked.contains(principal) || principal.blocked.contains(resource.owner))
. (It would not make sense to duplicated theblocked
- This has policy expression
- GitHub: https://github.com/cedar-policy/cedar-examples/blob/main/cedar-example-use-cases/github_example/policies.cedar
- Expressions like
principal in resource.repo.readers
are common: A resource like an issue or pull request inherits the permissions of the repository it is a part of, so the rules chase a pointer to that repository.
- Expressions like
datetime
extension
Related issues and PRs
- Implementation PR (s): cedar#1276
Timeline
- Started: 2024-08-08
- Accepted: 2024-09-11
- Landed: 2024-11-13 on
main
- Released: 2025-01-21 in
cedar-policy
v4.3.0 as an experimental feature
Note: These statuses are based on the first version of the RFC process.
Summary
Cedar currently supports extension functions for IP addresses and decimal values. A popular request has been support for dates and times as well, but there are several complexities here (e.g., timezones and leap years) that have delayed this. Recall that anything we add to Cedar, we want to be able to formally model in Lean using our verification-guided development approach (see this blog post). We also want to support a decidable SMT-based analysis (see this paper). The goal of this RFC is to narrow in on a set of date/time related features that are useful in practice, but still feasible to implement in Cedar given these constraints.
Basic example
This RFC would support a policy like the following, which allows a user to access materials in the "device_prototypes" folder only if they have a sufficiently high job level and a tenure of more than one year.
permit(
principal is User,
action == Action::"view",
resource in Folder::"device_prototypes"
)
when {
principal.department == "HardwareEngineering" &&
principal.jobLevel >= 10 &&
context.now.timestamp.durationSince(principal.hireDate) > duration("365d")
};
Motivation
Previously for applications that determine authorization based on date/time, our suggestion has been to use Unix time (see Workaround A) or to pass the result of time computations in the context (see Workaround B).
But these are not ideal solutions because Cedar policies are intended to expressive, readable, and auditable.
Unix timestamps fail the readability criteria: an expression like context.now < 1723000000
is difficult to read and understand without additional tooling to decode the timestamp.
Unix timestamps are also indistinguishable from any other integral value providing no additional help in expressing the abstraction of time.
Passing in pre-computed values in the context (e.g., a field like context.isWorkingHours
) makes authorization logic difficult to audit because it moves this logic outside of Cedar policies and into the calling application.
It is also difficult for the calling application to predict the necessary pre-computed values that a policy writer requires for their intended purpose. These may change over time, and may also differ depending on the principal, action, and/or resource.
Additional examples
The examples below illustrate the types of policies that this proposal would enable. It assumes that the application adds a Cedar Record named now
to context
with the following fields:
timestamp
: A value of typedatetime
(defined below) representing the current time at policy evaluationdayOfWeek
: Along
value representing current the day of week (Sunday = 1, ... Saturday = 7)day
: Along
representing the current day of the monthmonth
: Along
representing current month (January = 1)year
: Along
representing the current year
Only allow user "alice" to view JPEG photos for one week after creation.
permit(
principal == User::"alice",
action == PhotoOp::"view",
resource is Photo
) when {
resource.fileType == "JPEG" &&
context.now.timestamp.durationSince(resource.creationTime) <= duration("7d")
};
Allow access from a certain IP address only between 9am and 6pm UTC.
permit(
principal,
action == Action::"access",
resource
) when {
context.srcIp.isInRange(ip("192.168.1.0/24")) &&
context.workdayStart <= context.now.timestamp &&
context.now.timestamp < context.workdayEnd
};
Prevent employees from accessing work documents on the weekend.
forbid(
principal,
action == Action::"access",
resource is Document
) when {
[1,7].contains(context.now.dayOfWeek)
};
Permit access to a special opportunity for persons born on Leap Day.
permit(
principal,
action == Action::"redeem",
resource is Prize
) when {
principal.birthDate.day == 29 &&
principal.birthDate.month == 2 &&
context.now.day == 29 &&
context.now.month == 2
};
Forbid access to EU resources after Brexit
forbid(
principal,
action,
resource
) when {
context.now.timestamp > datetime("2020-01-31T23:00:00Z") &&
context.location.countryOfOrigin == 'GB' &&
resource.owner == 'EU'
}
Detailed design
This RFC proposes supporting two new extension types: datetime
, which represents a particular instant of time, up to millisecond accuracy, and duration
which represents a duration of time.
To construct and manipulate these types we will provide the functions listed below.
All of this functionality will be hidden behind a datetime
feature flag (analogous to the current decimal and IP extensions), allowing users to opt-out.
Instants of Time (datetime
)
The datetime(string)
function constructs a datetime value. Like with other extension function constructors, strict validation requires string
to be a string literal, although evaluation/authorization support any string-typed expression. The string must be of one of the forms, and regardless of the timezone offset is always normalized to UTC:
"YYYY-MM-DD"
(date only)"YYYY-MM-DDThh:mm:ssZ"
(UTC)"YYYY-MM-DDThh:mm:ss.SSSZ"
(UTC with millisecond precision)"YYYY-MM-DDThh:mm:ss(+/-)hhmm"
(With timezone offset in hours and minutes)"YYYY-MM-DDThh:mm:ss.SSS(+/-)hhmm"
(With timezone offset in hours and minutes and millisecond precision)
All formats must adhere to these additional restrictions:
- The month field (
MM
) must be in the range 01-12 - The day field (
DD
) must be in the range 01-31 - The hour field (
hh
) must be in the range 00-23 - The minute field (
mm
) must be in the range 00-59 - The seconds field (
ss
) must be in the range 00-59 (leap seconds are not supported)
Note that the restrictions on the hour and minute fields also apply to timezone offsets.
This implies that timezone offsets must have an absolute value less than 24 hours (i.e., between -2359
and +2359
).
The datetime
type does not provide a way for a policy author to create a datetime
from a numeric timestamp. One of the readable formats listed above must be used, instead.
Values of type datetime
have the following methods:
.offset(duration)
returns a newdatetime
, offset by duration..durationSince(DT2)
returns the difference betweenDT
andDT2
as aduration
. (Note that the inverse ofdurationSince
isDT2.offset(duration)
). An invariant forDT1.durationSince(DT2)
is that whenDT1
is beforeDT2
the resulting duration is negative..toDate()
returns a newdatetime
, truncating to the day, such that printing thedatetime
would have00:00:00
as the time..toTime()
returns a newduration
, removing the days, such that only milliseconds since.toDate()
are left. This is equivalent toDT.durationSince(DT.toDate())
Values of type datetime
can be used with comparison operators:
DT1 < DT2
returnstrue
whenDT1
is beforeDT2
DT1 <= DT2
returnstrue
whenDT1
is before or equal toDT2
DT1 > DT2
returnstrue
whenDT1
is afterDT2
DT1 >= DT2
returnstrue
whenDT1
is after or equal toDT2
DT1 == DT2
returnstrue
whenDT1
is equal toDT2
DT1 != DT2
returnstrue
whenDT1
is not equal toDT2
Equality is based on the underlying representation (see below) so, for example, datetime("2024-08-21T") == datetime("2024-08-21T00:00:00.000Z")
is true. This behavior is consistent with the decimal extension function, where decimal("1.0") == decimal("1.0000")
is also true.
Representation
The datetime
type is internally represented as a long
and contains a Unix Time in milliseconds. This is the number of non-leap seconds that have passed since 1970-01-01T00:00:00Z
in milliseconds. A negative Unix Time represents the number of milliseconds before 1970-01-01T00:00:00Z
. Unix Time days are always 86,400 seconds and handle leap seconds by absorbing them at the start of the day. Due to using Unix Time, and not providing a "current time" function, Cedar avoids the complexities of leap second handling, pushing them to the system and application.
Durations of Time (duration
)
The duration(string)
function constructs a duration value from a duration string. Strict validation requires the argument to be a literal, although evaluation/authorization support any appropriately-typed expressions. The string
is a concatenated sequence of quantity-unit pairs. For example, "1d2h3m4s5ms"
is a valid duration string.
The quantity part is a natural number. The unit is one of the following:
d
: daysh
: hoursm
: minutess
: secondsms
: milliseconds
Duration strings are required to be ordered from largest unit to smallest unit, and contain one quantity per unit. Units with zero quantity may be omitted.
"1h"
, "-10h"
, "5d3ms"
, and "3h5m"
are all valid duration strings.
A duration may be negative. Negative duration strings must begin with -
.
Values of type duration
have the following methods:
.toMilliseconds()
returns along
describing the number of milliseconds in this duration. (the value as a long, itself).toSeconds()
returns along
describing the number of seconds in this duration. (.toMilliseconds() / 1000
).toMinutes()
returns along
describing the number of minutes in this duration. (.toSeconds() / 60
).toHours()
returns along
describing the number of hours in this duration. (.toMinutes() / 60
).toDays()
returns along
describing the number of days in this duration. (.toHours() / 24
)
Values with type duration
can also be used with comparison operators:
DUR1 < DUR2
returnstrue
whenDUR1
is shorter thanDUR2
DUR1 <= DUR2
returnstrue
whenDUR1
is shorter than or equal toDUR2
DUR1 > DUR2
returnstrue
whenDUR1
is longer thanDUR2
DUR1 >= DUR2
returnstrue
whenDUR1
is longer than or equal toDUR2
DUR1 == DUR2
returnstrue
whenDUR1
is equal toDUR2
DUR1 != DUR2
returnstrue
whenDUR1
is not equal toDUR2
Comparisons are done with respect to the sign of a duration. I.e., duration("-1d") < duration("1s")
.
Equality is based on the underlying representation (see below) so, for example, duration("1d") == duration("24h")
is true.
Representation
The duration
type is internally represented as a quantity of milliseconds as a long
, which can be positive, negative, or zero.
A negative duration may be useful when a user wants to use .offset()
to shift a date backwards.
For example: context.now.offset(duration("-3d"))
expresses "three days before the current date".
Errors
All the extension functions proposed in this RFC will throw a type error at authorization time if called with the wrong type of arguments.
Additionally, the datetime
and duration
constructors will return an error if the input string does not match the expected format, or if the internal representation of the value (a 64-bit signed int) would overflow.
The datetime
constructor will also return an error if the input string refers to a non-existent calendar date (e.g., "2025-02-31"
).
Unlike some datetime libraries that perform overflow normalization, this extension explicitly rejects such invalid dates.
.offset(duration)
will return an error if the resulting datetime would overflow.
As noted above, strict validation will require passing literals to the duration
and datetime
constructors, and it will raise an error if those strings are malformed. Otherwise, validation is straightforward.
JSON Encoding
Cedar supports a JSON format for policies, schemas, entities, and request contexts. The JSON representation of the datetime
and duration
functions will match the precedents set by the existing IP address and decimal extensions.
For example, here is the JSON encoding of the expression datetime("2020-01-31T23:00:00Z")
, which might occur in a policy condition:
"datetime": [
{
"Value": "2020-01-31T23:00:00Z"
}
]
And here is the JSON encoding for this value when it occurs in entity or context data:
{ "__extn": { "fn": "datetime", "arg": "2020-01-31T23:00:00Z" } }
Finally, here is the JSON encoding of the datetime
type in a schema:
{
"type": "Extension",
"name": "datetime"
}
Out of scope
-
Conversion between UTC and epochs: This will be particularly difficult to model and verify in Lean (although it's technically possible, see this paper which does something similar in the Coq proof assistant). Since it will likely require input-dependent loops, it is unlikely that this can be reasoned about efficiently with SMT.
-
Conversion between UTC and other time zones: Time Zones are a notoriously complex system that evolves rapidly. We avoid this complexity by offering
datetime.offset(duration).
Policy authors that require "local time" can either provide an additional datetime incontext
or provide aduration
to thecontext
and call.offset()
to shift the time. -
Function to get current time: A natural extension of this proposal would be a function
currentTime()
that provides the current time instead of it being passed in throughcontext
. However,currentTime()
is stateful, i.e. not pure, and cannot be modeled in SMT. It's also not useful:currentTime() == currentTime()
may return false, and Cedar does not provide local bindings (e.g.let t = currentTime()
). For testing purposes, Cedar would also need to provide some way to overridecurrentTime
. These problems all go away ifcurrentTime()
is not supported. -
Leap seconds and leap years: Cedar does not have a clock, and this proposal does not add one. Instead, applications pass the current time through
context
or an entity and let the system / application handle complexities like leap seconds and leap years. This means that Cedar cannot provide utilities likedatetime.dayOfWeek()
ordatetime.dayOfMonth()
. Cedar applications that wish to define policies based on these ideas should pass pre-computed properties through entities or through thecontext
.
Support for date/time in other authorization systems
AWS IAM supports date condition operators, which can check relationships between date/time values in the ISO 8601 date format or Unix time. You can find an example of a IAM policy using date/time here.
Open Policy Agent provides a Time API with nanosecond precision and extensive time zone support. During policy evaluation, the current timestamp can be returned, and date/time arithmetic can be performed. The diff
(equivalent to our proposed durationSince
operator on datetime
) returns an array of positional time unit components, instead of a value typed similarly to our proposed duration
.
Alternatives
Both alternatives we propose here are "do nothing" options. They show how to encode date/time using existing Cedar types and functionality.
Workaround A: Represent Unix time with a Long
Cedar has long suggested workarounds for date/time functionality by using the comparison and arithmetic operators with context
-provided Unix Timestamps.
Here are the previous examples rewritten to use Unix Time.
Only allow experienced, tenured persons from the Hardware Engineering department to see prototypes.
permit(
principal is User,
action == Action::"view",
resource in Folder::"device_prototypes"
)
when {
principal.department == "HardwareEngineering" &&
principal.jobLevel >= 10 &&
(context.currentTime - (365 * 24 * 60 * 60)) >= principal.hireDate
};
Only allow user "alice" to view JPEG photos for one week after creation.
permit(
principal == User::"alice",
action == PhotoOp::"view",
resource is Photo
) when {
resource.fileType == "JPEG" &&
resource.creationDate <= (context.currentTime - (7 * 24 * 60 * 60))
};
Allow access from a certain IP address only between 9am and 6pm UTC.
Cedar does not currently support arithmetic division (/
) or remainder (%
), and therefore this example is not expressible, today.
permit(
principal,
action == Action::"access",
resource
) when {
context.srcIp.isInRange(ip("192.168.1.0/24")) &&
9 <= ((context.currentTime / (60 * 60)) % 24) &&
((context.currentTime / (60 * 60)) % 24) < 18
};
Note that the localized version of this example, with timezone offset, could be supported using the +
or -
operators on context.currentTime
.
Prevent employees from accessing work documents on the weekend.
With Unix Time, this requires /
and %
operators to compute the dayOfWeek
, which isn't currently expressible in Cedar.
Permit access to a special opportunity for persons born on Leap Day.
With Unix Time, this requires /
and %
operators to compute whether or not it is a leap year, which isn't currently expressible in Cedar.
Forbid access to EU resources after Brexit
forbid(
principal,
action,
resource
) when {
context.currentTime > 1580511600 &&
context.location.countryOfOrigin == 'GB' &&
resource.owner == 'EU'
}
Workaround B: Pass results of time checks in the context
Another workaround we have suggested is to simply handle date/time logic outside of Cedar, and pass the results of checks in the context
. For example, you could pass in fields like context.isWorkingHours
or context.dayOfTheWeek
.
Here are the previous examples rewritten to use additional context.
Only allow experienced, tenured persons from the Hardware Engineering department to see prototypes.
permit(
principal is User,
action == Action::"view",
resource in Folder::"device_prototypes"
)
when {
principal.department == "HardwareEngineering" &&
principal.jobLevel >= 10 &&
principal.hireDate <= context.minimumHiringDateForAccess
};
Note: assumes context.hireDate
and context.minimumHiringDateForAccess
are long
values (e.g., Unix time).
Only allow user "alice" to view JPEG photos for one week after creation.
permit(
principal == User::"alice",
action == PhotoOp::"view",
resource is Photo
) when {
resource.fileType == "JPEG" &&
resource.creationDate >= context.oldestViewableDate
};
Note: assumes resource.creationDate
and context.oldestViewableDate
are long
values (e.g., Unix time).
Allow access from a certain IP address only between 9am and 6pm UTC.
permit(
principal,
action == Action::"access",
resource
) when {
context.srcIp.isInRange(ip("192.168.1.0/24")) &&
context.isWorkingHours
};
Prevent employees from accessing work documents on the weekend.
forbid(
principal,
action == Action::"access",
resource is Document
) when {
context.isTheWeekend
};
Permit access to a special opportunity for persons born on Leap Day.
permit(
principal,
action == Action::"redeem",
resource is Prize
) when {
context.isMyBirthday && context.isLeapDay
};
Forbid access to EU resources after Brexit
forbid(
principal,
action,
resource
) when {
context.afterBrexit &&
context.location.countryOfOrigin == 'GB' &&
resource.owner == 'EU'
}
Discussion notes
Local time
The current proposal has no direct support for local time, or time zones outside of UTC. Providing robust time zone support would add significant complications to the formal models and ruin the project's SMT-based analysis goals. Policy authors wishing to use local time can simulate it by:
- providing offsets, of the form
+/-hhmm
, to the time strings used bydatetime()
. (Note: no time zone information is retained. The time will be converted to UTC) - utilizing the
datetime.offset()
method withduration
values, or values passed through entities and/orcontext
.
Consider the policy below that checks if a principal's local time is between 09:00 and 17:00.
permit(
principal,
action == Action::"access",
resource
) when {
context.now.timestamp.offset(principal.timeZoneOffset).toTime() >= duration("9h") &&
context.now.timestamp.offset(principal.timeZoneOffset).toTime() <= duration("17h")
};
Operator overloading
This RFC proposes to use operators <
, <=
, >
, and >=
for comparing datetime
and duration
objects.
Currently in Cedar, these operations are only supported for long
-typed values.
For other extension types with similar operations, Cedar instead uses extension functions (e.g., .lessThan()
for decimal values).
This RFC proposes to reverse this decision, and instead allow using builtin operators for extension functions, as appropriate. This will add some implementation complexity (at least in the primary Rust implementation), but it will make policies that use these operations easier to read and easier to write.
Millisecond precision
The current proposal supports milliseconds. The ISO 8601 format does not specify a maximum precision, so we can technically allow any number of S
s after the .
in YYYY-MM-DDThh:mm:ss.SSSZ
. Based on this blog post, it appears that Javascript supports milliseconds (3 digits), Python supports microseconds (6 digits), and Rust and Go support nanoseconds (9 digits). Assuming nanosecond accuracy, the maximum (signed) 64-bit number (2^63 - 1) represents April 11, 2262. This date seems far enough out that any of these choices (milliseconds, microseconds, or nanoseconds) seems reasonable.
During discussion, we decided that sub-second accuracy was potentially useful, but we did not have a use case in mind for sub-millisecond accuracy. So in the end we landed on milliseconds. Note that this is the backwards compatible option (at least with respect to allowable date/time strings) because we can add precision later, but not remove it without a breaking change.
Additional restrictions on datetime strings
The current proposal sets additional restrictions on most fields of datetime string formats. These restrictions are common in datetime frameworks for popular programming languages like Python and TypeScript (more details here), and they help us avoid discrepancies between different Cedar implementations.
Overflow handling
The current proposal rejects non-existent dates like 2025-02-31
.
However, some programming languages and datetime libraries accept such dates and handle excess days through overflow normalization, resulting in a different valid calendar date.
For instance, 2025-02-31
is interpreted as being 3 days later than 2025-02-28
, thus normalizing to 2025-03-03
.
The standard Date
type in TypeScript reflects this behavior, as shown in this comparison.
We argue that this normalization behavior, while convenient for general programming, may be too implicit and potentially surprising for an authorization system.
Entity tags with dedicated syntax and semantics
Related issues and PRs
- Reference Issues: cedar#305
- Related RFC(s): Supersedes RFC 68
- Implementation PR(s): cedar#1207
Timeline
- Started: 2024-09-11
- Accepted: 2024-09-19
- Landed: 2024-09-18 on
main
- Released: 2024-09-30 in
cedar-policy
v4.1.0 as an experimental feature - Stabilized: 2024-10-07 in
cedar-policy
v4.2.0
Note: These statuses are based on the first version of the RFC process.
Summary
This RFC proposes to extend the Cedar language, type system, and symbolic analysis to include full-featured entity tags for entity types. Tags are a mechanism used by cloud services to attach key-value pairs to resources. Cedar will allow them to be attached to any entities (not just resources).
For evaluation purposes, entity tags are accessed using the methods expr1.hasTag(expr2)
and expr1.getTag(expr2)
, where expr1
evaluates to an entity and expr2
to a string. These are the only ways to access the tags on an entity. The policy syntax, semantics, type system, and symbolic analysis are extended to support these two methods. The schema syntax and the entity JSON format are extended to support defining entity tags and their types.
Entity tags strictly increase the expressiveness of Cedar because they allow computed keys, unlike records and entities. Specifically, tag keys need not be literals, so it is possible to write resource.hasTag(context.tag)
and resource.getTag(context.tag)
.
This proposal is backward-compatible in the sense that existing polices, schemas, entities, etc., do not need to change to accommodate the addition of tags to the language.
This RFC has gone through several revisions, with different revisions proposing alternative designs. The alternatives are discussed at the end of this document.
Basic example
Here is a schema defining two entities, each of which contains entity tags.
entity User = {
jobLevel: Long,
} tags Set<String>;
entity Document = {
owner: User,
} tags Set<String>;
The User
and Document
entities each have entity tags, denoted by tags Set<String>
, implementing tags whose values are sets of strings. The declaration tags Set<String>
indicates an unspecified number of optional tags that all have values of type Set<String>
. (Of course, any type could be written, for instance Long
, not just Set<String>
.) Note that the set of tag (keys) does not have to be defined in the schema, but the type of all tag values needs to be the same (here, Set<String>
). (You cannot have tag foo
have value 2
, and tag bar
have value "spam"
.) Note also that there is no way to name the key-value map representing tags; you cannot treat it as a Cedar record or even compare it for equality to the tag-map on a different entity; you can only compare the values of individual keys.
Here's a policy that conforms to this schema:
permit (
principal is User,
action == Action::"writeDoc",
resource is Document)
when {
document.owner == principal ||
(principal.jobLevel > 6 &&
resource.hasTag("write") &&
principal.hasTag("write") &&
resource.getTag("write").containsAny(principal.getTag("write")))
};
This policy states that for a User
to carry out the writeDoc action on a Document
, the user must own the document, or else the user's job level must be at least 6 and the document and the user must each have a write
tag, where at least one of the user's write-tag's values must be present in the document's write-tag's values.
Detailed design
To support entity tags, we need to extend the JSON format for entities, the JSON and natural syntax for schemas, the policy syntax, as well as the evaluator, validator, and symbolic compiler.
Policy syntax and semantics
Tags support two binary operators, where e
and k
are Cedar expressions:
e.hasTag(k)
e.getTag(k)
The hasTag
operator tests if the entity e
has a tag with the key k
. This operator errors if e
is not an entity, or k
is not a string. It returns false
if e
doesn't have any tags at all to match the behavior of the has
operator on entities and records. In other words, having no tags is equivalent to having an empty tags-map. The getTag
operator retrieves the value for the key k
from the tags for the entity e
. This operator errors under the same conditions as hasTag
and additionally, if e
has no tag with the key k
.
At the code level, we will extend the Cedar CST, AST, and EST to include the new operators (either as separate nodes in the relevant trees or as two extra binary operators), and we'll extend the evaluator to implement the above semantics.
JSON entity format
The JSON entity format is extended to support optionally specifying tags separately from entity attributes:
[
{
"uid": {},
"parents": [],
"attrs": {},
"tags": {}
},
{
...
}
]
Schema
We extend schemas as follows to support the declaration of a tags. Here's the natural syntax:
Entity := 'entity' Idents ['in' EntOrTyps] [['='] EntityRecType] ['tags' Type] ';'
The new syntax simply extends the Entity
production to enable optionally specifying the type of attached tags.
The JSON syntax for schemas specifies tags as a separate tags
field that specifies the type of the tag values. Here's our introductory example schema in this format:
"entityTypes": {
"User" : {
"shape" : {
"type" : "Record",
"attributes" : {
"jobLevel" : {
"type" : "Long"
},
}
},
"tags" : {
"type" : "Set",
"element": { "type": "String" }
}
},
"Document" : {
"shape" : {
"type" : "Record",
"attributes" : {
"owner" : {
"type" : "Entity",
"name" : "User"
},
}
},
"tags" : {
"type" : "Set",
"element": { "type": "String" }
}
}
}
Validating policies
We extend the policy validator to handle the hasTag
and getTag
operators analogously to how it handles the has
and .
operator on records and entities.
Capabilities
Background: While typechecking an expression involving records or entities with optional attributes, the validator tracks capabilities. These represent the attribute-accessing expressions that are sure to succeed. If principal
has type User
and User
has an optional Boolean
attribute sudo
, then the expression principal.sudo
only validates if principal.sudo
is present in the current capability set. Capabilities are added to that set by a preceding expression of the form principal has sudo
, e.g., as in principal has sudo && principal.sudo
.
Capability tracking must be generalized to support tags. In particular, tags T
should be treated as a record with optional attributes, and hasTag
checks on its keys should update the capability set. For our introductory example, consider the following expression
resource.hasTag("write") && // (1)
principal.hasTag("write") && // (2)
resource.getTag("write").containsAny(principal.getTag("write")) // (3)
After the subexpression (1), resource.hasTag("write")
should be added to the current capability set. After subexpression (2), principal.hasTag("write")
should be added to it. Finally, when validating subexpression (3), the expression resource.getTag("write")
will be considered valid since resource.hasTag("write")
is in the current capability set and it will be given type Set<String>
, as the tags
for resource
has type Set<String>
. The expression principal.getTag("write")
is handled similarly. If either of the hasTag
subexpressions (1) or (2) were omitted, subexpression (3) would not validate due to the missing capability set entries.
For entity types with no tags
declaration in the schema, the validator gives hasTag
the type False
(to support short-circuiting), and rejects getTag
.
Validating and parsing entities
The Cedar authorizer's is_authorized
function can be asked to validate that entities in a request are consistent with a provided schema. Extending validation to work with tags is straightforward. We check that every value in the tags
map for an entity e
has type T
given that type of e
is declared to have tags T
. If the type e
doesn't have tags
in the schema, then specifying a tags
field is an error.
Similarly, schema-based parsing considers schemas when parsing in entities, and it can confirm when parsing that the contents of tags
attribute in the JSON have the appropriate shape, and if the type T
has different behavior under schema-based parsing (for instance, if it is an extension type), then schema-based parsing can interpret the tags
appropriately.
Symbolic compilation
Tags as defined here have an efficient logical representation as uninterpreted binary functions. In particular, for each declaration entity E ... tags T
, we introduce an uninterpreted function f_E
of type E -> String -> Option T
. With this representation, we can straightforwardly translate the hasTag
and getTag
operations as applications of the function f_E
, where the absence of a key is denoted by binding it to the value (none T)
.
Drawbacks
Entity types are permitted to have a single tags
declaration tags T
, which eliminates the possibility of attaching multiple tags-maps to a single entity (unlike RFC 68). This RFC also does not support tags containing other tags-maps, comparing tags-maps with ==
, and the like (as discussed above).
The use-cases that we are aware of do not suffer due to these limitations. If you wanted tags-maps to contain other tags-maps, or you wanted to store tags-maps in the context
record, or you wanted to attach multiple tags-maps to a single entity, you can create specific entity types with those tags. For example:
entity User {
roles: Set<String>,
accessLevels: IntTags
} tags StringTags;
entity StringTags {
} tags String;
entity IntTags {
} tags Long;
In effect, the User
's tags is equivalent to tags (tags String)
, but we have added a level of indirection by expressing the inner (tags String)
value as an entity with tags String
. Similarly, the User
's accessLevels
attribute is equivalent to (tags Long)
. So, we can express policies such as:
permit(principal is User, action, resource is Document)
when {
principal.roles.contains(context.role) &&
principal.accessLevels.hasTag(context.role) &&
principal.accessLevels.getTag(context.role) > 3
};
permit(principal is User, action, resource is Document)
when {
principal.hasTag("clearance") &&
principal.getTag("clearance").hasTag("level") &&
principal.getTag("clearance").getTag("level") == "top secret"
};
Implementation effort
The implementation effort for adding tags is non-trivial, as it will require changing the Rust code, the Lean models and proofs, and the differential test generators. It will affect most components of Cedar, including the policy parser, CST→AST conversion, CST→EST conversion, (schema-based) entity parser, (partial) evaluator, validator, and the symbolic compiler.
While they affect most components of Cedar, these changes are relatively localized and straightforward, as outlined above.
Alternatives and workarounds
This feature has gone through several revisions and iterations. We compare this proposal to its predecessor, RFC 68, which discusses other alternatives and workarounds in detail.
RFC 68 proposes to add embedded attribute maps (EA-maps) to Cedar as a second-class, validation-time construct that behaves like a record with limited capabilities. Specifically, an entity attribute can have an EA-map type, but EA-maps cannot occur anywhere else (e.g., it is not possible to have a set of EA-maps). EA-maps are treated as special second-class values during validation. But they are indistinguishable from records during evaluation.
In RFC 68, the validator enforces the second-class status of EA-map records by rejecting (some) expressions that attempt to treat EA-maps as first class values. For example, the validator would reject principal.authTags == resource.authTags
, where the authTags
attribute of principal
and resource
is an EA-map of type { ? : Set<String> }
. It would also reject (if principal.isAdmin then principal.authTags else principal.authTags).write
. However, it would accept (if true then principal.authTags else resource.authTags).write
due to how capabilities work in Cedar's type system. This behavior is confusing in that EA-maps don't appear to consistently act as second class values, and understanding when they do requires understanding the subtleties of how the Cedar type system works.
In RFC 68, the dynamic semantics is unaware of EA-maps. That is, the concept of EA-maps doesn't exist at evaluation time; they are just records. This leads to lower implementation burden (e.g., evaluator doesn't need to change), but it widens the gap between the dynamic and static (type-aware) semantics of Cedar. Specifically, EA-maps are first-class values during evaluation but not during validation.
EA-maps, as proposed in RFC 68, do not add any new expressiveness to Cedar. Since EA-maps are just records at evaluation time, computed keys (such as resource.getKey(context.tag)
) are disallowed. Adding computed keys (Alternative A in RFC 68) requires roughly the same implementation effort as implementing this proposal. But a downside of adding computed keys to RFC 68 is that doing so would further widen the gap between the dynamic and static semantics, since computed keys would be rejected by the validator but not the evaluator on records that aren't EA-maps.
Finally, after modeling RFC 68 in Lean, we found ourselves in an engineering corner, where the type-based enforcement of second-class status of EA-maps makes it difficult to prove a key property of the validator (every policy that is accepted by the validator is also accepted by the symbolic compiler), and also to model and reason about EA-maps in every component that depends on Cedar types.
The core issue is that RFC 68 does not distinguish, at the syntax and semantics level, between records and EA-maps. This distinction must be enforced at the type level. In particular, we need to introduce a first-class EA-map type, which must then be treated as second-class in specific expressions during type checking and in proofs. The special-casing must be done for ==
, if ... then ... else
, hasAttr
, getAttr
, contains
, containsAll
, containsAny
, set
, and record
expressions.
This special-casing will have to be repeated in every component that gives a static (type-aware) semantics to Cedar---such as the symbolic compiler and, potentially, a type-aware partial evaluator. It will also need to be propagated into specifications and proofs of type-aware components. This is expected to increase the overall implementation and proof burden by thousands of lines of Lean code compared to this proposal. (In the symbolic compiler, the burden is further magnified because the special-casing of EA-map types has to be propagated into the underlying Term language too.)
This proposal avoids the issues of RFC 68 by introducing dedicated syntactic and semantic constructs for working with tags. It is impossible, at the syntax level, to write a Cedar expression that evaluates to a (first-class) tags-map value. Rather, the contents of a tags-map are accessed by applying the hasTag
and getTag
operators to the entity that has the tags. These two operators are explicit in the AST, and they can be handled locally in the Lean model and proofs, without changing the handling of any other operator. This proposal also reflects tags in both the static and dynamic semantics, keeping their treatment consistent between the two.
Compared to RFC 68, this proposal represents a more substantial addition to the language, and it requires a significantly larger up-front implementation effort. The tradeoff is that investing in the implementation now means that we won't have to consistently pay an implementation and proof penalty for every new type-aware component we add to Cedar. And adopting the bigger language change now lets us keep the dynamic and static semantics consistent, freeing users from having to reason about the subtleties of Cedar's type system.