Formal Specification for Authorization: Clarity Before Implementation

November 25, 2025

In large organizations, authorization can be a very complex topic. Almost always it is tightly constraint with strict compliance requirements, while at the same time it is intricately tied to the user experience. Who hasn’t had the experience of trying to access a resource they need for work only to be stumped by insufficient access, access that can sometimes take days or even weeks to be granted.

For engineering platform teams, that is an especially acute problem. As it is part of their mission, to provide a seamless experience and at the same time they need care for authorization of multiple offerings that multiples the pain of any friction.

In this blog post, I’ll demonstrate how to describe authorization requirements using a formal specification language (TLA+). I’ll demonstrate the benefits in clarity and specificity that allow for richer discussions in the organization as well as the development new options that strengthen DevEx without compromising security.

I intentionally leave out specific tooling, to implement the authorization at this stage. Starting the discussion about specific tools very early can derail the discussion away from the need of your organization and towards the “features” and the “marketing” of the solution. Start looking into tool only after your requirements are clear. You don’t want to adapt your organizational process to the tool, you want the tool to make your organizational processes simple to use.

Why TLA+ for Authorization?

TLA+ is a high-level language for modeling programs and systems–especially concurrent and distributed ones.

Leslie Lamport

While authorization might not seem like a concurrent or distributed problem, it shares key characteristics:

Multiple actors: Users, teams, systems all interact with authorization
State changes: Permissions evolve as people join teams, projects change, or access is granted
Safety properties: “User X should never access resource Y”
Liveness properties: “User X should eventually get access to resource Y when they join team Z”

TLA+ excels at:

Making implicit assumptions explicit
Discovering edge cases you haven’t considered
Validating that your design satisfies requirements
Communicating precisely with both technical and non-technical stakeholders

The Example: Multi-Team, Multi-Environment Access Control

Organizational Context

In our fictional organization, the IT department has three teams:

dev_team_a and dev_team_b: Developing the flagship application
operations_team: Responsible for running the application in production

Technical Context

To support safe deployment, we maintain two environments:

integration: For testing and validation
production: The live environment serving customers

Each environment runs in a separate AWS account. Users access an environment by being members of the AccessGroup IAM group in that account.

The Problem

Given this setup, we need to ensure:

Only members of the operations team can access the production environment.

This seems straightforward, but as we’ll see, formally specifying this requirement will reveal hidden assumptions and raise important questions about our authorization model.

With our problem defined, let’s model it formally. We’ll start with the organizational context.

Modeling the Organization (Constants)

Let’s start by specifying the organizational context in a formal manner with TLA+.

I’m going to defined the information about the organization as CONSTANT. As, in the scope of this specification they are not going to be changing.

Below you see how to map complex values to CONSTANTS in TLA+. You can find more information about this syntax here

OrgPolicyEvaluation.cfg

1CONSTANTS
2    ApplicationEnvironments <- MyApplicationEnvironments
3    OrganizationalUnits  <- MyOrganizationalUnits
4    OrganizationalUsers <- MyOrganizationalUsers
5    OrganizationalUnitMemberships <- MyOrganizationalUnitMemberships

OrgPolicyEvaluation.tla

 1
 2MyApplicationEnvironments == { "integration", "production" }
 3
 4MyOrganizationalUnits == {
 5    "dev_team_a", "dev_team_b", "operations_team"
 6}
 7
 8MyOrganizationalUsers == {  "user1", "user2", "user3", "user4"  }
 9
10MyOrganizationalUnitMemberships == [ ou \in MyOrganizationalUnits |->
11    CASE ou = "dev_team_a"      -> { "user1", "user2" }
12      [] ou = "dev_team_b"      -> { "user3" }
13      [] ou = "operations_team" -> { "user4" }
14]

What I want to focus in this blog post in how TLA+ provides the tools to describe the organizational units, the users and also the memberships, which user belongs to which team organizational units.

The membership is modeled using a function, that maps the name of the organizational unit to a set of its members.

Now that we’ve captured the organizational structure, let’s model the technical resources we’ll configure.

Modeling the Technical System (Variables)

The technical resource of the system, I prefer to model as variables as those are some of the resources that is one area with no hard constraints, those resources can be modified to achive the best solution.

1VARIABLES
2    awsGroups,
3    awsGroupMemberships
4
5vars == <<awsGroups, awsGroupMemberships>>
6
7AwsAccounts == ApplicationEnvironments
8
9AccessIamGroupName == "AccessGroup"

The only technical resources we know of are the awsGroups and the awsGroupMemberships. The awsGroupMemberships maps users to awsGroups.

With our model in place, we can now formally express our requirements. As we do this, watch how the act of formalization reveals hidden assumptions.

Defining Requirements

Let’s start defining our goals and constraints,

Requirement 1: Only members of the operations team can access production

This was the only requirement explicitly discussed as part of the scope.

Here’s how you describe it in TLA+:

1OnlyMembersOfTheOperationsTeamCanAccessProduction ==
2    \/ AccessIamGroupName \notin awsGroups["production"]
3    \/ \A groupMember \in awsGroupMemberships["production"][AccessIamGroupName]:
4         groupMember \in OrganizationalUnitMemberships["operations_team"]
5]

After writing this requirement formally, it became obvious that there was also another requirement implied.

Requirement 2: All operations team members can access production

That requirement wasn’t never explicitly stated, but in the context of the organization it makes sense that this is how it should be. By making the implicit explicit you can have more meaningful conversations with the rest of the stakeholders and avoid miscommunications and nasty surpises later.

It also raise new questions to clarify:

what are the steps to give new joiners of the operations_team access to the production environment?
Is the onboardning process seamless, is it fast?
etc

1AllMembersOfOperationsTeamCanAccessProduction ==
2    /\ awsGroups["production"] # {}
3    /\ \A team_member \in OrganizationalUnitMemberships["operations_team"]:
4            team_member \in awsGroupMemberships["production"][AccessIamGroupName]

That requirement does not endanger the security of the system but its highly important for the operations_team to get the job done!

And as you start to think of requirements more carefully you see a third one that wasn’t discussed before.

Requirement 3: All team members of the three teams can access integration

1AllMembersOfAllTeamsCanAccessIntegration ==
2    /\ awsGroups["integration"] # {}
3    /\ \A ou \in OrganizationalUnits:
4        \A team_member \in OrganizationalUnitMemberships[ou]:
5                team_member \in awsGroupMemberships["integration"][AccessIamGroupName]
6]

Again stating the requirement explicitly you make sure to avoid misunderstandings and surprises later in the implementation phase.

This requirement also does not risk the security of the system, but rather the experience of the users and their ability to do their job. Raising similar discussion with Requirement 2.

Safety and Liveness

TLA+ allows to define Safety and Liveness properties.

Safety properties ensure that nothing bad ever happens.

Liveness properties that something good will eventually happen.

In our case, the requirement that only operation team members is a safety property. In TLA+ we write

OrgPolicyEvaluation.cfg

1INVARIANT OnlyMembersOfTheOperationsTeamCanAccessProduction

The requirements to ensure that all members of the operation team have access to production and that all members of all teams have access to integration. Are the kind of requirements that they can be violated for a period of time, for example it may take a while from when you onboard a new operations team member until their access is granted. It is ok, for those two properties to be temporarily violated so long as in the future they eventually hold again.

TLA+ can actually express and validate that kind of eventual properties: OrgPolicyEvaluation.cfg

1PROPERTIES
2    EventuallyAllMembersOfAllTeamsCanAccessIntegration
3    EventuallyAllMembersOfOperationsTeamCanAccessProduction

OrgPolicyEvaluation.tla

1EventuallyAllMembersOfAllTeamsCanAccessIntegration == []<>AllMembersOfAllTeamsCanAccessIntegration
2EventuallyAllMembersOfOperationsTeamCanAccessProduction == []<>AllMembersOfOperationsTeamCanAccessProduction

Requirements alone don’t make a useful specification. We need to define the initial state and the possible actions that can change the system.

The Specification (Actions)

The specification, which consists of the initial conditions and the possible action, would depend strongly on how your organization looks like, as what constitutes as action can be very broad and organization depended. An action can be an HR rep, assigning a new members to a team in the HR software, it could be a manager accepting the assignment of a new member, or it could be a new joiner requesting access via a developer portal.

Below, I’ll draft some steps (actions) to demonstrate that formally describing the steps can also be beneficial.

Initial state

Initially, there are no AWS IAM Groups in environment and no memberships. It is up to the actions to create groups and add group memberships.

1Init ==
2    /\ awsGroups = [ s \in AwsAccounts |-> {} ]
3    /\ awsGroupMemberships = [ s \in AwsAccounts |-> [g \in awsGroups[s] |-> {}] ]

Actions

The following actions are possible:

Create the AWS IAM access group in one of the environments.
Assign an organizational user to the AWS IAM group in the integration environment, so that they can have access to that environment.
Assign an organizational user to the AWS IAM group in the production environment, so that they can have access to that environment.

 1CreateAccessIamGroup(account) ==
 2    /\ awsGroups' = [awsGroups EXCEPT ![account] = @ \union {AccessIamGroupName}]
 3    /\ awsGroupMemberships' = [awsGroupMemberships EXCEPT ![account] = @ @@ [ g \in {AccessIamGroupName} |-> {} ] ]
 4
 5AssignUserToIamGroupInIntegration(group, user) ==
 6    /\ awsGroupMemberships' = [awsGroupMemberships EXCEPT !["integration"][group] = @ \union {user} ]
 7    /\ UNCHANGED <<awsGroups>>
 8
 9AssignUserToIamGroupInProduction(group, user) ==
10    /\ awsGroupMemberships' = [awsGroupMemberships EXCEPT !["production"][group] = @ \union {user} ]
11    /\ UNCHANGED <<awsGroups>>
12
13Next ==
14    \/ \E account \in AwsAccounts:
15        CreateAccessIamGroup(account)
16    \/ \E group \in awsGroups["integration"]:
17         \E user \in OrganizationalUsers:
18           AssignUserToIamGroupInIntegration(group, user)
19    \/ \E group \in awsGroups["production"]:
20         \E user \in OrganizationalUsers:
21           AssignUserToIamGroupInProduction(group, user)

If you run the model, TLA+ will inform you that it violates the safety property OnlyMembersOfTheOperationsTeamCanAccessProduction. It would also provide an example of the sequence of steps that caused the violation.

The violation is caused by assiging a memmber of one of the development teams in the production AccessGroup. The action AssignUserToIamGroupInProduction, does not check that a user is a member of the operations_team before assignment.

Let’s do an update:

1AssignUserToIamGroupInProduction(group, user) ==
2    \* Organizational measure required to ensure that only members of the operations team are added in production
3    /\ user \in OrganizationalUnitMemberships["operations_team"]
4    /\ awsGroupMemberships' = [awsGroupMemberships EXCEPT !["production"][group] = @ \union {user} ]
5    /\ UNCHANGED <<awsGroups>>

Adding a guard statement in the action AssignUserToIamGroupInProduction, makes the model validation pass!

And most importantly, as those steps represents actions your organization is taking. This is also the start of a discussion in your organization about how robust are your procedural and technical measures that gatekeep who can be assigned to the production AWS IAM Group.

💡 Key Insight: The guard condition reveals the need for either procedural or technical controls—exactly the kind of conversation formal specification enables.

What We’ve Gained

Clarity Through Explicitness

Implicit → Explicit: Requirement 2 (all ops members can access prod) was never stated but is clearly necessary, as some requirement 3.
Unambiguous timeliness: We distinguished safety (always true) from liveness (eventually true) properties
Minimal complete set: Three requirements fully capture the authorization policy

Design Insights

Implementation independence: The specification works regardless of whether you use AWS SSO, Terraform, custom tooling, or manual processes
Gap identification: The guard condition in AssignUserToIamGroupInProduction reveals the need for either procedural or technical controls
Discussion enabler: You can have precise discussions with non-techninal stakeholder without getting bogged down on technical details

Actionable Outcomes

You can now:

Evaluate existing tools against these precise requirements
Discuss trade-offs with PM, compliance, and security teams using shared vocabulary
Design implementation strategies that provably satisfy the specification
Leave behind unambiguous documentation that won’t be misinterpreted

Conclusion

In this post, we’ve seen how formal specification transforms vague authorization requirements into precise, verifiable constraints.

Key takeaways:

Formal methods aren’t just for aerospace—they’re valuable wherever precision matters
The process of specification is as valuable as the artifact (it forces clarity)
Starting with requirements, not tools, keeps you focused on organizational needs
TLA+ bridges technical and non-technical conversations with unambiguous language

What’s next? In my next post, I’ll demonstrate how to bridge the gap from TLA+ specification to AWS implementation, showing how to maintain traceability from requirements through deployed code.

For platform teams caught between security and DevEx, formal specification isn’t overhead—it’s how you escape the false choice and achieve both.

Resources

Resources:

TLA+ Examples