Azure Sovereign Controls

I’ve talked about sovereignty in Azure (or cloud in general) before.. and we are going to dive a little deeper into it this time. What is it, how do you configure it, and what guardrails can you put in place?

TLDR: I’ve created a few policies that mandate Managed HSM encryption for (supporting) Azure Services. Note that these policies are in Audit mode – as many services do not allow you to deploy with a CMK, but this must be configured after initial deployment.

RZomermanMS/SovControls: Sovereign Control Policies

Sovereign controls, or compliance controls are implemented into Azure through Azure Policies. Policies effectively monitor or control settings for the deployed services. For example, the sovereign foundational control of data residency is applied (or monitored) in the form of a policy that disables all Azure regions, except the allowed ones. Other policies can mandate encryption of data at rest, require confidential compute only or set specific security settings. In short, I identify these various policies as Level-1 (Data Residency), Level-2 (Encryption at rest with Managed HSM keys), Level-3 (Confidential Computing).

By combining multiple policies a specific set of compliancy requirements can be created for the various data classifications and then applied to their respective data/workload classifications management groups.

Policies can be implemented to ensure compliance with specific regulations but can also be used to enforce specific security settings that may (or may not be) part of those regulations. Azure provides several baseline security policy packs that may be complementary to the sovereign controls. Overall, the precise content of these policy packs varies per country, industry regulator and customer and they can be crafted to fit the direct requirements, but in general the sovereign policies follow these overall guidelines:

PublicInternalConfidentialSecret
TypeAll AccessFTE/PartnersFTE/TeamsIndividuals/systems
Level-1
Data Residency
optionalrequiredrequiredrequired
Level-2
Encryption with M-HSM
optionaloptionalrequiredrequired
Level-3
Confidential Computing
optionaloptionaloptionalrequired
Examplespublic websites, manualsemail, documents, meeting invitesIP, Financials, HR, researchDNA, Credit Card

Azure Policies can be applied to individual resource groups, subscriptions or management groups. In order to establish a “sovereign landing zone” these above policy packs can best be applied at higher level management groups.

Managed HSM as the encryption foundation

In the above Sovereign Controls overview we already talked about level 1, level 2 and level 3 compliancy. In this post we will focus on Level 2 (encryption by the service), based on Managed HSM. But why Managed HSM and not KeyVault, well..

In all encryption architectures it is important to ensure that the keys that unlock the data are safely guarded and not available to unauthorized users. The industry has several methods and techniques for this, with a Hardware Security Module (HSM) being one of the most common ones. An HSM is a physical device specifically designed for generating and storing (encryption) keys and block any unauthorized access for use of those keys.

Azure KeyVault Managed HSM is a fully managed, cloud-based service that provides a dedicated and isolated HSM partition for each instance. Azure Managed HSM is based on FIPS 140-2 Level 3 validated hardware and supports a subset of the Azure Key Vault APIs and features. With Azure Managed HSM, you can store and manage their keys in a highly secure and scalable cloud service, without having to worry about the maintenance and management of the underlying hardware. The architecture of managed HSM highly depends on confidential computing.

All the components are in the Azure datacenter, but the philosophy is to create such a secure environment for the Managed HSM (Key Management Service) that it matches / exceeds an external HSM architecture. This is handled in a few ways:

  1. A Trusted Execution Environment (TEE) is created for each instance used by a customer
  2. The TEE is based on external trust/keys outside of Microsoft control (Intel SGX)
  3. All secrets used by the service are generated inside the TEE secured instance
  4. There is no external access to the application execution environments
  5. No clear-text secrets are to be in active memory on the physical hosts
  6. No human or system outside of the trusted environment has the HSM credentials
  7. Access to the service is programmatically limited to the customers Entra ID (Azure Active Directory) object ID
  8. The security domain (including masking key) can only be requested / downloaded and unencrypted by the customer and is not stored in cloud.
  9. Private Key material in the HSM is set to non-exportable – unless specifically requested for Secure Key Release (see this link). Regular keys are non-exportable and the HSM will therefore not release the private key material in an unmasked state.
  10. The HSM has its own audit log that can be pulled by a customer to view any interaction on the HSM partition.

When looking at the key storage component, this is within the datacenter boundary. As indicated before we would need to evaluate the possibility of the 2 possible access methods to key materials.

  1. Private Key Material Extraction –
    1. A 3rd party attested HSM running regular firmware is used to provide the private key material protection.
    2. The credentials to the HSM itself are not humanly readable and are solely stored inside the trusted execution environments of the system.
    3. The masking key protecting the private key materials is native to the HSM and is stored solely with the customer as part of the security domain – protected by customer generated and managed key pairs- and it is therefore the customers responsibility to safeguard the masking key and the protecting the private key pairs– effectively separating the keys (backups) and the masking key in two different places.
    4. While indeed Microsoft Azure can take a full backup of the physical HSM (including all partitions), this backup is protected by each individual partitions masking key (that only the customer has access to).
    5. (un)Intended exposure of the security domain does not provide access to the keys. An attacker would need to gain access to a (key) backup and the security domain to be able to gain access to private key materials.
    6. (un)Intended exposure of the backup would not allow an attacker to gain access to private key material as they would also need to have the security domain which is not stored in cloud.
  • Unauthorized access / usage
    • Access to the service is limited to customers Entra ID signed authentication tokens. The service uses 2 ACL lists – Azure RBAC for creating / deleting instances and the HSM RBAC for HSM instructions. While a single Entra ID is used, an Azure subscription owner cannot take forced control of the Managed HSM.
    • An emergency HSM Administrator role exists for the “Entra ID Global Admins” group. Regardless of permissions on Azure subscriptions, access to the Managed HSM instance allows an Entra ID Global Admin to take control of the HSM. It will not allow them access to private key materials, but does allow them to change the HSM ACL list. Caution for Global Admins memberships is required.
    • The credentials of (each individual) HSM partition are never exposed and can therefore not be misused by unauthorized systems or persons.
    • The TLS (https) certificates are generated by and inside the Trusted Execution Environment, making the private key of the service TLS connections solely available to the instance.
    • The front-end service runs fully in confidential computing, and external access is impossible by other systems or humans. The instance cannot be “moved” to a compromised host as the TEE parameters will change and access to the “secrets store” that holds the credentials and service private keys as this is only possible on the original physical CPU on which the instance was built.
  • Other items:
    • The service has built-in redundancies in place. Each created managed HSM service creates 3 (individual) backend instances. The key exchange between the instances is secured by a database encryption key (shared between all instances) and the HSM (partition) masking key. Ensuring that private key material can only be exchanged between instances in the same service instance and that private key materials are only accessible when imported in to the HSM partitions belonging to the same service instance.

Management and operational procedures on the HSM’s are limited to Azure Fabric controllers. Therefore out-of-bound operations that are not built-into the service cannot be called upon. Direct access to the partitions (and therefore key operations or partition wide operations) are limited to each of the instances inside their respective TEE’s. No human or out-of-bound access to the partition is possible.

Implementing Managed HSM Services

Not all services support Managed HSM, so check the latest table on Services that support customer managed keys (CMKs) in Azure Key Vault and Azure Managed HSM | Microsoft Learn

In order to review the compliance of your deployed services with Managed HSM, you can always go to my Sovereign Policies basepack posted on my github: RZomermanMS/SovControls: Sovereign Control Policies

Take these policies, and create them in your Azure Environment, create an initiative with these (and add the others you would like) and apply them on Management Groups, Subscriptions, Resource Groups or resources..

Tagged , , ,