Manage Certificates for HPC Pack 2019 Cluster

There are several certificates used in HPC Pack Cluster for different purposes. Here is a full list:

Certificate Purpose and Description Install Locations
Microsoft HPC Azure Client Used by Head nodes to communicate with Azure PaaS proxy nodes. It's a self-signed certificate auto-generated by HPC Cluster. Head nodes: LocalComputer\Personal
Microsoft HPC Azure Service Used by Azure PaaS proxy nodes to communicate with Head nodes. It's a self-signed certificate auto-generated by HPC Cluster. Azure proxy nodes: LocalComputer\Personal
Microsoft HPC Azure Management(1) Used by Head node(s) to communicate with Azure Management Service to manage Azure resources in classic mode. Head node(s):
LocalComputer\Personal
LocalComputer\Trusted Root CA (self-signed only)
Azure Portal:
Subscriptions\Management certificates
Azure Service Principal Certificate Used by Head node(s) to communicate with Azure Resource Manager to manage Azure resources in resource manager mode.
You can use the same certificate with Microsoft HPC Azure Management.
Head node(s):
LocalComputer\Personal
Azure Service Principal
HPC Pack Communication Certificate Used for communication by all nodes except Azure PaaS nodes and Azure Batch Pool.
If it is a self-signed certificate, and you plan to use Burst to Azure IaaS VM feature to deploy Azure IaaS compute nodes, you shall also import this certificate into Azure Key Vault so that it can be used Azure IaaS compute nodes to communicate with head node(s).
Windows nodes(2)(3):
LocalComputer\Personal
LocalComputer\Trusted Root CA (self-signed only)
Linux nodes:
/opt/hpcnodemanager/cert
Service Fabric Certificate Used by the head nodes to secure the Service Fabric cluster communication.
By default it uses the same certificate as HPC Pack Communication Certificate.
You can add additional certificates to the Service Fabric cluster.
Head nodes:
LocalComputer\Personal
LocalComputer\Trusted Root CA (self-signed only)
CurrentUser\Personal**(5)**

(1) Cloud Services (classic) is now deprecated for new customers, and it will be retired on August 31st, 2024 for all customers. New deployments should use the new Azure Resource Manager-based deployment model, Azure Cloud Services (extended support).

(2) For domain joined Windows HPC Client machine, you can opt not to install the certificate HPC Pack Communication for Head node in Local Computer\Trusted Root CA store with the following two ways:

  • During HPC Client installation, choose "Skip CA and CN validation"

  • Add registry value named CertificateValidationType with DWORD value 0 under registry key HKLM\SOFTWARE\Microsoft\HPC

(3) For non-domain joined HPC Client machine, you must install the certificate HPC Pack Communication for Head node in Local Computer\Personal with private key and to CurrentUser\Trusted Root CA without private key, and then add a registry value named SSLThumbprint under registry key HKLM\SOFTWARE\Microsoft\HPC and specify the certificate thumbprint.

(4) If you want to access Service Fabric cluster portal (https://<service-fabric-cluster-hostname>:10400) on the head node or other nodes, you shall install the certificate under CurrentUser\Personal as well with private key.

Rotate HPC Pack Node communication certificates

Microsoft HPC Pack 2016 (and later) uses certificate to secure the communication between the HPC nodes. You need to rotate the certificate(s) before they expire to avoid breaking the HPC Pack cluster.

The certificates must meet the following requirements:

  • Have a private key capable of key exchange.
  • Key usage includes Digital Signature, Key Encipherment, Key Agreement and Certificate Signing.
  • Enhanced key usage includes Client Authentication and Server Authentication.
  • If two different certificates are used, they must have the same subject name.

If the certificate is used to secure Service Fabric Cluster as well, it must meet the following additional requirements:

  • The certificate's provider must be Microsoft Enhanced RSA and AES Cryptographic Provider;
  • The RSA key length must be 2048 bits.

Prepare new certificate

When you prepare a new certificate, make sure that you use the same subject name as that of the old certificate. Run the following PowerShell commands on the HPC node to get the subject name of your certificate.

$thumbprint = (Get-ItemProperty -Path HKLM:\SOFTWARE\Microsoft\HPC -Name SSLThumbprint).SSLThumbPrint
$subjectName = (Get-Item Cert:\LocalMachine\My\$thumbprint).Subject
$subjectName

If you're using self-signed certificate, run the following PowerShell command on a computer with operating system Windows 10 or Windows Server 2016 to generate a new certificate that meets all the above requirements. You get two files under a folder named with the thumbprint of the new certificate: PrivateCert.pfx with private key and PublicCert.cer without private key. Use the correct <subject-name>.

$subjectName = "<subject-name>"
$pfxcert = New-SelfSignedCertificate -Subject $subjectName -KeySpec KeyExchange -KeyLength 2048 -HashAlgorithm SHA256 -TextExtension @("2.5.29.37={text}1.3.6.1.5.5.7.3.1,1.3.6.1.5.5.7.3.2") -Provider "Microsoft Enhanced RSA and AES Cryptographic Provider" -CertStoreLocation Cert:\CurrentUser\My -KeyExportPolicy Exportable -NotAfter (Get-Date).AddYears(10) -NotBefore (Get-Date).AddDays(-1)
$certThumbprint = $pfxcert.Thumbprint
$null = New-Item $env:Temp\$certThumbprint -ItemType Directory
$pfxPassword = Get-Credential -UserName 'Protection password' -Message 'Enter protection password below'
Export-PfxCertificate -Cert Cert:\CurrentUser\My\$certThumbprint -FilePath "$env:Temp\$certThumbprint\PrivateCert.pfx" -Password $pfxPassword.Password
Export-Certificate -Cert Cert:\CurrentUser\My\$certThumbprint -FilePath "$env:Temp\$certThumbprint\PublicCert.cer" -Type CERT -Force
start "$env:Temp\$certThumbprint"

If you're using a certificate authority (CA) signed certificate or existing self-signed certificate, you can run the following command and check the values of KeySpec, Subject, Key Usage, Enhanced Key Usage, Public Key Length and Provider.

CertUtil.exe -p "<password>" -v -dump <path-of-pfxFile>
  • If the value of Subject, Key Usage, Enhanced Key Usage, or Public Key Length doesn't match, you must regenerate the certificate.

  • If the value of KeySpec (should be "1 -- AT_KEYEXCHANGE") or Provider doesn't match, you don't need to regenerate the certificate. Run the following command to import the certificate with modified KeySpec and Provider values, and then run certlm.msc to export the certificate, including private key, to a new PFX file which meets the requirements.

CertUtil.exe -f -p "<password>" -csp "Microsoft Enhanced RSA and AES Cryptographic Provider" -importpfx "<path-of-pfxFile>" AT_KEYEXCHANGE

Rotate certificate on broker, compute, and workstation nodes

Important

It is highly recommended to bulk rotate the certificate of your compute nodes before rotating the certificate of head node(s). If you rotate your head node(s) first, your compute nodes will be still expecting head node(s) to connect to them using the old certificate, forcing you to manually rotate the certificate on each compute node individually. Right after you bulk rotate the certificate of your compute nodes, the compute nodes will become offline in HPC Cluster Manager. This is to be expected because they are expecting connectivity from head node(s) using the new certificate. Follow the steps in later sections to rotate the certificate of head node(s) to finish the certificate rotation and bring compute nodes back online.

  1. Copy the new CN certificate, for instance PrivateCert.pfx, to the Certificates folder under the HPC install share, for instance, \\headnode\REMINST\Certificates, with the new name HpcCnCommunication.pfx.

  2. Download the PowerShell script Update-HpcNodeCertificate.ps1 and put it in the HPC install share (\\<headnode>\REMINST). Open HPC Cluster Manager, select Resource Management > Nodes. Select all the Windows compute, broker, and workstation nodes. Make sure head nodes are NOT included. Select Run Command, and run the following command with the correct values for head node and password:

    PowerShell.exe -ExecutionPolicy ByPass -Command "\\<headnode>\REMINST\Update-HpcNodeCertificate.ps1 -PfxFilePath \\<headnode>\REMINST\Certificates\HpcCnCommunication.pfx -Password <password> -RunAsScheduledTask"
    
  3. If you have Linux compute nodes, open HPC Cluster Manager on the head node. Select Resource Management > Nodes. Select all the Linux nodes. Select Run Command, and run the following commands in sequence:

    First, create a temp directory on all Linux nodes.

    mkdir /tmp/hpcreminst
    

    Second, mount HPC install share on all Linux nodes. Fill in the correct values for head node, domain name, and domain user credentials.

    mount -t cifs //headnode/REMINST /tmp/hpcreminst -o vers=2.1, domain=<domainname>,username=<username>,password='<userpassword>',dir_mode=0755,file_mode=0755
    

    Third, schedule a job to rotate certificate on all Linux nodes. Fill in the correct values for head node and certificate protection password.

    cd /tmp/hpcreminst; echo "python /opt/hpcnodemanager/setup.py -certfile:/tmp/hpcreminst/Certificates/HpcCnCommunication.pfx -certpassword:<password>" | at now + 1 minute
    
  4. Optionally if new bare metal machines will be deployed, open the HPC Cluster Manager, go to Deployment To-do List. Select Import a certificate for deployment to import the new CN certificate from \\headnode\REMINST\Certificates, with the name HpcCnCommunication.pfx.

Rotate certificate for single head node

  1. If the new head node certificate is self-signed, make all the Windows cluster nodes trust this new self-signed certificate before rotating.

    • Copy the new public certificate PublicCert.cer file to the Certificates folder under the HPC install share (\\headnode\REMINST\Certificates) with new name HpcHnPublicCert.cer.
    • Open HPC Cluster Manager > Resource Management > Nodes. Select all the Windows compute, broker, and workstation nodes. Select Run Command. Run the following command with the correct head node to make them trust the new head node certificate:
    PowerShell.exe -ExecutionPolicy ByPass -Command "Import-certificate -FilePath \\<headnode>\REMINST\Certificates\HpcHnPublicCert.cer -CertStoreLocation cert:\LocalMachine\Root"
    
  2. Download the PowerShell script Update-HpcNodeCertificate.ps1 and run the following PowerShell command to apply the new certificate PrivateCert.pfx:

    .\Update-HpcNodeCertificate.ps1 -PfxFilePath <path-of-PrivateCert.pfx> -Password <password>
    
  3. If you're using the Burst to Azure IaaS VM feature, on HPC Cluster Manager, select Configuration > Set Azure Deployment Configuration to import the new certificate PrivateCert.pfx on the Azure Key Vault Certificate page. Or you can refer to Create Azure Key Vault Certificate on Azure Portal to manually import the PrivateCert.pfx to Azure Key Vault, and then specify the values on the Azure Key Vault Certificate page in the Set Azure Deployment Configuration wizard.

Rotate certificate for high availability head nodes

This procedure applies to Service Fabric cluster or HPC Pack 2019 built-in high availability architecture.

  1. If the new head node certificate is self-signed, make all the Windows cluster nodes trust this new self-signed certificate before rotating.

    • Copy the new public certificate PublicCert.cer file to the Certificates folder under the HPC install share (\\<InstallShare>\Certificates) with new name HpcHnPublicCert.cer. You can use the following PowerShell command to get the HPC install share.

      Add-PSSnapin Microsoft.HPC
      Get-HpcClusterRegistry -PropertyName InstallShare
      
    • Open HPC Cluster Manager > Resource Management > Nodes, select all the Windows cluster nodes including all head nodes, and select Run Command. Run the following command line with the correct install share to make them trust the new head node certificate:

    PowerShell.exe -ExecutionPolicy ByPass -Command "Import-certificate -FilePath \\<InstallShare>\Certificates\HpcHnPublicCert.cer -CertStoreLocation cert:\LocalMachine\Root"
    
  2. On every head node, download the PowerShell script Update-HpcNodeCertificate.ps1 and run the following PowerShell command to import and apply the new certificate PrivateCert.pfx:

    .\Update-HpcNodeCertificate.ps1 -PfxFilePath <path-of-PrivateCert.pfx> -Password <password>
    
  3. On any one head node, run the following PowerShell command to apply the new certificate which is already installed in all the head nodes.

    Add-PSSnapin Microsoft.HPC
    $thumbprint = (Get-ItemProperty -Path HKLM:\SOFTWARE\Microsoft\HPC -Name SSLThumbprint).SSLThumbPrint
    Set-HpcClusterRegistry -PropertyName SSLThumbprint -PropertyValue $thumbprint
    
  4. If you're using the Burst to Azure IaaS VM feature, on HPC Cluster Manager, select Configuration > Set Azure Deployment Configuration to import the new certificate PrivateCert.pfx on the Azure Key Vault Certificate page. Or you can refer to Create Azure Key Vault Certificate on Azure Portal to manually import the PrivateCert.pfx to Azure Key Vault, and then specify the values on Azure Key Vault Certificate page in the Set Azure Deployment Configuration wizard.

  5. [Service Fabric cluster only] If you're using the same certificate to secure Service Fabric cluster, check whether a Service Fabric cluster configuration upgrade is required. On any head node, run the following PowerShell command to check the current security configuration of the Service Fabric cluster.

    Connect-ServiceFabricCluster
    Get-ServiceFabricClusterConfiguration | Out-File d:\sfclusterconfig.json
    

    If the security configuration is as below, a Service Fabric cluster configuration upgrade is not required if the new certificate is issued by the same issuer.

        "Security": {
          "CertificateInformation": {
            "ClusterCertificateCommonNames": {
              "CommonNames": [
                {
                  "CertificateCommonName": "[CertificateCommonName]",
                  "CertificateIssuerThumbprint": "[IssuerThumbprint]"
                }
              ],
              "X509StoreName": "My"
            },
            "ServerCertificateCommonNames": {
              "CommonNames": [
                {
                  "CertificateCommonName": "[CertificateCommonName]",
                  "CertificateIssuerThumbprint": "[IssuerThumbprint]"
                }
              ],
              "X509StoreName": "My"
            }
          },
          "ClusterCredentialType": "X509",
          "ServerCredentialType": "X509"
        },
    

    If the security configuration is as below, you need to upgrade the Service Fabric cluster configuration.

        "Security": {
          "CertificateInformation": {
            "ClusterCertificate": {
              "Thumbprint": "[Thumbprint]",
              "X509StoreName": "My"
            },
            "ServerCertificate": {
              "Thumbprint": "[Thumbprint]",
              "X509StoreName": "My"
            }
          },
          "ClusterCredentialType": "X509",
          "ServerCredentialType": "X509"
        },
    

For more information about the certificate rollover for Service Fabric cluster, see Upgrade Service Fabric cluster certificate configuration and Secure a standalone Service Fabric cluster.

Upgrade certificate configuration for Service Fabric cluster

  1. Modify the file sfclusterconfig.json as below:

    • Replace the value of Thumbprint under ClusterCertificate and ServerCertificate
    • Remove the properties with name $id if any under Security and CertificateInformation
    • Change clusterConfigurationVersion to a higher version, for example from 1.0.0 to 1.0.1
  2. Run the following PowerShell command to start service fabric cluster configuration upgrade.

    Connect-ServiceFabricCluster
    Start-ServiceFabricClusterConfigurationUpgrade -ClusterConfigPath d:\sfclusterconfig.json
    
  3. Use the following command to query the upgrade status:

Get-ServiceFabricClusterConfigurationUpgradeStatus

Rotate an already expired certificate in clusters with single head node or built-in HA

  1. On every head node, run the following PowerShell command:
Set-ItemProperty -Path "HKLM:\SOFTWARE\Microsoft\HPC" -Name SSLThumbprint -Value <NewThumbrpint>
Set-ItemProperty -Path "HKLM:\SOFTWARE\Wow6432Node\Microsoft\HPC" -Name SSLThumbprint -Value <NewThumbrpint>
  1. Update the thumbprint in the database HPCHAStorage with the following SQL query:
Update dbo.DataTable set dvalue='<NewThumbrpint>' where dpath = 'HKEY_LOCAL_MACHINE\Software\Microsoft\HPC' and dkey = 'SSLThumbprint'

Rotate an already expired certificate in clusters with Service Fabric HA

  1. Recover Service Fabric cluster
  1. Recover HPC Pack cluster
  • 2.1 On every head node, run the following PowerShell command:
Set-ItemProperty -Path "HKLM:\SOFTWARE\Microsoft\HPC" -Name SSLThumbprint -Value <NewThumbrpint>
Set-ItemProperty -Path "HKLM:\SOFTWARE\Wow6432Node\Microsoft\HPC" -Name SSLThumbprint -Value <NewThumbrpint>
Set-HpcReliableProperty.ps1 -PropertyName SSLThumbprint -PropertyValue <NewThumbrpint>

Next steps

Consider these tutorials to learn more about HPC Pack 2019.