Manage Certificates for HPC Pack 2019 Cluster

There are several certificates used in HPC Pack Cluster for different purposes. Here is a full list:

Certificate Purpose and Description Install Locations
Microsoft HPC Azure Client Used by Head nodes to communicate with Azure PaaS proxy nodes. It's a self-signed certificate auto-generated by HPC Cluster. Head nodes: LocalComputer\Personal
Microsoft HPC Azure Service Used by Azure PaaS proxy nodes to communicate with Head nodes. It's a self-signed certificate auto-generated by HPC Cluster. Azure proxy nodes: LocalComputer\Personal
Microsoft HPC Azure Management Used by Head nodes to communicate with Azure Management Service to manage Azure resources in classic mode. Head nodes: LocalComputer\Personal, LocalComputer\Trusted Root CA (self-signed only)
Azure Portal: Subscriptions\Management certificates
Azure Service Principal Certificate Used by Head nodes to communicate with Azure Resource Manager to manage Azure resources in resource manager mode. You can use the same certificate with Microsoft HPC Azure Management Head nodes: LocalComputer\Personal, Azure Service Principal
HPC Pack Communication for Head node Used by Head nodes to communicate with other head nodes and Compute/Broker/Workstation/Linux nodes, that is, all other nodes except for Azure PaaS nodes and Azure Batch Pool. If it's a self-signed certificate. If you plan to use the Burst to Azure IaaS VM feature to deploy Azure IaaS compute nodes, you also import this certificate into Azure Key Vault so that it can be used Azure IaaS compute nodes to communicate with head nodes. Head nodes and IaaS compute nodes (1): LocalComputer\Personal, LocalComputer\Trusted Root CA (self-signed only)
On-premise Windows nodes and HPC Client (2)(3): LocalComputer\Trusted Root CA (self-signed only)
HPC Pack Communication for other node Used by Compute/Broker/Workstation/Linux nodes to communicate with head nodes. You can use the same certificate with HPC Pack Communication for head node. For an HPC Pack cluster entirely in Azure or a hybrid cluster with Azure IaaS compute nodes, we recommend to use same certificate with HPC Pack Communication for head node On-premise Windows nodes: LocalComputer\Personal, LocalComputer\Trusted Root CA (self-signed only)
Linux nodes: /opt/hpcnodemanager/cert
Head nodes: LocalComputer\Trusted Root CA (self-signed only)
Service Fabric Certificate Used by the head nodes to secure the Service Fabric cluster communication. By default it uses the same certificate with HPC Pack Communication for Head node. You can add additional certificate for Service Fabric Cluster Head nodes: LocalComputer\Personal, LocalComputer\Trusted Root CA (self-signed only) CurrentUser\Personal (4)

(1) Here the term IaaS compute nodes means the compute nodes deployed with the Burst to Azure IaaS VM feature or HPC Pack cluster Deployment template. If you manually run HPC setup wizard (setup.exe) to install an HPC compute node on an Azure IaaS VM, you can treat it as an On-premise compute node.

(2) For a domain joined HPC Client machine, you can opt not to install the certificate HPC Pack Communication for Head node in Local Computer\Trusted Root CA store in the following two ways:

  • During HPC Client installation, choose Skip CA and CN validation.

  • Add registry value named CertificateValidationType with DWORD value 0 under registry key HKLM\SOFTWARE\Microsoft\HPC.

(3) For a non-domain joined HPC Client machine, install the certificate HPC Pack Communication for Head node in Local Computer\Personal with private key and to CurrentUser\Trusted Root CA without private key. Then add a registry value named SSLThumbprint under registry key HKLM\SOFTWARE\Microsoft\HPC and specify the certificate thumbprint.

(4) If you want to access the Service Fabric cluster portal (https://localhost:10400) on the head node, install the certificate under CurrentUser\Personal as well with private key.

Rotate HPC Pack Node communication certificates

Microsoft HPC Pack 2016 and later uses a certificate to secure the communication between the HPC nodes. You can use the same certificate on all HPC nodes, or use two different certificates: HPC Pack Communication for Head node and HPC Pack Communication for other node. You need to rotate the certificates before they expire. If you fail to do so, the HPC Pack cluster stops working.

The certificates must meet the following requirements:

  • Have a private key capable of key exchange.
  • Key usage includes Digital Signature and Key Encipherment.
  • Enhanced key usage includes Client Authentication and Server Authentication.
  • If two different certificates are used, they must have the same subject name.

If the certificate is used to secure Service Fabric Cluster as well, it must meet the following additional requirements:

  • The certificate's provider must be Microsoft Enhanced RSA and AES Cryptographic Provider.
  • The RSA key length must be 2048 bits.

Prepare new certificate

When you prepare a new certificate, make sure that you use the same subject name as that of the old certificate. Run the following PowerShell commands on the HPC node to get the subject name of your certificate.

$thumbprint = (Get-ItemProperty -Path HKLM:\SOFTWARE\Microsoft\HPC -Name SSLThumbprint).SSLThumbPrint
$subjectName = (Get-Item Cert:\LocalMachine\My\$thumbprint).Subject
$subjectName

If you're using self-signed certificate, run the following PowerShell command on a computer with operating system Windows 10 or Windows Server 2016 to generate a new certificate that meets all the above requirements. You get two files under a folder named with the thumbprint of the new certificate: PrivateCert.pfx with private key and PublicCert.cer without private key. Use the correct <subject-name>.

$subjectName = "<subject-name>"
$pfxcert = New-SelfSignedCertificate -Subject $subjectName -KeySpec KeyExchange -KeyLength 2048 -HashAlgorithm SHA256 -TextExtension @("2.5.29.37={text}1.3.6.1.5.5.7.3.1,1.3.6.1.5.5.7.3.2") -Provider "Microsoft Enhanced RSA and AES Cryptographic Provider" -CertStoreLocation Cert:\CurrentUser\My -KeyExportPolicy Exportable -NotAfter (Get-Date).AddYears(10) -NotBefore (Get-Date).AddDays(-1)
$certThumbprint = $pfxcert.Thumbprint
$null = New-Item $env:Temp\$certThumbprint -ItemType Directory
$pfxPassword = Get-Credential -UserName 'Protection password' -Message 'Enter protection password below'
Export-PfxCertificate -Cert Cert:\CurrentUser\My\$certThumbprint -FilePath "$env:Temp\$certThumbprint\PrivateCert.pfx" -Password $pfxPassword.Password
Export-Certificate -Cert Cert:\CurrentUser\My\$certThumbprint -FilePath "$env:Temp\$certThumbprint\PublicCert.cer" -Type CERT -Force
start "$env:Temp\$certThumbprint"

If you're using a certificate authority (CA) signed certificate or existing self-signed certificate, you can run the following command and check the values of KeySpec, Subject, Key Usage, Enhanced Key Usage, Public Key Length and Provider.

CertUtil.exe -p "<password>" -v -dump <path-of-pfxFile>
  • If the value of Subject, Key Usage, Enhanced Key Usage, or Public Key Length doesn't match, you must regenerate the certificate.

  • If the value of KeySpec (should be "1 -- AT_KEYEXCHANGE") or Provider doesn't match, you don't need to regenerate the certificate. Run the following command to import the certificate with modified KeySpec and Provider values, and then run certlm.msc to export the certificate, including private key, to a new PFX file which meets the requirements.

CertUtil.exe -f -p "<password>" -csp "Microsoft Enhanced RSA and AES Cryptographic Provider" -importpfx "<path-of-pfxFile>" AT_KEYEXCHANGE

Rotate certificate on broker, compute, and workstation nodes

  1. Copy the new CN certificate, for instance PrivateCert.pfx, to the Certificates folder under the HPC install share, for instance, \\headnode\REMINST\Certificates, with the new name HpcCnCommunication.pfx.

  2. Download the PowerShell script Update-HpcNodeCertificate.ps1 and put it in the HPC install share (\\<headnode>\REMINST). Open HPC Cluster Manager, select Resource Management > Nodes. Select all the Windows compute, broker, and workstation nodes. Make sure head nodes are NOT included. Select Run Command, and run the following command with the correct values for head node and password:

    PowerShell.exe -ExecutionPolicy ByPass -Command "\\<headnode>\REMINST\Update-HpcNodeCertificate.ps1 -PfxFilePath \\<headnode>\REMINST\Certificates\HpcCnCommunication.pfx -Password <password> -RunAsScheduledTask"
    
  3. If you have Linux compute nodes, open HPC Cluster Manager on the head node. Select Resource Management > Nodes. Select all the Linux nodes. Select Run Command, and run the following commands in sequence:

    First, create a temp directory on all Linux nodes.

    mkdir /tmp/hpcreminst
    

    Second, mount HPC install share on all Linux nodes. Fill in the correct values for head node, domain name, and domain user credentials.

    mount -t cifs //headnode/REMINST /tmp/hpcreminst -o vers=2.1, domain=<domainname>,username=<username>,password='<userpassword>',dir_mode=0755,file_mode=0755
    

    Third, schedule a job to rotate certificate on all Linux nodes. Fill in the correct values for head node and certificate protection password.

    cd /tmp/hpcreminst; echo "python /opt/hpcnodemanager/setup.py -certfile:/tmp/hpcreminst/Certificates/HpcCnCommunication.pfx -certpassword:<password>" | at now + 1 minute
    
  4. Optionally if new bare metal machines will be deployed, open the HPC Cluster Manager, go to Deployment To-do List. Select Import a certificate for deployment to import the new CN certificate from \\headnode\REMINST\Certificates, with the name HpcCnCommunication.pfx.

Rotate certificate for single head node

  1. If the new head node certificate is self-signed, make all the Windows cluster nodes trust this new self-signed certificate before rotating.

    • Copy the new public certificate PublicCert.cer file to the Certificates folder under the HPC install share (\\headnode\REMINST\Certificates) with new name HpcHnPublicCert.cer.
    • Open HPC Cluster Manager > Resource Management > Nodes. Select all the Windows compute, broker, and workstation nodes. Select Run Command. Run the following command with the correct head node to make them trust the new head node certificate:
    PowerShell.exe -ExecutionPolicy ByPass -Command "Import-certificate -FilePath \\<headnode>\REMINST\Certificates\HpcHnPublicCert.cer -CertStoreLocation cert:\LocalMachine\Root"
    
  2. Download the PowerShell script Update-HpcNodeCertificate.ps1 and run the following PowerShell command to apply the new certificate PrivateCert.pfx:

    .\Update-HpcNodeCertificate.ps1 -PfxFilePath <path-of-PrivateCert.pfx> -Password <password>
    
  3. If you're using the Burst to Azure IaaS VM feature, on HPC Cluster Manager, select Configuration > Set Azure Deployment Configuration to import the new certificate PrivateCert.pfx on the Azure Key Vault Certificate page. Or you can refer to Create Azure Key Vault Certificate on Azure Portal to manually import the PrivateCert.pfx to Azure Key Vault, and then specify the values on the Azure Key Vault Certificate page in the Set Azure Deployment Configuration wizard.

Rotate certificate for high availability head nodes

This procedure applies to Service Fabric cluster or HPC Pack 2019 built-in high availability architecture.

  1. If the new head node certificate is self-signed, make all the Windows cluster nodes trust this new self-signed certificate before rotating.

    • Copy the new public certificate PublicCert.cer file to the Certificates folder under the HPC install share (\\<InstallShare>\Certificates) with new name HpcHnPublicCert.cer. You can use the following PowerShell command to get the HPC install share.

      Add-PSSnapin Microsoft.HPC
      Get-HpcClusterRegistry -PropertyName InstallShare
      
    • Open HPC Cluster Manager > Resource Management > Nodes, select all the Windows cluster nodes including all head nodes, and select Run Command. Run the following command line with the correct install share to make them trust the new head node certificate:

    PowerShell.exe -ExecutionPolicy ByPass -Command "Import-certificate -FilePath \\<InstallShare>\Certificates\HpcHnPublicCert.cer -CertStoreLocation cert:\LocalMachine\Root"
    
  2. On every head node, download the PowerShell script Update-HpcNodeCertificate.ps1 and run the following PowerShell command to import and apply the new certificate PrivateCert.pfx:

    .\Update-HpcNodeCertificate.ps1 -PfxFilePath <path-of-PrivateCert.pfx> -Password <password>
    
  3. On any one head node, run the following PowerShell command to apply the new certificate which is already installed in all the head nodes.

    Add-PSSnapin Microsoft.HPC
    $thumbprint = (Get-ItemProperty -Path HKLM:\SOFTWARE\Microsoft\HPC -Name SSLThumbprint).SSLThumbPrint
    Set-HpcClusterRegistry -PropertyName SSLThumbprint -PropertyValue $thumbprint
    
  4. If you're using the Burst to Azure IaaS VM feature, on HPC Cluster Manager, select Configuration > Set Azure Deployment Configuration to import the new certificate PrivateCert.pfx on the Azure Key Vault Certificate page. Or you can refer to Create Azure Key Vault Certificate on Azure Portal to manually import the PrivateCert.pfx to Azure Key Vault, and then specify the values on Azure Key Vault Certificate page in the Set Azure Deployment Configuration wizard.

  5. [Service Fabric cluster only] If you're using the same certificate to secure Service Fabric cluster, check whether a Service Fabric cluster configuration upgrade is required. On any head node, run the following PowerShell command to check the current security configuration of the Service Fabric cluster.

    Connect-ServiceFabricCluster
    Get-ServiceFabricClusterConfiguration | Out-File d:\sfclusterconfig.json
    

    If the security configuration is as below, a Service Fabric cluster configuration upgrade is not required if the new certificate is issued by the same issuer.

        "Security": {
          "CertificateInformation": {
            "ClusterCertificateCommonNames": {
              "CommonNames": [
                {
                  "CertificateCommonName": "[CertificateCommonName]",
                  "CertificateIssuerThumbprint": "[IssuerThumbprint]"
                }
              ],
              "X509StoreName": "My"
            },
            "ServerCertificateCommonNames": {
              "CommonNames": [
                {
                  "CertificateCommonName": "[CertificateCommonName]",
                  "CertificateIssuerThumbprint": "[IssuerThumbprint]"
                }
              ],
              "X509StoreName": "My"
            }
          },
          "ClusterCredentialType": "X509",
          "ServerCredentialType": "X509"
        },
    

    If the security configuration is as below, you need to upgrade the Service Fabric cluster configuration.

        "Security": {
          "CertificateInformation": {
            "ClusterCertificate": {
              "Thumbprint": "[Thumbprint]",
              "X509StoreName": "My"
            },
            "ServerCertificate": {
              "Thumbprint": "[Thumbprint]",
              "X509StoreName": "My"
            }
          },
          "ClusterCredentialType": "X509",
          "ServerCredentialType": "X509"
        },
    

For more information about the certificate rollover for Service Fabric cluster, see Upgrade Service Fabric cluster certificate configuration and Secure a standalone Service Fabric cluster.

Upgrade certificate configuration for Service Fabric cluster

  1. Modify the file sfclusterconfig.json as below:

    • Replace the value of Thumbprint under ClusterCertificate and ServerCertificate
    • Remove the properties with name $id if any under Security and CertificateInformation
    • Change clusterConfigurationVersion to a higher version, for example from 1.0.0 to 1.0.1
  2. Run the following PowerShell command to start service fabric cluster configuration upgrade.

    Connect-ServiceFabricCluster
    Start-ServiceFabricClusterConfigurationUpgrade -ClusterConfigPath d:\sfclusterconfig.json
    
  3. Use the following command to query the upgrade status:

Get-ServiceFabricClusterConfigurationUpgradeStatus

Rotate an already expired certificate in clusters with Service Fabric HA

  1. Recover Service Fabric cluster
  1. Recover HPC Pack cluster
  • 2.1 On every head node, run the following PowerShell command:
Set-ItemProperty -Path "HKLM:\SOFTWARE\Microsoft\HPC" -Name SSLThumbprint -Value <NewThumbrpint>
Set-ItemProperty -Path "HKLM:\SOFTWARE\Wow6432Node\Microsoft\HPC" -Name SSLThumbprint -Value <NewThumbrpint>
Set-HpcReliableProperty.ps1 -PropertyName SSLThumbprint -PropertyValue <NewThumbrpint>

Next steps

Consider these tutorials to learn more about HPC Pack 2019.