Managing Your HDInsight Cluster using PowerShell – Update
Since writing my last post Managing Your HDInsight Cluster and .Net Job Submissions using PowerShell, there have been some useful modifications to the Azure PowerShell Tools.
The HDInsight cmdlets no longer exist as these have now been integrated into the latest release of the Windows Azure Powershell Tools. This integration means:
- You don’t need to specify Subscription parameter
- If needed, you can use AAD authentication to Azure instead of certificates
Also, the cmdlets are fully backwards compatible meaning you don’t need to change your current scripts. The Subscription parameter is now optional but if it specified then it is honoured.
As such the cluster creation script will now be:
- Param($Hosts = 4, [string] $Cluster = $(throw "Cluster Name Required."), [string] $StorageContainer = "hadooproot")
- # Get the subscription information and set variables
- $subscriptionInfo = Get-AzureSubscription -Default
- $subName = $subscriptionInfo | %{ $_.SubscriptionName }
- $subId = $subscriptionInfo | %{ $_.SubscriptionId }
- $cert = $subscriptionInfo | %{ $_.Certificate }
- $storeAccount = $subscriptionInfo | %{ $_.CurrentStorageAccountName }
- Select-AzureSubscription -SubscriptionName $subName
- $key = Get-AzureStorageKey $storeAccount | %{ $_.Primary }
- $storageAccountInfo = Get-AzureStorageAccount $storeAccount
- $location = $storageAccountInfo | %{ $_.Location }
- $hadoopUsername = "Hadoop"
- $clusterUsername = "Admin"
- $clusterPassword = "myclusterpassword"
- $secpasswd = ConvertTo-SecureString $clusterPassword -AsPlainText -Force
- $clusterCreds = New-Object System.Management.Automation.PSCredential($clusterUsername, $secpasswd)
- $clusterName = $Cluster
- $numberNodes = $Hosts
- $containerDefault = $StorageContainer
- $blobStorage = "$storeAccount.blob.core.windows.net"
- # tidyup the root to ensure empty
- # Remove-AzureStorageContainer Name $containerDefault -Force
- Write-Host "Deleting old storage container contents: $containerDefault" -f yellow
- $blobs = Get-AzureStorageBlob -Container $containerDefault
- foreach($blob in $blobs)
- {
- Remove-AzureStorageBlob -Container $containerDefault -Blob ($blob.Name)
- }
- # Create the cluster
- Write-Host "Creating '$numberNodes' Node Cluster named: $clusterName" -f yellow
- Write-Host "Storage Account '$storeAccount' and Container '$containerDefault'" -f yellow
- Write-Host "User '$clusterUsername' Password '$clusterPassword'" -f green
- New-AzureHDInsightCluster -Certificate $cert -Name $clusterName -Location $location -DefaultStorageAccountName $blobStorage -DefaultStorageAccountKey $key -DefaultStorageContainerName $containerDefault -Credential $clusterCreds -ClusterSizeInNodes $numberNodes
- Write-Host "Created '$numberNodes' Node Cluster: $clusterName" -f yellow
The only changes are the selection of the subscription and the removal of the Subscription option when creating the cluster. Of course all the other scripts can easily be modified along the same lines; such as the cluster deletion script:
- Param($Cluster = $(throw "Cluster Name Required."))
- # Get the subscription information and set variables
- $subscriptionInfo = Get-AzureSubscription -Default
- $subName = $subscriptionInfo | %{ $_.SubscriptionName }
- $subId = $subscriptionInfo | %{ $_.SubscriptionId }
- $cert = $subscriptionInfo | %{ $_.Certificate }
- $storeAccount = $subscriptionInfo | %{ $_.CurrentStorageAccountName }
- Select-AzureSubscription -SubscriptionName $subName
- $clusterName = $Cluster
- # Delete the cluster
- Write-Host "Deleting Cluster named: $clusterName" -f yellow
- Remove-AzureHDInsightCluster $clusterName -Subscription $subId -Certificate $cert
- Write-Host "Deleted Cluster $clusterName" -f yellow
This cluster creation script also contains an additional section to ensure that the default storage container does not container any leftover files, from previous cluster creations:
- $blobs = Get-AzureStorageBlob -Container $containerDefault
- foreach($blob in $blobs)
- {
- Remove-AzureStorageBlob -Container $containerDefault -Blob ($blob.Name)
- }
The rationale behind this is to ensure that any files that may be left over from previous clusters creations/deletions are removed.