Availability Group Listener in Windows Azure Now Supported! (And Scripts for Cloud-Only Configuration)

If you haven't noticed, AlwaysOn Availability Groups in Windows Azure now supports availability group (AG) listeners. Configuring the AG listener in Windows Azure is a multi-step process that includes:

  1. Creating a load-balanced endpoint for the Windows Azure VMs that host AG replicas.
  2. Ensuring that hotfix KB2854082 is installed on all cluster nodes (including any node that does not host an AG replica).
  3. Opening the probe port in the cluster nodes that host AG replicas.
  4. Manually creating a client access point in the cluster and configuring the IP address parameters to use the cloud service IP address, the probe port, etc.
  5. Setting the listener port in SQL Server.

If you are looking for an easier way to configure the listener in Windows Azure, I've published a script at the Script Center. This script currently has limited applications, but hopefully I can expand the scenarios as time goes on – and if you shout in my ear. If your scenario fits all the requirements, then I hope this script can help simplify the process of listener creation. If you don't fit all the requirements, you may still be able to "scriptify" most of the steps. So just read on.

Now back to the requirements for this script, the biggest limitations are as follows:

  • All AG nodes are running in Windows Azure and in the same subnet – Simply put, if the same listener requires multiple IPs, you can't use this script. This means the script excludes to all multi-subnet scenarios such as hybrid IT.

  • All cluster VMs were created with PowerShell Remoting enabled – This part gets a little hairy, so get ready. If after GA (4/16) you created your cluster VMs using PowerShell, then they are all PowerShell Remoting enabled. If however, you created your VMs using the portal, you had a choice until very recently to enable PowerShell Remoting by means of a small check box. If you didn't check that box, I won't say you lose, but you definitely can't use this script, at least not without manually enabling PowerShell Remoting on your VMs and tweaking the script. My personal opinion is: not worth the trouble.

    Now, the Azure team make a small tweak on 7/16 that enables PowerShell Remoting for all portal-created VMs without giving you the option. So if you created your VM after 7/16, then you win!

So enough for the $winners, now for the -not($winners) – those who can't use the script because of the above limitations. I'd like to provide some PowerShell snippets that you can run and that hopefully can make things simpler as well. Mainly, there are three scripts: one to run on your local client from which you normally administer your Windows Azure deployment, one to run on all your cluster VMs, and one to run on the primary replica VM. Understand that these scripts are much more "quick and dirty" than foolproof, so don't expect the validation and error checking that you find in the downloadable script. Also, you should have created a working AG in Windows Azure before using these steps. So now, without further ado, here are steps:

  1. Windows Azure PowerShell June 2013 or later installed on the local client. Download at https://go.microsoft.com/?linkid=9811175&clcid=0x409.

  2. On your local client, copy and paste the following script into a Windows Azure PowerShell session to configure LB endpoints and DSR each AG node (not necessarily each WSFC node)

    # Define variables

    $AGNodes = "VM1","VM2","VM3" # all AG nodes should be included

    $ServiceName = "MyCloudService" # the name of the cloud service that contains the AG nodes

    $EndpointName = "MyEndpoint" # name of the endpoint

    $EndpointPort = "10000" # public and private port to use for the endpoint

     

    # Configure a load balanced endpoint for each node in $AGNodes, with direct server return enabled

    ForEach ($node in $AGNodes)

    {

       Get-AzureVM -ServiceName $ServiceName -Name $node | Add-AzureEndpoint -Name $EndpointName -Protocol "TCP" -PublicPort $EndpointPort -LocalPort $EndpointPort -LBSetName "$EndpointName-LB" -ProbePort 59999 -ProbeProtocol "TCP" -DirectServerReturn $true | Update-AzureVM

    }

  3. Connect to RDP session for each WSFC node and download hotfix KB2854082 to a local directory.

  4. In the RDP session for each WSFC node, copy and paste the following script into an elevated PowerShell session to install hotfix 2854082 and open the probe port in the firewall if the node is an availability group node. Be careful to run the script to completion on each node before moving onto the next node.

    # Define variables

    $ag = "AG1" #Availability group name

    $hotfixpath = "<localpath>" #Hotfix's .msu file path

    $listenerPort = "10000" # Listener port. Same as the endpoint port.

     

    Import-Module FailoverClusters

    # Stop the cluster service

    Stop-ClusterNode -Name $env:COMPUTERNAME

    # Run update

    & $hotfixpath /quiet

    # Wait for update to finish

    while ((Get-Process | where {$_.ProcessName -eq "wusa"}) -ne $null)

    {

       Write-Host "Waiting for update to complete..."

       Start-Sleep -Seconds 10

    }

    # Start the cluster service

    Start-ClusterNode -Name $env:COMPUTERNAME

    # Check if VM is an AG node and open the port port if true

    If (Get-ClusterOwnerNode -Group $ag | where {$_.OwnerNodes -contains $env:COMPUTERNAME})

    {

       netsh advfirewall firewall add rule name='Load Balance Probe (TCP-In)' localport=59999 dir=in action=allow protocol=TCP

       netsh advfirewall firewall add rule name='Availability Group Listener (TCP-In)' localport=$listenerPort dir=in action=allow protocol=TCP

    }  

  5. In the RDP session for the primary replica VM, copy and paste the following script into an elevated PowerShell session

    # Define variables

    $ag = "AG1" # The availability group name

    $serviceName = "MyCloudService" # The cloud service name

    $networkName = "Cluster Network 1" # The cluster network name, usually "Cluster Network 1" if the nodes are in the same subnet

    $listenerPort = "10000" # Listener port. Same as the endpoint port.

     

    $aglistener = $ag + "Listener"

    $agendpoint = (Resolve-DnsName -Name "$serviceName.cloudapp.net").IPAddress

    Import-Module FailoverClusters

    # Add IP address resource for the listener to AG resource group. The probe port is set so the AG owner node will respond to probes from Windows Azure.

    Add-ClusterResource "IP Address $agendpoint" -ResourceType "IP Address" -Group $ag | Set-ClusterParameter -Multiple @{"Address"="$agendpoint";"ProbePort"="59999";SubnetMask="255.255.255.255";"Network"="$networkName";"OverrideAddressMatch"=1;"EnableDhcp"=0}

    # Add Network Name resource for the listener to AG resource group

    Add-ClusterResource -Name $aglistener -ResourceType "Network Name" -Group $ag | Set-ClusterParameter -Multiple @{"Name"=$aglistener;"DnsName"=$aglistener}

    # Set dependency for the Network Name resource on the IP address resource

    Get-ClusterResource -Name $aglistener | Set-ClusterResourceDependency "[IP Address $agendpoint]"

    # Start the listener resource

    Start-ClusterResource -Name $aglistener

    # Set dependency for the AG resource group on the listener's network name

    Get-ClusterResource -Name $ag | Set-ClusterResourceDependency "[$aglistener]"

    # Change port number on the listener to 1433

    Set-SqlAvailabilityGroupListener -Path SQLSERVER:\SQL\$env:COMPUTERNAME\DEFAULT\AvailabilityGroups\$ag\AvailabilityGroupListeners\$aglistener -Port $listenerPort

  6. Test connection to listener from a domain-joined VM that is not in the same cloud service (DSR not supported from within the same cloud service). Use a longer login timeout since network messages are traversing the VM's public endpoint. You can use sqlcmd or SSMS.

    sqlcmd –S "<ListenerName>,<EndpointPort>" -d "<DatabaseName>" -Q "select @@servername, db_name()" -l 15

  7. Fail over the AG and test the listener connection again. The query above should succeed and return a different server name.

The main HADR topic for SQL Server in Windows Azure: High Availability and Disaster Recovery for SQL Server in Windows Azure Virtual Machines, will be updated in the near future and will include a GUI tutorial on how to configure this.

Comments

  • Anonymous
    August 28, 2013
    Cephas, I have a quick question about this. If I understand correctly, this is creating a public endpoint for the AG, which I think presents a couple of issues. For one, I certainly wouldn't want my SQL setup to be exposed publicly to the Internet for security reasons. Second, I would rather have my Azure WebRole front-end to connect directly to the AG (they're in separate cloud services, on the same network, different subnets) as opposed to traversing the VM's public endpoint (as you mention when you suggest increasing the login timeout). So, the question is: can I setup a listener that's not public on the internet and that I can connect to from my WebRole cloud service? Thanks!

  • Anonymous
    December 06, 2013
    Sari, sorry for the late reply. Currently, you can use Windows Azure's Network Access Control Lists (ACL) to restrict access to your endpoint. For more information, see msdn.microsoft.com/.../dn376541.aspx. In that sense, you can make it not public. And yes, the client will traverse the public endpoint through the Windows Azure load balancer in order to find the availability group listener. However, once it finds the listener, it does make a direct connection to the VM.

  • Anonymous
    March 10, 2014
    Can you also create a listener for other cluster workloads?

  • Anonymous
    April 28, 2014
    HI, Can AG Listener be configured in WS 2008 R2 server as well??

  • Anonymous
    September 10, 2014
    I've successfully set up a failover cluster SQL Server with 2  nodes (sqlserver1 and sqlserver2. However I can only connect from the node where SQL Server is running. I cannot even connect from the second node, except when I move the role to it. I cannot connect from any client machine. Searching for solution on the web, I learnt that is a known issue with Microsoft SQL SERVER 2012 R2 The SQL cluster does not resolve to the virtual network name IP instead it resolves to the IP address of the node where SQL Server is running IP I have installed SQL Server Service pack 2 on both nodes hoping this will solve the problem but still nothing change. Is there any one who found solution for this issue and can share it with me?