S2D cluster demotion

Mohamed jihad bayali 1,131 Reputation points
2024-02-08T16:46:53.8866667+00:00

Hello Team,

I have an s2d cluster, that was corrupted

I want know to rebuild this cluster, but first i want to do a clean demotion of that cluster (Disabling of s2d - cluster disk cleanup -cleaning the nodes, cluster objects from the domain....etc)

Is there any procedure to do this? how can i cleanup the s2d REFS volume before demoting the cluster? do you have any best practises about this?

Thanks

Hyper-V
Hyper-V
A Windows technology providing a hypervisor-based virtualization solution enabling customers to consolidate workloads onto a single server.
2,613 questions
Windows Server Storage
Windows Server Storage
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.Storage: The hardware and software system used to retain data for subsequent retrieval.
642 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Net Runner 600 Reputation points
    2024-02-11T18:33:00.9333333+00:00

    The best method to remove an S2D cluster entirely is reinstalling Windows Server (or AzureStack HCI) from scratch.

    Storage Spaces Direct is a deeply integrated essential part of the Windows Server operating system that touches nearly every aspect of it. That is why the best option to create a clean cluster is to install the operating system fresh and wipe the disks during the process.

    Cleaning up your Active Directory, even after a proper cluster demotion, is also crucial. There may still be lots of remnants left that may need manual removal.

    1. Start a new Windows Server installation.
    2. Run diskpart during the installation wizard and perform disk clean for each disk in the server.
    3. Finish the installation.
    4. Install all the updates.
    5. Create a new S2D cluster (using a different name and IP is recommended).

    If you are up to build a hyper-converged solution, you may also be interested in replacing Storage Spaces Direct with Virtual SAN https://www.starwindsoftware.com/vsan. While having a similar feature set it does not require ReFS and offers much better performance and storage efficiency for smaller clusters.

    1 person found this answer helpful.
    0 comments No comments

  2. Ian Xue (Shanghai Wicresoft Co., Ltd.) 33,376 Reputation points Microsoft Vendor
    2024-02-16T04:54:12.9066667+00:00

    Hi Mohamed, Thanks for your post. When taking an S2D server offline, it is not only taking away the compute and memory for that server but also a portion of the storage pool. Care must be taken to keep your data safe and ensure quick resumption of production-level readiness to your cluster. Visit Microsoft for the full description and latest information: https://docs.microsoft.com/en-us/windows-server/storage/storage-spaces/maintain-servers Key Steps to reboot servers:

    1. Open PowerShell as Admin.
    2. Check to make sure the virtual disks are healthy by running Get-VirtualDisk.
    3. Run Suspend-ClusterNode -Drain to move the VMs to another node.
    4. Run to cleanly put the storage into maintenance mode. At this point writes to this node’s storage are still active until step 5 has been completed. Get-StorageFaultDomain -type StorageScaleUnit | Where-Object {$_.FriendlyName -eq “<Node Name>”} | Enable-StorageMaintenanceMode
    5. Run to verify the disks for the node are in maintenance mode. You should see “In Maintenance Mode, OK” under Operational Status. Foreach($Node in (Get-ClusterNode).Name){$Node;Get-StorageNode -Name $Node*|Get-PhysicalDisk -PhysicallyConnected}
    6. Reboot server.
    7. Once you’re ready to put the server back into production, open PowerShell as Admin.
    8. Run to put the storage back into production. Get-StorageFaultDomain -type StorageScaleUnit | Where-Object {$_.FriendlyName -eq “<Node Name>”} | Disable-StorageMaintenanceMode
    9. A storage job will initiate in the background to repair and resync the data. To check on the status, run (as Admin) Get-StorageJob If it returns to a command prompt that means there are no jobs running. Do not reboot the next node until all of the jobs have been completed.
    10. Run Get-VirtualDisk to verify the virtual disks are healthy after storage jobs complete. Wait until steps 9 and 10 have been completed before live migrating VMs back to this node as storage jobs will consume system resources potentially affecting the response time of your applications.
    11. Run Resume-ClusterNode -Failback Immediate to put the cluster node back into production to handle VM workloads. Best Regards,

    Ian Xue


    If the Answer is helpful, please click "Accept Answer" and upvote it.

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    0 comments No comments