How to check for DPM performance issues while backing-up to disk
In such a DPM Performance issue while backing-up to disk, you will need to check a variety of steps, like the following:
- Check the Resource monitor on both the DPM server and the protected computer and select DPMRA and System processes during a Consistency Check. Check the disk IO and see if you have any bottleneck.
- Double-check the replica size and make sure you have enough space allocated.
- Involve the storage/disk vendor/manufacturer to further check for any hardware issues.
- Check for any errors on the VSS writers on the DPM server, by running the VSSADMIN LIST PROVIDERS and VSSADMIN LIST WRITERS .
- Check for any errors in the Event Viewer of the DPM and the target computer logs..
- Check for any issues with the transfer rates on your disks. (Test with some file Copy > paste operations, between the disks by using a lot of big and different types of files.
A datasource with many small files will usually take longer than a datasource with fewer, larger files.)
- Run the chkntfs command (If no switches are specified, chkntfs displays the status of the dirty bit for each drive) on your drives.
- Adjust the pagefile size to the recommended one for DPM: https://technet.microsoft.com/en-gb/library/hh757757.aspx
Pagefile |
1.5 times the amount of RAM on the computer. |
0.2 percent of the combined size of all recovery point volumes, in addition to the minimum requirement size (1.5 times the amount of RAM on the computer). |
Note: Because the consistency check only shows you the number of bytes you have transfered since the CC started, it may take an hour of CRC checking, to discover 1GB of data that has actually changed, so in that 1 hr, you will only see DPM transfer 1GB, that DOES NOT mean that the trasfer rate is 1GB / hr. It may go several hours and not transfer any data.
- You may try to disable the Last Access Time update on the DPM server and the protected servers and check again.
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem
Value Name: NtfsDisableLastAccessUpdate
Data Type : REG_DWORD
Data : 1
- Check for any errors in the DPM logs:
This is the plan, in order to collect some DPM (Data Protection Manager) verbose logs (that way, the DPM log files will contain MUCH MORE logging to help troubleshoot issues). Make sure that no jobs are active or scheduled to run, while performing the following steps.To enable full VERBOSE logging add the following on both the primary and secondary DPM Server:
Go to the registry: HKLM\Software\Microsoft\Microsoft Data Protection Manager
Add a DWORD value, named: TraceLogLevel and set it to 0x43e
>Stop the DPM services that you want to enable verbose logging for.. ( DPM AccessManager Service, MSDPM service, etc..).
>Delete all the old logs.
The DPM SERVER 2010 (or if upgraded to DPM 2012) Logs are in the C:\Program files\Microsoft DPM\DPM\temp folder.
The DPM SERVER 2012 logs are in the C:\Program Files\System Center 2012\DPM\DPM\temp folder.
The PROTECTED SERVER logs are in C:\Program files\Microsoft Data Protection Manager\DPM\temp
>Now you just need to reproduce the issue.
Note: After you reproduce the issue, be sure to delete or rename the TraceLogLevel registry setting and restart the DPM services (so non-Verbose logging takes place again).
- Check for any DPM throttling on the agents (in DPM console > Management > click on agent > Action > Throttling.. )
- Modify the Protection Group or recreate it… and then, run a consistency check.
- Exclude the DPM installation folder from real-time scan of AV software on DPM server and the protected servers.
- You may need to evaluate, how large data need to copy and what kind of data is critical. You can separate those data into the two categories, one is critical and another is non-critical.
For the critical data, you can configure DPM incremental backup every day. For the non-critical data, you can configure incremental backup every 2 days.
- You can also run a performance monitor. Capture a Performance Monitor log when the problem is reproduced, so you clarify if there are any performance issues on the DPM and the protected computer.
Add "All counters" and "All instances" for each of the following:
• Memory
• Network Interface
• Objects
• Paging File
• Physical Disk
• Process
• Processor
• Redirector
• Server
• Server Work Queues
• System
Reproduce the problem, after that, click "stop button" to stop performance monitor and collect/analyze the data.
- Capture a Network Monitor trace when the problem is reproduced:
• Download Network Monitor https://www.microsoft.com/en-us/download/details.aspx?id=4865
• Install and run the tool at the DPM server and the protected server.
• Trigger a backup job.
• After the job completes, stop, save and analyze the captures.