DPM and deduplication on slower servers

If you are using Data Protection Manager with deduplication, you might have faced the infamous “disk missing” problem!
(Mark Reynolds describes the problem and its solution on his blog: https://blogs.technet.microsoft.com/dpm/2011/12/13/troubleshooting-dpm-volumes-are-flagged-as-missing-and-volsnap-logs-event-id-86/)
This error often occurs because the dedup process is very hungry for IO, and if the underlying storage can’t follow, the virtual DPM server will often timeout on disk access. This is especially true not only if dedup jobs are running while backups are runnning, but also if you are running both your virtual DPM server and your storage on the same physical host (which is now supported by Microsoft).

From this situation, most experts will recommend to increase the disk resources, which generally speaking goes against what most companies want, which is to use cheaper storage for their backup.
In this case, there is another solution, which is to run the deduplication job manually on a volume when it is running low on disk space. This allow a more granular control over the dedup process and ensure that there is always enough resources on the hosts for the DPM server.

“But, Stephane, how on earth do I do that?” I hear some of you ask already… glad you asked:
You can use the “Start-DedupJob” powershell command

Start-DedupJob -Type Optimization -Volume x: -Memory 20 -Priority Normal -InputOutputThrottleLevel high

Although this command comes with a lot more interesting features, I’ll focus on the above options for this specific post (You can find a full description on Technet: https://technet.microsoft.com/en-us/library/hh848442.aspx ):

  • Type Optimization – Instructs the system to start a dedup job
  • Volume x: – is the volume that is running out of space and you want to dedup
  • Memory 20 (the default value is 50%) – is the percentage of memory you are allowing the system to use for the job. Be careful there not to allocate too much, especially if you are running your virtual DPM server on the same host. That being said, the more memory allocated for the job, the better of course.
  • Priority Normal – The available values here are “low, normal, high”. This will determine what the priority for the process is (CPU and IO) if the system start to struggle for resources. Set this to low if you need to run a dedup while a backup is running.
  • InputOutputThrottleLevel high – The available values here are “none, low, medium, high, maximum”. With this option you can throttle the amount of disk IO being used by the process. This is probably the most interesting setting because with this you can ensure your virtual DPM server will never get starved of IO and therefor not mark a disk as “missing”.

You can now run the above while your backup is not running or run it with the appropriate value as to ensure DPM does lack the resources it needs when backup kicks in.

To see the job progress:

Get-DedupJob

If, at any time, you want to stop the job:

Stop-DedupJob -Volume x: 

Leave a Reply

Your email address will not be published. Required fields are marked *