Setting up the lab environment – Deduplication

The next step for the lab or so-called home data center: Installing and Configuring Deduplication

I was going to use a USB stick for the Windows Server 2016 OS.
The main reason for this: DEDUPLICATION.

I did start out with a USB stick, but due to performance issues this was changed – read the follow-up post (https://blog.thomasmarcussen.com/follow-up-on-the-home-datacenter-hardware/)

The reason for having the OS on a separate volume: Deduplication is not supported on system or boot volumes. Read more about Deduplication here: About Data Deduplication

Let’s get started

Installing and Configuring Deduplication

  1. Open an elevated PowerShell prompt
  2. Execute: Import-Module ServerManager
  3. Execute: Add-WindowsFeature -Name FS-Data-Deduplication
  4. Execute: Import-Module Deduplication

Installing Deduplication

Now we installed data Deduplication and it’s ready for configuration.

My Raid 0 volume is D:
The volume will primarily hold Virtual Machines (Hyper-V)
I’m going to execute the following command: Enable-DedupVolume D: -UsageType HyperV

Enable Deduplication for volume

You can read more about the different usage types here: Understanding Data Deduplication

Some quick info for the usage type Hyper-V:

  • Background optimization
  • Default optimization policy:
    • Minimum file age = 3 days
    • Optimize in-use files = Yes
    • Optimize partial files = Yes
  • “Under-the-hood” tweaks for Hyper-V interop

You can start the optimization job and limited (if needed) the amount of consumed memory for the process: Start-DedupJob -Volume “D:” -Type Optimization -Memory 50

 

 

 

You can get the deduplication status with the command: Get-DedupStatus

 

 

 

 

The currently saved space on my volume is 46.17 GB
That is for a 2 ISO files and a reference machine for Windows Server 2016 and the reference disks copied to separate folder.

More usefull powershell cmdlets here: Deduplication Cmdlets in Windows PowerShell

I do love deduplication especially for virtual machines, hence most of the basic data is the same.
The disks are also rather expensive so getting the most out of them is preferred.

 

Feel free to comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.