September 16, 2024

#StorageMinute – Intelligent Storage Operations in vSAN 6.7 U3

Welcome to the #StorageMinute series where we deliver key topics via bite-size content.

In this edition, we look at how vSAN 6.7 Update 3 can automatically address storage consumption when temporary operations can cause less than optimal performance or capacity distribution. Different operations such as policy changes, large numbers of objects being created or deleted, or when reprotecting data after a node or device failure can result in a less than optimal configuration.

Storage Policy Changes

Changing the storage policy assigned to a vSAN object can often result in a full copy of the object being created with the new policy. Once the new copy is created, the old is discarded. When changing the policy assigned to a single vSAN object, or a limited few, this typically does not require much transient capacity. When changing the policy assigned to tens, hundreds, or thousands of vSAN objects can require a significant amount of transient capacity. Consider changing 100 vSAN objects from Mirroring to Erasure Coding with a desire to reduce the total amount consumed. During the policy change process, all 100 vSAN objects will have an erasure coded copy created. Rather than changing a given policy, general guidance has been to create and assign a new policy a few vSAN objects, such as a VM, at a time to reduce the storage consumption during the policy changeover.

vSAN 6.7 Update 3 changes the default behavior of storage policy changes to operate in batches. Depending on the amount of free capacity, vSAN will only apply policy changes to a group of vSAN objects at a time, reducing the amount of temporary storage required for policy changes.

Unbalanced Cluster

Several different scenarios could result in a cluster that doesn’t have a uniform, or balanced, distribution of data. Some of these could include:

  • Nodes are added to an existing vSAN Cluster
  • Capacity devices are added to a disk group
  • Disk groups are added to an existing cluster
  • A large number vSAN objects are created or deleted

The vSAN Health Check will warn when a cluster isn’t balanced and provides the ability to invoke a process to balance data across the cluster. Proactive rebalancing was introduced to give administrators a proactive approach to address an unbalanced vSAN cluster. Both required input from an administrator.

vSAN 6.7 Update 3 introduced an option to automatically balance data. The default device free percentage (30%) can easily be adjusted to give administrators even more flexibility to address unbalanced cluster scenarios.

Transient Capacity use during Resyncs

When bringing vSAN objects back to a policy-compliant state, resyncs write new data to disk groups and their devices. What happens though when a resync operation consumes enough capacity on a device or disk group to the point where it prevents already compliant workloads from growing beyond their consumed capacity to their provisioned capacity? Administrators could use a storage policy that has a space reservation to ensure available capacity, but this approach may not be the most efficient use of capacity overall.

In vSAN 6.7 Update 3, resync jobs are now paused when available capacity goes below a given threshold, which depends on different factors at the individual device, each disk group and overall cluster levels. vSAN will determine when to pause resyncs at a more aggressive pace rather than allowing them to resume, and at different thresholds. This results in resync operations that are less oscillatory in nature, giving a better overall experience. This behavior, coupled with previously released Adaptive Resync and the Parallel Resync feature in vSAN 6.7 U3, helps ensure resync operations are completed more quickly and efficiently.

More information

These are only a few topics that pertain to some of the Intelligent Operations in vSAN 6.7 Update 3. More information can be found in the What’s New in vSAN 6.7 Update 3 blog post and the vSAN 6.7 Update 3 Technical Overview.

#StorageMinute

@jasemccarty

This was originally posted on the VMware Virtual Blocks site: https://blogs.vmware.com/virtualblocks/2019/09/17/storageminute-storage-operations/