SAN Performance Optimizations on ESX

SAN Performance Optimizations on ESX

Disable Storage I/O control (SIOC)

VMware SIOC provides I/O prioritization for virtual machines that have access to a shared datastore cluster.On ISCSI, it is recommended that this feature be disabled within ESXi servers. 

To check or disable SIOC

  1. In the ESXi node configuration, click Storage
  2. For each EQL datastore, open Properties
  3. If checked, uncheck the box under Storage I/O Control that says “Enabled”
  4. Click Close

How to change the default iSCSI timeout values

When using Round Robin or MEM (Multipath Extension Module) for multipathing, Dell recommends changing the iSCSI timeout values from the default value of 5 seconds, to 60 seconds.

To do this:

  1. In the ESXi node configuration, click Storage Adapters
  2. Open the properties of your iSCSI Software Adapter
  3. Click Advanced
  4. Scroll down to the LoginTimeout section
  5. Change this value to 60 and click OK.

How to Disable DelayedAck

DelayedAck is a TCP/IP method intended to reduce I/O overhead, which is typically enabled by default with ESXi. However, leaving it enabled can increase latency between the ESXi host and the EqualLogic SAN. Dell recommends disabling this.

To disable DelayedAck globally in ESXi

  1. Place the ESXi node in Maintenance mode
  2. In the ESXi node configuration, click Storage Adapters
  3. Open the properties of your iSCSI Software Adapter
  4. Click Advanced
  5. Uncheck the “DelayedAck” box and click OK
  6. Restart the server

Next, check to ensure DelayedAck is disabled for existing connections:

  1. Login to the ESXi CLI using SSH or other method
  2. Run this command:  vmkiscsid –dump-db | grep Delayed
  3. A value of “1” indicates DelayedAck is enabled, “0” is disabled

If necessary, now disable DelayedAck on each existing connection as follows:

  1. In the ESXi node configuration, click Storage Adapters
  2. Open the properties of your iSCSI Software Adapter
  3. Click the Static Discovery tab and remove all entries
  4. Click the Dynamic Discovery tab and remove all EQL storage entries
  5. Reboot the node
  6. Return to Configuration -> Storage Adapters -> iSCSI Adapter -> Properties
  7. Select Dynamic Discovery, modify the EQL storage entry and click Settings
  8. Click Advanced and scroll down to Delayed ACK
  9. Uncheck Inherit from parent
  10. Uncheck DelayedAck and click OK
  11. Do this for each discovery address that needs to be modified
  12. Rescan your Storage adapters
  13. Verify that DelayedAck is now disabled using above procedures
  14. Remove the node from Maintenance mode and conduct testing

Perform the above procedures for each ESXi node in your cluster.

How to disable LRO (Large Receive Offload)

LRO is also intended reduce traffic overhead and is also typically enabled by default in ESXi. But, this comes at the expense of increased latency and reduced performance of your PS series EqualLogic SAN. Dell recommends disabling LRO especially when using Linux guests.

To disable LRO using ESXi shell

  1. Place the ESXi host in Maintenance mode
  2. Login to the ESXi CLI using SSH or other method
  3. Set the LRO value to zero using this command:  esxcfg-advcfg -s 0 /Net/TcpipDefLROEnabled
  4. Restart the server
  5. Remove ESXI host from Maintenance mode and test operations
  6. Repeat this process for all ESXi hosts in the cluser

Transfer latency causes storage performance issues. 

Typically if you have multiple arrays going over a multiple switches to talk, as in if your VM’s are spread around across multiple arrays, if you observe a great deal of “write” latency on the VMware esx performance monitor on a disk, you should verify the flow control. 

Flow control is a feature on the switch side. 

Flow control is used to control the rate of data transfer between two devices. Its a feature that may be turned off by default and this is to done prevent a device from overwhemling a target device by sending more packets than a destination can handle. During times of contention, packets will be dropped and will need to be retransmitted thus causing massive performance issues across the wider estate. This is usually turned off on switches to disable DDOS attacks etc from a particular VM 

To enable flow control 

 

(Visited 328 times, 1 visits today)

By C A Thomas

Chinchu A. Thomas is an Infrastructure Analyst specializing in Microsoft Azure, the Microsoft 365 suite, AWS, and Windows infrastructure management products.

Leave a Reply