Diagnosing Network Issues on ESXi with esxtop and Log Files

In this blog, we will walk through how to diagnose and fix network issues on ESXi hosts, focusing especially on CRC errors and packet drops. We’ll cover using tools like esxtop, esxcli, and log files to get to the root of the problem — all with practical steps you can follow today.

Using esxtop to Identify Network Issues

Let’s start with what most seasoned admins reach for first: esxtop.

  1. SSH into the ESXi host
  2. Launch esxtop:esxtop
  3. Press n to jump into network view
ColumnWhat It ShowsWhy It MattersHow to Interpret
PKTTX/sPackets transmitted per second50k–300k normal, >500k highHigh PKTTX/s + small PSZTX
→ Chatty workloads, possible overhead, or microburst behaviour.
MbTX/sMegabits transmitted per second0–8 000 Mb/s normal, >9 000 Mb/s saturatedShows how close you are to the 10G ceiling.
PSZTXAverage TX packet sizeVery small packets often point to chatty apps or overhead-heavy traffic
50k–300k normal, >500k high
Small values (<500 bytes) often indicate fragmentation or chatty traffic
PKTRX/sPackets received per second50k–300k normal, >500k highSame logic as TX — high PPS stresses queues.
MbRX/sMegabits received per second0–8 000 Mb/s normal, >9 000 Mb/s saturatedCritical for spotting inbound congestion.
PSZRXAverage RX packet size1500 or 9000 bytes expectedLow values hint at MTU mismatch or small‑packet workloads.
DRPTXDropped outbound packetsShould be zero.Even a few drops on 10G indicate queue pressure or NIC issues.
DRPRXDropped inbound packetsAlso should be zero. Drops usually point to physical NIC saturation, teaming misconfig, or upstream switch issues.

Hit f in esxtop and enable:

  • f: TEAM-PNIC
  • j: USED-BY (who’s using the interface?)
  • k: PORT-ID
  • n: VSWITCH
  • o: VLAN-ID

You’ll now be able to correlate exactly which VM or kernel port is responsible for traffic spikes or anomalies.


Detecting Saturation or Imbalance

On a 10 Gbps NIC, the practical ceiling is roughly 9.4 Gbps once Ethernet overhead is accounted for.

In esxtop, watch:

Saturation:
Watch MbTX/s and MbRX/s. On 10G, anything above 8 000 Mb/s is high, and >9 000 Mb/s

Why this matters

10G NICs have deeper buffers, but once you push into the 80–95% range, even short bursts can overwhelm queues, causing:

  • Latency spikes
  • Packet drops (DRPTX/DRPRX)
  • Retransmissions
  • VM‑level performance degradation

Imbalance: Is One Uplink Doing All the Work?

If you’re using NIC teaming (vSwitch or VDS), imbalance is a classic silent killer.

In esxtop, check the TEAM‑PNIC column:

  • Each VMkernel or portgroup flow is mapped to a physical NIC
  • If one uplink shows heavy MbTX/MbRX while others sit near idle, you’ve got a distribution issue

Common causes

  • Route based on originating port (default) → Static pinning
  • Uneven VM placement across portgroups
  • LACP hashing mismatch between VDS and physical switch
  • Single‑flow workloads (e.g., vMotion, NFS, replication)

How to fix it

  • On VDS, Load‑Based Teaming (LBT) dynamically shifts flows based on real utilisation

Quick Reality Check for 10G Environments

  • If throughput is high but packet size (PSZTX/PSZRX) is small → microbursts likely
  • If throughput is moderate but drops appear → MTU mismatch or queue exhaustion
  • If one uplink is slammed while others nap → teaming policy mismatch

Deep Dive: NIC-Level Stats with esxcli

Sometimes the issue lies deeper than what esxtop reveals. Try:

esxcli network nic stats get -n vmnic7
watch  esxcli network nic stats get -n vmnic7

Check for:

  • Transmit errors
  • Receive drops
  • CRC or framing errors (indicates cabling or physical port issues)
  • High broadcast RX → noisy VLAN, misconfigured network, or ARP storms.
  • RX drops > 0 → ESXi host is overloaded or NIC buffers are insufficient.
  • CRC/frame errors → check cables, SFPs, or switch ports.
  • FIFO/missed errors → NIC saturation or hardware bottleneck.

Here’s where to look:

Log FilePathWhat to Look For
vmkernel.log/var/log/vmkernel.logPacket drops, link state changes
vobd.log/var/log/vobd.logNIC up/down events
hostd.log/var/log/hostd.logSwitch config changes, VM attach events
syslog.log/var/log/syslog.logGeneral networking/system events

Use grep to pull highlights:.

grep -i "duplicate" /var/log/vmkernel.log
grep -i "arp" /var/log/vmkernel.log

grep -i 'drop\|error\|vmnic\|link' /var/log/vmkernel.log
grep -i 'link.*up\|link.*down' /var/log/vobd.log

for crc errors, bad packets, usually physical or config related.
grep -i 'crc\|error\|link' /var/log/vmkernel.log
  • Confirming errors on NIC and switch
  • Matching duplex/speed mismatches
  • Replacing cables
  • Avoiding interference
  • Updating firmware
  • Checking logs for additional clues

Tail logs in real time while recreating the issue:

tail -f /var/log/vmkernel.log

Find VM-ID on ESXi to locate the vm causing issues

esxcli vm process list
or 
vim-cmd vmsvc/getallvms | grep <vmid>

Ring Buffers

A ring buffer is used to allocate a section of memory which is like a temporary holding area for packets whose packet rate may be so high that the code required to process them has trouble keeping up. 

You Increase Ring Buffers if you’re troubleshooting RX/TX queue overruns

Check the preset maximums

Each type of network adapter has a preset maximum which is determined by the device driver.  This is revealed by the above command featuring preset.

[root@vvf-esx03:~] esxcli network nic ring preset get -n vmnic0
Max RX: 4096
Max RX Mini: 2048
Max RX Jumbo: 0
Max TX: 4096

Check current ring buffer setting
The preset maximum is usually higher than the default setting which is allocated when ESXi is installed.  This is revealed by the above command featuring current.

[root@vvf-esx03:~]
[root@vvf-esx03:~] esxcli network nic ring current get -n vmnic0
RX: 1024
RX Mini: 128
RX Jumbo: 0
TX: 2048

Increase both RX and TX ring buffer

esxcli network nic ring current set -n vmnic4 -r 4096 -t 4096

Increase only the RX ring buffer

esxcli network nic ring current set -n vmnic4 -r 4096

https://knowledge.broadcom.com/external/article/341594/troubleshooting-nic-errors-and-other-net.html

(Visited 817 times, 2 visits today)

By Ash Thomas

Ash Thomas is a seasoned IT professional with extensive experience as a technical expert, complemented by a keen interest in blockchain technology.