Diagnosing Network Issues on ESXi with esxtop and Log Files -

In this blog, we will walk through how to diagnose and fix network issues on ESXi hosts, focusing especially on CRC errors and packet drops. We’ll cover using tools like esxtop, esxcli, and log files to get to the root of the problem — all with practical steps you can follow today.

Using `esxtop` to Identify Network Issues

Let’s start with what most seasoned admins reach for first: esxtop.

SSH into the ESXi host
Launch esxtop:esxtop
Press n to jump into network view

Column	What It Shows	Why It Matters	How to Interpret
PKTTX/s	Packets transmitted per second	50k–300k normal, >500k high	High PKTTX/s + small PSZTX → Chatty workloads, possible overhead, or microburst behaviour.
MbTX/s	Megabits transmitted per second	0–8 000 Mb/s normal, >9 000 Mb/s saturated	Shows how close you are to the 10G ceiling.
PSZTX	Average TX packet size	Very small packets often point to chatty apps or overhead-heavy traffic 50k–300k normal, >500k high	Small values (<500 bytes) often indicate fragmentation or chatty traffic
PKTRX/s	Packets received per second	50k–300k normal, >500k high	Same logic as TX — high PPS stresses queues.
MbRX/s	Megabits received per second	0–8 000 Mb/s normal, >9 000 Mb/s saturated	Critical for spotting inbound congestion.
PSZRX	Average RX packet size	1500 or 9000 bytes expected	Low values hint at MTU mismatch or small‑packet workloads.
DRPTX	Dropped outbound packets	Should be zero.	Even a few drops on 10G indicate queue pressure or NIC issues.
DRPRX	Dropped inbound packets	Also should be zero.	Drops usually point to physical NIC saturation, teaming misconfig, or upstream switch issues.

Hit f in esxtop and enable:

f: TEAM-PNIC
j: USED-BY (who’s using the interface?)
k: PORT-ID
n: VSWITCH
o: VLAN-ID

You’ll now be able to correlate exactly which VM or kernel port is responsible for traffic spikes or anomalies.

Detecting Saturation or Imbalance

On a 10 Gbps NIC, the practical ceiling is roughly 9.4 Gbps once Ethernet overhead is accounted for.

In esxtop, watch:

Saturation:
Watch MbTX/s and MbRX/s. On 10G, anything above 8 000 Mb/s is high, and >9 000 Mb/s

Why this matters

10G NICs have deeper buffers, but once you push into the 80–95% range, even short bursts can overwhelm queues, causing:

Latency spikes
Packet drops (DRPTX/DRPRX)
Retransmissions
VM‑level performance degradation

Imbalance: Is One Uplink Doing All the Work?

If you’re using NIC teaming (vSwitch or VDS), imbalance is a classic silent killer.

In esxtop, check the TEAM‑PNIC column:

Each VMkernel or portgroup flow is mapped to a physical NIC
If one uplink shows heavy MbTX/MbRX while others sit near idle, you’ve got a distribution issue

Common causes

Route based on originating port (default) → Static pinning
Uneven VM placement across portgroups
LACP hashing mismatch between VDS and physical switch
Single‑flow workloads (e.g., vMotion, NFS, replication)

How to fix it

On VDS, Load‑Based Teaming (LBT) dynamically shifts flows based on real utilisation

Quick Reality Check for 10G Environments

If throughput is high but packet size (PSZTX/PSZRX) is small → microbursts likely
If throughput is moderate but drops appear → MTU mismatch or queue exhaustion
If one uplink is slammed while others nap → teaming policy mismatch

Deep Dive: NIC-Level Stats with `esxcli`

Sometimes the issue lies deeper than what esxtop reveals. Try:

esxcli network nic stats get -n vmnic7
watch  esxcli network nic stats get -n vmnic7

Check for:

Transmit errors
Receive drops
CRC or framing errors (indicates cabling or physical port issues)
High broadcast RX → noisy VLAN, misconfigured network, or ARP storms.
RX drops > 0 → ESXi host is overloaded or NIC buffers are insufficient.
CRC/frame errors → check cables, SFPs, or switch ports.
FIFO/missed errors → NIC saturation or hardware bottleneck.

Here’s where to look:

Log File	Path	What to Look For
`vmkernel.log`	`/var/log/vmkernel.log`	Packet drops, link state changes
`vobd.log`	`/var/log/vobd.log`	NIC up/down events
`hostd.log`	`/var/log/hostd.log`	Switch config changes, VM attach events
`syslog.log`	`/var/log/syslog.log`	General networking/system events

Use grep to pull highlights:.

grep -i "duplicate" /var/log/vmkernel.log
grep -i "arp" /var/log/vmkernel.log

grep -i 'drop\|error\|vmnic\|link' /var/log/vmkernel.log
grep -i 'link.*up\|link.*down' /var/log/vobd.log

for crc errors, bad packets, usually physical or config related.
grep -i 'crc\|error\|link' /var/log/vmkernel.log

grep -i "duplicate" /var/log/vmkernel.log
grep -i "arp" /var/log/vmkernel.log

grep -i 'drop\|error\|vmnic\|link' /var/log/vmkernel.log
grep -i 'link.*up\|link.*down' /var/log/vobd.log

for crc errors, bad packets, usually physical or config related.
grep -i 'crc\|error\|link' /var/log/vmkernel.log

Confirming errors on NIC and switch
Matching duplex/speed mismatches
Replacing cables
Avoiding interference
Updating firmware
Checking logs for additional clues

Tail logs in real time while recreating the issue:

tail -f /var/log/vmkernel.log

Find VM-ID on ESXi to locate the vm causing issues

esxcli vm process list
or 
vim-cmd vmsvc/getallvms | grep <vmid>

Ring Buffers

A ring buffer is used to allocate a section of memory which is like a temporary holding area for packets whose packet rate may be so high that the code required to process them has trouble keeping up.

You Increase Ring Buffers if you’re troubleshooting RX/TX queue overruns

Check the preset maximums

Each type of network adapter has a preset maximum which is determined by the device driver. This is revealed by the above command featuring preset.

[root@vvf-esx03:~] esxcli network nic ring preset get -n vmnic0
Max RX: 4096
Max RX Mini: 2048
Max RX Jumbo: 0
Max TX: 4096

Check current ring buffer setting
The preset maximum is usually higher than the default setting which is allocated when ESXi is installed. This is revealed by the above command featuring current.

[root@vvf-esx03:~]
[root@vvf-esx03:~] esxcli network nic ring current get -n vmnic0
RX: 1024
RX Mini: 128
RX Jumbo: 0
TX: 2048

Increase both RX and TX ring buffer

esxcli network nic ring current set -n vmnic4 -r 4096 -t 4096

Increase only the RX ring buffer

esxcli network nic ring current set -n vmnic4 -r 4096

https://knowledge.broadcom.com/external/article/341594/troubleshooting-nic-errors-and-other-net.html

(Visited 817 times, 2 visits today)

Diagnosing Network Issues on ESXi with esxtop and Log Files

Using `esxtop` to Identify Network Issues

Detecting Saturation or Imbalance

Deep Dive: NIC-Level Stats with `esxcli`

Here’s where to look:

Find VM-ID on ESXi to locate the vm causing issues

Ring Buffers

By Ash Thomas

Chinchu A Thomas

Diagnosing Network Issues on ESXi with esxtop and Log Files

Using esxtop to Identify Network Issues

Detecting Saturation or Imbalance

Deep Dive: NIC-Level Stats with esxcli

Here’s where to look:

Find VM-ID on ESXi to locate the vm causing issues

Ring Buffers

By Ash Thomas

Chinchu A Thomas

Using `esxtop` to Identify Network Issues

Deep Dive: NIC-Level Stats with `esxcli`