This post will show how to test Azure VM networking performance.
Azure offers a variety of VM sizes and types, each with a different mix of performance capabilities.
The network bandwidth allocated to each virtual machine is metered on egress (outbound) traffic from the virtual machine.
All network traffic leaving the virtual machine is counted toward the allocated limit, regardless of destination.
Ingress is not metered or limited directly.
For example, in this post, we will open Standard D8s v3 virtual machine as testing VM.
This VM target network egress throughput is 4Gbps and support up to 4 network interface.
Please note that multiple interface is used to provide different subnet design, no for bonding or increase network throughput.
To learn how many network interfaces different Azure VM sizes support, please check here
We setup two D8sV3 VM at Azure East Asia region.
Using iperf3 to test network throughput and qperf to test latency.
For network throughput test.
Server side use default setup iperf3 -s
Client side use single thread and lasts 30 second iperf3 -c 10.0.2.4 -t 30
For Network latency test.
Server side use default setup qperf
Client side use tcp latency setup qperf -v 10.0.2.4 tcp_lat
From the result, we can see D8sV3 VM egress throughput is 3.65Gbps with single TCP thread. CWND value is 3.27MB. Network latency is 146us.
All detail information about acceleration network, please check here
We setup two D8sV3 VM with acceleration network at Azure East Asia region.
For network throughput test.
For Network latency test.
From the result, we can see D8sV3 VM egress throughput is 3.82Gbps with single TCP thread. CWND value is 1.36MB. Network latency is 40us.
Azure by default provide a free load balancer.
This load balancer required that VM should be same AVAILABILITY SET.
In this setup, we create a basic load balancer and put 2 VM in same availability set with acceleration networking in the backend pool.
We setup another VM with acceleration networking and send traffic to LB frontend IP address.
We also need to define the load balance rule in order that testing traffic can pass the load balancer. We use TCP 12000 for testing.
We will test the bandwidth and latency impact when adding basic load balancer.
For network throughput test.
For Network latency test.
From the result, we can see D8sV3 VM egress throughput is 3.28Gbps with single TCP thread. CWND value is 1.32MB. Network latency is 81us.
Azure released standard load balance(SLB). This SLB support low latency load sharing and HA port.
In this post, we setup a SLB, put 2 VM with acceleration networking in the backend pool.
We setup another VM with acceleration networking and send traffic to SLB frontend IP address.
Because of HA port support, we don't need to define any specify port for load balance rules.
We will test the bandwidth and latency impact when adding new SLB.
For network throughput test.
For Network latency test.
From the result, we can see D8sV3 VM egress throughput is 3.28Gbps with single TCP thread. CWND value is 1.61MB. Network latency is 53us.
In Azure, there are 4 methods can link those two region VM together.
First one is using VM public IP(PIP), VM can talk directly with their PIP.
Second one is using VNET-VNET IPSec VPN. Each region VNET will setup an IPSec VPN gateway, two gateways will setup an IPSec VPN tunnel, VM can talk each other via tunnel.
Third one is using new feature called global VNET peering. This feature allow different region VM can talk directly without any gateway support.
Forth one is host ExpressRoute gateway and link two VNET gateway to a single ExpressRoute circuit.
We will first test the two VM network latency, this will have significant impact for network throughput.
From the result, one way network latency is 83ms, the round trip latency is 166ms.
If we use the default setup, single TCP thread throughput can only be only 131Mbps because of high network latency.
Also we see that CWND is 6.02MB.
If we wants to increase the single TCP thread throughput, we must increase the TCP send and receive buffer.
Basically, if there is no packet drop, TCP Throughput = buffer size / latency. If we wants to get 4Gbps throughput with 166ms latency network, buffer size should be around 85MB.
We modify system TCP parameter to increase the buffer size to 128MB.
echo 'net.core.wmem_max=131072000' >> /etc/sysctl.conf
echo 'net.core.rmem_max=131072000' >> /etc/sysctl.conf
echo 'net.ipv4.tcp_rmem= 10240 87380 131072000' >> /etc/sysctl.conf
echo 'net.ipv4.tcp_wmem= 10240 87380 131072000' >> /etc/sysctl.conf
sysctl -p
After the setup, we retest with iperf3 single tcp thread and get below result.
We see that throughput is around 2Gbps and have packet retry.
Second, we will setup two IPSec VPN gateway and test the latency and throughput.
We choose VPNGw1 which the performance is 650Mbps.
From the result, two VM connected by VPN have 2 hops. But latency is much lower than PIP.
VPN traffic may be routed by another WAN connection and have shorter distance.
Third, we will setup global VNET peering to test both latency and throughput.
We can see that from global VNET peering have only hop between two VM. Latency is almost same.
Forth, we will setup two ExpressRoute gateway in each VNET, and link those two gateway to single ExpressRoute circuit.
In this case, the traffic flow will be: VM - ExpressRoute Gateway - Microsoft ExpressRoute Edge - ExpressRoute Gateway - VM.
Two VM connected by ER is 3 hops. Latency is almost same 165ms with IPSec VPN.
For the single TCP throughput, this methods can only get 240Mbps.
From the below table, accelerate network will improve the end to end network latency.
Lower latency will reduce the CWND when reaching the same level TCP throughput.
New standard load balancer will only add small latency for end to end network latecnty.
Parameters | VM-VM without Acc | VM-VM with Acc | VM-LB-VM with Acc | VM-SLB-VM with Acc |
---|---|---|---|---|
Throughput | 3.65Gbps | 3.82Gbps | 3.28Gbps | 3.28Gbps |
CWND | 3.27MB | 1.36MB | 1.31MB | 1.61MB |
Latency | 146us | 40us | 81us | 53us |
Below the summary of VM to VM cross region test result. VPN gateway is the bottle neck of the performance. Global VNET peer have minimal performance impact comparing with direct PIP connection.
Parameters | VM-VM with PIP | VM-VM with VPN | VM-VM with Peer | VM-VM with ER |
---|---|---|---|---|
Throughput | 2.12Gbps | 549Mbps | 1.76Gbps | 240Mbps |
CWND | 24.5MB | 12.2MB | 33.7MB | 5.85MB |
Latency | 186ms | 168ms | 186ms | 168ms |