In this Blog Post I will describe how I have configured VXLAN's over a Multicast enabled Layer 3 network. I will show the router configs and the associated multicast routes created and the host VXLAN mappings.
This lab is a physical lab rather than a virtual one on VMware Workstation. I hope to cover how to do this on a virtual lab in a later Blog Post. The lab is based on two hosts each with 2 Nics, a small PC for vCenter, vSphere Client and shared storage. The vShield Manager "VSM" is deployed onto the one of the two ESXi hosts. The lab is based on vSphere 5.1 and vCNS 5.1.
The network is based on two Cisco routers and a Switch for vSphere PC as shown in the diagram below:
On the network I have deployed PIM Sparse-Mode "PIM-SM" with R1 as the rendezvous point for both routers R1 and R2. I have used PIM-SM as apposed to sparse-dense mode which seems to be the recommendation. My reasons for this is that I have more experience of PIM-SM so this seems a good starting point for me to learn VMwares implementation of VXLAN.
As the previous blog I will not be deploying a VXLAN gateway just yet and will be concentrating on just VXLAN itself. The VSM and VXLAN preparation is identical to my previous 2 part Blog Post "Simple VXLAN lab on Workstation viewing traffic with Wireshark".
The only difference here is that each ESXi hosts vmk1 interface is now in a different layer 3 and layer 2 segment.
Host ESXi1 VXLAN interface vmk1 is in subnet 192.168.150.0/24
Host ESXi2 VXLAN interface vmk1 is in subnet 192.168.136.0/24
Each physical router acts as a DHCP server for each ESXi host.
Preparation for VXLAN is now as per below:
Below are the configs I used for routers R1 and R2, a simple network based on two routers in a single PIM-SM domain both running OSPF in a single area 0.
Router "R1" config:
!
hostname R1
!
ip multicast-routing
!
interface Loopback0
description PIM RP
ip address 172.16.1.1 255.255.255.255
ip pim sparse-mode
!
interface FastEthernet0/0
description Link to Host esxi1
ip address 192.168.150.254 255.255.255.0
no ip proxy-arp
ip pim sparse-mode
duplex auto
speed auto
!
interface FastEthernet0/1
description Link to R2-Fa0-1
ip address 10.1.0.1 255.255.0.0
ip pim sparse-mode
duplex auto
speed auto
!
router ospf 1
log-adjacency-changes
passive-interface FastEthernet0/0
network 10.0.0.0 0.255.255.255 area 0
network 172.16.1.1 0.0.0.0 area 0
network 192.168.0.0 0.0.255.255 area 0
!
ip pim rp-address 172.16.1.1
!
ip access-list standard VXLAN-1-BOUNDARY
deny 224.1.1.50
permit 224.0.0.0 15.255.255.255
!
Router "R2" config:
!
hostname R2
!
ip multicast-routing
!
interface FastEthernet0/0
description Link to Host esxi2
ip address 192.168.136.254 255.255.255.0
no ip proxy-arp
ip pim sparse-mode
duplex auto
speed auto
!
interface FastEthernet0/1
description Link to R1-Fa0-1
ip address 10.1.0.2 255.255.0.0
ip pim sparse-mode
duplex auto
speed auto
!
interface FastEthernet1/0
ip address 10.3.0.1 255.255.0.0
ip pim sparse-mode
duplex auto
speed auto
!
router ospf 1
log-adjacency-changes
passive-interface FastEthernet0/0
network 10.0.0.0 0.255.255.255 area 0
network 172.16.1.1 0.0.0.0 area 0
network 192.168.0.0 0.0.255.255 area 0
!
ip pim rp-address 172.16.1.1
!
ip access-list standard VXLAN-1-BOUNDARY
deny 224.1.1.50
permit 224.0.0.0 15.255.255.255
!
As in the previous lab I have two OpenBSD VMs deployed, VM01 on host ESXi1 and VM02 on host ESXi2. I have then created a single VXLAN named vxlan-01 with a VNI of 5000 using the multicast group 224.1.1.50.
VM's VM01 and VM02 are in subnet 172.16.0.0/24, VM01 with the IP 172.16.0.2 and VM02 with the IP 172.16.0.3.
With the VM's deployed and their vNics a member of vxlan-01 as expected there are no hosts joined to the multicast group 224.1.1.50 and there are no active multicast sources for the group 224.1.1.50.
Now we will power on both VM's VM01 and VM02. As soon as the VM's are powered on even before the guest OS of the VM's has booted up each host now joins the multicast group 224.1.1.50 through IGMP version 2.
We now have multicast routes in place for the two hosts that have joined the multicast group 224.1.1.50.
We can also see the IGMP membership report for group 224.1.1.50 on each router.
At this point as no packets such as broadcast, unknown unicast or ARP have been send from VM's VM01 and VM02 and therefore nothing has had to be encapsulated in the multicast group 224.1.1.50 by either host ESXi1 or ESXi2 so no multicast sources for group 224.1.1.50 are registered with the rendezvous point.
Now a ping session is started from VM01 on Host ESXi1 to VM02 on host ESX2. At this point host ESXi01 encapsulated the ARP request packet into a multicast packet and transmits it on the group 224.1.1.50 with a VXLAN header for VNI 5000. The router R1 will register the source 192.168.150.128 for the group 224.1.1.50 with the rendezvous point. The host ESXi2 will receive the Multicast packet for group 224.1.1.50 and VNI 5000, decapsulates it and send onto the recipient VM whilst adding the source VM Mac address, host and VXLAN mapping into its VXLAN mapping table. Router R2 will then send a source specific join towards the host ESXi1 for the group 224.1.1.50 (192.168.150.137,224.1.1.50). When the router R1 is recieving duplicate packets one from Shared tree and one from the now formed shortest-path-tree, the router R2 will switch over to the shortest-path-tree.
We now have the below multicast routes in place showing host ESXi1 as a source for group 224.1.1.50.
Only the original ARP request is passed over multicast the rest of the ICMP session is passed over Unicast between the host's encapsulated in a VXLAN packet for VNI 5000.
On the previous Blog Post I put up on VXLAN, the hosts VXLAN mapping table had an outer MAC that matched the recipient hosts vmk1 MAC address.
In this use case the outer MAC now has the outer MAC of the 1st hop router i.e. R1 fa0/0 or R2 fa0/0 as shown below, I presume this MAC address is learned when a host receives a VXLAN frame from another host as the source MAC address will be that of the egress router and alleviates the need for proxy-arp on the routers or a separate kernel default route for the VXLAN network.
Anyway, this is a short Blog Post just to hopefully describe basically how VXLAN can be used over a Layer 3 multicast enabled network.
In a future Blog Post I will look at the VXLAN gateway "vCNS Edge" and how it can be used to connect from on VXLAN to another or from the "real world" into a VXLAN. I will also cover NAT and firewall services on the vCNS Edge and how you can use the Edge CLI to aid fault finding.
The above is based on my understanding of both PIM-SM and VXLAN so may well be wrong, then again may hopefully be right
Thanks for reading.
Kevin Barrass