When you created the service, the integration between ACI and Kubernetes made the ACI constructs and policies to expose the external IP. To accomplish this goal ACI utilized PBR (policy based redirect). PBR makes it possible for the ACI fabric to redirect traffic into L4/L7 devices without the need for these devices to be the default gateway or routep point. This is popular construct in networks for firewalls, intrussion prevention systems, load balancers and more.
Since PBR is loaded on top of ACI contracts you can also do policy enforcement on the services that are being configured. For example, in the configuration you can create network filters so that traffic destined to a web service are only allowed to hit port 80 and 443. Making sure that the load balancer is receiving only traffic that policy allows for a more secure network.
The below diagram is a representation of what the integration between ACI and Kubernetes built to provide the service you defined to be exposed for the simple MyLabAPP. This provides you with an easy ability to load balance in your enterprise private network services that are developed for the platform. Since ACI will do the load balancing, as you scale the containers in Kubernetes to handle the service load capacity, ACI automatically will redirect the traffic into the right containers.
Since there is no defined bridge domain that would contain the subnet that is defined for the dynamic service, you have to build a static route from the adjacent router into the ACI fabric. For this lab we just pre-built the static routes for you to simplify the lab complexity.
The configuration that is deployed in ACI starts in the External Layer 3 policy object. Looking in the ACI fabric under the common tenant you can find this policy.
The service assigned IP should be the same as what you have in value when doing the command
kubectl get pods -o wide
. Another easy way to get this information is to
access this via the ACI fabric APIC.
The ACI APIC leverages the same NX-OS CLI that can be used to view the fabric objects just like you can view these from the web interface. In the Credentials link on the top of this web page you have a link to connect via SSH to the APIC.
show tenant common vrf k8s_vrf external-l3 epg k8s_pod09_svc_default_mylabapp detail
Name Flags Match Node Entry Oper State -------------- ------------------------- ------------------ ---------- ------------------ ---------- common: vxlan: 2555909 10.0.146.67/32 node-209 10.0.146.67/32 enabled k8s_pod09_svc_ vrf: k8s_vrf node-207 10.0.146.67/32 enabled default_mylaba Target dscp: unspecified node-210 10.0.146.67/32 enabled pp qosclass: unspecified node-208 10.0.146.67/32 enabled Contracts --------- Provided: k8s_pod09_svc_d efault_mylabapp Consumed:
Using the same CLI we can also look at the created graph in the fabric.
show l4l7-graph tenant common graph k8s_pod09_svc_global
Graph : common-k8s_pod09_svc_global Graph Instances : 1 Consumer EPg : common-k8s-epg Provider EPg : common-k8s_pod09_svc_default_mylabapp Contract Name : common-k8s_pod09_svc_default_mylabapp Config status : applied Function Node Name : loadbalancer Service Redirect : enabled Connector Encap Bridge-Domain Device Interface Service Redirect Policy ---------- ---------- ---------- -------------------- ------------------------------ provider vlan-3209 common-k8s_po interface k8s_pod09_svc_default_mylabapp d09_bd_kubern etes-service consumer vlan-3209 common-k8s_po interface k8s_pod09_svc_default_mylabapp d09_bd_kubern etes-service
As you can see the ACI/Kubernetes integration has created a bridge domain that is used for the definition of the service graph instances in the fabric.
Looking at the diagram we showed earlier, we see that the contract that is created between the EPG for the service and the external network is where the Service Graph is layered upon.
Looking in the APIC CLI we can see:
show tenant common contract k8s_pod09_svc_default_mylabapp
Tenant Contract Type Qos Class Scope Subject Access-group Dir Description ---------- ---------- ---------- ------------ ---------- ---------- ---------- ---- ---------- common k8s_pod09_ permit unspecified vrf loadbalanc k8s_pod09_sv both svc_defaul edservice c_default_my t_mylabapp labapp
One of the important value statements of the ACI integration with Kubernetes, is how service definitions in Kubernetes translates directly into ACI policies to be implemented in the fabric dynamically. If you look at the following example, any of the specified ports that are defined in the Kubernetes service YAML file will be created in the ACI fabric.
apiVersion: v1 kind: Service metadata: name: mycrazyapp labels: app: mycrazyapp spec: type: LoadBalancer loadBalancerIP: 10.0.146.67 ports: - name: http port: 80 targetPort: 80 - name: https port: 443 targetPort: 443 - name: mgmt port: 8080 targetPort: 8080 selector: - name: data-ingress port: 9183 targetPort: 9183 app: mycrazyapp
This provides us the detail that the access-group/filter is called k8s_pod09_svc_default_mylabapp
and we could verify the filter that we are using for this particular application which in this case is TCP port 80:
show tenant common access-list k8s_pod09_svc_default_mylabapp
apic1# show tenant common access-list k8s_pod09_svc_default_mylabapp Tenant : common Access-List : k8s_pod09_svc_default_mylabapp match tcp dest 80
You can also look at at the PBR services created in ACI based on the Kubernetes deployment. You will be able to see the hashing algorithm which in this case uses the default setting sip-dip-prototype. This means this policy will be hashing the traffic based on Source IP + Destination IP + Protocol Type.
This window also shows the destination nodes which are the services nodes. In your case it would be the IP address assigned to pod09-node1 and pod09-node2 from the OpFlex protocol.
Back in the ACI fabric:
As part of the configuration for the integration to work you have to define the static route that points towards the ACI fabric leaf that has the defined L3 out. Traffic that arrives from the network towards the exposed service IP will be routed via this mechanism. The ACI fabric will never advertise the route to the network by itself.
When the packet hits the ACI fabric leaf, the L3 out policy object will probably have a default route defined leaving the fabric back towards the adjacent router. If it doesn't, this won't change the behaviour we are going to explain.
As the packet is in the L3 out object context, it would look in the VRF for a destination to
send the packet and this will trigger a MISS since it is not defined anywhere in a bridge domain. For this
reason, before sending the packet back out it performs a look up on the SRC IP
and
classifies the packet into the default EPG that has defined 0.0.0.0/0 as destination.
Now the packet should be ready to be sent out the fabric, but when ACI does the final lookup
to send the packet; Longest Prefix Match (LPM) places the packet in the service EPG due to
match of the DST IP
to the /32 in the service EPG.
Since LPM placed this in the external service EPG that ACI/Kubernetes built the policy forces the traffic into the Service Graph and leverages the Contract that was created (in our case TCP 80) and sends it to the the Policy Based Redirect (PBR) construct. At this point, the service graph knows which destination nodes this packet is for as defined in the policy.
Once the packet reaches the destination node, the packet is then Network Address
Translated (NAT)
by openvswitch (OVS) that is running on that node. That operation
converts the SERVICE IP
destination IP address to
the POD Destination IP and routed in the host. If there are various
PODS on the same compute host for the servive, OVS can also do a final load
balancing locally towards the various POD IP's.
In a large fabric you can observe the value of ACI managing this process for you. Any increment in the count of scalability would create more pods to manage that service. Each would get address to the service graph and the PBR would properly load balance across the fabric.
One last part of the life of the packet is that the container is listening in a service that is different than what is being sent to it. If you recall in the previous example the container was listening on port 8090. So for this solution to work, NAT also has to perform Protocol address translation. In this case when you defined the service the target port was specified and this is all the information that is needed for the ACI/Kubernetes integration to program the OpenVSwitch to perform this task.