Skip to content

RFC Template

Summary

Provide a policy for configuring how DNS should be handed for a given gateway. Provide a mechanism for enabling DNS based load balancing.

Motivation

Gateway admins, need a way to define the DNS policy for a multi-cluster gateway in order to control how much and which traffic reaches these gateways. Ideally we would allow them to express a strategy that they want to use without needing to get into the details of each provider and needing to create and maintain dns record structure and individual records for all the different gateways that may be within their infrastructure.

Guide-level explanation

Allow definition of a DNSPolicy that configures load balancing to decide how traffic should be distributed across multiple gateway instances from the central control plane.

Terms

  • managed listener: This is a listener with a host backed by a DNS zone managed by the multi-cluster gateway controller
  • hub cluster: control plane cluster that managed 1 or more spokes
  • spoke cluster: a cluster managed by the hub control plane cluster. This is where gateway are instantiated

Provide a control plane DNSPolicy API that uses the idea of direct policy attachment from gateway API that allows a load balancing strategy to be applied to the DNS records structure for any managed listeners being served by the data plane instances of this gateway. The DNSPolicy also covers health checks that inform the DNS response but that is not covered in this document.

Below is a draft API for what we anticipate the DNSPolicy to look like

apiVersion: kuadrant.io/v1alpha1
kind: DNSPolicy
spec:
  targetRef: # defaults to gateway gvk and current namespace
    name: gateway-name
  health:
   ...
  loadBalancing:
    weighted:
     defaultWeight: 10
     custom: #optional
     - value: AWS  #optional with both GEO and weighted. With GEO the custom weight is applied to gateways within a Geographic region
       weight: 10
     - value: GCP
       weight: 20
    GEO: #optional
      defaultGeo: IE # required with GEO. Chooses a default DNS response when no particular response is defined for a request from an unknown GEO.

Available Load Balancing Strategies

GEO and Weighted load balancing are well understood strategies and this API effectively allow a complex requirement to be expressed relatively simply and executed by the gateway controller in the chosen DNS provider. Our default policy will execute a "Round Robin" weighted strategy which reflects the current default behaviour.

With the above API we can provide weighted and GEO and weighted within a GEO. A weighted strategy with a minimum of a default weight is always required and the simplest type of policy. The multi-cluster gateway controller will set up a default policy when a gateway is discovered (shown below). This policy can be replaced or modified by the user. A weighted strategy can be complimented with a GEO strategy IE they can be used together in order to provide a GEO and weighted (within a GEO) load balancing. By defining a GEO section, you are indicating that you want to use a GEO based strategy (how this works is covered below).

apiVersion: kuadrant.io/v1alpha1
kind: DNSPolicy
name: default-policy
spec:
  targetRef: # defaults to gateway gvk and current namespace
    name: gateway-name
  loadBalancing:
    weighted: # required
     defaultWeight: 10  #required, all records created get this weight
  health:
   ...   

In order to provide GEO based DNS and allow customisation of the weighting, we need some additional information to be provided by the gateway / cluster admin about where this gateway has been placed. For example if they want to use GEO based DNS as a strategy, we need to know what GEO identifier(s) to use for each record we create and a default GEO to use as a catch-all. Also, if the desired load balancing approach is to provide custom weighting and no longer simply use Round Robin, we will need a way to identify which records to apply that custom weighting to based on the clusters the gateway is placed on.

To solve this we will allow two new attributes to be added to the ManagedCluster resource as labels:

   kuadrant.io/lb-attribute-geo-code: "IE"
   kuadrant.io/lb-attribute-custom-weight: "GCP"

These two labels allow setting values in the DNSPolicy that will be reflected into DNS records for gateways placed on that cluster depending on the strategies used. (see the first DNSPolicy definition above to see how these values are used) or take a look at the examples at the bottom.

example :

apiVersion: cluster.open-cluster-management.io/v1
kind: ManagedCluster
metadata:
 labels:
   kuadrant.io/lb-attribute-geo-code: "IE"
   kuadrant.io/lb-attribute-custom-weight: "GCP"
spec:    

The attributes provide the key and value we need in order to understand how to define records for a given LB address based on the DNSPolicy targeting the gateway.

The kuadrant.io/lb-attribute-geo-code attribute value is provider specific, using an invalid code will result in an error status condition in the DNSrecord resource.

DNS Record Structure

This is an advanced topic and so is broken out into its own proposal doc DNS Record Structure

Custom Weighting

Custom weighting will use the associated custom-weight attribute set on the ManagedCluster to decide which records should get a specific weight. The value of this attribute is up to the end user.

example:

apiVersion: cluster.open-cluster-management.io/v1
kind: ManagedCluster
metadata:
 labels:
   kuadrant.io/lb-attribute-custom-weight: "GCP"

The above is then used in the DNSPolicy to set custom weights for the records associated with the target gateway.

    - value: GCP
      weight: 20

So any gateway targeted by a DNSPolicy with the above definition that is placed on a ManagedCluster with the kuadrant.io/lb-attribute-custom-weight set with a value of GCP will get an A record with a weight of 20

Status

DNSPolicy should have a ready condition that reflect that the DNSRecords have been created and configured as expected. In the case that there is an invalid policy, the status message should reflect this and indicate to the user that the old DNS has been preserved.

We will also want to add a status condition to the gateway status indicating it is effected by this policy. Gateway API recommends the following status condition

- type: gateway.networking.k8s.io/PolicyAffected
  status: True 
  message: "DNSPolicy has been applied"
  reason: PolicyApplied
  ...

https://github.com/kubernetes-sigs/gateway-api/pull/2128/files#diff-afe84021d0647e83f420f99f5d18b392abe5ec82d68f03156c7534de9f19a30aR888

Example Policies

Round Robin (the default policy)

apiVersion: kuadrant.io/v1alpha1
kind: DNSPolicy
name: RoundRobinPolicy
spec:
  targetRef: # defaults to gateway gvk and current namespace
    name: gateway-name
  loadBalancing:
    weighted:
     defaultWeight: 10

GEO (Round Robin)

apiVersion: kuadrant.io/v1alpha1
kind: DNSPolicy
name: GEODNS
spec:
  targetRef: # defaults to gateway gvk and current namespace
    name: gateway-name
  loadBalancing:
    weighted:
     defaultWeight: 10
    GEO:
     defaultGeo: IE

Custom

apiVersion: kuadrant.io/v1alpha1
kind: DNSPolicy
name: SendMoreToAzure
spec:
  targetRef: # defaults to gateway gvk and current namespace
    name: gateway-name
  loadBalancing:
    weighted:
     defaultWeight: 10
     custom:
     - attribute: cloud
       value: Azure #any record associated with a gateway on a cluster without this value gets the default
       weight: 30

GEO with Custom Weights

apiVersion: kuadrant.io/v1alpha1
kind: DNSPolicy
name: GEODNSAndSendMoreToAzure
spec:
  targetRef: # defaults to gateway gvk and current namespace
    name: gateway-name
  loadBalancing:
    weighted:
     defaultWeight: 10
     custom:
     - attribute: cloud
       value: Azure
       weight: 30
    GEO:
      defaultGeo: IE

Reference-level explanation

  • Add a DNSPolicy CRD that conforms to policy attachment spec
  • Add a new DNSPolicy controller to MCG
  • DNS logic and record management should all migrate out of the gateway controller into this new DNSPolicy controller as it is the responsibility and domain of the DNSPolicy controller to manage DNS
  • remove the Hosts interface as we want do not want other controllers using this to bring DNS Logic into other areas of the code.

Drawbacks

You cannot have a different load balancing strategy for each listener within a gateway. So in the following gateway definition

spec:
    gatewayClassName: kuadrant-multi-cluster-gateway-instance-per-cluster
    listeners:
    - allowedRoutes:
        namespaces:
          from: All
      hostname: myapp.hcpapps.net
      name: api
      port: 443
      protocol: HTTPS
    - allowedRoutes:
        namespaces:
          from: All
      hostname: other.hcpapps.net
      name: api
      port: 443
      protocol: HTTPS      

The DNS policy targeting this gateway will apply to both myapp.hcpapps.net and other.hcpapps.net

However, there is still significant value even with this limitation. This limitation is something we will likely revisit in the future

Background Docs

DNS Provider Support

AWS DNS

Google DNS

Azure DNS

Direct Policy Attachment

Rationale and alternatives

An alternative is to configure all of this yourself manually in a dns provider. This is can be a highly complex dns configuration that it would be easy to get wrong.