Kuadrant Rate Limiting¶

A Kuadrant RateLimitPolicy custom resource, often abbreviated "RateLimitPolicy":

Targets Gateway API networking resources such as HTTPRoutes and Gateways, using these resources to obtain additional context, i.e., which traffic workload (HTTP attributes, hostnames, user attributes, etc) to rate limit.
Supports targeting subsets (sections) of a network resource to apply the limits to.
Abstracts the details of the underlying Rate Limit protocol and configuration resources, that have a much broader remit and surface area.
Enables cluster operators to set defaults that govern behavior at the lower levels of the network, until a more specific policy is applied.

How it works¶

Envoy's Rate Limit Service Protocol¶

Kuadrant's Rate Limit implementation relies on the Envoy's Rate Limit Service (RLS) protocol. The workflow per request goes:

On incoming request, the gateway checks the matching rules for enforcing rate limits, as stated in the RateLimitPolicy custom resources and targeted Gateway API networking objects
If the request matches, the gateway sends one RateLimitRequest to the external rate limiting service ("Limitador").
The external rate limiting service responds with a RateLimitResponse back to the gateway with either an OK or OVER_LIMIT response code.

A RateLimitPolicy and its targeted Gateway API networking resource contain all the statements to configure both the ingress gateway and the external rate limiting service.

The RateLimitPolicy custom resource¶

Overview¶

The RateLimitPolicy spec includes, basically, two parts:

A reference to an existing Gateway API resource (spec.targetRef)
Limit definitions (spec.limits)

Each limit definition includes:

A set of rate limits (spec.limits.<limit-name>.rates[])
(Optional) A set of dynamic counter qualifiers (spec.limits.<limit-name>.counters[])
(Optional) A set of additional dynamic conditions to activate the limit (spec.limits.<limit-name>.when[])

The limit definitions (limits) can be declared at the top-level level of the spec (with the semantics of defaults) or alternatively within explicit defaults or overrides blocks.

Check out Kuadrant RFC 0002 to learn more about the Well-known Attributes that can be used to define counter qualifiers (counters) and conditions (when).

Check out the API reference for a full specification of the RateLimitPolicy CRD.

Using the RateLimitPolicy¶

Targeting a HTTPRoute networking resource¶

When a RateLimitPolicy targets a HTTPRoute, the policy is enforced to all traffic routed according to the rules and hostnames specified in the HTTPRoute, across all Gateways referenced in the spec.parentRefs field of the HTTPRoute.

Target a HTTPRoute by setting the spec.targetRef field of the RateLimitPolicy as follows:

apiVersion: kuadrant.io/v1
kind: RateLimitPolicy
metadata:
  name: <RateLimitPolicy name>
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: <HTTPRoute Name>
  limits: { … }

Rate limit policy targeting a HTTPRoute resource

Hostnames and wildcards¶

If a RateLimitPolicy targets a route defined for *.com and another RateLimitPolicy targets another route for api.com, the Kuadrant control plane will not merge these two RateLimitPolicies. Unless one of the policies declare an overrides set of limites, the control plane will configure to mimic the behavior of gateway implementation by which the "most specific hostname wins", thus enforcing only the corresponding applicable policies and limit definitions.

E.g., by default, a request coming for api.com will be rate limited according to the rules from the RateLimitPolicy that targets the route for api.com; while a request for other.com will be rate limited with the rules from the RateLimitPolicy targeting the route for *.com.

See more examples in Overlapping Gateway and HTTPRoute RateLimitPolicies.

Targeting a Gateway networking resource¶

A RateLimitPolicy that targets a Gateway can declare a block of defaults (spec.defaults) or a block of overrides (spec.overrides). As a standard, gateway policies that do not specify neither defaults nor overrides, act as defaults.

When declaring defaults, a RateLimitPolicy which targets a Gateway will be enforced to all HTTP traffic hitting the gateway, unless a more specific RateLimitPolicy targeting a matching HTTPRoute exists. Any new HTTPRoute referrencing the gateway as parent will be automatically covered by the default RateLimitPolicy, as well as changes in the existing HTTPRoutes.

Defaults provide cluster operators with the ability to protect the infrastructure against unplanned and malicious network traffic attempt, such as by setting safe default limits on hostnames and hostname wildcards.

Inversely, a gateway policy that specify overrides declares a set of rules to be enforced on all routes attached to the gateway, thus atomically replacing any more specific policy occasionally attached to any of those routes.

Target a Gateway HTTPRoute by setting the spec.targetRef field of the RateLimitPolicy as follows:

apiVersion: kuadrant.io/v1
kind: RateLimitPolicy
metadata:
  name: <RateLimitPolicy name>
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: Gateway
    name: <Gateway Name>
  defaults: # alternatively: `overrides`
    limits: { … }

rate limit policy targeting a Gateway resource

Overlapping Gateway and HTTPRoute RateLimitPolicies¶

Two possible semantics are to be considered here – gateway policy defaults vs gateway policy overrides.

Gateway RateLimitPolicies that declare defaults (or alternatively neither defaults nor overrides) protect all traffic routed through the gateway except where a more specific HTTPRoute RateLimitPolicy exists, in which case the HTTPRoute RateLimitPolicy prevails.

Example with 4 RateLimitPolicies, 3 HTTPRoutes and 1 Gateway default (plus 2 HTTPRoute and 2 Gateways without RateLimitPolicies attached):

RateLimitPolicy A → HTTPRoute A (a.toystore.com) → Gateway G (*.com)
RateLimitPolicy B → HTTPRoute B (b.toystore.com) → Gateway G (*.com)
RateLimitPolicy W → HTTPRoute W (*.toystore.com) → Gateway G (*.com)
RateLimitPolicy G (defaults) → Gateway G (*.com)

Expected behavior:

Request to a.toystore.com → RateLimitPolicy A will be enforced
Request to b.toystore.com → RateLimitPolicy B will be enforced
Request to other.toystore.com → RateLimitPolicy W will be enforced
Request to other.com (suppose a route exists) → RateLimitPolicy G will be enforced
Request to yet-another.net (suppose a route and gateway exist) → No RateLimitPolicy will be enforced

Gateway RateLimitPolicies that declare overrides protect all traffic routed through the gateway, regardless of existence of any more specific HTTPRoute RateLimitPolicy.

Example with 4 RateLimitPolicies, 3 HTTPRoutes and 1 Gateway override (plus 2 HTTPRoute and 2 Gateways without RateLimitPolicies attached):

RateLimitPolicy A → HTTPRoute A (a.toystore.com) → Gateway G (*.com)
RateLimitPolicy B → HTTPRoute B (b.toystore.com) → Gateway G (*.com)
RateLimitPolicy W → HTTPRoute W (*.toystore.com) → Gateway G (*.com)
RateLimitPolicy G (overrides) → Gateway G (*.com)

Expected behavior:

Request to a.toystore.com → RateLimitPolicy G will be enforced
Request to b.toystore.com → RateLimitPolicy G will be enforced
Request to other.toystore.com → RateLimitPolicy G will be enforced
Request to other.com (suppose a route exists) → RateLimitPolicy G will be enforced
Request to yet-another.net (suppose a route and gateway exist) → No RateLimitPolicy will be enforced

Limit definition¶

A limit will be activated whenever a request comes in and the request matches:

all of the when conditions specified in the limit.

A limit can define:

counters that are qualified based on dynamic values fetched from the request, or
global counters (implicitly, when no qualified counter is specified)

A limit is composed of one or more rate limits.

E.g.

spec:
  limits:
    "toystore-all":
      rates:

        - limit: 5000
          window: 1s

    "toystore-api-per-username":
      rates:

        - limit: 100
          window: 1s
        - limit: 1000
          window: 1m
      counters:
        - expression: auth.identity.username
      when:
        - predicate: request.host == 'api.toystore.com'

    "toystore-admin-unverified-users":
      rates:

        - limit: 250
          window: 1s
      when:
        - predicate: request.host == 'admin.toystore.com'
        - predicate: !auth.identity.email_verified

Request to	Rate limits enforced
`api.toystore.com`	100rps/username or 1000rpm/username (whatever happens first)
`admin.toystore.com`	250rps
`other.toystore.com`	5000rps

`when` conditions¶

when conditions can be used to scope a limit (i.e. to filter the traffic to which a limit definition applies) without any coupling to the underlying network topology, i.e. without making direct references to HTTPRouteRules.

Use when conditions to conditionally activate limits based on attributes that cannot be expressed in the HTTPRoutes' spec.hostnames and spec.rules.matches fields, or in general in RateLimitPolicies that target a Gateway.

The selectors within the when conditions of a RateLimitPolicy are a subset of Kuadrant's Well-known Attributes (RFC 0002). Check out the reference for the full list of supported selectors.

Examples¶

Check out the following user guides for examples of rate limiting services with Kuadrant:

Known limitations¶

RateLimitPolicies can only target HTTPRoutes/Gateways defined within the same namespace of the RateLimitPolicy.

Implementation details¶

Driven by limitations related to how Istio injects configuration in the filter chains of the ingress gateways, Kuadrant relies on Envoy's Wasm Network filter in the data plane, to manage the integration with rate limiting service ("Limitador"), instead of the Rate Limit filter.

Motivation: Multiple rate limit domains

The first limitation comes from having only one filter chain per listener. This often leads to one single global rate limiting filter configuration per gateway, and therefore to a shared rate limit domain across applications and policies. Even though, in a rate limit filter, the triggering of rate limit calls, via actions to build so-called "descriptors", can be defined at the level of the virtual host and/or specific route rule, the overall rate limit configuration is only one, i.e., always the same rate limit domain for all calls to Limitador.

On the other hand, the possibility to configure and invoke the rate limit service for multiple domains depending on the context allows to isolate groups of policy rules, as well as to optimize performance in the rate limit service, which can rely on the domain for indexation.

Motivation: Fine-grained matching rules A second limitation of configuring the rate limit filter via Istio, particularly from Gateway API resources, is that rate limit descriptors at the level of a specific HTTP route rule require "named routes" – defined only in an Istio VirtualService resource and referred in an EnvoyFilter one. Because Gateway API HTTPRoute rules lack a "name" property¹, as well as the Istio VirtualService resources are only ephemeral data structures handled by Istio in-memory in its implementation of gateway configuration for Gateway API, where the names of individual route rules are auto-generated and not referable by users in a policy²³, rate limiting by attributes of the HTTP request (e.g., path, method, headers, etc) would be very limited while depending only on Envoy's Rate Limit filter.

Motivated by the desire to support multiple rate limit domains per ingress gateway, as well as fine-grained HTTP route matching rules for rate limiting, Kuadrant implements a wasm-shim that handles the rules to invoke the rate limiting service, complying with Envoy's Rate Limit Service (RLS) protocol.

The wasm module integrates with the gateway in the data plane via Wasm Network filter, and parses a configuration composed out of user-defined RateLimitPolicy resources by the Kuadrant control plane. Whereas the rate limiting service ("Limitador") remains an implementation of Envoy's RLS protocol, capable of being integrated directly via Rate Limit extension or by Kuadrant, via wasm module for the Istio Gateway API implementation.

As a consequence of this design:

Users can define fine-grained rate limit rules that match their Gateway and HTTPRoute definitions including for subsections of these.
Rate limit definitions are insulated, not leaking across unrelated policies or applications.
Conditions to activate limits are evaluated in the context of the gateway process, reducing the gRPC calls to the external rate limiting service only to the cases where rate limit counters are known in advance to have to be checked/incremented.
The rate limiting service can rely on the indexation to look up for groups of limit definitions and counters.
Components remain compliant with industry protocols and flexible for different integration options.

A Kuadrant wasm-shim configuration for one RateLimitPolicy custom resources targeting a HTTPRoute looks like the following and it is generated automatically by the Kuadrant control plane:

apiVersion: extensions.istio.io/v1alpha1
kind: WasmPlugin
metadata:
  creationTimestamp: "2024-10-01T16:59:40Z"
  generation: 1
  name: kuadrant-kuadrant-ingressgateway
  namespace: gateway-system
  ownerReferences:

    - apiVersion: gateway.networking.k8s.io/v1
      blockOwnerDeletion: true
      controller: true
      kind: Gateway
      name: kuadrant-ingressgateway
      uid: 0298355b-fb30-4442-af2b-88d0c05bd2bd
  resourceVersion: "11253"
  uid: 36ef1fb7-9eca-46c7-af63-fe783f40148c
spec:
  phase: STATS
  pluginConfig:
    services:
      ratelimit-service:
        type: ratelimit
        endpoint: ratelimit-cluster
        failureMode: allow
    actionSets:
      - name: some_name_0
        routeRuleConditions:
          hostnames:
            - "*.toystore.website"
            - "*.toystore.io"
          predicates:
            - request.url_path.startsWith("/assets")
        actions:
          - service: ratelimit-service
            scope: gateway-system/app-rlp
            predicates:
              - request.host.endsWith('.toystore.website')
            data:
              - expression:
                  key: limit.toystore_assets_all_domains__b61ee8e6
                  value: "1"
      - name: some_name_1
        routeRuleConditions:
          hostnames:
            - "*.toystore.website"
            - "*.toystore.io"
          predicates:
            - request.url_path.startsWith("/v1")
        actions:
          - service: ratelimit-service
            scope: gateway-system/app-rlp
            predicates:
              - request.host.endsWith('.toystore.website')
              - auth.identity.username == ""
            data:
              - expression:
                  key: limit.toystore_v1_website_unauthenticated__377837ee
                  value: "1"
  targetRef:
    group: gateway.networking.k8s.io
    kind: Gateway
    name: kuadrant-ingressgateway
  url: oci://quay.io/kuadrant/wasm-shim:latest

https://github.com/kubernetes-sigs/gateway-api/pull/996 ↩
https://github.com/istio/istio/issues/36790 ↩
https://github.com/istio/istio/issues/37346 ↩