Advanced Rate-Limiting on Enroute Universal Gateway built on Envoy Proxy

Why rate-limit?

APIs drive the digital economy. They are fundamental to today’s commerce. Rate limiting of APIs is used to ensure availability and prevent abuse.

While ensuring high availability, APIs need to meet SLAs defined for users. To meet these SLAs, checks are necessary so that one user does not overwhelm the system.

Every system or API has different uses and throughput. Also every API may provide separate limits for authenticated and un-authenticated users.

Rate limit is very fundamental to APIs. Every api needs some form of rate-limiting

What is Enroute Universal Gateway

Enroute Universal API gateway is a polymorphic gateway that allows flexible policy enforcement for APIs. It can work as a Standalone Gateway for traditional brownfield use-cases, at kubernetes ingress or can be run alongside a service for mesh like deployments. Such flexibility is critical for organizations adopting cloud or rewriting services or parts of it to run under kubernetes.

What this article covers

This article shows how Enroute can be deployed in two different topologies to add rate-limiting to APIs.

One topology is running it at Kubernetes Ingress. Enroute Universal Gateway deployed as kubernetes ingress controller can be configured for advanced rate-limiting.

Another topology is running it as a Standalone Gateway that does not depend on Kubernetes. It can be configured in any environment or cloud to run independent of Kubernetes. We program the Enroute Universal Gateway in Standalone mode for advanced rate-limiting.

Enroute also supports several other topologies including a Gateway Mesh topology. One Enroute control plane can program multiple stateless Enroute data planes. Every Enroute data plane instance runs an instance of Envoy along with it. This topology is not covered in this article.

Enroute Universal Gateway uses filters to attach additional functionality either at the global level or per-route level. We show how the same filter configuration can be applied across all topologies in which Enroute is run.

Enroute Quick Start - Get running in less than a minute

This section demonstrates how to quickly setup and run advanced real-life rate-limiting on Enroute. The rest of the article describes in more detail how this is achieved.

Quick Start: Standalone

The following steps cover how you can run Enroute in standalone mode for advanced rate-limiting. The quick-start example is also listed in more detail below.

Start Enroute Standalone API Gateway
sudo docker run --net=host saarasio/enroute-gw:0.4.0
Setup listener, rate-limit filter, upstream

The quick start is a golang script that invokes Standalone Gateway APIs to program it. The quick-start is written using golang and needs golang installed. The script was tested on golang version 1.13.9

Download quick start script:

curl -O https://raw.githubusercontent.com/saarasio/api-ratelimit/master/api-rate-limit.go

Run quick start script to create configuration:

go run api-rate-limit.go --op=create

To check configuration, use the following command:

go run api-rate-limit.go --op=show

To clear configuration, use the following command:

go run api-rate-limit.go --op=delete
Run traffic

The lua script in quick-start, extracts the api-key either from the header x-app-key or api parameter api-key. If none of them are present, the request gets rate-limited

To run traffic, use any of the following curl command.:

API Key passed as a query param:

curl -vvv http://localhost:8080/?api-key=query-param-saaras

API key passed as a header:

curl -vvv -H "x-app-key: hdr-app-saaras" http://localhost:8080/

The following curl command results in a 429

curl -vvv http://localhost:8080/

When API key is not found, it is set to x-app-notfound in the lua script. This results in descriptor match where rate limit of requests_per_unit is set to zero

...
 {
     "key": "x-app-key",
     "value" : "x-app-notfound",

     "descriptors": [
         {
             "key" : "remote_address",
             "rate_limit": {
                 "unit": "second",
                 "requests_per_unit": 0
             }
         }
     ]
 }
...

Quick Start: Kubernetes Ingress

Follow the steps below to run Enroute Universal Gateway at Kubernetes Ingress

Start Enroute Kubernetes Ingress

This example assumes that you have a working kubernetes cluster that supports service of type LoadBalancer

The steps can also be run using the following command:

bash -c 'kubectl apply -f https://getenroute.io/ic/0.4.0/qs/ic-all-rl.yaml'

This sets up a LoadBalancer type of service that provides a public DNS name to access it on AWS. Note that to make the example work, we set a CNAME record to point demo.saaras.io to the AWS domain. This was done to match the programmed Fqdn.

Refer to Enroute Universal API Gateway for kubernetes ingress to get a step-by-step introduction on running Enroute Kubernetes Ingress API gateway

Run traffic

Get the LoadBalancer IP

ubuntu@ip-10-0-10-0:~$ kubectl get -n enroute-gw-k8s service

NAME      TYPE           CLUSTER-IP      EXTERNAL-IP            PORT(S)                      AGE
enroute   LoadBalancer   10.104.27.203   a6..lb.amazonaws.com   80:32138/TCP,443:32695/TCP   36d
httpbin   ClusterIP      10.109.43.186   <none>                 80/TCP                       36d

Note that we have a CNAME mapping from demo.saaras.io to a6..lb.amazonaws.com

curl -vvv -k https://demo.saaras.io/get

Common rate-limiting configuration

A lot of APIs use some form of rate-limiting to prevent abuse and provide fairness. This needs identifying the user who is calling the API. One way to identify a user is through a unique IP address. A better way to identify a user is a token that the user receives after they have authenticated.

This user information is either received in a header or as a url parameter. Rate-limiting then uses this information to count the number of API requests made and limits the user according to the policy configured.

We delve into some of the example of how rate-limiting is used today and how Enroute Universal Gateway can be used to recreate such configuration. This how-to only covers one configuration for rate-limiting. Also check why every API needs rate-limiting to go through other real-world examples.

Mapping rate-limiting requirements to Enroute directives

We look at real world examples specified in why every API needs rate-limiting and use it as an example.

The following steps are involved in recreating the real-world example:

  • Extract user-token from request, send it as a header value for rate-limit engine to read
  • Add route action to send request state to rate-limit engine when request matches that route
  • Add rate-limit engine descriptors to specify limits on user

The real-world examples use the token received in the request to rate-limit and use different rate limits - one for authenticated users and another set of limits for un-authenticated users.

The lua global level filter gets invoked first. We use it to fish out the token for an authenticated user. We then use this token for our rate-limiting calculations.

A route level filter lets us specify what state needs to be sent to the rate-limit engine. When the request matches this route, the corresponding state is sent to the rate-limit engine.

The configuration on the rate-limit engine specified in terms of descriptors, describes the rate-limits for authenticated and unauthenticated users. We provide configuration to the rate-limit engine using Enroute’s GlobalConfig

The next two sections show the individual filter configuration and walk through the configuration.

Lua filter configuration - different rate limits for authenticated/un-authenticated user

The global lua filter tries to extract the token or api-key sent by the user. This is either sent as a query parameter or as a header. The lua filter checks both these locations.

If the lua filter finds the token in any of the locations, it sets it on a header that is used by the rate-limit engine. If it cannot find this token, it sets it to a pre-determined value that can be interpreted accordingly by the rate-limit engine.

Note that the same Lua code is attached as lua filter configuration when running Enroute as Kubernetes Ingress or running it as a Standalone Gateway

 ---

 apiVersion: enroute.saaras.io/v1beta1
 kind: HttpFilter
 metadata:
   labels:
     app: httpbin
   name: luatestfilter
   namespace: enroute-gw-k8s
 spec:
   name: luatestfilter
   type: http_filter_lua
   httpFilterConfig:
     config: |
         function get_api_key(path, q_param_name)
             -- path = "/?api-key=valid-key"

             s, e = string.find(path, "?")
             if s ~= nil then
               for pre, q_params in string.gmatch(path, "(%S+)?(%S+)") do
                 -- print(pre, q_params, path, s, e)

                 for k, v in string.gmatch(q_params, "(%S+)=(%S+)") do
                   print(k, v)
                   if k == q_param_name then
                     return v
                   end
                 end
               end
             end

             return nil
         end

         function envoy_on_request(request_handle)
            request_handle:logInfo("Begin: envoy_on_request()");

            hdr_x_app_key = "x-app-key"
            hdr_x_app_not_found = "x-app-notfound"
            q_param_name = "api-key"

            -- extract API key from header "x-app-key"

            headers = request_handle:headers()
            header_value = headers:get(hdr_x_app_key)

            if header_value ~= nil then
              request_handle:logInfo("envoy_on_request() API Key from header "..header_value);
            else
              request_handle:logInfo("envoy_on_request() API Key in header is nil");
            end

            -- extract API key from query param "api-key"

            path_in = headers:get(":path")
            api_key = get_api_key(path_in, q_param_name)

            if api_key ~= nil then
              request_handle:logInfo("envoy_on_request() API Key from query param "..api_key);
            else
              request_handle:logInfo("envoy_on_request() API Key from query param is nil");
            end

            -- If API key found, do nothing

            -- else set header x-app-key:x-app-notfound

            if header_value == nil then
                if api_key == nil then
                  headers:add(hdr_x_app_key, hdr_x_app_not_found)
                else
                  headers:add(hdr_x_app_key, api_key)
                end
            end

            request_handle:logInfo("End: envoy_on_request()");

         end

         function envoy_on_response(response_handle)
            response_handle:logInfo("Begin: envoy_on_response()");
            response_handle:logInfo("End: envoy_on_response()");
         end

The same lua script can also be programmed on the standalone gateway as demonstrated in the quick-start script here

https://raw.githubusercontent.com/saarasio/api-ratelimit/master/api-rate-limit.go

For every request, the function envoy_on_request(request_handler) gets invoked. This is the entry point. We look for the user token or api key first in a header and then in the query parameter. If we cannot find it, we set it to x-app-notfound to indicate it wasn’t found. This information helps the rate-limit engine decide how it should treat the request.

Rate-limit filter configuration

To rate limit requests, the rate-limit engine needs state from the request to identify the request/user. This information is specified at the route level.

Per-route filter config


---
apiVersion: enroute.saaras.io/v1beta1
kind: RouteFilter
metadata:
  labels:
    app: httpbin
  name: rl2
  namespace: enroute-gw-k8s
spec:
  name: rl2
  type: route_filter_ratelimit
  routeFilterConfig:
    config: |
        {
            "descriptors": [
              {
                "request_headers": {
                  "header_name": "x-app-key",
                  "descriptor_key": "x-app-key"
                }
              },
              {
                "remote_address": "{}"
              }
            ]
        }

The above is configuration is for k8s ingress attached to a route. It sends the value of x-app-key to the rate-limit engine.

The same configuration can also be used on for a filter as specified in the quick-start script. Alternatively, it can also be specified using a curl command -

# Create rate-limit filter for route
curl -0 -v http://localhost:1323/filter \
-H "Host: localhost:1323" \
-H "Content-Type: application/json" \
-H "Accept-Encoding: gzip" \
-H "Expect:" \
-H 'Content-Type: application/json; charset=utf-8' \
-d @- <<'EOF'
    {
       "Filter_name" : "test_filter_rl",
       "Filter_type" : "route_filter_ratelimit",
       "Filter_config" : "
   {
     \"descriptors\" :
     [
       {
         \"generic_key\":
         {
           \"descriptor_value\":\"default\"
         }
       }
     ]
   }
"
}
EOF

Once the rate-limit engine receives this information, it does the counting to allow/disallow the request. It uses redis in the backend to do the counting. The configuration to the rate-limit engine can be specified using globalconfig.

rate-limit engine config
---
apiVersion: enroute.saaras.io/v1beta1
kind: GlobalConfig
metadata:
  labels:
    app: httpbin
  name: rl-global-config
  namespace: enroute-gw-k8s
spec:
  name: rl-global-config
  type: globalconfig_ratelimit
  config: |
        {
            "domain": "enroute",
            "descriptors": [
                {
                    "key": "x-app-key",
                    "value" : "x-app-notfound",

                    "descriptors": [
                        {
                            "key" : "remote_address",
                            "rate_limit": {
                                "unit": "second",
                                "requests_per_unit": 0
                            }
                        }
                    ]
                },
                {
                    "key": "x-app-key",
                    "descriptors": [
                        {
                            "key" : "remote_address",
                            "rate_limit": {
                                "unit": "second",
                                "requests_per_unit": 100000
                            }
                        }
                    ]
                }
            ]
        }

The rate-limit engine uses the descriptors to build a token to count the request. In the above case, it’ll use a rate-limit of "requests_per_unit": 0 for requests when a token isn’t found.

When a token is found, it uses the "requests_per_unit": 100000 for every unique token.

The same configuration can also be found in the quick start script.

Conclusion

We show how API rate-limiting is critical for APIs today and how they can be programmed on the Enroute Universal Gateway. Depending on where the API is running, the standalone gateway or the Kubernetes Ingress API gateway can be used.

Advanced rate-limiting can be run without any inhibitions or licenses on Enroute Universal API gateway. Rate-limiting is fundamental to running APIs and is provided completely free in the community edition (while other vendors charge for it).

Enroute provides a complete rate-limiting solution for APIs with centralized control for all aspects of rate-limiting. With Enroute’s flexibility the need for rate-limiting for APIs can be achieved in any environment - private cloud, public cloud, kubernetes ingress or gateway mesh.

You can use the contact form to reach us if you have any feedback.