TSDB, Prometheus, Grafana In Kubernetes: Tracing A Variable Across The OSS Monitoring Stack

Use Open Source software using Prometheus and Grafana

Open Source Metrics Stack

Metrics are a critical aspect of any system to understand its health and operational state. Design of any system requires collection, storage and reporting of metrics to provide a pulse of the system. We explore how tsdb, prometheus and grafana form that stack for open source metrics reporting in Kubernetes.

Data is stored over a series of time intervals and needs an efficient database to store and retrieve this data.

OpenTSDB Time Series Database is one such time series database that can serve that need.

While data is stored in a time series database, a standard system to scrape such metrics and store it in the database has emerged in form of Prometheus . When a data source exports metrics in prometheus exposition format, it can be scraped by prometheus. While time series database stores metrics, Prometheus collects the metrics and pushes them in the database.

Any database also needs an efficient and programmer friendy way to query information, eg: SQL for popular transactional databases like Postgres and MySQL. Prometheus defines a rich query language in form of PromQL to query data from this time series database.

Any form of reporting solution isn't complete without a graphical component to plot data in graphs, bar charts, pie charts, time series and other mechanisms to visualize data. Grafana serves this need where it can take a data source (like Prometheus) and provides the programmability and flexibility to display data in a form that is useful to the user. Grafana also supports several other data sources

We broadly cover how the different systems work in tandem, what makes them stick together, some detail of each sub-system and tracing a variable from end-to-end to gain a clear understanding.

EnRoute is built using Envoy proxy and both sub-systems provide a prometheus endpoint to scrape/export the operational metrics in the system.

Discovery, Storage and Querying of Metrics Data

This section describes how prometheus monitoring system uses service discovery to scrape data (using scrape configuration) and store it in TSDB (prometheus time series database). We then describe how Grafana uses PromQL to query this data.

Finding Instances to Scrape using Service Discovery

Prometheus collects metrics using the pull model. Prometheus needs a port and path to scrape data from. How does it find that? This problem is addressed by Prometheus service discovery.

You can set the instance port and path to scrape in a prometheus config file. With dynamic environments like kubernetes where endpoints, ip-addresses, services, pods, containers are transient, a static configuration won't work. A more dynamic approach is using service discovery and providing instructions about how to work with discovered services. This is where prometheus service discovery can be configured to discover instances (or job which is a collection of instances).

Service Discovery in Prometheus can work in severl different environments. Example environments include Kubernetes, Azure, Digital Ocean and several other options. A complete set of service discovery options can be found on prometheus configuration reference . The <*_sd_config> are all different environments in which prometheus has built-in support and can be configured for service discovery.

Service discovery returns a list of instances to scrape metrics from. The discovery process may also discover additional metadata about the discovered instances. This additional metatdata may be used by the user to ignore, filter, customize or to add attributes to the collected data. Prometheus uses a relablelling mechanism to achieve that.

In the example above from prometheus github , the configuration file for prometheus specifies kubernetes service discovery using kubernetes_sd_config directive. Note how relabelling can be used to ignore certain instances and replace variable names.

Prometheus also shows the list of targets that it has discovered on the /targets endpoint

Reporting of variables

The first requirement for the system to work is how variable state or metric is reported. When a system that has metric to report, it is broadly classified as the type of data which is either a counter that monotonically increases or a gauge that can increase or decrease over time.

Such a distinction at the source of where the metric is generated is required to facilitate the metric stack to correctly query, store and retrieve data.

If an application does not export metrics in Prometheus Exposition Format, there are libraries to convert metrics to this format. Prometheus also has several libraries in different languages that help facilitate defining the variable type, operating on it and exporting it.

Scraping and storage of variables

OpenTSDB supports a multi-dimensonal data model where a data point can be identified using a name and key/value pairs. As an example, let us consider the data model with an example.

When a request traverses through Envoy, it is proxied and connects to an upstream to serve the request. One such variable for upstream stats is envoy_cluster_upstream_rq.

The figure below shows prometheus exposition format that is used as an export format for the metric flows over the wire:

We can GET the information about this variable in Envoy from /stats/prometheus url of envoy stats endpoint:



curl -s localhost:9001/stats/prometheus | grep envoy_cluster_upstream_rq




# TYPE envoy_cluster_upstream_rq counter
envoy_cluster_upstream_rq{envoy_response_code="200",envoy_cluster_name="enroutedemo_externalauth_443"} 1
envoy_cluster_upstream_rq{envoy_response_code="200",envoy_cluster_name="enroutedemo_hello-enroute_9091"} 11
envoy_cluster_upstream_rq{envoy_response_code="429",envoy_cluster_name="enroutedemo_hello-enroute_9091"} 1
envoy_cluster_upstream_rq{envoy_response_code="200",envoy_cluster_name="enroutedemo_lambdacluster_443"} 12
envoy_cluster_upstream_rq{envoy_response_code="200",envoy_cluster_name="saaras-enroute_enroute_8001"} 13
envoy_cluster_upstream_rq{envoy_response_code="503",envoy_cluster_name="saaras-enroute_enroute_8001"} 3
envoy_cluster_upstream_rq{envoy_response_code="200",envoy_cluster_name="saaras-enroute_enroute_8003"} 12
envoy_cluster_upstream_rq{envoy_response_code="200",envoy_cluster_name="saaras-enroute_enroute_8004"} 212
envoy_cluster_upstream_rq{envoy_response_code="200",envoy_cluster_name="saaras-enroute_service-stats_9001"} 6237


Note the TYPE that tells prometheus (and the prometheus time series database) about the type of variable. counter is one type of variable supported by prometheus.

This variable envoy_cluster_upstream_rq is stored in OpenTSDB , which in addition to the value of the variable stores a timestamp and key/value pairs. These key/value pairs facilitate querying of data. The following are all valid queries -

  • get a total count of upstream requests where response code is 200.
  • get value of all upstream requests for cluster hello-enroute_9091 where reponse code is 429 (rate-limited)
Querying of variables

To better understand how a tool like Grafana can pull data from Prometheus/TSDB, we need to understand how data once stored in TSDB is queried.

Note that data variables hold not only time-series value but also an associated key/value. These attributes can be used for querying (eg: envoy_response_code key). This querying is what makes it powerful. There are several ways to look at the same data. While envoy instance from which data was scraped may report a couple of dimensions (or additional key/value tags), additional tags may be added by Prometheus. For instance Prometheus adds kubernetes metadata like service (eg: service key), namespace (eg: kubernetes_namespace key), pod-name, endpoint info etc. when it scrapes and stores information from an instance. All this additional tags can be used to query the data.

The query language to support rich set of use-cases is critical, and PromQL provides that flexibility. We quickly take a look at how to use PromQL to form queries. More details about PromQL can be found on the prometheus PromQL

Going back to the original variable envoy_cluster_upstream_rq, it counts the number of upstream requests for clusters. Here are several ways to get this counter:

Query for service enroute



envoy_cluster_upstream_rq{ service="enroute"}


Query for service enroute AND response code 200


envoy_cluster_upstream_rq{ service="enroute", envoy_response_code="200"}


Query for service enroute AND response code 200 AND namespace enroutedemo


 envoy_cluster_upstream_rq{ service="enroute", envoy_response_code="200", kubernetes_namespace="enroutedemo" }


Note that the above value is time series data. The above is a fairly simple example. While querying a time series the offset, duration, subquery, aggregation, function and other qualifiers can be used to extract more fine-grained data. For example, you can dump the [30m] rate of this time-series:


rate(envoy_cluster_upstream_rq{ service="enroute", envoy_response_code="200", kubernetes_namespace="enroutedemo" }[30m])


More detailed examples can be found on prometheus documentation reference

Types of Metrics

Prometheus system offers a few core metric types that are supported by the system.

  • Counter: monotonically increasing value, eg: request count, connection count etc.
  • Gauge: can increase/decrease in value, eg: upstream count, certificate count, cluster count
  • Histogram: Bucketed counters
  • Summary: Is just like histogram but also calculates quantiles.

A detailed explaination of these variable types can be found on prometheus section on variable types

Statistics in Envoy and EnRoute

EnRoute OneStep is both a lightweight shim on Envoy. It can work as a lightweight control plane to program Envoy Proxy as an Ingress Controller. It also functions without Kubernetes. It provides a cache for Envoy configuration that can help configure the proxy. It exports metrics that are relevant to the internals of the cache it maintains to serve Envoy. Additionally, it has has metrics about other subsystems that work in tandem to provide API security.

A lot of interesting state and metrics data is provided by Envoy. A proxy like Envoy has several sub-systems that exports a lot data. We won't look at all the metrics but high level components and the data they export. The Envoy documentation reference provides a complete list of stats exported.

Downstream Envoy Internal Context

envoy_http_downstream_cx_active
envoy_http_downstream_cx_http1_active
envoy_http_downstream_cx_http2_active
envoy_http_downstream_cx_http3_active
envoy_http_downstream_cx_protocol_error
envoy_http_downstream_cx_ssl_active


Downstream Requests

envoy_http_downstream_rq_active
envoy_http_downstream_http1_total
envoy_http_downstream_http2_total
envoy_http_downstream_http3_total
envoy_http_downstream_rq_rx_reset
envoy_http_downstream_rq_total
envoy_http_downstream_tx_reset
envoy_http_downstream_rq_xx


SSL


	
envoy_listener_ssl_connection_error
envoy_listener_ssl_handshake
envoy_listener_ssl_session_reused

envoy_cluster_ssl_ciphers
envoy_cluster_ssl_connection_error
envoy_cluster_ssl_handshake
envoy_cluster_ssl_session_reused

Circuit Breakers

envoy_cluster_circuit_breakers_default_cx_open
envoy_cluster_circuit_breakers_default_rq_open
envoy_cluster_circuit_breakers_high_cx_open
envoy_cluster_circuit_breakers_high_rq_open


Upstream Envoy Internal Context


	
envoy_cluster_upstream_cx_active
envoy_cluster_upstream_cx_connect_failed
envoy_cluster_upstream_cx_http1_total
envoy_cluster_upstream_cx_http2_total
envoy_cluster_upstream_cx_total

Upstream Envoy Request

	
envoy_cluster_upstream_rq_active
envoy_cluster_upstream_rq_retry
envoy_cluster_upstream_rq_total
envoy_cluster_upstream_rq_timeout
envoy_cluster_upstream_xx


Envoy access log

envoy_access_logs_grpc_access_log_logs_written
envoy_access_logs_grpc_access_log_logs_dropped


Envoy OAuth and JWT and other interesting filters



envoy_http_jwt_authn_allowed
envoy_http_jwt_authn_denied
envoy_http_oauth_unauthorized_rq
envoy_http_oauth_failure
envoy_http_oauth_success
envoy_http_aws_lambda_server_error
envoy_http_aws_lambda_upstream_rq
envoy_http_ext_authz_ok
envoy_http_ext_authz_denied
envoy_http_ext_authz_error
envoy_http_ext_authz_timeout
mysql login_attempts
mysql login_failure
mysql queries_parsed
kafka response.TYPE
...

Grafana Dashboards and Panels

Grafana is an open source tool to build an observability dashboard. The primary abstractions used to build a UI in grafana is a Dashboard that is made up of several Panels. A panel embeds a query that pulls the data to be displayed and various styling and formatting options.

An intermediate step between querying and visualizing the results of a query is transformation. As the name suggests, transformation provides a mechanism to work on the data before it is passed to visualization.

The monitoring dashboard and panels are perodically updated with the latest data (where the refresh frequency can be controlled in Grafana)

Grafana supports several data sources one of which is Prometheus.

In Grafana, you build graphs using Visualizations. Visualizations roughly map to displaying data types defined earlier in Prometheus eg: counter, gauge, histogram and summary. There also is flexibility in how they are displayed as graphs, bar charts as tables or heatmaps.

A more detailed explaination of Dashboard , Panels , Querying, Transformation and Visualizations can be accessed from the Grafana website.

Here is an example Grafana Dashboard that is made up of panels -

The dashboard is made up of 12 panels across 3 rows and 4 columns. Each panel has a query to pull data from a data source. The panel also has a JSON representation in grafana. Here is one such representation for Downstream 2xx Responses



{
...
  "datasource": "prometheus",
...
  "gridPos": {
    "h": 8,
    "w": 6,
    "x": 0,
    "y": 50
  },
...
  "targets": [
    {
      "expr": "sum(rate(envoy_http_downstream_rq_xx{namespace=~\"$Namespace\",service=~\"$Service\",envoy_response_code_class=~\"2\"}[1m])) by (namespace,service)",
      "format": "time_series",
      "intervalFactor": 2,
      "legendFormat": "{{namespace}}/{{service}}",
      "refId": "A"
    }
  ],
...
  "title": "Downstream 2xx Responses",
  "tooltip": {
    "shared": true,
    "sort": 0,
    "value_type": "individual"
  },
  "type": "graph",
  "xaxis": {
...
  },
  "yaxes": [
...
  ]
}

The representation above has a lot of fields removed just to highlight the interesting ones to build an understanding of how a variable reported in enroute/envoy is displayed. The interesting parts are datasource and the targets.expr that captures the query. Note the PromQL syntax and how the cumulative value of 1m rate is plotted over time.

The query:


      sum(
        rate(
         envoy_http_downstream_rq_xx{
           namespace=~\"$Namespace\",
           service=~\"$Service\",
           envoy_response_code_class=~\"2\"}[1m]))
		by (namespace,service)


fetches the value of envoy_http_downstream_rq_xx for a reponse code 2xx

Here is another example that plots a histogram of latency to upstream server after importing swagger spec using enroutectl.

Open Source Metrics with/without Kubernetes

EnRoute Universal API Gateway with its API both for Standalone and Kubernetes works with Prometheus and Grafana based open source telemetry. Both EnRoute and Envoy can export operational data in Prometheus Exposition Format. Telemetry is critical for operations and using an open source format like Prometheus and Grafana ensures that programmable insights can be derived from data and the API gateway adheres to the overall logging architecture and choices.

When working with a complex system, any downtime is unacceptable. Understanding the root cause requires metrics to understand and troubleshoot. Logging and monitoring are critical to ensure the DevOps team has the necessary tools to keep the system running. EnRoute Universal API Gateway automatically creates detailed dashboards for individual Envoy components to provide deep insights into working of several API Gateway sub-systems. These Envoy specific dashboards can also be displayed on a per-service basis using relevant Envoy metrics.