Statistics collector

Module stats gathers various counters from the query resolution and server internals, and offers them as a key-value storage. These metrics can be either exported to Graphite/InfluxDB/Metronome, exposed as Prometheus metrics endpoint, or processed using user-provided script as described in chapter Asynchronous events.

Note

Please remember that each Knot Resolver instance keeps its own statistics, and instances can be started and stopped dynamically. This might affect your data postprocessing procedures if you are using Multiple instances.

Built-in statistics

Built-in counters keep track of number of queries and answers matching specific criteria.

Global request counters

request.total

total number of DNS requests (including internal client requests)

request.internal

internal requests generated by Knot Resolver (e.g. DNSSEC trust anchor updates)

request.udp

external requests received over plain UDP (RFC 1035)

request.tcp

external requests received over plain TCP (RFC 1035)

request.dot

external requests received over DNS-over-TLS (RFC 7858)

request.doh

external requests received over DNS-over-HTTP (RFC 8484)

request.xdp

external requests received over plain UDP via an AF_XDP socket

Global answer counters

answer.total

total number of answered queries

answer.cached

queries answered from cache

answer.stale

queries that utilized stale data

Answers categorized by RCODE

answer.noerror

NOERROR answers

answer.nodata

NOERROR, but empty answers

answer.nxdomain

NXDOMAIN answers

answer.servfail

SERVFAIL answers

Answer latency

answer.1ms

completed in 1ms

answer.10ms

completed in 10ms

answer.50ms

completed in 50ms

answer.100ms

completed in 100ms

answer.250ms

completed in 250ms

answer.500ms

completed in 500ms

answer.1000ms

completed in 1000ms

answer.1500ms

completed in 1500ms

answer.slow

completed in more than 1500ms

answer.sum_ms

sum of all latencies in ms

Answer flags

answer.aa

authoritative answer

answer.tc

truncated answer

answer.ra

recursion available

answer.rd

recursion desired (in answer!)

answer.ad

authentic data (DNSSEC)

answer.cd

checking disabled (DNSSEC)

answer.do

DNSSEC answer OK

answer.edns0

EDNS0 present

Query flags

query.edns

queries with EDNS present

query.dnssec

queries with DNSSEC DO=1

Example:

modules.load('stats')

-- Enumerate metrics
> stats.list()
[answer.cached] => 486178
[iterator.tcp] => 490
[answer.noerror] => 507367
[answer.total] => 618631
[iterator.udp] => 102408
[query.concurrent] => 149

-- Query metrics by prefix
> stats.list('iter')
[iterator.udp] => 105104
[iterator.tcp] => 490

-- Fetch most common queries
> stats.frequent()
[1] => {
        [type] => 2
        [count] => 4
        [name] => cz.
}

-- Fetch most common queries (sorted by frequency)
> table.sort(stats.frequent(), function (a, b) return a.count > b.count end)

-- Show recently contacted authoritative servers
> stats.upstreams()
[2a01:618:404::1] => {
    [1] => 26 -- RTT
}
[128.241.220.33] => {
    [1] => 31 - RTT
}

-- Set custom metrics from modules
> stats['filter.match'] = 5
> stats['filter.match']
5

Module reference

stats.get(key)
Parameters:

key (string) – i.e. "answer.total"

Returns:

number

Return nominal value of given metric.

stats.set('key val')

Set nominal value of given metric.

Example:

stats.set('answer.total 5')
-- or syntactic sugar
stats['answer.total'] = 5
stats.list([prefix])
Parameters:

prefix (string) – optional metric prefix, i.e. "answer" shows only metrics beginning with “answer”

Outputs collected metrics as a JSON dictionary.

stats.upstreams()

Outputs a list of recent upstreams and their RTT. It is sorted by time and stored in a ring buffer of a fixed size. This means it’s not aggregated and readable by multiple consumers, but also that you may lose entries if you don’t read quickly enough. The default ring size is 512 entries, and may be overridden on compile time by -DUPSTREAMS_COUNT=X.

stats.frequent()

Outputs list of most frequent iterative queries as a JSON array. The queries are sampled probabilistically, and include subrequests. The list maximum size is 5000 entries, make diffs if you want to track it over time.

stats.clear_frequent()

Clear the list of most frequent iterative queries.

Graphite/InfluxDB/Metronome

The graphite sends statistics over the Graphite protocol to either Graphite, Metronome, InfluxDB or any compatible storage. This allows powerful visualization over metrics collected by Knot Resolver.

Tip

The Graphite server is challenging to get up and running, InfluxDB combined with Grafana are much easier, and provide richer set of options and available front-ends. Metronome by PowerDNS alternatively provides a mini-graphite server for much simpler setups.

Example configuration:

Only the host parameter is mandatory.

By default the module uses UDP so it doesn’t guarantee the delivery, set tcp = true to enable Graphite over TCP. If the TCP consumer goes down or the connection with Graphite is lost, resolver will periodically attempt to reconnect with it.

modules = {
        graphite = {
                prefix = hostname() .. worker.id, -- optional metric prefix
                host = '127.0.0.1',  -- graphite server address
                port = 2003,         -- graphite server port
                interval = 5 * sec,  -- publish interval
                tcp = false          -- set to true if you want TCP mode
        }
}

The module supports sending data to multiple servers at once.

modules = {
        graphite = {
                host = { '127.0.0.1', '1.2.3.4', '::1' },
        }
}

Dependencies

Prometheus metrics endpoint

The HTTP module exposes /metrics endpoint that serves metrics from Statistics collector in Prometheus text format. You can use it as soon as HTTP module is configured:

$ curl -k https://localhost:8453/metrics | tail
# TYPE latency histogram
latency_bucket{le=10} 2.000000
latency_bucket{le=50} 2.000000
latency_bucket{le=100} 2.000000
latency_bucket{le=250} 2.000000
latency_bucket{le=500} 2.000000
latency_bucket{le=1000} 2.000000
latency_bucket{le=1500} 2.000000
latency_bucket{le=+Inf} 2.000000
latency_count 2.000000
latency_sum 11.000000

You can namespace the metrics in configuration, using http.prometheus.namespace attribute:

modules.load('http')
-- Set Prometheus namespace
http.prometheus.namespace = 'resolver_'

You can also add custom metrics or rewrite existing metrics before they are returned to Prometheus client.

modules.load('http')
-- Add an arbitrary metric to Prometheus
http.prometheus.finalize = function (metrics)
        table.insert(metrics, 'build_info{version="1.2.3"} 1')
end