Aggregates

Reference documentation

What is a Sensu named aggregate?

Sensu named aggregates are collections of check results, accessible via the Aggregates API. Check aggregates make it possible to treat the results of multiple disparate check results – executed across multiple disparate systems – as a single result.

When should named aggregates be used?

Check aggregates are extremely useful in dynamic environments and/or environments that have a reasonable tolerance for failure. Check aggregates should be used when a service can be considered healthy as long as a minimum threshold is satisfied (e.g. are at least 5 healthy web servers? are at least 70% of N processes healthy?).

How do named aggregates work?

Check results are included in an aggregate when a check definition includes the aggregate definition attribute. Check results that provide an "aggregate": "example_aggregate" are aggregated under the corresponding name (e.g. example_aggregate), effectively capturing multiple check results as a single aggregate.

Example aggregated check result

Aggregated check results are available from the Aggregates API, via the /aggregates/:name API endpoint. An aggregate check result provides a set of counters indicating the total number of client members, checks, and check results collected, with a breakdown of how many results were recorded per status (i.e. ok, warning, critical, and unknown).

{
  "clients": 15,
  "checks": 2,
  "results": {
    "ok": 18,
    "warning": 0,
    "critical": 1,
    "unknown": 0,
    "total": 19,
    "stale": 0
  }
}

Additional aggregate data is available from the Aggregates API, including Sensu client members of a named aggregate, and the corresponding checks which are included in the aggregate:

$ curl -s http://localhost:4567/aggregates/elasticsearch/clients | jq .
[
  {
    "name": "i-424242",
    "checks": [
      "elasticsearch_service",
      "elasticsearch_cluster_health"
    ]
  },
  {
    "name": "1-424243",
    "checks": [
      "elasticsearch_service"
    ]
  },
]

Aggregate data may also be fetched per check that is a member of the named aggregate, along with the corresponding clients that are producing results for said check:

$ curl -s http://localhost:4567/aggregates/elasticsearch/checks | jq .
[
  {
    "name": "elasticsearch_service",
    "clients": [
      "i-424242",
      "i-424243"
    ]
  },
  {
    "name": "elasticsearch_cluster_health",
    "clients": [
      "i-424242"
    ]
  }
]

Aggregate configuration

Example aggregate definition

The following is an example check definition, a JSON configuration file located at /etc/sensu/conf.d/check_aggregate_example.json.

{
  "checks": {
    "example_check_aggregate": {
      "command": "do_something.rb -o option",
      "aggregate": "example_aggregate",
      "interval": 60,
      "subscribers": [
        "my_aggregate"
      ],
      "handle": false
    }
  }
}

Aggregate definition specification

NOTE: aggregates are created via the aggregate Sensu check definition attribute. The configuration example(s) provided above, and the “specification” provided here are for clarification and convenience only (i.e. this “specification” is just a subset of the check definition specification, and not a definition of a distinct Sensu primitive).

Aggregate check attributes

aggregate
description Create a named aggregate for the check. Check result data will be aggregated and exposed via the Sensu Aggregates API.
required false
type String
example
"aggregate": "elasticsearch"
aggregates
description An array of strings defining one or more named aggregates (described above).
required false
type Array
example
"aggregates": [ "webservers", "production" ]
handle
description If events created by the check should be handled.
required false
type Boolean
default true
example
"handle": false
_NOTE: although there are cases when it may be helpful to aggregate check results and handle individual check results, it is typically recommended to set "handle": false when aggregating check results, as the purpose of the aggregation should be to act on the state of the aggregated result(s) rather than the individual check result(s)._