Monitor DSL

Similar to loading nodes, Arborist has pluggable hooks for sourcing monitor data from various sources (databases, LDAP, etc). It only ships with file sources by default. This reference page outlines all possible options for the file backed source.

Overview

Monitor definition files are loaded in by the Monitor daemon at startup. You can organize them however you please. Files will be discovered irrespective of directory depth. Files must end in an .rb extension to be considered.

Each monitor is given a human-readable description for use in user interfaces, and one or more attributes that describe which nodes should be monitored, how they should be monitored, and how often the monitor should be run.

Keywords

keyword description
description Set a human-readable description for the monitor, for use in interfaces or logs. This can be declared as an argument to the constructor.
every How often the monitor should run, in seconds. Default is 300 seconds.
exclude A hash of search criteria that removes matching nodes from the result set.
exclude_down If set, this option returns only nodes considered to be in a 'reachable' state (not in a status of down, disabled, or quieted.)
exec Specify what should be run to do the actual monitoring. This argument has multiple forms, described in detail below.
exec_arguments Alter the default arguments supplied to the spawned monitor, described in detail below.
exec_callbacks A module that wraps exec_arguments, exec_input, and exec, described in detail below.
exec_input Write to the spawned monitor's STDIN stream. Described in detail below.
handle_results Parse the STDOUT and STDERR of the spawned monitor. Described in detail below.
key Declare a namespace for the monitor. The error status for a node is keyed by this value, so that monitors with different keys don't clear each other's errors. This is mandatory, though it can be declared as an argument to the constructor.
match A hash of search criteria. The monitor is run for nodes that match this result set.
splay Random offset from the interval in seconds, to prevent monitors from all running at the same time. This can be set per monitor, or globally in the Arborist configuration file.
use Specify the list of properties to provide to the monitor for each node. If this is unspecified, the input to the monitor will be just the list of identifiers. Monitor modules can request specific attributes automatically.

Exec

exec is where the monitor's core action takes place. There are differing modes for accomplishing this task. You can write a monitor action inline in Ruby, as a reusable Ruby module, or spawn an external process, allowing integration with other tools.

exec( command )

This first form simply spawns the specified command with its STDIN opened to a stream of serialized node data. The command should be an array of strings, if the command expects hardcoded arguments or flags.

exec 'fping'
exec 'fping', '-e', '-t', '150'
exec %w[ fping -e -t 150 ]

By default, the format of the serialized nodes is one node per line, and each line looks like this:

«identifier» «attribute1»=«attribute1 value» «attribute2»=«attribute2 value»

Each line should use shell-escaping semantics, so that if an attribute value contains whitespace, it should be quoted, control characters need to be escaped, etc.

For example, a monitor that has a use :addresses declarative might receive input like:

duir addresses=192.168.16.3
sidonie addresses=192.168.16.3
yevaud addresses=192.168.16.10

This works well for attributes that are simple key/value pairs. Array attributes are serialized to a comma separated string.

If the command you are running doesn't support this format, or your monitor needs to use attributes that don't serialize to this format, you can override this in one of two ways.

If your command expects the node data as command-line arguments, you can provide a custom exec_arguments block. It will receive an Array of Arborist::Node objects, and should generate an Array of arguments to append to the command before spawning it.

exec_arguments do |nodes|
    # Build an address -> node mapping for pairing the updates back up by address
    @node_map = nodes.each_with_object( {} ) do |node, hash|
        address = node.addresses.first
        hash[ address ] = node
    end

    @node_map.keys
end

If your command expects the node data via STDIN, but in a different format, you may declare an exec_input block. It will be called with the same node array, and additionally an IO open to the STDIN of the running command. This can be combined with the exec_arguments block, if you're dealing with something really weird, and need both.

exec_input do |nodes, writer|
    # Build an address -> node mapping for pairing the updates back up by address
    @node_map = nodes.each_with_object( {} ) do |node, hash|
        address = node.addresses.first
        hash[ address ] = node
    end

    writer.puts( @node_map.values )
end

Note: If your command doesn't require input to STDIN at all, you currently need to override the default exec_input block with a no-op:

exec_input{|*|}

By default, the monitor must write results for any of the listed identifiers that require update in the same format to its STDOUT. Using the example above for a ping check, the results might look like:

sidonie rtt=10ms
duir rtt=103ms
yevaud rtt= error="Host unreachable."

If the program (likely) writes its output in some other format, you can provide a handle_results block. It will be called with the program's pid, and open handles to its STDOUT and STDERR. It should return a Hash of update Hashes, keyed by the node identifier it should be sent to.

handle_results do |pid, out, err|
    updates = {}

    out.each_line do |line|
        address, status = line.split( /\s+:\s+/, 2 )

        # Use the @node_map we created up in the exec_arguments to map the output
        # back into identifiers. Error-checking omitted for brevity.
        identifier = @node_map[ address ].identifier

        # 127.0.0.1 is alive (0.12 ms)
        # 8.8.8.8 is alive (61.6 ms)
        # 192.168.16.16 is unreachable
        if status =~ /is alive \((\d+\.\d+ ms)\)/i
            updates[ identifier ] = { ping: { rtt: Float($1) } }
        else
            updates[ identifier ] = { error: status }
        end
    end

    updates
end

Unlisted attributes are unchanged. A listed attribute with an empty value is explicitly cleared. An identifier that isn't listed in the results means no update is necessary for that node.

If you find yourself wanting to repeat one or more of the exec callbacks, you can also wrap them in a module, and call exec_callbacks, with the module name as the argument. This is usually the fastest way to make a reusable module that deals with external commands. For a complete example using exec_callbacks, see the arborist-fping gem.

exec {|node_attributes| ... }

In this form, no external command is launched. The monitor implementation is contained entirely within the Ruby block. It is passed the matching nodes, keyed by identifier. Like handle_results, it should return a Hash of Hashes, also keyed by identifier.

exec do |nodes|
    nodes.each_with_object({}) do |(identifier, attributes), results|
    results[ identifier ] = rand(2).zero? ? {} : { warning: 'Chaos monkey!' }
end

exec( module )

Passing a Ruby module to exec bypasses all the default block behaviors. It expects any object that responds to #run, which is invoked the same way as the block.

module Example
    module_function

    def run( nodes )
        return nodes.each_with_object({}) do |(identifier, attributes), results|
            results[ identifier ] = rand(2).zero? ? {} : { warning: 'Chaos monkey!' }
        end
    end
end

exec( Example )

For a complete example using exec with a module argument, see the arborist-snmp gem.

Example

An arbitrary and incomplete syntax demonstration.

require 'arborist/monitor/socket'
require 'arborist/monitor/fping'
require 'arborist/snmp'

using Arborist::TimeRefinements

Arborist::Monitor 'ping check', :ping do
    every 15.seconds
    match type: 'host'
    exec 'fping', '-e', '-t', '150'
    exec_callbacks( Arborist::Monitor::FPing )
end

Arborist::Monitor 'udp service checks', :udp do
    every 45.seconds
    splay 10
    match type: 'service', protocol: 'udp'
    exec( Arborist::Monitor::Socket::UDP )
end

Arborist::Monitor 'disk space check', :disk do
    every 1.minute
    match type: 'resource', category: 'disk'
    exec( Arborist::Monitor::SNMP::Disk )
end

Arborist::Monitor 'example' do
    description 'Hand wavey inline ruby monitor'
    key :example
    every 15.seconds
    match type: 'host'
    use :addresses, :config

    exec do |nodes|
        results = {}

        nodes.each_pair do |node, attributes|
            res = rand(2).zero?
            if res
                results[ node ] = {}
            else
                results[ node ] = { error: 'Test monitor failed.' }
            end
        end

        results
    end
end