Monitor DSL
Similar to loading nodes, Arborist has pluggable hooks for sourcing monitor data from various sources (databases, LDAP, etc). It only ships with file sources by default. This reference page outlines all possible options for the file backed source.
Overview
Monitor definition files are loaded in by the
Monitor daemon at startup. You can
organize them however you please. Files will be discovered irrespective
of directory depth. Files must end in an .rb
extension to be
considered.
Each monitor is given a human-readable description for use in user interfaces, and one or more attributes that describe which nodes should be monitored, how they should be monitored, and how often the monitor should be run.
Keywords
keyword | description |
---|---|
description | Set a human-readable description for the monitor, for use in interfaces or logs. This can be declared as an argument to the constructor. |
every | How often the monitor should run, in seconds. Default is 300 seconds. |
exclude | A hash of search criteria that removes matching nodes from the result set. |
exclude_down | If set, this option returns only nodes considered to be in a 'reachable' state (not in a status of down , disabled , or quieted .) |
exec | Specify what should be run to do the actual monitoring. This argument has multiple forms, described in detail below. |
exec_arguments | Alter the default arguments supplied to the spawned monitor, described in detail below. |
exec_callbacks | A module that wraps exec_arguments , exec_input , and exec , described in detail below. |
exec_input | Write to the spawned monitor's STDIN stream. Described in detail below. |
handle_results | Parse the STDOUT and STDERR of the spawned monitor. Described in detail below. |
key | Declare a namespace for the monitor. The error status for a node is keyed by this value, so that monitors with different keys don't clear each other's errors. This is mandatory, though it can be declared as an argument to the constructor. |
match | A hash of search criteria. The monitor is run for nodes that match this result set. |
splay | Random offset from the interval in seconds, to prevent monitors from all running at the same time. This can be set per monitor, or globally in the Arborist configuration file. |
use | Specify the list of properties to provide to the monitor for each node. If this is unspecified, the input to the monitor will be just the list of identifiers. Monitor modules can request specific attributes automatically. |
Exec
exec
is where the monitor's core action takes place. There are
differing modes for accomplishing this task. You can write a monitor
action inline in Ruby, as a reusable Ruby module, or spawn an external
process, allowing integration with other tools.
exec( command )
This first form simply spawn
s the specified command with its STDIN
opened to a stream of serialized node data. The command should
be an array of strings, if the command expects hardcoded arguments or
flags.
exec 'fping'
exec 'fping', '-e', '-t', '150'
exec %w[ fping -e -t 150 ]
By default, the format of the serialized nodes is one node per line, and each line looks like this:
«identifier» «attribute1»=«attribute1 value» «attribute2»=«attribute2 value»
Each line should use shell-escaping semantics, so that if an attribute value contains whitespace, it should be quoted, control characters need to be escaped, etc.
For example, a monitor that has a use :addresses
declarative might
receive input like:
duir addresses=192.168.16.3
sidonie addresses=192.168.16.3
yevaud addresses=192.168.16.10
This works well for attributes that are simple key/value pairs. Array attributes are serialized to a comma separated string.
If the command you are running doesn't support this format, or your monitor needs to use attributes that don't serialize to this format, you can override this in one of two ways.
If your command expects the node data as command-line arguments, you can
provide a custom exec_arguments
block. It will receive an Array of
Arborist::Node objects, and should generate an Array of arguments to
append to the command before spawn
ing it.
exec_arguments do |nodes|
# Build an address -> node mapping for pairing the updates back up by address
@node_map = nodes.each_with_object( {} ) do |node, hash|
address = node.addresses.first
hash[ address ] = node
end
@node_map.keys
end
If your command expects the node data via STDIN, but in a different
format, you may declare an exec_input
block. It will be called with
the same node array, and additionally an IO open to the STDIN of the
running command. This can be combined with the exec_arguments
block,
if you're dealing with something really weird, and need both.
exec_input do |nodes, writer|
# Build an address -> node mapping for pairing the updates back up by address
@node_map = nodes.each_with_object( {} ) do |node, hash|
address = node.addresses.first
hash[ address ] = node
end
writer.puts( @node_map.values )
end
Note: If your command doesn't require input to STDIN at all, you
currently need to override the default exec_input
block with a no-op:
exec_input{|*|}
By default, the monitor must write results for any of the listed identifiers that require update in the same format to its STDOUT. Using the example above for a ping check, the results might look like:
sidonie rtt=10ms
duir rtt=103ms
yevaud rtt= error="Host unreachable."
If the program (likely) writes its output in some other format, you can
provide a handle_results
block. It will be called with the program's
pid, and open handles to its STDOUT and STDERR. It should
return a Hash of update Hashes, keyed by the node identifier it should
be sent to.
handle_results do |pid, out, err|
updates = {}
out.each_line do |line|
address, status = line.split( /\s+:\s+/, 2 )
# Use the @node_map we created up in the exec_arguments to map the output
# back into identifiers. Error-checking omitted for brevity.
identifier = @node_map[ address ].identifier
# 127.0.0.1 is alive (0.12 ms)
# 8.8.8.8 is alive (61.6 ms)
# 192.168.16.16 is unreachable
if status =~ /is alive \((\d+\.\d+ ms)\)/i
updates[ identifier ] = { ping: { rtt: Float($1) } }
else
updates[ identifier ] = { error: status }
end
end
updates
end
Unlisted attributes are unchanged. A listed attribute with an empty value is explicitly cleared. An identifier that isn't listed in the results means no update is necessary for that node.
If you find yourself wanting to repeat one or more of the
exec callbacks, you can also wrap them in a module, and call
exec_callbacks
, with the module name as the argument. This is usually
the fastest way to make a reusable module that deals with external
commands. For a complete example using exec_callbacks
, see the
arborist-fping
gem.
exec {|node_attributes| ... }
In this form, no external command is launched. The monitor
implementation is contained entirely within the Ruby block. It is
passed the matching nodes, keyed by identifier. Like handle_results
,
it should return a Hash of Hashes, also keyed by identifier.
exec do |nodes|
nodes.each_with_object({}) do |(identifier, attributes), results|
results[ identifier ] = rand(2).zero? ? {} : { warning: 'Chaos monkey!' }
end
exec( module )
Passing a Ruby module to exec
bypasses all the default
block behaviors. It expects any object that responds to
#run
, which is invoked the same way as the block.
module Example
module_function
def run( nodes )
return nodes.each_with_object({}) do |(identifier, attributes), results|
results[ identifier ] = rand(2).zero? ? {} : { warning: 'Chaos monkey!' }
end
end
end
exec( Example )
For a complete example using exec
with a module argument, see the
arborist-snmp
gem.
Example
An arbitrary and incomplete syntax demonstration.
require 'arborist/monitor/socket'
require 'arborist/monitor/fping'
require 'arborist/snmp'
using Arborist::TimeRefinements
Arborist::Monitor 'ping check', :ping do
every 15.seconds
match type: 'host'
exec 'fping', '-e', '-t', '150'
exec_callbacks( Arborist::Monitor::FPing )
end
Arborist::Monitor 'udp service checks', :udp do
every 45.seconds
splay 10
match type: 'service', protocol: 'udp'
exec( Arborist::Monitor::Socket::UDP )
end
Arborist::Monitor 'disk space check', :disk do
every 1.minute
match type: 'resource', category: 'disk'
exec( Arborist::Monitor::SNMP::Disk )
end
Arborist::Monitor 'example' do
description 'Hand wavey inline ruby monitor'
key :example
every 15.seconds
match type: 'host'
use :addresses, :config
exec do |nodes|
results = {}
nodes.each_pair do |node, attributes|
res = rand(2).zero?
if res
results[ node ] = {}
else
results[ node ] = { error: 'Test monitor failed.' }
end
end
results
end
end