A pythonic example of recording metrics about ephemeral scripts with prometheus

    In my previous post we talked about how to record information from short lived scripts (I call them ephemeral scripts by the way) with prometheus. The example there was a script which checked the SMART status of each of the disks in a machine and reported that via pushgateway. I now want to work through a slightly more complicated example.

    I think you hit the limits of reporting simple values in shell scripts via curl requests fairly quickly. For example with the SMART monitoring script, SMART is capable of returning a whole heap of metrics about the performance of a disk, but we boiled that down to a single "health" value. This is largely because writing a parser for all the other values that smartctl returns would be inefficient and fragile in shell. So for this post, we're going to work through an example of how to report a variety of values from a python script. Those values could be the parsed output of smartctl, but to mix things up a bit, I'm going to use a different script I wrote recently.

    This new script uses the Weather Underground API to lookup weather stations near my house, and then generate graphics of the weather forecast. These graphics are displayed on the various Cisco SIP phones I already had around the house. The forecasts look like this:



    The script to generate these weather forecasts is relatively simple python, and you can see the source code on github.

    My cunning plan here is to use prometheus' time series database and alert capabilities to drive home automation around my house. The first step for that is to start gathering some simple facts about the home environment so that we can do trending and decision making on them. The code to do this isn't all that complicated. First off, we need to add the python prometheus client to our python environment, which is hopefully a venv:

    pip install prometheus_client
    pip install six
    


    That second dependency isn't a strict requirement for prometheus, but the script I'm working on needs it (because it needs to work out what's a text value, and python 3 is bonkers).

    Next we import the prometheus client in our code and setup the counter registry. At the same time I record when the script was run:

    from prometheus_client import CollectorRegistry, Gauge, push_to_gateway
    
    registry = CollectorRegistry()
    Gauge('job_last_success_unixtime', 'Last time the weather job ran',
          registry=registry).set_to_current_time()
    


    And then we just add gauges for any values we want to add to the pushgateway

    Gauge('_'.join(field), '', registry=registry).set(value)
    


    Finally, the values don't exist in the pushgateway until we actually push them there, which we do like this:

    push_to_gateway('localhost:9091', job='weather', registry=registry)
    


    You can see the entire patch I wrote to add prometheus support on github if you're interested in an example with more context.

    Now we can have pretty graphs of temperature and stuff!

    Tags for this post: prometheus monitoring python pushgateway
    Related posts: Recording performance information from short lived processes with prometheus; Basic prometheus setup; Implementing SCP with paramiko; Mona Lisa Overdrive; Packet capture in python; mbot: new hotness in Google Talk bots

posted at: 01:08 | path: /prometheus | permanent link to this entry