Skip to main content

Monitoring Node Health

A Flow node generates logs and publishes metrics as it runs. These logs and metrics can be used to gain insights into the health of the node.

Logs​

Logs are emitted to stdout as JSON formed strings. Where these logs are available on your system depends on how you launch your containers. On systemd systems for example, the logs will be sent to the system journal daemon journald. Other systems may log to /var/log.

Metrics​

Flow nodes produce health metrics in the form of Prometheus metrics, exposed from the node software on /metrics.

If you wish to make use of these metrics, you'll need to set up a Prometheus server to scrape your Nodes. Alternatively, you can deploy the Prometheus Server on top of your current Flow node to see the metrics without creating an additional server.

The flow-go application doesn't expose any metrics from the underlying host such as CPU, network, or disk usages. It is recommended you collect these metrics in addition to the ones provided by flow using a tool like node exporter (https://github.com/prometheus/node_exporter)

  1. Copy the following Prometheus configuration into your current flow node


    _12
    global:
    _12
    scrape_interval: 15s # By default, scrape targets every 15 seconds.
    _12
    _12
    scrape_configs:
    _12
    # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
    _12
    - job_name: 'prometheus'
    _12
    _12
    # Override the global default and scrape targets from this job every 5 seconds.
    _12
    scrape_interval: 5s
    _12
    _12
    static_configs:
    _12
    - targets: ['localhost:8080']

  2. Start Prometheus server


    _10
    docker run \
    _10
    --network=host \
    _10
    -p 9090:9090 \
    _10
    -v /path/to/prometheus.yml:/etc/prometheus/prometheus.yml \
    _10
    prom/prometheus"

  3. (optional) Port forward to the node if you are not able to access port 9090 directly via the browser ssh -L 9090:127.0.0.1:9090 YOUR_NODE

  4. Open your browser and go to the URL http://localhost:9090/graph to load the Prometheus Dashboard

Key Metric Overview​

The following are some important metrics produced by the node.

Metric NameDescription
go_*Go runtime metrics
consensus_compliance_finalized_heightLatest height finalized by this node; should increase at a constant rate.
consensus_compliance_sealed_heightLatest height sealed by this node; should increase at a constant rate.
consensus_hotstuff_cur_viewCurrent view of the HotStuff consensus algorith; Consensus/Collection only; should increase at a constant rate.
consensus_hotstuff_timeout_secondsHow long it takes to timeout failed rounds; Consensus/Collection only; values consistently larger than 5s are abnormal.

Monitoring a Flow node using Metrika Monitoring​

Metrika has developed the Flow node monitoring service and is the recommended way of monitoring a Flow node. It is a free tool that uses the logs and metrics published by the node and provides access to private node-specific dashboards. Follow this link to setup the Metrika monitoring for your node.