Netdata Monitoring Solution

Good monitoring solutions are a must in any environment and the IT world is fit to bursting with them. Each has it’s pros and cons and in this post I’m going to look at a fairly recent solution called netdata. You install and run it on each system you want to monitor and then access that data through a web interface.

First off let me provide links to the GitHub source page and to some demo environments.

GitHub netdata soure page

Demo sites

I’ll also include the creators description –

‘netdata is a system for distributed real-time performance and health monitoring. It provides unparalleled insights, in real-time, of everything happening on the system it runs (including applications such as web and database servers), using modern interactive web dashboards.’

‘netdata is fast and efficient, designed to permanently run on all systems (physical & virtual servers, containers, IoT devices), without disrupting their core function.’

  • Stunning interactive bootstrap dashboards
    • Mouse and touch friendly, in 2 themes: dark, light
  • Amazingly fast
    • Responds to all queries in less than 0.5 ms per metric, even on low-end hardware
  • Highly efficient
    • Collects thousands of metrics per server per second, with just 1% CPU utilization of a single core, a few MB of RAM and no disk I/O at all
  • Sophisticated alarming
    • Supports dynamic thresholds, hysteresis, alarm templates, multiple role-based notification methods (such as email, slack.com, pushover.net, pushbullet.com telegram.org, twilio.com, messagebird.com)
  • Extensible
    • You can monitor anything you can get a metric for, using its Plugin API (anything can be a netdata plugin, BASH, python, perl, node.js, java, Go, ruby, etc)
  • Embeddable
    • It can run anywhere a Linux kernel runs (even IoT) and its charts can be embedded on your web pages too
  • Customizable
    • Custom dashboards can be built using simple HTML (no javascript necessary)
  • Zero configuration
    • Auto-detects everything, it can collect up to 5000 metrics per server out of the box
  • Zero dependencies
    • It is even its own web server, for its static web files and its web API
  • Zero maintenance
    • You just run it, it does the rest
  • Scales to infinity
    • Requiring minimal central resources
  • Back-ends supported
    • Can archive its metrics on graphite or opentsdb, in the same or lower detail (lower: to prevent it from congesting these servers due to the amount of data collected)

 

Right now we’ve got that out of the way let’s take a quick look at the installation process and then some examples of the charts available. I really do recommend you check out some of the demo links to get a feel for this great product.

 

Installation

The creator has provided installation guidance on the GitHub source page – https://github.com/firehol/netdata/wiki/Installation

I chose to setup a virtual server (VM) instance of netdata on the latest version of Ubuntu server (16.10). Once my VM had been deployed and networking setup I made sure to update the OS with the usual apt-get update/upgrade combination.

Once the OS was updated I moved on to installing netdata itself which is wonderfuly simple to do. Following the installation guidance you will see we have two options, a basic install and a full install which monitors everything netdata supports. I’m not going to cover every detail as the creator has done a good job of that.

Basic Install

 

Full Install

 

Once you have gone through the setup process you can access netdata through a browser, by default using the following URL –

  • http://serverName:19999/

To access the netdata configuration you can browse to the same URL with /netdata.conf appended to the end.

  • http://serverName:19999/netdata.conf.

I went with the full install to monitor everything on my VM and it certainly covered a lot! Without messing around or changing any configuration options (well except memory deduplication) netdata reported the following regarding the monitoring of this system –

netdata on BSA-NETDATA01:

  • Collects every second 1,212 metrics
  • Presented as 180 charts and monitored by 69 alarms
  • Using 23 MB of memory for 1 hour of real-time history.

 

I’m always interested in the load any monitoring solution places on a system – after all it’s great having granular data but what if it adds noticeable load to the target? I think it is safe to say netdata hardly touches a system especially when you consider the number of metrics and frequency at which it is monitoring. I could probably run this on one of my Raspberry Pi units without much worry! It is also fantastic to see that each metric graph has a description of where the data came from and what it actually means. Too many times I’ve seen a monitoring system display data with a heading that means nothing to me and I’ve had to dig around to see what the metric is and why I should care.

This is a product I will be testing and evaluating both at home and work – I’m really interested to see what I can do with it, as well as exporting the data to some back end and then perhaps presenting that through something like Grafana.

To end the post let’s have a few screenshots! Notice in the final screenshot that each resource in the netdata side bar can be expanded to show the various categories monitored.

Netdata System Overview

Netdata Context Switches

Netdata Memory Deduper

Netdata Side Bar

 

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.