Server monitoring by Monit and Munin

In this article I am going to show your how you can keep an eye on your server/desktop/laptop visually through web browser. For that I am going to use two tools to do the job for you; those are monit and munin .

I am on Debian Lenny to implement those two tools.So the first thing first get those software in the system.Here we go:

Monit:

Before try to install it I tried to query the existing package database to whether it installed or not(or I might have installed some time back!!)

bhaskar@bhaskar-laptop_18:12:03_Tue Nov 16:/etc/monit> sudo dpkg -s monit
Package: monit
Status: install ok installed
Priority: optional
Section: admin
Installed-Size: 696
Maintainer: Stefan Alfredsson
Architecture: i386
Version: 1:4.10.1-4
Depends: libc6 (>= 2.7-1), libssl0.9.8 (>= 0.9.8f-5)
Conffiles:
/etc/default/monit cf582dd57fac58748aba3d7cf174f011
/etc/monit/monitrc d0127e44088e2c13e6eaef8f3cb95c9f
/etc/init.d/monit 3c19420528fdb85fd2669f6f7257a552
Description: A utility for monitoring and managing daemons or similar programs
monit is a utility for monitoring and managing daemons or similar
programs running on a Unix system. It will start specified programs
if they are not running and restart programs not responding.
.
monit supports:
* Daemon mode - poll programs at a specified interval
* Monitoring modes - active, passive or manual
* Start, stop and restart of programs
* Group and manage groups of programs
* Process dependency definition
* Logging to syslog or own logfile
* Configuration - comprehensive controlfile
* Runtime and TCP/IP port checking (tcp and udp)
* SSL support for port checking
* Unix domain socket checking
* Process status and process timeout
* Process cpu usage
* Process memory usage
* Process zombie check
* Check the systems load average
* Check a file or directory timestamp
* Alert, stop or restart a process based on its characteristics
* MD5 checksum for programs started and stopped by monit
* Alert notification for program timeout, restart, checksum, stop
resource and timestamp error
* Flexible and customizable email alert messages
* Protocol verification. HTTP, FTP, SMTP, POP, IMAP, NNTP, SSH, DWP,
LDAPv2 and LDAPv3
* An http interface with optional SSL support to make monit
accessible from a webbrowser

It seems it’s there.Ok,now it has deflated lot of file in the system and as I am not going to mention those in details,but should show you where it kept :

bhaskar@bhaskar-laptop_18:17:15_Tue Nov 16:~> whereis monit
monit: /usr/sbin/monit /etc/monit /usr/share/man/man1/monit.1.gz

We should be concerned about the configuration file it.Because we need to define everything in this file to get noticed by it.I changed into /etc/monit dir ,where I found the config file named monitrc.Let’s have a view of it:


bhaskar@bhaskar-laptop_18:20:12_Tue Nov 16:~> cd /etc/monit
bhaskar@bhaskar-laptop_18:20:18_Tue Nov 16:/etc/monit> ls

monitrc
bhaskar@bhaskar-laptop_18:20:20_Tue Nov 16:/etc/monit> sudo vim monitrc
###############################################################################
2 ## Monit control file
3 ###############################################################################
4 ##
5 ## Comments begin with a '#' and extend through the end of the line. Keywords
6 ## are case insensitive. All path's MUST BE FULLY QUALIFIED, starting with '/'.
7 ##
8 ## Below you will find examples of some frequently used statements. For
9 ## information about the control file, a complete list of statements and
10 ## options please have a look in the monit manual.
11 ##
12 ##
13 ###############################################################################
14 ## Global section
15 ###############################################################################
16 ##
17 ## Start monit in the background (run as a daemon) and check services at
18 ## 2-minute intervals.
19 #
20 set daemon 60
21 #
22 #
23 ## Set syslog logging with the 'daemon' facility. If the FACILITY option is
24 ## omitted, monit will use 'user' facility by default. If you want to log to
25 ## a stand alone log file instead, specify the path to a log file
26 #
27 set logfile syslog facility log_daemon
28 #
29 #
30 ## Set the list of mail servers for alert delivery. Multiple servers may be
31 ## specified using comma separator. By default monit uses port 25 - this
32 ## is possible to override with the PORT option.
33 #
34 set mailserver bhaskar-laptop # primary mailserver
35 # backup.bar.baz port 10025, # backup mailserver on port 10025
36 # localhost # fallback relay
37 #
38 #
39 ## By default monit will drop alert events if no mail servers are available.
40 ## If you want to keep the alerts for a later delivery retry, you can use the
41 ## EVENTQUEUE statement. The base directory where undelivered alerts will be
42 ## stored is specified by the BASEDIR option. You can limit the maximal queue
43 ## size using the SLOTS option (if omitted, the queue is limited by space
44 ## available in the back end filesystem).
45 #
46 # set eventqueue
47 # basedir /var/monit # set the base directory where events will be stored
48 # slots 100 # optionaly limit the queue size
49 #
50 #
51 ## Monit by default uses the following alert mail format:
52 ##
53 ## --8<--
54 ## From: monit@$HOST # sender
55 ## Subject: monit alert -- $EVENT $SERVICE # subject
56 ##
57 ## $EVENT Service $SERVICE #
58 ## #
59 ## Date: $DATE #
60 ## Action: $ACTION #
61 ## Host: $HOST # body
62 ## Description: $DESCRIPTION #
63 ## #
64 ## Your faithful employee, #
65 ## monit #
66 ## --8<-- 67 ## 68 ## You can override this message format or parts of it, such as subject 69 ## or sender using the MAIL-FORMAT statement. Macros such as $DATE, etc. 70 ## are expanded at runtime. For example, to override the sender: 71 # 72 set mail-format { from: monit@bhaskar-laptop.localdomain } 73 # 74 # 75 ## You can set alert recipients here whom will receive alerts if/when a 76 ## service defined in this file has errors. Alerts may be restricted on 77 ## events by using a filter as in the second example below. 78 # 79 set alert root@bhaskar-laptop.localdomain # receive all alerts 80 # set alert manager@foo.bar only on { timeout } # receive just service- 81 # # timeout alert 82 # 83 # 84 ## Monit has an embedded web server which can be used to view status of 85 ## services monitored, the current configuration, actual services parameters 86 ## and manage services from a web interface. 87 # 88 set httpd port 2812 and 89 use address bhaskar-laptop # only accept connection from localhost 90 allow bhaskar-laptop # allow localhost to connect to the server and 91 allow admin:admin # require user 'admin' with password 'admin' 92 # 93 # 94 ############################################################################### 95 ## Services 96 ############################################################################### 97 ## 98 ## Check general system resources such as load average, cpu and memory 99 ## usage. Each test specifies a resource, conditions and the action to be 100 ## performed should a test fail. 101 # 102 check system bhaskar-laptop.localdomain 103 if loadavg (1min) > 4 then alert
104 if loadavg (5min) > 2 then alert
105 if memory usage > 75% then alert
106 if cpu usage (user) > 70% then alert
107 if cpu usage (system) > 30% then alert
108 if cpu usage (wait) > 20% then alert
109 #
110 #
111 ## Check a file for existence, checksum, permissions, uid and gid. In addition
112 ## to alert recipients in the global section, customized alert will be sent to
113 ## additional recipients by specifying a local alert handler. The service may
114 ## be grouped using the GROUP option.
115 #
116 check file apache_bin with path /usr/sbin/apache2
117 # if failed checksum and
118 # expect the sum 8f7f419955cefa0b33a2ba316cba3659 then unmonitor
119 if failed permission 755 then unmonitor
120 if failed uid root then unmonitor
121 if failed gid root then unmonitor
122 alert security@bhaskar-laptop.localdomain on {
123 permission, uid, gid, unmonitor
124 } with the mail-format { subject: Alarm! }
125 # group server
126 #
127 #
128 ## Check that a process is running, in this case Apache, and that it respond
129 ## to HTTP and HTTPS requests. Check its resource usage such as cpu and memory,
130 ## and number of children. If the process is not running, monit will restart
131 ## it by default. In case the service was restarted very often and the
132 ## problem remains, it is possible to disable monitoring using the TIMEOUT
133 ## statement. This service depends on another service (apache_bin) which
134 ## is defined above.
135 #
136 check process apache2 with pidfile /var/run/Apache2/apache2.pid
137 start program = "/etc/init.d/apache2 start"
138 stop program = "/etc/init.d/apache2 stop"
139 if cpu > 60% for 2 cycles then alert
140 if cpu > 80% for 5 cycles then restart
141 if totalmem > 200.0 MB for 5 cycles then restart
142 if children > 250 then restart
143 if loadavg(5min) greater than 10 for 8 cycles then stop
144 if failed host bhaskar-laptop.localdomain port 80 protocol http
145 and request "/monit/doc/next.php"
146 then restart
147 # if failed port 443 type tcpssl protocol http
148 # with timeout 15 seconds
149 # then restart
150 if 3 restarts within 5 cycles then timeout
151 depends on apache_bin
152 group server
153 #
154 #
155 ## Check device permissions, uid, gid, space and inode usage. Other services,
156 ## such as databases, may depend on this resource and an automatically graceful
157 ## stop may be cascaded to them before the filesystem will become full and data
158 ## lost.
159 #
160 # check device datafs with path /dev/sdb1
161 # start program = "/bin/mount /data"
162 # stop program = "/bin/umount /data"
163 # if failed permission 660 then unmonitor
164 # if failed uid root then unmonitor
165 # if failed gid disk then unmonitor
166 # if space usage > 80% for 5 times within 15 cycles then alert
167 # if space usage > 99% then stop
168 # if inode usage > 30000 then alert
169 # if inode usage > 99% then stop
170 # group server
171 #
172 #LVM
173
174 check device Bhaskar-laptop-data with path /lvm
175 if space usage > 80% then alert
176
177 #Tmp
178 check device tmp with path /tmp
179 if space usage > 90% then alert
180
181 ## Check a file's timestamp. In this example, we test if a file is older
182 ## than 15 minutes and assume something is wrong if its not updated. Also,
183 ## if the file size exceed a given limit, execute a script
184 #
185 # check file database with path /data/mydatabase.db
186 # if failed permission 700 then alert
187 # if failed uid data then alert
188 # if failed gid data then alert
189 # if timestamp > 15 minutes then alert
190 # if size > 100 MB then exec "/my/cleanup/script"
191 #
192 #
193 ## Check directory permission, uid and gid. An event is triggered if the
194 ## directory does not belong to the user with uid 0 and gid 0. In addition,
195 ## the permissions have to match the octal description of 755 (see chmod(1)).
196 #Bin
197 check directory bin with path /bin
198 if failed permission 755 then unmonitor
199 if failed uid 0 then unmonitor
200 if failed gid 0 then unmonitor
201 #
202 #LVM
203 check directory lvm with path /lvm
204 if failed uid 0 then unmonitor
205 if failed gid 0 then unmonitor
206
207 #Home
208 check directory home with path /home
209 if failed uid 0 then unmonitor
210 if failed gid 0 then unmonitor
211
212 # Var
213 check directory var with path /var
214 if failed uid 0 then unmonitor
215 if failed gid 0 then unmonitor
216
217
218
219
220 ## Check a remote host network services availability using a ping test and
221 ## check response content from a web server. Up to three pings are sent and
222 ## connection to a port and a application level network check is performed.
223 #
224 # check host myserver with address 192.168.1.1
225 # if failed icmp type echo count 3 with timeout 3 seconds then alert
226 # if failed port 3306 protocol mysql with timeout 15 seconds then alert
227 # if failed url
228 # http://user:password@www.foo.bar:8080/?querystring
229 # and content == 'action="j_security_check"'
230 # then alert
231 #
232 #Mysql
233
234 check process mysql with pidfile /var/run/mysqld/mysqld.pid
235 group database
236 start program = "/etc/init.d/mysql start"
237 stop program = "/etc/init.d/mysql stop"
238 if failed host 127.0.0.1 port 3306 then restart
239 if 5 restarts within 5 cycles then timeout
240
241 ###############################################################################
242 ## Includes
243 ###############################################################################
244 ##
245 ## It is possible to include additional configuration parts from other files or
246 ## directories.
247 #
248 # include /etc/monit.d/*
249 #
250 #
"monitrc" 250L, 9699C

As it is visible from the mundane configuration file that what we are trying to monitor.It has a big advantage that monit can take decision about the service i.e if some service is down and it needs to up,it can do so.It is just not mere status showing software.

Now we can configure it start when the system boots.So we will define a runlevels for it .We will use a software called sysv-rc-conf ,(aptitude install sysv-rc-conf).Here is invocation of it:

sysv-rc-conf

sysv-rc-conf/>

Now you can see the highlighted section for the monit service.As I have mentioned in configuration file that the web interface of it can be accessed through port 2812 .Here is the invocation through browser:

Monit Web Interface

I hope enlarging those two above picture will give you enough insight that what you can do with it.Now if you click on any of the service on the left side of panel you can get a detailed view like below:

service-details

The above screen has a “Disable Monitoring” button at the bottom of the screen,so with that you can deactivate particular device or thing monitoring.

Munin:
It is basically a graphing system to plot thing on the browser to get a visual representation of activity happening on the network or particular device.Let’s check out whether I have it or not in y system:


bhaskar@bhaskar-laptop_18:38:14_Tue Nov 16:~> sudo dpkg -s munin
[sudo] password for bhaskar:
Package: munin
Status: install ok installed
Priority: optional
Section: net
Installed-Size: 996
Maintainer: Munin Debian Maintainers
Architecture: all
Version: 1.2.6-10~lenny2
Depends: perl (>= 5.6.0-16), perl-modules | libparse-recdescent-perl, librrds-perl, libhtml-template-perl, libdigest-md5-perl, libtime-hires-perl, libstorable-perl, rrdtool, adduser
Recommends: munin-node, libdate-manip-perl
Suggests: www-browser, httpd
Conffiles:
/etc/cron.d/munin 98f4112ea36053af9e1dc9111ab4d973
/etc/munin/munin.conf 057d322c5776710b8b71fbf02b12edbc
/etc/munin/templates/munin-comparison-month.tmpl 31f92013656bc96f496ad9fe9bd87b8b
/etc/munin/templates/munin-comparison-year.tmpl f8fc458757219e152bc0c316208214c4
/etc/munin/templates/definitions.html 6f2cda49ff5f0a5641549ae0dd063334
/etc/munin/templates/munin-nodeview.tmpl 60791f957f0879b859274ac423850e59
/etc/munin/templates/munin-serviceview.tmpl 9d061d0a097fdedc7cec09da56b45170
/etc/munin/templates/munin-comparison-week.tmpl 0ed0ac1772a96108e621f7ec9e651e65
/etc/munin/templates/logo.png 385010f8f050d25723206b1c77f0df5e
/etc/munin/templates/munin-comparison-day.tmpl 487b8c7f6f1eaf19687d601621da6f06
/etc/munin/templates/munin-overview.tmpl 07b6ba2c872f737fd3f2bf3df82bee06
/etc/munin/templates/munin-domainview.tmpl dfa7d0b5372086423c2aa7476bd04b90
/etc/munin/templates/style.css e6f61ecb33988635e5f6961de96c71c3
/etc/logrotate.d/munin caf8f6b63086ec5e11a9a2e2d883c7a1
Description: network-wide graphing framework (grapher/gatherer)
Munin is a highly flexible and powerful solution used to create graphs of
virtually everything imaginable throughout your network, while still
maintaining a rattling ease of installation and configuration.
.
This package contains the grapher/gatherer. You will only need one instance of
it in your network. It will periodically poll all the nodes in your network
it's aware of for data, which it in turn will use to create graphs and HTML
pages, suitable for viewing with your graphical web browser of choice.
.
It is also able to alert you if any value is outside of a preset boundary,
useful if you want to be alerted if a filesystem is about to grow full, for
instance. You can do this by making Munin run an arbitrary command when you
need to be alert it, or make use of the intrinsic Nagios support.
.
Munin is written in Perl, and relies heavily on Tobi Oetiker's excellent
RRDtool. To see a real example of Munin in action, you can follow a link
from to a live installation.
Homepage: http://munin.projects.linpro.no

It seems that I have it.So the next thing to where it reside in the system :

bhaskar@bhaskar-laptop_18:57:10_Tue Nov 16:~> whereis munin
munin: /etc/munin /usr/share/munin

Oh! I forgot to tell you that I need one more piece of software called “minin-node” . Let’s check out:


bhaskar@bhaskar-laptop_18:58:53_Tue Nov 16:~> sudo dpkg -s munin-node
Package: munin-node
Status: install ok installed
Priority: optional
Section: net
Installed-Size: 1396
Maintainer: Munin Debian Maintainers
Architecture: all
Source: munin
Version: 1.2.6-10~lenny2
Depends: perl (>= 5.6.0-16), libnet-server-perl, procps, adduser, lsb-base (>= 3.2-4), gawk
Recommends: libnet-snmp-perl
Suggests: munin, munin-plugins-extra, libwww-perl, liblwp-useragent-determined-perl, libnet-irc-perl, mysql-client, smartmontools (>= 5.37-6~bpo40+1), acpi | lm-sensors, python, ethtool, libdbd-pg-perl
Conffiles:
/etc/cron.d/munin-node 64b993c241bef6ad98b0f50f0de9d18b
/etc/init.d/munin-node 0a2e199d22c98af892cc407c63dddb5a
/etc/munin/munin-node.conf c317597f98622746dc2120d4aa1ace17
/etc/munin/plugin-conf.d/munin-node 686c0aa6a0a3eb4e973f162dc77ffe52
/etc/logrotate.d/munin-node 8afe5ab15b1f1731016d0bffadadff46
Description: network-wide graphing framework (node)
Munin is a highly flexible and powerful solution used to create graphs of
virtually everything imaginable throughout your network, while still
maintaining a rattling ease of installation and configuration.
.
This package contains the daemon for the nodes being monitored. You should
install it on all the nodes in your network. It will know how to extract all
sorts of data from the node it runs on, and will wait for the gatherer to
request this data for further processing.
.
It includes a range of plugins capable of extracting common values such as cpu
usage, network usage, load average, and so on. Creating your own plugins which
are capable of extracting other system-specific values is very easy, and is
often done in a matter of minutes. You can also create plugins which relay
information from other devices in your network that can't run Munin, such as a
switch or a server running another operating system, by using SNMP or similar
technology.
.
Munin is written in Perl, and relies heavily on Tobi Oetiker's excellent
RRDtool. To see a real example of Munin in action, you can follow a link
from to a live installation.
Homepage: http://munin.projects.linpro.no

Now I changed into the /etc/munin directory ,because I need to change the configuration file of it.Like below:

bhaskar@bhaskar-laptop_19:03:28_Tue Nov 16:/etc/munin> ls
munin.conf munin-node.conf plugin-conf.d plugins templates

Now have a look at the munin.conf file:

1 # Example configuration file for Munin, generated by 'make build'
2
3 # The next three variables specifies where the location of the RRD
4 # databases, the HTML output, and the logs, severally. They all
5 # must be writable by the user running munin-cron.
6 dbdir /var/lib/munin
7 htmldir /var/www/munin
8 logdir /var/log/munin
9 rundir /var/run/munin
10
11 # Where to look for the HTML templates
12 tmpldir /etc/munin/templates
13
14 # Make graphs show values per minute instead of per second
15 #graph_period minute
16
17 # Graphics files are normaly generated by munin-graph, no matter if
18 # the graphs are used or not. You can change this to
19 # on-demand-graphing by following the instructions in
20 # http://munin.projects.linpro.no/wiki/CgiHowto
21 #
22 #graph_strategy cgi
23
24 # Drop somejuser@fnord.comm and anotheruser@blibb.comm an email everytime
25 # something changes (OK -> WARNING, CRITICAL -> OK, etc)
26 #contact.someuser.command mail -s "Munin notification" somejuser@fnord.comm
27 #contact.anotheruser.command mail -s "Munin notification" anotheruser@blibb.comm
28 #
29 # For those with Nagios, the following might come in handy. In addition,
30 # the services must be defined in the Nagios server as well.
31 #contact.nagios.command /usr/sbin/send_nsca -H nagios.host.com -c /etc/send_nsca.cfg
32
33 # a simple host tree
34 [bhaskar-laptop.localdomain]
35 address 127.0.0.1
36 use_node_name yes
37
38 #
39 # A more complex example of a host tree
40 #
41 ## First our "normal" host.
42 # [fii.foo.com]
43 # address foo
44 #
45 ## Then our other host...
46 # [fay.foo.com]
47 # address fay
48 #
49 ## Then we want totals...
50 # [foo.com;Totals] #Force it into the "foo.com"-domain...
51 # update no # Turn off data-fetching for this "host".
52 #
53 # # The graph "load1". We want to see the loads of both machines...
54 # # "fii=fii.foo.com:load.load" means "label=machine:graph.field"
55 # load1.graph_title Loads side by side
56 # load1.graph_order fii=fii.foo.com:load.load fay=fay.foo.com:load.load
57 #
58 # # The graph "load2". Now we want them stacked on top of each other.
59 # load2.graph_title Loads on top of each other
60 # load2.dummy_field.stack fii=fii.foo.com:load.load fay=fay.foo.com:load.load
61 # load2.dummy_field.draw AREA # We want area instead the default LINE2.
62 # load2.dummy_field.label dummy # This is needed. Silly, really.
63 #
64 # # The graph "load3". Now we want them summarised into one field
65 # load3.graph_title Loads summarised
66 # load3.combined_loads.sum fii.foo.com:load.load fay.foo.com:load.load
67 # load3.combined_loads.label Combined loads # Must be set, as this is
68 # # not a dummy field!
69 #
70 ## ...and on a side note, I want them listen in another order (default is
71 ## alphabetically)
72 #
73 # # Since [foo.com] would be interpreted as a host in the domain "com", we
74 # # specify that this is a domain by adding a semicolon.
75 # [foo.com;]
76 # node_order Totals fii.foo.com fay.foo.com
77 #
78

I have bold the section in the file ; which is absolute must get going with it. If the directory is not present ,then please create it and point the right path.

Now take a look at the munin-node.conf file:

1 #
2 # Example config-file for munin-node
3 #
4
5 log_level 4
6 log_file /var/log/munin/munin-node.log
7 pid_file /var/run/munin/munin-node.pid
8
9 background 1
10 setseid 1
11
12 user munin
13 group munin
14 setsid yes
15
16 # Regexps for files to ignore
17
18 ignore_file ~$
19 ignore_file \.bak$
20 ignore_file %$
21 ignore_file \.dpkg-(tmp|new|old|dist)$
22 ignore_file \.rpm(save|new)$
23 ignore_file \.pod$
24
25 # Set this if the client doesn't report the correct hostname when
26 # telnetting to localhost, port 4949
27 #
28 #host_name localhost.localdomain
29
30 # A list of addresses that are allowed to connect. This must be a
31 # regular expression, due to brain damage in Net::Server, which
32 # doesn't understand CIDR-style network notation. You may repeat
33 # the allow line as many times as you'd like
34
35 allow ^127\.0\.0\.1$
36
37 # Which address to bind to;
38 host *
39 # host 127.0.0.1
40
41 # And which port
42 port 4949
43

So once more I have highlighted few thing in the this file to get going with it.And most of the thing are pretty easily understood thing.

Lets access it through browser to see the graph..here we go..this the first screen I got in my system:

Ok once I clicked on hyperlinked option I am presented with the graphs like below:

Ok, now if your click those graph then you can get little explanation of the graph too!!

Hope this will help.

Cheers!
Bhaskar

About these ads

About unixbhaskar
Sr Consultant GNU/Linux and Networking

3 Responses to Server monitoring by Monit and Munin

  1. Pingback: Tweets that mention Server monitoring by Monit and Munin « Unixbhaskar's Blog -- Topsy.com

  2. Pingback: Links 17/11/2010: Chrome OS and Android Explained, Linux 2.6.37-rc2 | Techrights

  3. This design is incredible! You obviously know how
    to keep a reader entertained. Between your wit and your videos, I was almost moved to start my
    own blog (well, almost…HaHa!) Excellent job. I really loved what you had
    to say, and more than that, how you presented it.
    Too cool!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 93 other followers

%d bloggers like this: