Cluster Information Service installation instructions.

CIS available for Linux, kernel 2.2, architecture x*86. It was tested on
RedHat 6.2 and Mandrake 7.0 distributions.

Requirements.

- To install CIS you are expected to have root account in target system and
experiences with system administration in Linux OS.

- If you want to have all the features of CIS, you need to patch your kernel
and therefore you need to get kernel sources. If you want to just try it and
you don't care about overhead, all monitors except socket monitor can be used
without patching your kernel.

- To enable monitoring of LM sensors, install i2c and lm_sensors packages.
They can be downloaded from http://www2.lm-sensors.nu/~lm78/download.html

1. Compiling CIS package

- download it from http://ups.savba.sk/parcom/cluster/cis.html
- uncompress and untar it
- go to CIS directory
- adjust Makefile by your needs
- make

2. Kernel modification.

If you want just evaluate CIS you can proceed to point 3.

CIS kernel probes are implemented as kernel modules. They are in the 'kernel'
directory. For compiling you need to patch your kernel. The patch is in
that directory too (monitor-2.2.patch). It will add exports of some symbols
and monitoring fields to kernel data structures. The patch is _really_ trivial.
Anyway, Mandrake kernel has problems with exporting symbols in SMP mode which
may be caused by special optimalization options used. In this case you should
get original kernel from www.kernel.org.

Instalation of kernel modules:

- login as root
- go to your kernel source directory
- patch -p 1 < monitor-2.2.patch
- you need to enable Kernel/User netlink socket in 'Network options'
- recompile and reinstall your kernel (reboot your machine).
- go back to the 'kernel' directory in CIS distribution
- change path to patched kernel in Makefile
- make

If building of modules was OK, you can try to insert modules into your kernel
by 'insmod <kmodule>.o'. There should appear four new (binary) files:
sysinfo, proclist, socklist and netdevlist in /proc directory. In /proc/net
directory you should see netlink file with lines containing  4, 5 and 6 in
'Eth' column.

Copy modules to /lib/modules/<kernel_version>/misc and run 'depmod -a'.

3. Installing CIS

- login as root.
- go to CIS directory
- make install
	This will copy cisd and monitors into /usr/sbin	directory

If you want have resolved CIS RPC port in rpcinfo:
- go to 'scripts' directory
- cat rpc >> /etc/rpc

CIS server comunicates with clients via RPC calls so you need to have
portmapper running on your server host. It is recommended to start CIS daemons
at startup. Since the startup scripts depend on your Linux distribution, you
have to add it into your scripts by hand. If you are using SysVinit package
(RedHat distribution), you can use scripts provided in scripts/ directory.
Copy them into /etc/rc.d/init.d/ directory and create links in appropriate
/etc/rc.d/rc?.d directories (either by tksysv or by hand). Add variable
CISSERVER=<hostname> to /etc/sysconfig/network on all machines where you will
run monitors.

- edit sample configuration file cis.conf in scripts directory and copy it
  to /etc/ directory on host where you will run cisd 

4. Starting CIS system.

You must be root to start CIS. It is recommended to start it at boot time. Look
at scripts how to do it. Monitors require server hostname as a parameter.

Warning: It is not recommended to remove modules from kernel while monitors
are running! If you need to install new versions stop CIS monitors and then
remove modules. Since they add wrappers to some system calls you need to stop
also all processes with active sockets and child processes. kprocmon.o module
is particularly sensitive. To remove it, lsmod should report 0 usage and rmmod
must not be run as child process! Examples:
`/sbin/rmmod kprocmon&`
or on remote host:
rsh <host> exec /sbin/rmmod kprocmon

5. Troubles

- wrong values in monitoring data can be caused by bugs in Linux kernel.
  Known problems (kernel patches are in the kernel directory):
  - only time slices longer that one tick (jiffie) are accounted. Even worse,
    shorter slices are accounted to next running process.
  - wrong I/O statistics in /proc/stats
  - 3c90x.c driver from card vendor (!) stores value of 20 bit counter
    into unsigned short.
  If you find another one, search the archive of the Linux kernel maling list
  for a patch.
  
- If you have a troubles compiling and installing kernel modules, try to run
  CIS without them.

- If your cisd won't start
	- look into your system log
	- check your /etc/cis.conf
	- check whether your portmap is running and there is no other cisd
          registered (if cisd crashed and it is still registered, compile
          it with -DDEBUG option, start and stop it. It should unregister
          itself correctly.)

- If your CIS monitor won't start
	- look into your system log
	- socket monitor will work only with kernel module
	- check whether you can access portmapper on server and whether cisd
		is running  'rpcinfo -p <hostname>'

- If you won't be able to find a solution, send me an e-mail

Jan Astalos (astalos.ui@savba.sk)
