Version 1.0 - 1.2 FAQ

This FAQ attempts to answer questions regarding the configuration and use of the clustering software product. This attempts to be an evolving document – if you have any queries that this FAQ does not address please let me know and I will add to it deemed appropriate.

Simon Edwards <at> , 25th April, 2006.

1Installation and Compatibility

1.1How does Work? is a clustering product. It is designed to allow one or more applications to be run in a “highly available” configuration. That means if a machine suffers from a failure, such as a software problem, loss of a hard drive or network card, or indeed a whole machine, the software is able to reconfigure the environment so the application remains running.

In some cases this will require the application to be restarted – for example if it was running on a machine that has died, whilst other times the application continues to run without interruption. works by running on two machines at the same time – know as the “cluster”. Each node in the cluster can be running one or more applications, though each application is only run on one machine at a time.

Applications are moved between the two machines when a failure occurs to ensure they remain running. Whilst the cluster is reconfigured in this manner the applications may be unavailable for a short period of time – usually a matter of less than 20 seconds before they restarted on the other machine in the cluster.

To support this functionality the software maintains two copies of all application data – one on each machine. Thus there is no requirement for using expensive shared storage infrastructures such as SCSI or Fibre Channel attached storage.

Each application is distinguished by having one or more unique IP addresses that appear on the host where the application is currently running. All users of the application use this IP address rather than the IP address associated with a physical machine. This ensures that they can connect to the application no matter on which machine it is run on.

1.2What Kernels are supported?

The underlying technology that uses is called DRBD (see for details). This very clever software provides with a block device over which the data for each application is replicated. The requirements for DRBD drive the kernel requirements for

The current versions of use DRBD version XXX – which is compatible with pretty much all recent 2.4 and 2.6 based kernels. The biggest issue that administrators tend to run into is not the compatibility of DRBD but making sure the required tools are present to build the kernel module for DRBD.

1.3What distributions are supported? is not aligned with any particular distribution and should work with most. The author has personal experience with using it with the following distributions in the past:

It should be noted that the software is developed and tested on Slackware and Centos but when a big release takes place (such as 1.0 or 1.2) it is tested across all the distributions listed above.

1.4Must both hosts run same kernel and/or distribution?

This is not the case – although for sake of sanity of the system administrator it is strongly recommended. The kernel requirements for (DRBD as mentioned above) uses a particular version of a protocol, and so as long as the two versions of are the same, the distribution and kernel do not need to be.

That said this is only partially true – would probably not work with one node running Linux 2.4 and the other running 2.6. However running differing minor versions should not present a problem. Indeed this is probably a good thing since it allows the OS to be upgraded one node at a time without the whole cluster being unavailable.

2Hardware Related Questions

2.1What are the minimum requirements for running itself strives to be a low-overhead system – thus the hardware requirements are minimal and are defined instead by the distribution that is being run, and more importantly the requirements of the applications that are to be clustered.

As an extreme case the author has run on a pair of servers running slackware with just 64Mb of RAM. The biggest overheads of are that of memory (the software is mostly Perl and so a perl interpreter must be running) and CPU. However even on a 1.0 Ghz CPU the overhead of running is typically 1-2%.

As stated the requirements for can typically be ignored – instead concentrate on the requirements for the distribution in question and then add the requirements for the applications to run.

2.2What disk configuration is recommended?

Since the DRBD device used by works with any underlying block device (i.e. Hard disk), does not require a particular type of hardware. It will work anything from standard PC PATA or SATA drives, locally attached SCSI, or even Storage Area Network based storage. The only type of storage it will not work with is network-based storage, such as NFS or SMB partitions.

Given that the software is designed to provide high availability typically the disk configuration of each server attempts to minimize failure – and hence making use of RAID-1 or RAID-5 based storage is also recommended. DRBD device treats a meta-device just like any other block device.

2.3Does require LVM (Logical Volume Manager)?

Yes – but this is because it makes the life of the administrator, as well as the use of the software, far more straightforward once configured. Pretty much all distributions support LVM (version 1 or 2) and use of LVM means that can support on-line addition of file systems and even on-line growth of file systems. supports both LVM-1 and LVM-2 as appropriate for the current distribution on which the software is installed on. It is even possible for one node to be running LVM-1 and the other running LVM-2.

Currently other volume managers are not supported.

2.4What network configuration is recommended?

The recommended configuration to use it at least two network cards – more if possible. If only two network connections are available then the most popular connectivity is to give just a single card an IP address and have both cards configured as part of the same “network” in

In such configurations the single network card will be used for client connectivity as well as the data to synchronize the disk partitions. Assuming both cards support physical link level checking (most do), if the live card fails will automatically fail-over all IP addresses to the alternative connection.

If more than two network cards are available then the possibilities for more flexible and resilient configurations become possible – see the next section for some examples.

3Network Topology

3.1What is the advantage of using Multiple Networks? supports the ability to group several network cards into a “network” - and it is possible to define multiple networks as part of the cluser configuration (or “topology”). Each network must only have a single card with an IP address – the other cards are “standby” - that it they become candidates for hosting the IP addresses associated with the live card if that card fails or its network connection fails.

Cards can not be shared between networks and so a typically network consists of just two cards. A typical “high end” configuration would have 4 network cards in each machine, two cards in a “public” network and two in a “private” network. The private network would be used for DRBD (disk synchronization data), whilst the public network would be used for client connections. This topology ensures that client traffic is separate from the internal traffic used and hence is considered a more scalable solution. Such a solution might be represented by a diagram such as the following:


3.2How can I validate link-level checking is supported on my Ethernet cards?

Almost all modern distributions support a utility called “ethtool”. With a particular network card physically connected to a switch simply issue the command “ethtool eth0” (replacing “eth0” with the network card as appropriate). If the card is supported many lines of information will be shown. If it is not an error will be given.

Using this tool might be necessary anyway; particularly a machine has multiple network cards – you can always use this to identify the logical “ethN” number with a physical card instance. This becomes particularly important if the topology is to consist of multiple networks – you need to ensure the currect cards are connected to the correct switches!

3.3Can cross-over cables be used for DRBD traffic?

Yes – cross over cables can be used, but in the configuration care must be taken to ensure that the network has the “attribute=crossover”. This is because with cross-over network connections when one machine looses power it can be seen as a physical network failure, leading to an incorrectly diagnosed problem. When the “attribute=crossover” option is given for a network the physical link-level checking is disabled for this network alone ensuring this mis-diagnosis can not occur.

3.4Can more than one card be part of the same network?

Yes – a “network” can consist of multiple network cards – though only a single card can have an IP address associated with it – the others are used as “stand-by” cards. They become candidates for hosting the IP address if a physical link failure is detected on the card currently hosting the IP addresses for that network.

3.5Can different card types/speeds be part of the same network?

The cards defined in a network do not need to be the same type – though it is recommended that they all support phycial link level checking. They all need to be Ethernet cards however the speed of each does not matter.

Typically when the speed of the cards in a network is different it is typical to ensure that when the machine boots the fastest card is the one configured with the “static” IP address so that it is used by default by

4Initial Cluster Configuration

4.1What is the process of defining the cluster?

Once the software has been installed and the post-installation process completed successfully (see the log files in the root user's home directory if uncertain), then the process of cluster configuration can begin. Before any applications are defined the cluster topology must be configured and built.

The first step the administrator must take is to define the “network topology” of the cluster. That is to define how many logical networks will be configured, how they are to be linked to the network, and also to ensure that each logical network to define has an IP address configured on one of the cards. Each of the IP addresses configured must be on a separate subnet – for example and are sample IP addresses to assign to two different networks.

The administrator must then ensure that promptless SSH connections as root are possible between all instances of IP addresses between the two machines – and also to the IP addresses associated with the local machine. Several documents describe this process as well as countless pages on the web.

Once that has been completed an XML file must be created on one of the two machines – called “/etc/cluster/clconf.xml”. A sample can be found in “/etc/cluster” after the installation to act as a baseline. This file contains the details of the nodes that make up the cluster, various cluster characteristics (such as the cluster name, range of TCP/IP ports to use etc.), and the network topology of the cluster.

Once the file has been editied then the “clbuild” routine is run. It is recommended that it is run with the “--verbose” option since it can take some time to run. It scans both nodes in question, ensuring that relevant software is available, and when configured will build an internal list of resources that can be allocated as part of the cluster.

Once the process has completed successfully the cluster itself can be started – simply using the “clform” command.

4.2Is is possible to change the cluster definition later?

Of course! Since is designed to be fully dynamic it is even possible to do this whilst the cluste is up and running. To change the configuration simple alter the “/etc/cluster/clconf.xml” file on a particular machine. As soon as it is altered many utilities on the command line will fail since the software will realise that the configuration file not longer matches that which was successfully used to build the cluster.

However all that needs to be done is to use the “clbuild” command again. If the cluster is up and running the “--force” option will be required. This will check for changes and inform the running cluster daemons to reconfigure themselves as appropriate (for example if the network topology has changed the network daemon needs to reconfigure to take acount of this).

This process can even be performed whilst clustered applications are running and is one of the strong points of

4.3Are all cluster parameters dynamic?

Almost. It is not possible to redefine the network used for DRBD traffic whilst the cluster is running, nor is it possible to change the number of DRBD devices. This is because that change can only take affect when the DRBD module is loaded, and so is not possible when the cluster is making use of this kernel module.

Care must also be taken when reducing the number of DRBD devices configured – for version 1.0 based installations, though version 1.2 caters for this change in a safer manner and will report errors if the number of devices is too small for the current cluster configuration.

4.4How can the cluster definition be backed up? uses the directory “/etc/cluster” for storing the configuration of the cluster, any applications and the resources allocated to each. Hence simply running the following command on both machines on a regular basis is recommended:

# tar cvjf /backup-dir/cluster-date.tbz2 /etc/cluster

It should also be noted that use of the “clbuild” routine once a cluster is already defined will automatically archive the current configuration in a tar file of a similar name to the above in the “/cmbackup” directory (which it will create if necessary).

If the existing configuration needs to be restores please ensure the cluster is stopped first! Only restore a backup of the “/etc/cluster” directory when the software is running if the cluster can not be shutdown using “clhalt” - this can happen if the “/etc/cluster/clconf.xml” has been deleted whilst the cluster is running for example.

5Initial Application Configuration

5.1What applications can be clustered?

Almost all “server-based” applications that allows users to connect via an IP address can be clustered. Successful examples include Apache, Samba, Mysql, and PostgresQL. Other uses might include mail servers or NFS servers.

For best results the applications that are clustered should support client connectivity via TCP connections. This protocol is more reliable and will are more likely to handle network fail-overs or even fail-overs of the application to the other host, in a more robust manner.

It is important to realise that the process used to start the application is very important. This process must take account that the applicartion might be restarting after a node failure and thus it should attempt to ensure it “cleans” any transaction logs if appropriate (for example if restarting a database).

5.2Can applications be added whilst the cluster is already running?

Of course! New applications can be added when the cluster is up or down. New applications can be added whilst existing applications are up and running without problems.

5.3How many applications can be configured as part of the cluster?

There is no defined limit to the number of applications that can be defined. To make best use of hardware typically a cluster will consist of a minimum of two applications – each defining a different node as the preferred node. In that manner the default configuration will have each application running on a different machine. Only when a machine fails, or the application is migrated to the other node for maintenance, will both applications run on the same machine.

The number of applications configured is not limited – the limitation is more likely the machine resources to be able to run the applications. Hence you might define 20 different databases each as separate cluster applications – you just need to ensusre that each machine in the cluster has enough memory/cpu to actually run all at once if a problem with one of the nodes occur!

5.4What is the recommended process for clustering an application?

Cluster an application tends to be fairly straightforward and typically consists of understanding;

Once these have been defined the location of any application and dynamic data must exist on file systems that should be part of the clustered data. Typically application binaries are not, since they have not need to be.

Thus the application code is typically installed locally on both machines (in the same location), as the first step. Following this typically the file systems that will host the other data are mounted from file systems in the volume group that will contain the application.

Once these file systems are mounted and routines have been tested to start and stop the application, the application “appconf.xml” can be written (or modified from the sample provided)

5.5Are there recommended file systems to use? does not recommend a particular file system – as long as it is journaled! Thus it is probably best to make use of the file system that is most closely aligned with your distribution unless you need a particular feature not offered by a given file system. has been designed to work with any of the following:

The user can use one particular file system or mix and match in the same cluster. Of course the file systems available depend on the kernel, kernel modules and supporting binary tools being available.

5.6What is the maximum storage associated with an application?

The maximum amount of storage that any particular application can take use of is not limited. DRBD supports single file systems of 4Tb in size, but the number of file systems defined for an application is not limited.

The only point to keep in mind is that each application must make use of a dedicated volume group, and so when dealing with large amounts of storage keeping disks (or partitions) free from a particular volume groups to allow future growth is strongly recommended.

Also it should be remembered that for each file system (or raw volume) that is made available as part of an application requires a separate 128Mb volume to store the DRBD meta data. Definition of these volumes occurs automatically – but can only take place if the administrator ensures adequate spare space is available in the volume group when the application is configured as part of the cluster.

5.7How flexible are IP address allocations for applications?

Very! An application can be configured with no IP addresses (highly unusual) or one or more. The IP addresses can be across one or more networks configured in the cluster topology. The number of IP addresses that can be defined for an application is not limited, some administrators have applications with 20 or more IP addresses without issue.

To improve support for large number of IP addresses the latest versions of support ranges – for example “” indicates that IP addresses from “” through to “” should be assigned to the specified application.

5.8Can monitor for and respond to application software failure?

Each application is expected currently to have a “Lems” daemon running. This daemon is responsible for monitoring the cluster state for the application in question. The configuration always include a “file system monitor” but can also include other monitors optionally. comes with a series of modules that can be added to a Lems daemon to monitor different things and one of these is a process monitor. Hence it is possible for the Lems daemon to check that certain processes are running and if not trigger the cluster to attempt to restart the application.

Such monitors are state-full – if a particular process or application fails a certain number of instances in a given time period it will trigger the application to restart on the other node in the cluster. This makes the most sense since the application failure might be related to a software mis-configuration on the current node.

6Application Modifications

6.1Can additional file systems be added to an existing application?

Yes! Additional file systems can simply be mounted locally on the machine that is running the application, and then by running the “clbuildapp” against the application in question will allocate the required resources, un-mount the file system and re-mount running over a DRBD device.

When file systems are added in this way the contents of the file systems are automatically back-ground synchronized to the equivalent logical volume on the other node in the cluster. Such volumes are created automatically if they do not yet exist.

This on-line addition of file systems is considered a key strength of

6.2Can existing file systems be removed from an existing application?

Yes! Existing file systems can be removed from an existing application. The easiest process of performing this change is simply un-mounting the file system in question whilst the application is running as a clustered application. Then using the “clbuildapp” routine again will remove the file system from that managed as part of the application. The file system is not destroyed – that process will then be down to the administrator if required – including the removal of the logical volume used for the DRBD meta-data.

This on-line removal of file systems is considered a key strength of

6.3Can additional volume groups be added to an existing application?

Yes! The application configuration file, (known as “appconf.xml”) needs to be firstly changed to include a reference to the new volume group. Then file systems mounted natively on the volume group on the node which is currently running the application can be added to the application in the usual way – simply by running the “clbuildapp” function to allocate the required resources and the file systems will be automatically remounted as DRBD devices.

Again these changes are performed whilst the application is running – a strength of

6.4Can existing volume groups be removed from an existing application?

Of course! The process of removing a complete volume group is a matter of performing the following two steps. Firstly all file systems associated with the volume group should be un-mounted and then the “clbuildapp” should be run. Once this has been completed the “appconf.xml” file for the application should be modified to remove the reference to the volume group and the “clbuildapp” utility should be run again.

Again these changes can be performed whilst the application is running – a strength of

6.5Can existing file systems be extended?

Yes! All of the support file system types can be extended whilst the application is running and the file system is mounted and in use! This of course the required binary tools to support the on line file system expansion are in place. However to extend the file system simple perform the following steps on the node where the application is currently running;

6.6Is is possible to change the monitored processes for an application?

As previously stated the Lems daemon that is run for an application can include process monitors. There is a utility called “lemsctl” that allow this running daemon to be managed. It is possible to halt, remove and add monitors online without problems.

6.7Is is possible to change the IP addresses for an application?

Version 1.2 of supports changing the associated IP addresses of an application whilst the application runs. For version 1.0 the changes only take affect when the application is restarted. In either case the process is a matter of changing the “appconf.xml” file and running the “clbuildapp” against the application as usual.

7Operating System Related Changes

7.1Do I need to do anything if I upgrade my Kernel version? uses a kernel module called “drbd”. If the version of the kernel changes it is likely that the software will not be able to load the available “drbd” module. In such circumstances a suitable error will be given along with instructions on how the drbd can be recompiled and re-installed.

For version 1.2 this process is automated when the cluster daemons start. For older versions based on 1.0 only the warning stated above will be given – intervention by the administrator is required.

7.2Do I need to do anything if I upgrade my Distribution version?

It is likely that changing the distribution will change the version of the kernel that is in use. In this case follow the steps mentioned in the previous question.

7.3What is the version of Perl changes? has been developed against various increments of Perl 5.8 and apart from a work around in 5.8.0, all other iterations have worked without issue. The author can not forsee any likely problem if minor incremets to 5.8 are installed.

The more likely times for problems are when 5.10 based Perl installations become available. Of course when that does happen the author will test the software and make any alterations required as soon as possible (if any are needed).

8Coping with hardware failure

9Support and Road maps

9.1How long will 1.0 based released be supported for? version 1.0-based releases will enter “bug-fix” mode once version 1.2 final is released. That status is likely to be maintained for as long as necessary – certainly at least for 2 years. Of course during that time any limitations in features will not be addressed just critical bugs fixed.

9.2How long will 1.2 based released be supported for?

Again there is no set time-limit that support will be in place for. A minimum of two years in “bug-fix” mode once version 2.0 is released is likely – though in reality it will be kept as a working code tree for as long as users continue to use it.

9.3So version 2.0 is next? What are the main aims of such a release?

Yes! Version 2.0 is receiving intermittent development at present – once version 1.2 final is released the majority of coding effort will be moved to version 2.0. The aim is to release a very rough (probably “alpha” quality) code version before the end of 2006. Even so development of version 2.0 already covers 10,000 lines of Perl code.

The version 2.0 release is a completely fresh code release, the main design features being;

9.4What support is available for

The first point of call should be the mailing list – see blah blah. Also recently launched is a support forum. The author is keen to hear from people experiencing any problems since this is the best way of ensuring the continued development and progression of the software.

9.5What about commercial support?

Although the author has received tentative approaches for commercial support nothing much has happened on that front. However the author is training others in the hope that is any companies do wish for such support, 24x7 based support will be possible.

10Personal Questions

10.1Who Develops is purely developed by myself; luckily several keen users of product are quick to point out issues so I can continue to improve the product fairly rapidly. Others have sponsored certain documents and/or features for which I am very grateful.

10.2Why Develop

Personal satisfaction mainly. The author also spends a lot of time performing consultancy much of which is based around various commercial high availability products. Hence draws on that knowledge and tries to avoid the limits and annoyances that affects those products.

10.3Do you do this full time?

No. is purely a part-time effort at present. If commercial support or sponsors are found this may change of course. However I've committed much time and money so far to the project and have no intention of stopping now things are really getting exciting!

10.4Have you released any other software?

I'm not a software developer really; the majority of my time is spent performing high-end UNIX and SAN consultancy; particularly in the areas of EMC-based Symmetrix storage environments and Sun/HP and IBM proprietary UNIX platforms.

That said I've written customized revision control systems; package management infrastructures; secure intranet sites and many UNIX utilities – the current one under development being Skulker V2; which should soon have it's own web presence. This is a cross platform log and temporary file management tool; similar but much more powerful than, the “logrotate” utility.