[Linuxha-users] Re: LinuxHA failover problems

Simon Edwards simon.edwards at linuxha.net
Tue Jan 25 01:20:17 GMT 2005


Hello James,
	You might want to try and manually bunzip2 the attached files into /tmp
on each node and then run;

# chmod +x clstat cldaemon clutils.pm
# cp clstat cldaemon /sbin/cluster
# cp clutils.pm /usr/local/cluster/lib/perl

This should get rid of the 8192)M errors - and more importantly if you
view the /var/log/cluster/cldaemon-cluster1.log file on the node that is
brought back into the cluster it will log the calls to drbd_setup. If
these commands like the following only appear once then the host is just
synchronizing active regions:

INFO  25/01/2005 00:55:29 Local: 192.168.0.31, Remote: 192.168.0.41
INFO  25/01/2005 00:55:29 app01vg/lv01: Not defined.
INFO  25/01/2005 00:55:29 Executing: /sbin/drbdsetup /dev/drbd0
disk /dev/app01vg/lv01 /dev/app01vg/lv01_meta 0
INFO  25/01/2005 00:55:29 Executing: /sbin/drbdsetup /dev/drbd0 net
192.168.0.31:9901 192.168.0.41:9901 C
INFO  25/01/2005 00:55:30 Executing: /sbin/drbdsetup /dev/drbd0
secondary

If you get lots and then finally one with an "invalidate" command then a
complete re-sync is being forced.

Please let me know how you get on!

Regards,
Simon.


On Mon, 2005-01-24 at 15:49 -0400, James MacLean wrote:
> Simon Edwards wrote:
> 
> >James,
> >	I made one or two small changes to 0.7.8 so let me run through the same
> >scenario as you...
> >  
> >
> Hi Simon,
> 
> Latest is much better on this end :). After the reboot, I get :
> 
> # clform -join
> INFO  24/01/2005 19:36:35 Validated checksum for cluster configuration
> INFO  24/01/2005 19:36:35 Checking cluster status...
> INFO  24/01/2005 19:36:35 p-6.ednet.ns.ca is running - p-5.ednet.ns.ca 
> will attempt tojoin cluster.
> INFO  24/01/2005 19:36:35 Starting cldaemon on p-5.ednet.ns.ca...
> INFO  24/01/2005 19:36:35 Waiting for p-5.ednet.ns.ca to join the cluster...
> INFO  24/01/2005 19:36:40 No response returned!
> INFO  24/01/2005 19:36:45 Connection made to p-5.ednet.ns.ca.
> INFO  24/01/2005 19:36:46 Node p-5.ednet.ns.ca successfully joined cluster.
> 
> After which the cluser shows as active.
> 
> But... then it appears that I must load the drbd module myself or the 
> syncing doesn't startup. Maybe I do not wait long enough?
> 
> Then the sync begins, but during this time I get a small perl error when 
> I issue clstat :
> 
> # clstat -application mysql
> Cluster: cluster1 - UP
> 
> 
>  Application       Node      State  Runnnig  Monitor  Stale  Fail-over?
>        mysql  p-6    STARTED  0:00:29  Running      1         Yes
> 
>  File Systems
> 
>  Mount Point              Valid   Type      State   % Complete   Completion
> Argument "8192)M" isn't numeric in division (/) at /sbin/cluster/clstat 
> line 499.
>  /var/lib/mysql            local   drbd    Syncing          0 %   32:00
> 
>  Network Configuration for mysql on p-5.ednet.ns.ca
> 
>  Intfce  Status   Times used  Time since use
>  eth0    Active            1      0:00:29:22
> 
>  General Monitors
> 
>             Type          Name    Status
>       Flag Check    flag_check   Running
>       FS Monitor     fsmonitor   Running
>       IP Monitor            ip   Running
>     Link Monitor          link   Running
>    IP Assignment       move_ip   Stopped
> 
> Which goes away once the sync is complete.
> 
> Also... :(, the sync takes as long as if I was syncing from nothing. I 
> thought during my tests of drbd alone that if on in the drbd pair was 
> off line for just a while then the sync happened quite fast? It appears 
> to be syncing as if a new Secondary is being brought into the cluster? 
> Of course maybe I'm just expecting it to be too fast ;).
> 
> Thanks for your quick responses. Now to do more tests :).
> 
> take care,
> JES
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cldaemon.bz2
Type: application/x-bzip
Size: 24394 bytes
Desc: not available
Url : /pipermail/linuxha-users_linuxha.net/attachments/20050125/f592f7b2/cldaemon-0001.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: clstat.bz2
Type: application/x-bzip
Size: 6051 bytes
Desc: not available
Url : /pipermail/linuxha-users_linuxha.net/attachments/20050125/f592f7b2/clstat-0001.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: clutils.pm.bz2
Type: application/x-bzip
Size: 6551 bytes
Desc: not available
Url : /pipermail/linuxha-users_linuxha.net/attachments/20050125/f592f7b2/clutils.pm-0001.bin


More information about the Linuxha-users mailing list