clustering is SO AWESOME

Darryl Moore darryl-90a536wCiRb3fQ9qLvQP4Q at public.gmane.org
Mon Oct 12 19:55:07 UTC 2009


Hey Madi, just a quick note.

Are you aware that LINBIT do not yet recommend DRBD running
primary-primary for production environments yet?

http://www.drbd.org/home/mirroring/
DRBD's primary-primary mode with a shared disk file system (GFS, OCFS2).
These systems are very sensitive to failures of the replication network.
Currently we cannot generally recommend this for production use.


You say that it is not true anymore that the slave will be idle most of
the time. I'm unsure how this would be. One of the ways DRBD works is
that the master node assumes the cluster IP address. When it goes down
the slave takes over that address. Accessing the cluster via the slave
would be problematic without the custer IP address. You could do it via
the nodes regular IP, but then if it goes down you have a more
complicated issue in trying to discover this. As well, even from the
documentation I've seen on their website, accessing data through the
slave device via snapshots still means read only. Lastly, from the
configurations I've seen, (and modeled by build scripts after), the
actual service daemons are not even running on the slave until the
master goes down.

Perhaps I am missing something and do not yet fully appreciate the power
of DRBD. If I am, I'd be grateful if you could enlighten me as to how
these issues are resolved. I think using a different file system such as
GFS might resolve some of them, but not all. I certainly haven't yet
contemplated how to set up a system like this yet.


I'm still in the process of testing and documenting my build scripts. I
will be posting all of them shortly. My target date is around the middle
of next month. I can post what I have so far if you want to offer
criticism, but keep in mind they are unfinished. My target systems are
Ubuntu, so they would need a bit of adapting for other distros,
particularly non-debian ones.

cheers,
darryl



Madison Kelly wrote:
> Darryl Moore wrote:
>> I wrote a script to automate the building of basic block device clusters
>> using DRBD. I've started writing other scripts to build services on top
>> of that. So far just NFS, by I plan to do MySQL, and others too.
> 
> Awesome, you have those up anywhere?
> 
>> I'm really impressed with DRBD so far, though I haven't put it into a
>> production environment yet. Soon I hope.
> 
> I've been using DRBD for ~3y now. Just recently though have I switched
> to the new version and began playing with primary/primary mode.
> 
>> The only down side of DRBD is that only one machine is the master at any
>> given time which means that the other one is idle and a waste of
>> resources.
> 
> Not true any more! :)
> 
>> It is a good idea to give the slave a few other duties so that it
>> doesn't ever get too bored. The other thing you can do is make your DRBD
>> cluster doubled headed. I.E. have each machine be the master of separate
>> resources and also be the backup for each other. I've recently updated
>> my build scripts to do this, though I haven't tested it yet. As soon as
>> I get my high availability SQL build scripts going I'm going to build a
>> double headed NFS / MySQL cluster and take it for a spin.
> 
> Check out the new version. With a cluster-aware FS (I personally use LVM
> with locking), you have have both servers using the DRBD partition at
> the same time. Also useful are OCFS2, GFS and others.
> 
>> One thing to watch out for, regardless of how you build it, is that you
>> don't load down the slaves during normal operations to such an extent
>> that they will not be able to cope with the additional load in the event
>> that the master goes down.
>>
>> cheers and happy thanksgiving weekend,
>> darryl
> 
> In my case, I've got dual CPU, quad-core Opeterons (total of 8 cores)
> with 32GB/CPU and bring up virtual machines on either server set to use
> a minimum of X resources and let them balloon out to Y (to use up the
> unused resources on each node). This way, when one node fails or is
> taken off line for maintenance, I know I have enough resources to run
> all VMs on the one node without wasting the resources available when
> both nodes are alive.
> 
> If you want any help/hints/whatever getting dual-primary running, let me
> know. I've bashed my head off this stuff enough... It'd be nice to help
> save someone else some of the hassle.
> 
> Madi
> 
> -- 
> The Toronto Linux Users Group.      Meetings: http://gtalug.org/
> TLUG requests: Linux topics, No HTML, wrap text below 80 columns
> How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists
--
The Toronto Linux Users Group.      Meetings: http://gtalug.org/
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists





More information about the Legacy mailing list