clustering is SO AWESOME

Madison Kelly linux-5ZoueyuiTZhBDgjK7y7TUQ at public.gmane.org
Mon Oct 12 22:33:41 UTC 2009


Darryl Moore wrote:
> Hey Madi, just a quick note.
> 
> Are you aware that LINBIT do not yet recommend DRBD running
> primary-primary for production environments yet?

Yup.

> http://www.drbd.org/home/mirroring/
> DRBD's primary-primary mode with a shared disk file system (GFS, OCFS2).
> These systems are very sensitive to failures of the replication network.
> Currently we cannot generally recommend this for production use.

In my case, I have an LVM PV on top of the DRBD. From there, I slice it 
up into LV; One for each virtual machine. This way, each logical volume, 
and thus DRBD slice (apologies to the BSD'ers), is only written to from 
one side of the cluster at a time. This way, I feel, is pretty safe.

However, I've been testing various failure and recovery scenarios to be 
safe. Specifically, things like starting a disk I/O intensive app on a 
VM, killing it's underlying server and trying to recover the VM on the 
surviving node. So far, so good.

> You say that it is not true anymore that the slave will be idle most of
> the time. I'm unsure how this would be. One of the ways DRBD works is
> that the master node assumes the cluster IP address. When it goes down
> the slave takes over that address. Accessing the cluster via the slave
> would be problematic without the custer IP address. You could do it via
> the nodes regular IP, but then if it goes down you have a more
> complicated issue in trying to discover this. As well, even from the
> documentation I've seen on their website, accessing data through the
> slave device via snapshots still means read only. Lastly, from the
> configurations I've seen, (and modeled by build scripts after), the
> actual service daemons are not even running on the slave until the
> master goes down.

I use Xen to create VMs, I don't use the cluster for individual services 
any more. I used to, a-la HA Heartbeat manager, but having dedicated, 
self-contained virtual machines seems more robust so far.

In my setup, dom0 is nothing special. I have three NICs; A dedicated 
DRBD link, a "back channel" used by all VMs which also gives me access 
to the node's IPMI interface, and a third that is only used by the 
firewall VM for Internet access. Dom0 nor any of the other VMs use this NIC.

Then, I setup each VM to use a minimum number of CPUs and RAM. 
Specifically, an amount that adds up to the equivalent to consuming most 
of the resources available on a single node. Then, I let the VMs 
"balloon" to use more resources up to a point where the given subset of 
VMs on a given node consume most of that node's CPU cores and RAM. This 
way, under normal operation, the VMs have extra headroom to do whatever 
each does, but are still able to all come up on one node should the 
other fail. Thus minimizing resource wasting.

> Perhaps I am missing something and do not yet fully appreciate the power
> of DRBD. If I am, I'd be grateful if you could enlighten me as to how
> these issues are resolved. I think using a different file system such as
> GFS might resolve some of them, but not all. I certainly haven't yet
> contemplated how to set up a system like this yet.

I've already documented all the steps needed to do the above, but I did 
a lot of it for work. I need to talk to my boss to see how or what I can 
use to create a publicly available document. Once I know, I will use 
what I can and re-write the rest as a how-to on my website. I'm seeing 
what I can do to create the docs for a simple 2-node on DRBD and a 
3+node using centralized storage on a software-iSCSI/SAN server. If/when 
I get those done, I'll probably put together a clustering talk, if TLUG 
is interested. That won't be for some time though, so in the meantime, 
if you are interested, I'd be happy to share what I know.

> I'm still in the process of testing and documenting my build scripts. I
> will be posting all of them shortly. My target date is around the middle
> of next month. I can post what I have so far if you want to offer
> criticism, but keep in mind they are unfinished. My target systems are
> Ubuntu, so they would need a bit of adapting for other distros,
> particularly non-debian ones.
> 
> cheers,
> darryl

I'm mainly a Debian/Ubuntu user myself, but work is forcing my hand 
towards CentOS. I am interested in checking them out, and will be happy 
to test some scripts for you, too. However, I've got my TPM talk at the 
end of the month and a fairly scary test in the beginning of December. I 
doubt I will have much time for extra stuff until after that though. If 
you're interested though, I'd be curious to take a gander at them in the 
meantime, even if I don't have the spare cycles just now to run one up.

Madi
--
The Toronto Linux Users Group.      Meetings: http://gtalug.org/
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists





More information about the Legacy mailing list