(Simple?) High availability question

Fri Jun 1 21:15:16 UTC 2007

Lennart Sorensen wrote:
> Well running primary/secondary bind is trivial.

   Aye, this was the least of my concerns. :P

> Running identical web servers is not too hard, although you have to
> update both whenever you make page changes.  Doing round robin
> connection distribution with a load balancer at the firewall isn't too
> hard, and there are probably better load balancers that take system load
> of each web server into account as well as checking that the web server
> is working and such.

   Since posting I found this: 
http://www.howtoforge.com/load_balancing_apache_mod_proxy_balancer

   I'll be running 2.2, so this should help deal with the apache side of 
things pretty easily.

> Running two mail servers is harder.  When a user reads or deletes a
> message, how do you ensure the update occours on both?  Redundant mail
> reception isn't too hard since you just have one be the main mail server
> and the other a backup MX which simply holds and forwards mail to the
> primary when it comes back up (most of the time the primary receives the
> mail directly).

   The mail reception I wasn't worried about exactly because of the 
simplicity of using multiple MX records. As you pointed out, it's the 
directing users to their mailbox and keeping both in sync where the 
trouble starts. This might be a better candidate for a share FS?

> Redundant pgsql is VERY hard.  If all you want is static database data
> then it wouldn't be a big deal and you could treat it like the web
> server.  Of course this is almost never what anyone wants.  Last I
> checked postgresql did not have live replication support, which is
> basicly what is needed.  This is one of those places where oracle and
> db2 have a reason for existing.  I believe mysql has a replicating
> server backend, although apparently that backend is much slower and has
> less features than the regular one, so it is a major tradeoff there.
> People are working on replication support for postgresql, but they have
> been working on it for years and I don't think it is working yet.  It is
> a very complicated thing to implement.  Keeping in sync when two servers
> are both up and already in sync is no big deal.  Getting back in sync if
> one has been down is very hard, especially while data is still changing
> on the live server.

Foo. I was under the impression that is exactly what clustering was 
about. Is there a way to used a distributed file system (like coda) and 
have two servers talking to the save directory structure? I am going to 
go out on a limb and guess no.

I may have to give up the idea of having load balancing at this time and 
stick with having the second server keep a mirror of the main server 
with a heartbeat between the two to have the backup take over on a 
failure of the main. Seems like a sad waste though with the second 
server just sitting there. :(

The main websites I care most about uptime on use PgSQL and have 
frequent writes. Have you (or anyone) played with how to handle 
mirroring the WAL of PostgreSQL? I can run a simple 'rsync' on the 
backup server say every 5m but that won't help if the master failed 
after an rsync (very likely) and without an up-to-date copy of the WAL 
rebuilding the missing bits would be, if I understand it all right, not 
possible. I could have a script run on the master that dumps the 
databases very frequently and have the most recent loaded on failure but 
I'd still lose any changes between the last dump and the failure.

I hope I can come up with something more robust. Perhaps I'll have to 
look into slony more.

Thanks!!

Madi
--
The Toronto Linux Users Group.      Meetings: http://gtalug.org/
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists