[GTALUG] Cache DNS issues.

William Muriithi william.muriithi at gmail.com
Wed Nov 26 04:16:17 UTC 2014


Hugh,


> Sorry if my message was confusing.  I don't think that setting a
> longer TTL is the solution to your problem.  In fact, I don't know the
> right solution.  Not really my area of expertise.
>
> Setting a longer TTL will paper over the problem.  If the outage is
> short enough, and the remaining TTLs are long enough, you will not
> have a problem.  But all TTLs count down, so it is probably only luck
> if you never crash with a short remaining TTL.  Adjusting TTL way high
> will adjust the probabilities, but still leaves a vulnerability.
>

You have a point, unless I make it full day, at some point it will
happen again.  This is in Markham and power outage is like a weekly
affair.  Petty disappointed by power supply up there.

> You didn't really tell us everything relevant about your problem so
> I'm guessing at a few things.
>
> Is your DNS master server on the same machine or a different one?  If
> it is on the same machine, perhaps you can delay the startup of
> postfix until after the master server is up.
>
Nope.  The master and the slave leave of different hardware.
Actually, master is an active directory while the slave is a bind
server.  The purpose of the bind server is to ensure DNS resolve fast
enough as I do a lot of DNS related checks to avoid spam coming
through.
> If the server is on another machine, I don't know of an off-the-shelf
> solution for all cases.  That doesn't mean that there isn't one.
>

Me too. I am wondering if removing the MX server from the VM that
should start automatically should solve the problem.

> If you control the master server, you could run a local slave DNS
> server on the postfix machine.  That is probably the best and cleanest
> solution.  Zone transfers don't have to happen in real time.  This
> assumes that you only really care about queries for names in that zone.
>

Actually bind running on MX server is actually slave.  Sorry for
mixing up cache and slave. Shouldn't both stop serving the zone in
question is the TTL has expired?   If thats not the case, I guess I am
on the wood on why its happening then as you implicitly seem to imply
it shouldn't happen with slaves.

> If you don't control the master server:
>
> A normal (not DNSSEC) way of deciding that there is no domain with
> the given name is to give up after a query receives no answer after a
> timeout.  Just telling postfix to have more patience might work but
> has other problems.
>
> [UNTESTED] Perhaps before starting postfix, you could do a query of a
> better-be-a-sure-thing domain name with a really long timeout.
>         dig @master.server known-name.ca +time=300
> might do the trick (a 5 minute patience).  This might fit in the init
> script.

Hmm, good idea, I could chkconfig off postfix and set it a small
script to check DNS and then bring up postifx when it resolve
successfully.  Possibly call it every 15 minutes..  Will investigate
that idea tomorrow


>
> Of course this is all a little improper.  If the server goes down
> without the machine running postfix going down, you have the same old
> problem.
William

>
>
> ---
> GTALUG Talk Mailing List - talk at gtalug.org
> http://gtalug.org/mailman/listinfo/talk


More information about the talk mailing list