A keystroke away from Doom.
Jamon Camisso
jamon.camisso-H217xnMUJC0sA/PxXw9srA at public.gmane.org
Mon Nov 4 13:31:47 UTC 2013
On 13-10-19 01:41 AM, Robert Brockway wrote:
> (2) One night I was up late working on a problem. I stayed up late
> working on this because I was stuck trying to solve it. The problem was
> that the database backups were not restoring properly, and as we all
> know a backup needs to be tested to be a good backup. The developers
> were loading real data in preparation for launch so I had to get this
> working soon.
>
> I was dumping the database to run another restore test and I put the
> redirect around the wrong way. In my tired state I thought I had
> over-written the database. My stress levels went up rather suddenly. I
> assessed the situation and confirmed that I had not in fact damaged the
> database. This reminded me of another important moral I ostensibly
> already knew but wasn't following:
>
> Moral: Don't do sysadmin when extremely tired. It will only end in tears.
Here's one for the late night OMGWTF files:
It is really easy to remove an entire volume group with lvm tools versus
reducing them by tab completing commands. On a production SAN. Running
20+ VM guests.
So imagine if you will an entire 4TB SAN with no volume groups defined
with VMs now running on top because I pebkac'ed a tab complete. It is
that easy - tab completion is usually a great aid in my .zshrc, but for
root I've turned it off completely and make sure to use bash so I have
to type everything out explicitly.
To save things I paused all VMs and walked away for 5 minutes. This is a
corollary to Robert's Moral 1) - WALK AWAY when things mess up like
this. Like step away from the computer and walk/move/breathe. Then get
out the disaster recovery plan and read the overview.
After doing the above, with VMs paused and having got my bearings and
planned for a late night data centre visit, I restored the volume group
using a backup from /etc/lvm/backup. I unpaused and rebooted all the VMs.
No one knew how close we'd come to complete failure.
Jamon
--
The Toronto Linux Users Group. Meetings: http://gtalug.org/
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists
More information about the Legacy
mailing list