posted Oct 14, 2011, 8:05 PM
There is a reason I am losing my hair....upgrades and migrations...
Many people think these are as easy as putting in the CD and typing "patch add cd" with your iPhone for an hour... reboot and go to the bar and celebrate.
(Great job by Pat Waters, Thanks Pat)
(notice he does a snapshot/revert!!! Wise Wise man)
Did you ever watch those house remodeling shows that make it all look soooo easy? Same here, get your Home Depot credit card ready...
New installs are easy..People don't have Internet then after you install they have Internet and you are the hero. Not so with upgrades. People HAVE Internet access and if the upgrade goes south people DON"T HAVE Internet access and you are the a$$ that destroyed their business (they obviously were in bed at 3am while you were working).
Upgrades are easy until they go south, then upgrades are really nasty. People assume the upgrade scripts are magical and will catch every weird configuration issue. NOT!!!! And of course you are upgrading in a live environment with a 2.5 hour window. Of course the perfect storm hits when you can't go forward and you can't go backward. 
I've learned my lesson. I never approach upgrades with ease. The new Sidewinder firewall has this cool feature where it builds a virtual disk with the old environment. If anything goes wrong, just point to the virtual disk and wham bam your are back online.
So too I've learned my lessons with 18 hour fingernail pulling upgrades. Here are a couple options to keep hair on your head:
1) Have a duplicate hardware platform that you build on and swap it in until you are convinced it works. If not, then quickly swap in the old platform.
2) Snapshot/restore: Anyone not doing this should be fired. Also they should be castigated if they are not testing the restore to ensure the integrity of the snapshot. WHY?? Because something is not working/different in R75 where you can't restore to a previous version! How would you like to run into that during an aborted upgrade.
3) My favorite is RAID 10 - pull out the redundant drives and LABEL them "Version XXX DO NOT DELETE". If possible. During the upgrade if anything goes wrong just slam in the backup drives, MAYBE do a SIC, and wham bam thank you maam, Yeah the RAID rebuild takes a bit, but you are probably doing this at 3am anyways. 
4) No better reason for HA than now. Synch to the Secondary, Make the Secondary the Primary. Upgrade the old
primary. If something goes wrong who cares?
4) Verification Test: Agree with main application owners what the verification test is. If you can't run this test 1/2 way through the maintenance window EJECT, PUT DOWN THE SHOVEL-STEP BACK FROM THE HOLE,PULL OUT OF THE DIVE, RETREAT, SUCK UP YOUR PRIDE AND AVOID THE WATERLOO. If this test does run successfully, then if anything else goes wrong tell them to accompany you at 3am to fix it.
You are welcome....I just saved your career. I know these have saved mine...
In addition, I urge customers to consider a migration instead of upgrade. Its like upgrading XP to Windows you really want to pull all the viruses and 10 years of crap you downloaded into the future? NO! I urge people to build the newer version from scratch and then re-import the ruleset on a fresh install.
My gunslinger days are over. Besides having a Plan B, I document the process so others can replicate. As I learn gotchas OR new techniques they all get written down in the process document.
But in the end its too late. Most my hair is gone already. Why didn't I learn earlier????