Forgot your password?

typodupeerror
Networking Technology

Testing Network Changes When No Test Labs Exist? 164

Posted by timothy
from the michael-gurski-special dept.
vvaduva writes "The ugly truth is that many network guys secretly work on production equipment all the time, or test things on production networks when they face impossible deadlines. Management often expects us to get a job done but refuse to provide funds for expensive lab equipment, test circuits and for reasonable time to get testing done before moving equipment or configs into production. How do most of you handle such situations, and what recommendation do you have for creating a network test lab on the cheap, especially when core network devices are vendor-centric, like Cisco?"
This discussion has been archived. No new comments can be posted.

Testing Network Changes When No Test Labs Exist?

Comments Filter:
  • by Anonymous Coward on Thursday December 24, 2009 @07:17PM (#30547830)

    Whenever you're working in/on a production environment, only one rule matters:

    Don't fuck it up.

  • by Lord Byron II (671689) on Thursday December 24, 2009 @07:20PM (#30547848)

    There are zero replies and the story is already tagged with "youreboned". That's the truth. If your higher ups won't front the money for proper test equipment and expect you to roll out production-ready equipment on the first go, then you really are boned. Of course, you can mitigate this by simple pen-and-paper analysis. What should each piece of equipment do? Are the products we've selected appropriate for the roles we're going to put them in? These sorts of questions can find a lot of bugs without any sort of testing. If you think, "what would I do if it was the 1980's?" then you'll be fine.

  • Could be worse (Score:4, Insightful)

    by 7213 (122294) on Thursday December 24, 2009 @07:22PM (#30547858) Homepage

    The best bet is to be ready to blame the vendor when things go south ;-)

    Seriously, I'm right there with you. If management does not want to provide for a test lab & reasonable time to test. Then it's clear they've made a 'business decision' that the network is not of sufficient value / risk is not great enough for such investments.

    This may change quickly once something goes south (assuming they understand why it did) but you're gonna be talking to a brick wall until then.

    It could be worse, you could have management that are afraid of there own shadows & who freak out at the idea of replacing redundant components after a HW failure. (Ever had to get VP approval to replace a failed GBIC? Oh, I have & yes, I hate my life).

  • by DigiShaman (671371) on Thursday December 24, 2009 @07:36PM (#30547938) Homepage

    Not all changes are a one-way trip. Having a rollback plan is also important. Should something very unexpected happen, be prepared to roll back any and all changes to undo what has just been done.

  • by Renraku (518261) on Thursday December 24, 2009 @07:41PM (#30547962) Homepage

    If you get fired for failing to do a job for which you were not equipped (and they know you aren't equipped for it), you might be able to sue because they created a hostile work environment. Hostile work environment lawsuits aren't just for sexual harassment, folks.

  • by BiggerIsBetter (682164) on Thursday December 24, 2009 @07:44PM (#30547980)

    Not all changes are a one-way trip. Having a rollback plan is also important. Should something very unexpected happen, be prepared to roll back any and all changes to undo what has just been done.

    Couldn't agree more, except to say, don't assume you'll be rolling back from a known state. I've seen roll-back plans that assume they're undoing the changes just put in, not reverting to the state before the changes. Yes, there's a difference between the two! Eg, if your install fails, maybe you can't un-install. Yes, this might mean additional resources and the overhead of FS and DB snapshots, and complete copies of config files, but better that than the alternative.

  • by Anonymous Coward on Thursday December 24, 2009 @08:04PM (#30548098)

    Been there, done that (A LOT!!)
    But it has failed quite a few times too..

    If no money available for test labs, make good plans... Tell the dudes that wanted the changes (or if you are the dude that wants the changes inform the correct people that you will be doing stuff) Agree on a service window. Have backup plans.. Have all configurations saved.. Let all users know that after 10pm on that saturday network will be down for 10 mins etc etc..

    Have tons of contengency plans, and let the 'responsible' people known what you are about to do.. Plan everything 'wide'... So even a 5 mins cable plugover, reserve a service window outside of office hours for 2 hours..

  • by afidel (530433) on Thursday December 24, 2009 @08:36PM (#30548244)
    This is networking equipment, other than transitory information like peer maps and MAC tables that can be re-learned you should always be able to revert to the previous state as far as the software and configuration.

    My comments are that out of band management are the networking guys best friend, and POTS is the best OOB available. Also learn how to change the running config without affecting the saved config, that way worst case is you have to power cycle (can be done with the correct OOB config or you can pre-schedule a reboot that you cancel if everything goes well). Oh and downtime windows might seem like a luxury but unless you are Google or Amazon the business needs to be made aware that they are necessary and critical to the smooth functioning of their IT infrastructure, so you should be making these changes during the downtime window where everyone is aware that things might break.
  • Re:Could be worse (Score:3, Insightful)

    by hazem (472289) on Thursday December 24, 2009 @09:53PM (#30548506) Journal

    That reminds me of an article by Nelson Repenning, "Nobody ever gets credit for fixing problems that never happened". It's quite an interesting read... The guy who "saves the day" during an emergency always seems to get credit and reward, but what about the guy who keeps the emergency from ever happening?

  • by Kr1ll1n (579971) on Friday December 25, 2009 @02:19AM (#30549520)
    first, pick the closest site that is drive-able to test with first. Make sure you have included a method for accessing the site in case of failure.....Dial-in modem, additional access path.....Even if it weakens security, you can remove it after you have verified the changes as working...Trust me on this. I have overseen Cisco networks that were remote ranging from 30+ sites to 300+ sites, all in distances that were not going to be approved to drive to.
  • by lukas84 (912874) on Friday December 25, 2009 @04:32AM (#30549846) Homepage

    Everyone has a test environment. But not everyone has a production environment.

I've got a very bad feeling about this. -- Han Solo

Working...