Trying to Help a Troubled Network with Linux? 68
vmehta asks: "I was recently put in a situation where I am trying to help a troubled network with many students accessing it. There are issues with broadcast packets and random outages which seem to be plaguing the network. What tools and methods are the best practice when trying to use Linux and Open Source to analyze and fix a network?"
Assess the problem (Score:5, Insightful)
Basically, OSS/Linux are great, but don't rush in without establishing the issues first.
Re:Assess the problem (Score:2, Informative)
Re:Assess the problem (Score:3, Informative)
but if your 100mbit network is being overhauled, it's quite difficult
to isolate single responsible instances.
I guess that probably you will end up doing that
1) get rid of cheap hubs(made in paiwan) and get some real network switches in place, like those from SMC. Having an old buggy hub talking to several cheap NICs in several machines ends up in massive packet collision, resulting a network that doesnt carry much but is totally jam
Re:Assess the problem (Score:5, Informative)
Here's an idea: Before you blunder in with an answer, the first step is to work out what the question is. :)
Re:Assess the problem (Score:1)
Chances are, he's not going to install linux under vmware to solve the problem ;)
Re:Assess the problem (Score:2)
Chances are, he's not going to install linux under vmware to solve the problem
Maybe he just happens to have a Linux laptop that he's willing to plug into the network to do the scan/diagnostic stuff. Maybe he wants OSS because he doesn't want to use / can't afford to spend expensive software solutions.
I use Linux at work, while most of my coworkers have either WinXP or OSX. Although sometimes a task is better accomplished from one of my linux boxes, I'm not rushing
Re:Assess the problem (Score:1)
Re:Assess the problem (Score:2)
Someone who tries to justify the first person's cluelessness by trying to posit some alternate universe where it would make sense.
Re:Assess the problem (Score:1)
Map the physical/logical network. (Score:2)
Remember, when a NIC is connected to a switch, they only auto-negotiate if both are set to auto-negotiate. If someone sets them to a certain configuration, but doesn't get pair correctly matched, you will have a lot more collisions and such.
Make sure that your collision domain is setup correctly. Pay attention to the length of the cables. This is where the physical map comes in. You can che
Re:Assess the problem (Score:3, Interesting)
I agree with madaxe42, First things first. Diagram the network. Figure out where hubs and switches are. Figure out where the firewalls are. Figure out how packets traverse the network(s). If it's a single network with a single point of access to the internet this should be (relatively) easy. If you are looking to save the day with linux what you could do is set the switches to use "port mirrors" to capture every packet on the network
You are infected with viruses most likely (Score:5, Insightful)
Re:You are infected with viruses most likely (Score:1, Funny)
Now people are shouting at me, something about an Oracle.
Who the fuck is this Oracle dude and why has he hacked into our network? Is he like that Mitchick character? I hope they never let him out of prison!
Loopback (Score:2)
Re:Loopback (Score:2)
Re:Loopback (Score:2)
Shoot the students (Score:5, Funny)
OSS? Linux? WHY? (Score:4, Interesting)
Christ, this is like the late 90's, when everything suddenly had "e" in front of it. Dude, get Ethereal, slap it on any Windows box, and be done. No need to get nerdy with Linux. If you know enough that its broadcast traffic, you're halfway there.
Re:OSS? Linux? WHY? (Score:1)
you mean eThereal don't you
Re:OSS? Linux? WHY? (Score:3, Informative)
If you want to use a PC running Ethereal to monitor 802.11 traffic to or from other machines, rather than using Ethereal only to look at traffic to and from the machine on which you're running Ethereal, you should seriously consider running it on a recent version of Linux or of one of the free-software BSDs, rather than on Windows.
Re:OSS? Linux? WHY? (Score:3, Insightful)
Insightful eh? (Score:1)
Know your network. Document it! (Score:5, Insightful)
Do your network segments have multiple subnets attached to them?
Is everything subnetted properly?
The first set of questions are ones YOU should be able to answer. After all, it's YOUR network, and YOU should know how it's set up. The last two are harder to deal with, because these settings may be on computers not in your control.
Answer the first questions first, then when you are looking at packet traces, TCP/IP dumps, logs, etc. and you see a problem, you'll have a better idea where the problem is physically located, saving much time and energy.
And then there's the "dumb questions" I shouldn't have to ask: Do you have a loop? Are your cables wired to T568A or T568B standards? Are all your cables in good repair?
Re:Know your network. Document it! (Score:1)
Are your cables wired to T568A or T568B standards?
It makes no functional difference which standard you use for a straight-thru cable. You can start a crossover cable with either standard as long as the other end is the other standard. It makes no functional difference which end is which. Despite what you may have read elsewhere, a 568A patch cable will work in a network with 568B wiring and 568B patch cable will work in a 568A network. The electrons couldn't care less.
Re:Know your network. Document it! (Score:1)
Toolsets (Score:1)
The best thing you can do is use a tool such as Ethereal to find the IP of the system or systems causing it, and subject them to a good cleanup.
For a good toolset, check out the Auditor Security Tools LiveCD for a collection of tools you can take with you wherever you go...
Auditor tools [remote-exploit.org]
It's a NIC (Score:4, Insightful)
Re:It's a NIC - YES! (Score:2)
While it's *possible* this is a virus (as others have said), I'd look at hardware first. A bad tranciever will generate more bad traffic than a virus could ever hope to.
Re:It's a NIC (Score:2)
Or it could be arp flooding, or it could be a virus, or it could be a greedy student downloading music, or it could be too much bittorrent traffic, or it could be a million other things.
Troubleshooting these things for a living, trust me, nothing is certain until you've figured out what it is.
To the poster:
Use ethereal and watch where the traffic is coming from. Use management built into your switches to watch for ports going down when there are outtages. Use traceroutes to find a dead hop (if
It's still a NIC (Score:2)
It certainly could be any of the things you mention. With the vagueness of the original post, it could even be a layer 7 problem (i.e. a crappy Windows server.) But with
Re:It's still a NIC (Score:2)
Go back to your hole, troll.
Of the top of my head (Score:2)
Use to ping ip-address to see if you can get to the router and beyond. Make sure "allow ICMP" is enabled in the router.
Use traceroute -n ip-address to see where the traffic is failing.
Is it a DNS problem? Try host some.host.name to make sure you can resolve names.
Is it a DHCP problem? Try dhclient to see if you can get an IP address. (maybe pump on some systems.)
Connect a hub (not a switch) to some strategic place on the network. Give yours
Re:Of the top of my head (Score:2)
Re:Of the top of my head (Score:2)
I'm only a student, not a systems administrator so I wouldn't pretend to suggest I know what's acceptable and what's not, but this would piss me off if I knew someone was doing this to me. I imagine this kind of behaviour should be kept under one's hat
Further, random unplugging of cables
Re:Of the top of my head (Score:1)
This would be because you're a student. Students tend to think they have some right to a network and every network resource they can imagine. They don't. On the other hand, the administrator has the responsibility of making sure the network and it's resourc
Re:Of the top of my head (Score:2)
If my activities are suspect, an administrator can and should investigate and this should be mandated in the policy.
Re:Of the top of my head (Score:1)
Just because you wish it, doesn't make it so.
map, isolate, trend (Score:5, Informative)
Step 2) Isolate the problem protocols and hosts. Be on the lookout for appletalk, IPX, or old netbios. All very chatty protocols. Look for old hubs and replace them with switches. Look for comprimised boxen. Try to VLAN things logically (by department, or usage which ever is best for the environment). Tools are snort, ethereal, ntop, and syslog (any managed switches should be sending to a syslog server (I've used syslog-ng))
Step 3) Trend as much as you can. Even before the network is cleaned up, start to collect statistics from the switches, and/or hosts on your network. Any gateways should be monitored as well. This will let you see if there are problems corelated to a particular time of day, if your're going over your bandwidth etc. Tools are MRTG, or for more in depth try Cacti http://www.cacti.net/ [cacti.net]
There is much more after you get to this point, but people will be much happier the faster you get here.
Good luck
Take a step back (Score:3, Informative)
Grab a consultant from a local small Linux shop for a few days. Someone with good knowledge about system/network architecture.
Get them to poke around on your network. Provide all documentation you have available.
After the first day, you should have all the information necessary to write up a document regarding your existing issues. Make notes while he's using tools to investigate. From there you work with the consultant to come up with a separate document for resolutions with a criticality rating.
From there, you want systems in place to monitor the health of your network. Have a chat to him about it, but I'd be inclined to build a solution which was centered around using Nagios.
While consultants can (and frequently do) suck when you come to specifics, they are a valuable resource for pointing you in the right direction. And experience counts! They've done this stuff before, they know the pitfalls and proven solutions.
Consultants (Score:2)
You should read between the lines. He said: I was recently put in a situation...
Which means he is the consultant. Of course, thanks to a fake curriculum made by the sales representant of the consultancy firm, they sent him while he has no clues about network administration.
The 10 step Universal Troubleshooting Process (Score:2, Informative)
Low-tech (Score:4, Funny)
Network Traces (Score:2)
Start tracking those broadcasts down and find o
Fuck (Score:4, Insightful)
Go on, mod me 'insightfull' or mod me 'flamebait', it's one or the other.
Re:Fuck (Score:2)
Re:Fuck (Score:2)
Of course, if they were competent, there would be no market at all for conslutants in the first place.
Re:Fuck (Score:2)
Best. Freudian slip. ever.
Re:Fuck (Score:1)
You are right though, he should seek assistance.
Baselining (Score:2)
For baselining, I'd enable SNMP for all the managed devices. Then use something like MRTG with RRD Tool and chart every port for every switch for week or so.
While that's happening in the background, start mapping your LAN. Use something like Visio on a laptop and start visiting switches and routers. Confirm the connections between all the routers and switches. Then use good labels (no, not scotch tape and paper) to document those connections with F
top 75 list (Score:2, Informative)
More tools than you could learn in a reasonable timeframe can be found here: http://www.insecure.org/tools.html [insecure.org]
I would have posted sooner, but T-Mobile's data coverage has been spotty since Wilma hit. Still no power or fuel, but at least I can can get my geek-fix now.
Open source network analysis tools (Score:2)
These are some of the tools to consider, in no particular order:
You'll have to read the descriptions to decide which ones to try.
Re:Open source network analysis tools (Score:2)
As others have said good documentation of the Network is a must. I was thrown into a similiar situation a year or 2 back at my highschool (I graduated in 94, so it wasn't as a student). Aftering doing a walk through of the network and finding every single hub (there where 2 switches) and what was attached to it we could then easily locate some of the problems. In some cases they have hubs chained
read your network (Score:3, Informative)
* Put you box on the monitor/mirror/analysing port of the switch an read the traffic with tcpdump/tethereal/ethereal (If you just want to check the broadcasts, it does not have to be a monitoring port). Edit the packet filter expression until you do not see the legal/uninteresting traffic anymore but only the suspects. (They are students? Have fun to filter all the p2p traffic
* Watch out for ICMP errors, especially ICMP-redirects. Watch out for TCP-resets. Watch out for fragments. Watch out for malicious Spanning-Tree packets. Watch for SMTP to many IPs (spamming trojans), IRC (zombies), weird packets eg. fragmented UDP (zombies attacking a target)
* Check the MAC adresses in the etherframe-header ('tcpdump -e'): are they constant? If there are packets IP_AIP_B, are the accordings MACs really MAC_AMAC_B or MAC_A-->MAC_B and MAC_B-->MAC_C instead?
* Install an arpwatcher. Stealing the default-gateway's MAC is an effective DoS attack on a network.
* Put 2 NICs into a fast linux box, bridge ('brctl') them together, put this linuxbridge in front of the default-gateway. Dump again. Install a snort on it and let it see the traffic - what does the snort log say?
* Do the switches have the feature to log to a remote syslog deamon? Do so and read those logs! Check all the snmp-variables on the switches, especially the "errors". Read the logs of the default-gateway.
* Watch the amount of traffic (snmpget the port-counters of the switches and make mrtg-graphs of the results). Maybe the problem only strikes if some switch ports are under high load?
* Scan the network with nessus. Maybe you'll find some bindshells.
*
Hope this helps.
g.
Step 1 (Score:2)
Step 2
Follow his/her recommendations (which will probably be splitting the network in more l3 domains) get a 6500, or a few 3750, or if you really can't afford much a few 3550 switches (which will leave you out of luck when ipv6 starts getting used, but otherwise is a fine choice).
This is about having L3 switches closer to the end user than
Re:Step 1 (Score:1)
So, let me quickly summarize your solution:
1. Get a consultant.
2. Blow $50K in Crisco hardware (yah, you heard me, Crisco, not Cisco)
3. Put a bunch of snot-nosed barely literate retards, err, sorry, students on a L3 network where they can run fscking kazaa all day.
I haven't laughed this hard in a while
check the obvious (Score:1)
First and always first in troubleshooting networks (Score:1)
Just this summer I tracked down an error that was caused by a cisco wireless access point trying to pull electricity from the cat5. It was UNPLUGGED from the power! It took down a whole segment of the network.
The way we found it was from the solid light on the switch.
Documentation (Score:1)
If they run Cisco equipment, a show cdp neighbor will help you a lot. Keeping up to date documentation on a network (especially a large one) is a difficult task, but it will make solving future problems much easier.
All sorts of good tools (Score:2)
Here are some tools I use for just about the same thing your about to do. And a brief reasons why I use each. Start with one, then once mastered move to another.