Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Networking Software Technology

Network Monitoring Options? 42

Nom du Keyboard asks: "We have a LAN network of 7 servers and about 400 PCs. Every so often I'll notice immense slowdowns, from minutes to occasional delays of a couple hours, while getting data from various servers, and it happens from more than just my PC. So far we haven't had any way of determining if a server has suddenly gotten tied up, or if there is some failure in the communications backbone. Without a lot of money to spend on this (I think it's more important than others right now), what cheap or free monitoring options are there available that can map and isolate problems in a network of this size?"
This discussion has been archived. No new comments can be posted.

Network Monitoring Options?

Comments Filter:
  • some options (Score:4, Informative)

    by Yonder Way ( 603108 ) on Friday December 16, 2005 @07:46PM (#14276439)
    Some of the ones I have more recent experience with. All of these require some reading and planning before you set them up.

    OpenNMS - Probably the most trouble-free NMS I've found so far. No, not "trouble-free". But the closest to it.

    Nagios - The most flexible, but also the biggest royal pain in the ass to set up & maintain. Almost infinitely scalable, though, if you are willing to take the time to write some perl scripts to automate most administrative tasks and divide the monitoring work up (several "slave" hosts can harvest monitoring data for a subset of your network and push it to your central Nagios server which greatly lessens the load on your main monitoring server). Some really great monitoring possibilities are out there if you look into NRPE with Nagios.

    OpManager - We bought this commercial solution at my last job. Great for monitoring Windows servers. A real pain in the ass to monitor anything else with any level of sophistication. It also has some fatal bugs that cause it to quietly orphan nodes if it misses a scheduled poll!
    • Re:some options (Score:3, Informative)

      by Blkdeath ( 530393 )

      Some of the ones I have more recent experience with. All of these require some reading and planning before you set them up.

      Before you get into network monitoring software, start at layer 1. Look at the physical topology of the network. Do you have network/switch maps? If not, get some. If there are none, make some. How is your network configured? Is it a high speed backbone (1G? 10G?) with low or high speed desktop connections (10Mbit? 100Mbit?) Is WIFI in play? Are you using VLANs? Are you connected to

      • This guy knows what he is talking about. Print services can be a bitch. MS, gah, chose a messy protocol and hasn't improved it much. I am assuming you have a small IT staff? Ad-hoc printing set-up? You could have extreme amounts of print server traffic. You should also try VLANing different departements out. It helps keeps problems isolated, increases security, and lowers the broadcast traffic across the entire domain so it can increase spead.
    • Re:some options (Score:2, Informative)

      by mjhuot ( 525749 ) *
      Let me start by saying I work on the OpenNMS project. You could use OpenNMS very easily to accomplish your goals. OpenNMS does many things, the features that would be most useful to you for this problem would be service polling, service reponse time graphs, snmp performance graphs, and thresholding. Here is a quick run down on each of these -

      Service Polling
      OpenNMS can be configured to poll services on your servers. It will do checks for many protocols such as HTTP, SMTP, FTP, HTTPS, DNS. NTP, RADI
    • Anyone new to monitoring a network is going to be up against a learning curve somehwere. Start with something you are comfortable with until you start to understand really what is at hand. I have driven many of them and they all have plusses and minuses. If you want something that can monitor most services, collect snmp data, process snmp traps, send thresold alarms, and send out alerts with minimal fuss OpenNMS is the way to go IMHO. I like another poster got involved with the project after understandin
  • tried (Score:3, Insightful)

    by nocomment ( 239368 ) on Friday December 16, 2005 @07:48PM (#14276450) Homepage Journal
    Cacti? ettercap/ethereal/whatever? Ran snort to see what kind of traffic is on your network? You left out an awful lot of information. I'm assuming you are running switches, but who knows? You never said the speed of your network either. Whether this is all in one building or spread across many, with routers in the middle etc... Without knowing any details I will suggest Cacti, and leave it at that.
  • etherape (Score:3, Interesting)

    by Yonder Way ( 603108 ) on Friday December 16, 2005 @07:48PM (#14276451)
    Also, set up a mirror port on your switch and run "etherape" on a machine connected to that port. You'll get a real-time graphical representation of where the traffic is going on your network, and some indications of what kind of traffic you're looking at.
  • Just network? (Score:5, Informative)

    by HavokDevNull ( 99801 ) <ericNO@SPAMlinuxsystems.net> on Friday December 16, 2005 @07:50PM (#14276466) Homepage Journal
    Then NTOP http://www.ntop.org/ [ntop.org] is your best bet, this breaks down all traffic on your network and should allow you to see who's being naughty and who's being nice.
  • by jgaynor ( 205453 ) <jon@@@gaynor...org> on Friday December 16, 2005 @07:51PM (#14276475) Homepage
    what cheap or free monitoring options are there available . . .

    If the network is the issue, the cheapest and simplest is a good laptop running Ethereal [ethereal.com] or Snort [snort.org]. Also pick up (or scrounge up) a dumb hub and if possible a fiber tap, since you're probably running in a mixed-media switched infrastructure (or maybe you're not - hence the problems :) ). If you want to get fancy you can buy span or rspan [sans.org] capable switches which will let you mirror traffic from individual ports or Vlans to a single management station port (in which case you can just use a desktop).

    This should go withot saying, but those packet captures will be useless unless you know WHERE each mac address is on the network. That said:

    1) maintain reliable L1/L2/L3 mappings
    2) Tag both ends of long cables and make sure all wallports are numbered, and
    3) beat the shit out of anyone who brings personal equipment in and plugs it in. It screws up your records and is probably less secure.

  • Besides the regular ethereal suggestions if you're trying to do something on the cheap consider installing a lightweight Snort on each of the clients. If something is up it's bound to at least trigger some sort of Snort log. And it'll cluster around your incidents. Although, hands down, Ethereal on a span port or network tap is a better option. -Pk
  • Try these tools (Score:3, Informative)

    by Matt Perry ( 793115 ) <perry.matt54@ya[ ].com ['hoo' in gap]> on Friday December 16, 2005 @08:02PM (#14276549)
    ntop [ntop.org]
    Nagios [nagios.org]
    MRTG [ee.ethz.ch]
    Cacti [cacti.net]
  • by macshit ( 157376 ) *
    A bit off-topic, but I'm curious if anyone can answer this for me:

    At work our network setup recently changed from static-IP based to DHCP based. I run a debian machine, and not all that much seems different for me, just that the machine gets its info from a server at bootup.

    However, running various network sniffing tools shows that all the windows machines on the network have become insanely chatty -- every windows machine seems to be constantly sending out packets, regardless of whether they're actually d
    • Anyone know WTF those machines are doing? Is this some "feature" gone berzerk?

      My immediate suspicion would be a virus/worm/spywarebot calling home.

      For all its many other faults, Windows usually seems to handle DHCP reasonably well.

      • My immediate suspicion would be a virus/worm/spywarebot calling home.

        I really don't think it's that. Despite their use of windows, most of the users are quite technically savvy (most are doing software development and/or chip design), and they seem to be quite good about doing what's necessary to avoid nastiness. There's also an organized structure for making sure people keep their machines up to date, and the admins actually follow through to make sure people do it. When there's a virus outbreak in the
    • You should check if the windows machines try(and most likely fail) to update dns info regarding themselves. It's the default setting as far as I can determine, and it's a royal pain in the ass for anyone not running a to-spec, windows-from-dns-up microsoft-stamped shop.
    • I wouldn't be surprised if it was directly related to newer versions of Windows. I get the impression from your post that your company uses some sort of semi-standard image. There are a lot more network aware/active bits of software nowadays. Multiple programs checking updates online. Windows probing for new printers and neighbours. Antivirus chatter. Etc.

      Or it could be a misconfigured DHCP setup that doesn't provide the correct or enough information causing the machines to send broadcasts. Looking at
    • by Anonymous Coward
      Anyone know WTF those machines are doing? Is this some "feature" gone berzerk?

      Jesus Fucking Christ. How about you look at the traffic? You appear to know about sniffing tools. LOOK AT THE RESULTS. What kind of traffic are they sending? Souce? Destination? Port? Protocol?

      So why can't I get a job when numbnuts like you have one?

      Probably because I'm bitter.
    • However, running various network sniffing tools shows that all the windows machines on the network have become insanely chatty -- every windows machine seems to be constantly sending out packets, regardless of whether they're actually doing anything or not. Given that there are hundreds of windows machines on the (ethernet) network, this means A Lot of Packets.

      Your steps should be something as follows;

      • Check the source/port/type/content of the packets to isolate what protocol is causing them.
      • Check yo
    • My first guess would be that all machines are set to take their Netbios setting from the DHCP server, which by default is on. Netbios is very chatty and useless, unless you have some 16bit network apps that need it. I would look there first.
    • Can you identify which protocols/ports the "chatter" is using? TCP 139,445? UDP 137,138,139?

      I suspect your workstations are running Windows XP, which creates quite a bit more traffic than its predecessors as it attempts to discover network resources such as file shares and printers.

      On your workstations, from Windows Explorer:

      1. Click Tools->Folder Options
      2. Select the "View" tab
      3. Uncheck the "Automatically search for network folders and printers" option.
      4. Profit!

      Whiz-bang-boom! Instantly quieter XP :

    • Anyone know WTF those machines are doing?

      Yes, they are planning world domination for Microsoft! Oh wait....
  • network or hosts? (Score:3, Informative)

    by jaredmauch ( 633928 ) <jared@puck.nether.net> on Friday December 16, 2005 @10:56PM (#14277472) Homepage
    You didn't make it perfectly clear which you were attempting to isolate, the host related issues or the network related ones. There are a lot of monitoring systems out there from NAGIOS [nagios.org] to Sysmon [sysmon.org] (author disclosure) as well as the previously mentioned OpenNMS.

    If your intent is to detect network troubles, I recommend using some system like Cricket or MRTG to graph the interfaces as well as the Errors on the interfaces within the network. This may require some finesse in setting up for the first time.

    Aside from that, Sysmon was written primarily to monitor hosts and the host based services, but was morphed also to monitoring networks. It may fit your needs as you can set up SNMP thresholds of network errors and other things.

    If you want to be super-lazy, I would download the trial of Intermapper [intermapper.com] it may be able to find these troubles for you if you can SNMP poll the devices and has auto-discovery. I've not used it in awhile, so hopefully it has support for the platforms that you are using.

  • Thats a pretty vague question, and you didn't provide enough information to really answer it right, but here's some recommendations.

    Assuming you have managed switches, collecting per-port data with SNMP is a great first start. I think Cricket (http://cricket.sourceforge.net/ [sourceforge.net] is a great system for collecting this data, but I prefer Drraw (http://web.taranis.org/drraw [taranis.org]) for graphing the data. For an example of the power available by combining these two tools, see http://stats.net.cmu.edu/ [cmu.edu]

    Once you've got that
  • I take it you don't play well with others, but you play well with money.
  • ssh and top.
  • You haven't described your network topology much. 400 PCs and 7 servers. I can make a lot of assumptions here such as you're on an entirely flat network. You're network is composed of a variety of equipment from multiple vendors and most of it is not manageable. Would either or both of these statements be correct?

    In the short-term you need to break out a sniffer. A few people suggested this. What most of the people are suggesting are service/service monitor tools. These really won't help your problem

  • by Clover_Kicker ( 20761 ) <clover_kicker@yahoo.com> on Saturday December 17, 2005 @11:36AM (#14279646)
    Are all servers affected? Have you bothered measuring the load on your servers? The problem might not have anything to do with the network.

  • 7 servers and 400 PCs sounds like a small shop, one prone to growth-by-accretion. Are you daisychaining hubs? Breaking the 5-4-3 rule anywhere? Using crappy cabling, thats at (or over) the distance limit? Are you all on a switch or switches? Do they suck? Try some network partitioning, if you can swing it, drop a PC-based router in (Linux, Win2K, whatever) and DHCP all the PCs off onto a separate subnet.
    Are all the servers Windows-based? Set up 1 master Perfmon screen with NIC and CPU usage stats for
  • If you're looking for quick and dirty up/down host/service monitoring, check out Limph [sf.net]. Disclaimer: I am the main dev on this project.

    If you need more complex system/router data, Cacti is a really good way to centralize the collection of SNMP data.

  • Free option (Score:3, Insightful)

    by obeythefist ( 719316 ) on Monday December 19, 2005 @02:16AM (#14289317) Journal
    Start -> Programs -> Administrative Tools -> Performance

    If you don't know how this tool works, please resign and hire a high school MCSE who does. But just in case you do want to use /. as a means to make yourself appear more competent at support than you actually are, here's what you do with it. Place counter logs on servers experiencing poor performance. Observe any thresholds that are exceeded that shouldn't (poor disk, cpu, memory, network performance). Upgrade/fix deficient performers. If you don't see any problems, it is likely an issue with network infrastructure (But don't run straight to blaming the network if you haven't fully investigated server performance).

    I don't mean to flame but monitoring performance is not complicated and certainly not something that should qualify for an Ask Slashdot.

    What will we see next on Ask Slashdot?

    "I am an Administrator for a medium sized busines with 100 workstations and 8 servers. We have a new employee starting next week, and I have been told this employee does not wish to use an existing user account, instead management wants the new starter to have an account with her own name on it. I have read through all the manuals but I want to know, is it possible to have a new user account on the network? Management don't want to spend any more money on licenses so this should be a cheap solution."

    "I am running a local area network with about 10 desktops and 2 servers. Suddenly last week all the computers stopped communicating. I looked at the core network switch but it appears normal, although all the lights have turned off. Management would like this fixed as soon as possible but they are on a tight budget. Are there any open source solutions, or any readers who have seen similar problems?"

I tell them to turn to the study of mathematics, for it is only there that they might escape the lusts of the flesh. -- Thomas Mann, "The Magic Mountain"

Working...