Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Software The Internet

Managing Huge Networks with Open Source Tools? 45

An anonymous reader asks: "I work for a large multinational firm with a network that spans the globe and am responsible for evaluating the software we use to monitor our network. Our department has a lot of money, and we're usually willing and able to spend it on good commercial software. Recently though, I find myself evaluating and approving more and more open source software. We are actually in the process of replacing some of our commercial tools with software like Nagios, LooperNG and syslog-ng. We are also evaluating MRTG, RRDTool, ntop and a host of other tools. The problem is that there's just too many of them, most of which are not maintained anymore. Here's my question: What other open-source tools do you use to monitor your networks? I not just looking for names, but how long you've been using them for, how easy / hard is it to administer and I guess how well it scales as the network grows. More importantly, are their respective projects still alive and kicking?"
This discussion has been archived. No new comments can be posted.

Managing Huge Networks with Open Source Tools?

Comments Filter:
  • by shfted! ( 600189 ) on Friday August 06, 2004 @10:49PM (#9906136) Journal
    For my large network at home, I have a little device with lights that blink when the network is working. I think the guy at the store called it a switch or something.
  • ...unless you have the resources to maintain it yourself.

    You will find gaps in the utility of any set of tools you use, but a maintained project will be likely to have a community of people who can verify any patches you submit and provide tips on using the software in the most effective manner.
  • JFFNMS (Score:5, Informative)

    by szysz ( 214137 ) on Friday August 06, 2004 @11:21PM (#9906333) Homepage

    I created and use JFFNMS [jffnms.org] (Just for Fun Network Management System) to monitor my network at work.
    Its also used by a lot of people to monitor big, and medium networks.
    Its fully mantained, and customizable.
  • by DDumitru ( 692803 ) <doug@easycoOOO.com minus threevowels> on Friday August 06, 2004 @11:51PM (#9906505) Homepage
    We used to use Nagios here to monitor a large number of services on a large number of servers. We eventually agandoned it and replaced our "is the server up" monitoring with simple scripts that call fping. The problem with nagios is that the process model starts to fall apart at several hundred monitored servers/services and we really did not want to dedicate a farm to monitoring.
    • by msuzio ( 3104 )
      Hmm... we've just started using Nagios at my site, and it seems to be doing quite well at that scale.

      We *are* using 3 separate machines as data collectors, but that was also done so we could continue to grow in the future. We have requirements to monitor many different OS and platform combinations, as well as several services on those platforms. In all our searching (about a 6 month process), only Nagios seemed to fit the bill for that. So, if you can dedicate some modest hardware to it, Nagios seems to
    • I know that one of the largest datacenters and hosting providers, The Planet (and their subsidiary Server Matrix) uses Nagios for monitoring dedicated servers... and they have tens of thousands of them.

      So you should ask them how they manage it :)
    • by Anonymous Coward
      I have 230 active checks in nagios, and it really works great. You just need to adjust max_concurrent_checks, service_reaper_frequency in nagios.cfg after a while.

      My settings are:
      max_concurrent_checks=20
      service_reaper_fre quency=2

      This is on a 2Ghz P3 2/ 512MB.
  • Gkrellm (Score:5, Informative)

    by ptaff ( 165113 ) on Saturday August 07, 2004 @12:15AM (#9906625) Homepage
    A slick tool is Gkrellm [gkrellm.net], which has real-time graphical status for memory/temperatures/net/disk. Can be run in "server mode" (so no need for X on the monitored server). Lots of plugins [wt.net] are also available, from SNMP to ping tools. The project is well alive. Don't know if it floats your boat, though, as you're mentioning huge networks.

    Feel ready to own one or many Tux stickers [ptaff.ca]?
    • Another similar tool is ksysguard. Predictably, the client is written for KDE, but the ksysguardd daemon has no such dependencies.

      It seems ksysguard does display a bit more information, and is better for displaying information from multiple hosts in one window.

      But personally I prefer gkrellm. Since I do not have an overly large number of machines to monitor, I can have a number of gkrellm windows open and have a quick glance at what is going on. I find gkrellm much better for this sort of thing.
      • By the way, I don't think I'd recommend either for a very large international network of machines. Tools that try to give a graphical representation of everything might not be what you're looking for at that level.

        Searching around for SNMP tools that are still maintained is probably a good idea. If you can't find any good open source tools, try looking at proprietary SNMP tools like IBM Director. Most IT vendors have such packages, and in the case of IBM, it works on non-IBM hardware (of course it can d
  • there are three security based live cd's

    based on quick search of 'secur' on distrowatch's live cd page

    http://www.distrowatch.com/dwres.php?resource=c d

    here are the results

    f.i.r.e.
    http://biatchux.dmzs.com/

    hackin9
    http://www.haking.pl/en/index.php?page= hakin9_live

    plan-b
    http://www.projectplanb.org/

    i think live cd's are of great use when figuring out networks etc, b/c its very easy to reboot with the cd and get on and working on the problem
  • RRFW (Score:3, Informative)

    by Blaze74 ( 523522 ) on Saturday August 07, 2004 @01:06AM (#9906859)
    rrfw.sf.net is a nice gem of an app. It can automatically discover a lot of high end snmp equipment, and set itself up for monitoring that equipment. It's a mix of XML and Perl, and is really easy to add support for more hardware.
  • NetFlow. (Score:4, Informative)

    by Mordant ( 138460 ) on Saturday August 07, 2004 @01:08AM (#9906874)
  • Forget MRTG (Score:5, Informative)

    by Judg3 ( 88435 ) <jeremyNO@SPAMpavleck.com> on Saturday August 07, 2004 @01:14AM (#9906909) Homepage Journal
    Well, don't forget it as much as get something better, Cacti [raxnet.net]. Cacti is a frontend to MRTG & RRDTOOL and offers a lot of awesome improvements, such as a web frontend to add devices, device "profiles" to enable a common monitoring set for things such as Cisco routers, servers, etc and a whole lot more. We used MRTG here at our (Windows only) network, and I'm slowly moving it all over to Cacti for all of the above plus a lot more.
    • Re:Forget MRTG (Score:3, Informative)

      by Judg3 ( 88435 )
      Oi, hate to reply to myself, but another good thing about cacti is it's speed. It has a compiled version of the daemon which claims it can do over 50 checks a second - in a big network it's worth it.
    • Cacti is a very good tool, but it isn't a front-end to MRTG, but specifically just rrdtool.

      The nice thing about Cacti, compared to MRTG, is the Web interface; however, I admit that it is sometimes buggy and there is quite a learning curve to using it, if you ask me.
  • Internode NodeMap (Score:5, Informative)

    by sr180 ( 700526 ) on Saturday August 07, 2004 @01:58AM (#9907066) Journal
    As developed and used by Australian National ISP Internode. They developed it and gave it to the community... Kudos to them: NodeMap [on.net]

  • by mcco7614 ( 266304 ) on Saturday August 07, 2004 @02:27AM (#9907163)
    I bought Open Source Network Administration [barnesandnoble.com] by Kretchmar to answer this question. I was looking for open source tools to be used in a service provider environment and was unpleasantly surprised at what was revealed in this book. However, since it seems you're looking for enterprise-ish stuff, I highly recommend this.

    You'll find many of the tools within to be quite useful during both day-to-day operations and troubleshooting as well as long term planning on your network. The author does a fairly decent job of walking you through a basic installation of each tool.

    Slashdot reviewed it here [slashdot.org].

    Here are most of the tools [ktools.org] discussed in this book.
  • Support (Score:3, Insightful)

    by ikeleib ( 125180 ) on Saturday August 07, 2004 @03:08AM (#9907275) Homepage
    An advantage you have is the financial resources to make support arrangements for open source software. These arrangements can be standard, such as purchasing support from a company that expressly supports the OSS project. Or, it could be less traditional, such as finding the authors and making arrangements with them in advance. If you have the financial resources to do so, I would strongly suggest making support a criteria of selection.
  • Nagios and RRDTool (Score:5, Informative)

    by Karora ( 214807 ) on Saturday August 07, 2004 @03:22AM (#9907324) Homepage
    We're using Nagios (multiple redundant geographically diverse installations) and RRDTool fairly successfully, but that's for maybe 200 machines, tops.

    From looking at what we've achieved with these I would say that you will need to be careful trying to scale them to large networks. They can start huge numbers of processes each minute, when monitoring many servers.

    It depends what you're monitoring, of course - in our case we are monitoring maybe 20-30 operational parameters on each server. If we were only monitoring a single parameter then we could probably look at around 1-2000 machines from a single P4-based monitoring box, without any real problems. Using a 2.6 kernel on the monitoring box would also dramatically increase the scalability of it all.

    Scalability issues bite similarly with rrdtool: numbers of parameters monitored per server can ramp the load on the monitoring machine(s) quite quickly. Again that is process load, not CPU load though, and a 2.6 kernel will be significantly better in this area. It can also be resolved by scripting the collection process better - not just running some collect-the-statistics routine from cron every minute.

    If you're looking at monitoring 1000's of systems though, maybe you have enough of a budget to be able to plan around these issues.

    I'm sure that ultimately all monitoring apps run into issues with how many (parameters * servers) each monitoring system can monitor too.

  • Another angle (Score:3, Insightful)

    by Inda ( 580031 ) <slash.20.inda@spamgourmet.com> on Saturday August 07, 2004 @07:15AM (#9907808) Journal
    You say that your department has lots of money and you are happy to waste it on commercial software. You also say you like open source but are worried about it being maintained and supported.

    Sounds to me like you could create some jobs for a couple of talented programmers. Everyone wins.
  • by Anonymous Coward
    We are use looperng to do most of our snmp collection and have it integrated to netcool which is our commercial platform. we were using it since the earlir days of looper (not ng).

    It has scaled very well since now we have almost four looper collection stations collecting traps from over 6000 elements.

    Netcool is our main platform but we also use NNM and ehealth.

    We tried nagios when it was known as netsaint and found it to have poor performance at the time.
  • http://www.opensims.org
  • by Anonymous Coward
    Though alternatives do exist.

    If you've got a cluster or otherwise "clumpy" network, ganglia [sourceforge.net] is the ideal end-user-visible monitoring tool - lots of pretty and informative graphs, multicast based so no heavy network load, no particularly sensitive information unless you choose to reveal it.

    For filesystem security, samhain is mature, secure and imho very nice, though the good web-frontend is non-free.

    For network security, nessus scan on a daily basis.
  • Uhm (Score:2, Interesting)

    by neuroscr ( 132147 )
    If you have the budget why not fund a project to make sure it doesn't stop being developed. This would make good use of that money and help everyone.
  • F/OSS Tools (Score:3, Informative)

    by bastardadmin ( 660086 ) on Saturday August 07, 2004 @07:24PM (#9910741) Journal
    Not sure how helpful this will be in huge environments, I live in the small to midsize market, but here are some tools that I have found useful in the past:
    Not exactly a monitoring tool, but definitely the most versatile all around auditor I have ever found: Nessus [nessus.org].
    Ettercap [sourceforge.net] is a good sniffer.
    The MRTG [ee.ethz.ch] tool has been a godsend when I have had managed devices to deal with, and I have heard very good things about the RRD tool [ee.ethz.ch] and Cacti [raxnet.net].
    Tripwire [tripwire.org] is freely available for Linux and the BSDs, though the Win32 version has not been open-sourced.
    One tool I have not been able to find in F/OSS is a Windows event log monitor (though believe me I'm still looking).
  • by dTb ( 304368 )
    I have had good results from OpenNMS [opennms.org].
  • You would be amazed at what you can get out of a few things like Perl, MYsql, SNORT, GNUPLOT, UCD-SNMP and and apache server. Be creative, and create your own network management system. I have used HPOV, SNMPc, Cabletron Stectrum, CiscoWorks and others to manage 1000+ node, global networks. The later being the worst software i ever laid my finger tips on. The BIG boys are good but you can do the same with some simple programming and creativity. Plus maybe you'll have the chance to share your work with
  • Server Monitoring (Score:3, Interesting)

    by palmadj ( 159880 ) on Sunday August 08, 2004 @12:40PM (#9913854)
    I've recently taken the position in a large multi-national company to monitor our devices as well. We are only about a year into it. We also use a mixture of OpenSouce and commercial tools. The thing I've noticed mostly with most software today is that they are becoming more and more a mix. For instance our main managment software is currently Netcool. Netcool is commercial but it utilizes Apache, Tomcat and alot of Perl/CGI. On the commercial side it uses Java, and thier own proprietary DB called Omnibus as well as a Sybase Communications protocal. The end result it is extremenly flexible and works across all platforms I've tried with little trouble. I really don't think the future of software will be as black and white as OpenSource vs Commercial. Its really going to be a mix of both. The benifit is lower costs to the customer because of less development effort for the vendor. I like Netcool alot because its very nature is to be as flexable as possible. Its possible to use Netcool for other puposes that the develpers never thought of. Our HP NNM system is good but is nowhere near as flexible simply because it can only go as far as HP wants it to go. Hence it is only used for network discovery and SMNP trap collection. Bottom line is that the best choice in software today is the software the includes some OpenSource code.
  • Our stuff includes (Score:5, Interesting)

    by harikiri ( 211017 ) on Sunday August 08, 2004 @11:44PM (#9917509)
    We use the following tools:

    • Nagios [nagios.org]: For availability monitoring. When a service or host goes down, we know about it. Was put in place when we discovered one of our pairs of firewalls (hot standby) had silently failed over because of a faulty hdd, and we hadn't noticed it for 2 days.
    • Cacti [raxnet.net]: For throughput and performance monitoring. Makes pretty little graphs. The best thing about it is that it helps bypass the complex configuration of rrdtool by using templates. Documentation on creating new, non-standard graphs could use some work.

    Both tools give us a much better view of our network, and what our various devices are doing.

  • A great tool that I've got some experience with is big brother [bb4.org]
    The monitoring won't blow your mind, but the notifications are pretty cool. It looks like they've decided to go with a less than free license though. I don't know if you're a crusader for those sorts of things. They do offer source code, and many platforms for their client software.

  • Address management (Score:3, Interesting)

    by Paul Carver ( 4555 ) on Monday August 09, 2004 @04:41PM (#9923507)
    What tools are popular for IP address management? Most of what I've seen is pretty basic. Are there any tools that are good at combining device inventory managment with address management and assigning addresses to new devices according to configurable rules?

    For example, if I need to add 50 new web servers and each web server needs a redundant pair of NICs sharing one IP plus. Plus allowing multiple IPs per device or multiple devices per IP (e.g. VRRP, HSRP).

    Spreadsheets seem faster than web based systems if it takes multiple queries to get all the info you need. A flexible query system and integration of address management with server monitoring seems like it would be very useful, but I haven't come found anything yet.
  • by gclef ( 96311 ) on Tuesday August 10, 2004 @01:02PM (#9931014)
    Seriously, it's hugely useful. It's very nice to be able to show management that you not only have a config backup system for your network devices, but your backup system is also doubling as a change control system. It's at http://www.shrubbery.net/rancid . I tend to use something like webCVS with it, to let folks browse through the CVS configs (you will, of course, want to use authentication to restrict access to webcvs).
  • I did not see anyone mention Cricket at http://cricket.sourceforge.net/ [sourceforge.net]. Our ISP uses Cricket w/ RRDTool to monitor bandwidth on all of their infrastructure. We track our Internet usage this way. It appears to be a pretty good tool for mid to large networks. Just another good OSS tool.
  • For handling traps I use SNMPTT with Net-SNMP. It allows for traps to be converted to more meaningful messages using variable substitution.

    I've seen it used in small and large environments.

    http://www.snmptt.org/ [snmptt.org]
    http://www.net-snmp.org/ [net-snmp.org]


  • http://www.stemsystems.com

    This appears to be perl-based and Damian Conway, Randal Schwartz and Aeleen Frisch are/were advisors.
    But, it doesn't seem that much has been done in over
    2 years. They did have a writeup in SysAdmin mag, back when they first started.
  • Developed at Dartmouth.edu, now spun off for profit. Cheap and it works real good. I have spoken with many Tivoli/openview users who say Intermapper works far better on thier networks. -- nobody and I speak for one another

"Engineering without management is art." -- Jeff Johnson

Working...