Managing Huge Networks with Open Source Tools? 45
An anonymous reader asks: "I work for a large multinational firm with a network that spans the globe and am responsible for evaluating the software we use to monitor our network. Our department has a lot of money, and we're usually willing and able to spend it on good commercial software. Recently though, I find myself evaluating and approving more and more open source software. We are actually in the process of replacing some of our commercial tools with software like Nagios, LooperNG and syslog-ng. We are also evaluating MRTG, RRDTool, ntop and a host of other tools. The problem is that there's just too many of them, most of which are not maintained anymore. Here's my question: What other open-source tools do you use to monitor your networks? I not just looking for names, but how long you've been using them for, how easy / hard is it to administer and I guess how well it scales as the network grows. More importantly, are their respective projects still alive and kicking?"
On my large network (Score:4, Funny)
Avoid unmaintained software... (Score:2, Insightful)
You will find gaps in the utility of any set of tools you use, but a maintained project will be likely to have a community of people who can verify any patches you submit and provide tips on using the software in the most effective manner.
JFFNMS (Score:5, Informative)
I created and use JFFNMS [jffnms.org] (Just for Fun Network Management System) to monitor my network at work.
Its also used by a lot of people to monitor big, and medium networks.
Its fully mantained, and customizable.
Experience with Nagios (Score:3, Informative)
Re:Experience with Nagios (Score:3, Informative)
We *are* using 3 separate machines as data collectors, but that was also done so we could continue to grow in the future. We have requirements to monitor many different OS and platform combinations, as well as several services on those platforms. In all our searching (about a 6 month process), only Nagios seemed to fit the bill for that. So, if you can dedicate some modest hardware to it, Nagios seems to
Re:Experience with Nagios (Score:2, Interesting)
Re:Experience with Nagios (Score:2)
So you should ask them how they manage it
Re:Experience with Nagios (Score:1, Informative)
My settings are:
max_concurrent_checks=20
service_reaper_fr
This is on a 2Ghz P3 2/ 512MB.
Gkrellm (Score:5, Informative)
Feel ready to own one or many Tux stickers [ptaff.ca]?
Re:Gkrellm (Score:2)
It seems ksysguard does display a bit more information, and is better for displaying information from multiple hosts in one window.
But personally I prefer gkrellm. Since I do not have an overly large number of machines to monitor, I can have a number of gkrellm windows open and have a quick glance at what is going on. I find gkrellm much better for this sort of thing.
Re:Gkrellm (Score:2)
Searching around for SNMP tools that are still maintained is probably a good idea. If you can't find any good open source tools, try looking at proprietary SNMP tools like IBM Director. Most IT vendors have such packages, and in the case of IBM, it works on non-IBM hardware (of course it can d
try a live cd!!! (Score:1)
based on quick search of 'secur' on distrowatch's live cd page
http://www.distrowatch.com/dwres.php?resource=c d
here are the results
f.i.r.e.
http://biatchux.dmzs.com/
hackin9
http://www.haking.pl/en/index.php?page= hakin9_live
plan-b
http://www.projectplanb.org/
i think live cd's are of great use when figuring out networks etc, b/c its very easy to reboot with the cd and get on and working on the problem
RRFW (Score:3, Informative)
NetFlow. (Score:4, Informative)
Forget MRTG (Score:5, Informative)
Re:Forget MRTG (Score:3, Informative)
Re:Forget MRTG (Score:2)
The nice thing about Cacti, compared to MRTG, is the Web interface; however, I admit that it is sometimes buggy and there is quite a learning curve to using it, if you ask me.
Internode NodeMap (Score:5, Informative)
Open Source Network Administration (Score:5, Interesting)
You'll find many of the tools within to be quite useful during both day-to-day operations and troubleshooting as well as long term planning on your network. The author does a fairly decent job of walking you through a basic installation of each tool.
Slashdot reviewed it here [slashdot.org].
Here are most of the tools [ktools.org] discussed in this book.
Support (Score:3, Insightful)
Nagios and RRDTool (Score:5, Informative)
From looking at what we've achieved with these I would say that you will need to be careful trying to scale them to large networks. They can start huge numbers of processes each minute, when monitoring many servers.
It depends what you're monitoring, of course - in our case we are monitoring maybe 20-30 operational parameters on each server. If we were only monitoring a single parameter then we could probably look at around 1-2000 machines from a single P4-based monitoring box, without any real problems. Using a 2.6 kernel on the monitoring box would also dramatically increase the scalability of it all.
Scalability issues bite similarly with rrdtool: numbers of parameters monitored per server can ramp the load on the monitoring machine(s) quite quickly. Again that is process load, not CPU load though, and a 2.6 kernel will be significantly better in this area. It can also be resolved by scripting the collection process better - not just running some collect-the-statistics routine from cron every minute.
If you're looking at monitoring 1000's of systems though, maybe you have enough of a budget to be able to plan around these issues.
I'm sure that ultimately all monitoring apps run into issues with how many (parameters * servers) each monitoring system can monitor too.
Another angle (Score:3, Insightful)
Sounds to me like you could create some jobs for a couple of talented programmers. Everyone wins.
LooperNG with netcool (Score:1, Informative)
It has scaled very well since now we have almost four looper collection stations collecting traps from over 6000 elements.
Netcool is our main platform but we also use NNM and ehealth.
We tried nagios when it was known as netsaint and found it to have poor performance at the time.
http://www.opensims.org (Score:2, Informative)
MRTG is pretty standard (Score:1, Informative)
If you've got a cluster or otherwise "clumpy" network, ganglia [sourceforge.net] is the ideal end-user-visible monitoring tool - lots of pretty and informative graphs, multicast based so no heavy network load, no particularly sensitive information unless you choose to reveal it.
For filesystem security, samhain is mature, secure and imho very nice, though the good web-frontend is non-free.
For network security, nessus scan on a daily basis.
Uhm (Score:2, Interesting)
F/OSS Tools (Score:3, Informative)
Not exactly a monitoring tool, but definitely the most versatile all around auditor I have ever found: Nessus [nessus.org].
Ettercap [sourceforge.net] is a good sniffer.
The MRTG [ee.ethz.ch] tool has been a godsend when I have had managed devices to deal with, and I have heard very good things about the RRD tool [ee.ethz.ch] and Cacti [raxnet.net].
Tripwire [tripwire.org] is freely available for Linux and the BSDs, though the Win32 version has not been open-sourced.
One tool I have not been able to find in F/OSS is a Windows event log monitor (though believe me I'm still looking).
OpenNMS (Score:1)
simple tools needed (Score:1)
Server Monitoring (Score:3, Interesting)
Our stuff includes (Score:5, Interesting)
Both tools give us a much better view of our network, and what our various devices are doing.
Another fun tool (Score:1)
The monitoring won't blow your mind, but the notifications are pretty cool. It looks like they've decided to go with a less than free license though. I don't know if you're a crusader for those sorts of things. They do offer source code, and many platforms for their client software.
Address management (Score:3, Interesting)
For example, if I need to add 50 new web servers and each web server needs a redundant pair of NICs sharing one IP plus. Plus allowing multiple IPs per device or multiple devices per IP (e.g. VRRP, HSRP).
Spreadsheets seem faster than web based systems if it takes multiple queries to get all the info you need. A flexible query system and integration of address management with server monitoring seems like it would be very useful, but I haven't come found anything yet.
Rancid is your friend (Score:3, Informative)
Cricket Anyone (Score:1)
SNMPTT and Net-SNMP (Score:1)
I've seen it used in small and large environments.
http://www.snmptt.org/ [snmptt.org]
http://www.net-snmp.org/ [net-snmp.org]
Has anyone ever tried Stem Systems? (Score:2)
http://www.stemsystems.com
This appears to be perl-based and Damian Conway, Randal Schwartz and Aeleen Frisch are/were advisors.
But, it doesn't seem that much has been done in over
2 years. They did have a writeup in SysAdmin mag, back when they first started.
InterMapper: Network Monitoring and Alerting Softw (Score:1)