(More) Intelligent Network Monitors? 26
Genady asks: "Maybe I'm getting old. I've been looking around lately at all the little scripts I've created to watch log files, drive space, web pages, general SysAdmin stuff really. It's really a mis-mash of stuff I've written and aquired over the years. I've used the higher end enterprise management frameworks, as well as lower end apps like NetSaint. My problem with these has always been lack of intelligence. Does anyone know of a project to do monitoring/alerting coupled with some artificial intelligence that can learn that I don't care about particular servers after a certain time of day?"
Silver Bullet (Score:3, Informative)
There are open source programs that to a bit, but personally one of the best programs I found when I was in the business was ACE-SNMP, it's been sold back to the original developer and can now be found at http://www.snmx.com/Download/
I'm not sure of the pricing and other restrictions, enterprise license and all, but I believe he was trying to market it to general customers as well.
Re:Silver Bullet (Score:1)
Re:Silver Bullet (Score:1)
Net Monitor (Score:1)
Re: (Score:2)
obvious: (Score:1, Offtopic)
Offtopic? (Score:1)
that sounds dangerous (Score:1)
Nagios [nagios.org] has been working perfect for me. Tell it that you don't care if the porn site you host on your employers' equiptment goes down between the hours of 1am and 7am and it'll leave you alone till then. I've also heard good things about Big Brother [bb4.com], but haven't tried it.
Re:that sounds dangerous (Score:2)
What about crond? It comes installed on most systems and if integrated with your scripts, it will work wonders.
OpenNMS (Score:4, Informative)
OpenNMS [opennms.org] has some pretty good builtin functionality, and tries to make it easy to plugin more intelligence.
Larry
nagios (Score:5, Informative)
No artificial intelligence or learning is involved in the system, but just specifying it does get the job done (and probably in a more straightforward and predicatable way than a neural network or somesuch).
You're required to specify hours for contacts, as well. Eg., the on-call pager only gets messages outside of office hours, individual sysadmin pagers only get messages during office hours, etc. The contact settings are broken down by host and service, too, so, for instance, you can have it so the Oracle DBA won't get a page when a host goes down, but the unix admin will.
I've only been using nagios for a few weeks, but I've been really impressed with it. All the shortcomings I saw with other monitoring systems are fixed. The dependencies keep me from getting 20 pages when a router goes down. check_by_ssh allows me to have an individual key for each thing I want to check on a host (such as load), without running any additional daemons - and without giving the monitoring system a shell on the system. Events allow me to get information from the time of the alert - such as by running top on a host with high load, or traceroute for an abnormally high ping response time. Scheduled maintenence windows allow me to simply visit a web page, and set a maintenance time for something, and all the alarms don't go off during maintenance.
Inheritance in the template-based configuration files allows you to specify all the basics for a host or service in a single place, too, so you only need a few lines to specify the actual host or service to be checked. Since the host names can be separated by commas in the definition, it doesn't take lots of repetition for a number of similar machines.
In other words, I wouldn't call it low-end any more. :)
AI training pain (Score:1)
Defining such rules is probably a lot easier than having to simulate or live through thousands of failures of each subsystem just so you can train the AIs.
Re:AI training pain (Score:1)
Start sharing _YOUR_ scripts! (Score:3, Interesting)
Maybe you already wrote that 2 years ago!
Why don't we start making available the stuff we've already done?
Anyway, a bit of karma whoring for meeee tooo
Have a look at mon [kernel.org], a nice package directly off kernel.org, which is sooo nice that I'm actually scrapping my script in favour of it!
Shameless Plug JFFNMS (Score:2, Informative)
You could try JFFNMS [sourceforge.net] Just for Fun Network Management System
If the feature you want it's not there yet, you can create it easily.
Someone bored today? give it a try : )
Compaq Insight Manager (Score:2)
CIM can be integrated into non CPQ enviornments as well, though it takes a bit of work. It's all free though (as in beer, not speech).
Spong! (Score:2, Informative)
Spong [sf.net] (demo) [monsters.org] works for me. Runs on pretty well any Perl 5 installation, some support for NT, and it's reasonably easy to extend.
Oh, and the degree of customization possible on "who gets notified about which services on which machines at what time, and at what severity" is truly mind-boggling. Or perhaps I boggle easily.
Freshwater (Score:2)
They are out there, for a price. (Score:2)
Unicenter TNG is an Enterprise Management System, which is different than a network management system. Unicenter TNG allows you to monitor, control and automatically respond to events in your enterprise from a failed router to a single process that is about to have difficulty. It is infinitely configurable to manage and respond to events in very intelligent and or complex manners. It has agents called Neugents that actually learn from events in your environment and become increasingly intelligent, ultimately able to predict failures and when they will occur, well in advance of the actual failure. These events can then be responded to automatically, which prevents the failure from actually occuring.
Unicenter TNG can manage almost anything, literally. It can monitor logs or other files, manage hardware, manage protocols, backups, authentication, virus control, security and firewalls, manage databases or individual processes, or even manage complex business processes and jobs across the enterprise. It operates on a very wide range of platforms and can schedule and control individual jobs across all of those platforms.
Having said all that, CA also offers, for free, the Unicenter TNG Framework [ca.com]. This is the core processing engine of Unicenter but without the agents or options. It runs on most any platform and a Linux version is available. In fact, it use to come with the Suse distro, though I am not certain that it still does. With a fair bit of work and if you write a few of your own agents (the agent SDK is also free) you could give your scripts a level of intelligence that is just amazing.
Re:They are out there, for a price. (Score:1, Interesting)
Netcool - not OSS (Score:1, Informative)
MON? (Score:2)
Also, since the config file is pretty easy and can use M4 to define time periods and addresses to send alerts to, I guess it wouldn't be so hard to write some kind of thingy to update M4 definitions according to its own observations of what you give a damn (or not) about.
Demarc PureSecure (Score:2, Informative)
Re:Demarc PureSecure (Score:2)
BMC Patrol (Score:1)
Try here: