Worldwide Performance/Usage Monitoring Software? 9
Wee asks: "I've got a need to monitor a bunch of Sun Solaris boxes worldwide for various load statistics. Things I need to see are disk I/O, CPU usage, RAM usage, IP traffic, etc. I need reports by the hour, day, week, month and year (as near real-time as possible). The reports have to customizable. I don't have a GUI running, so these stats need to be compiled from the console, and remotely. I can't figure out how to log system/network/load/usage statistics without loading the server in doing so. All the packages I've found so far are either limited or not appropriate in some way. I'd really like to hear how to monitor something without affecting the monitored thing. Since I need pretty precise numbers this is kinda important. I'm sure there's something somewhere that does what I need, but I can't seem to find it. If I have to use multiple tools and collate data then I will, but this wouldn't be ideal. "
Son of MRTG (Score:1)
http://www.munitions.com/~jra/cricket/
which is built around
http://ee-staff.ethz.ch/~oetiker/webtools/rrdto
plus whatever other tools and bits and bobs you find/create/get pointed to by everyone else.
OpenView (Score:1)
If you want free, I've heard good things about MRTG, bu
t I don't know how customizable it is.
OpenView (Score:1)
If you want free, I've heard good things about MRTG, but I don't know how customizable it is.
Monitoring Unix machines (Score:1)
product. I'll avoid product plugs as I'm biased. There are a number
of commercial products. HP, Tivoli/IBM, Platinum/CA, Compuware, and
BMC all have products. There are also some open source packages,
though I'm less familiar with them. All address much of your problem,
but none of them will be an out of the box solution. Like as not the
long term summarizing will remain your problem.
However, I want to address to some issues that I see in your question,
so you avoid some of the mistakes I've seen people fall into.
First, wanting to be "real-time" raises a red flag with me. Be
careful of wanting to collect data on a very fine granularity. In
many cases (cpu utilization, run queue length) the numbers are really
averages over time. Collecting them too often degrades their meaning.
There's also a trade off between how often you collect data and the
overhead of collecting it. Give serious thought to how much you
"care" about short lived perturbations. Would you really do something
about them? Also think about what the numbers you are collecting
really mean over the time frames you collect them.
Second, there is absolutely no way to collect data without impacting
the system. You can minimize the impact a number of ways. Don't
collect extraneous data. Use efficient means of collection. Offload
data analysis and summation to a different machine. But, you can't
eliminate the overhead altogether. The data is on the machine it's
on, and that's where you need to get it.
Third, don't worry too much about precision until you are sure what it
is you are being precise about. By and large all any product can do
is collect what the kernel has to offer and maybe add some value in
terms of summarization and correlation. Give serious thought to what
you really need to track. The more you understand what the OS and
machine are up to the better off you are. There are a number of good
books on tuning and internals.
Most of all, remember that the point of the OS is to *use* the
machine. Sure, it's to use it efficiently and fairly. You want to
detect inefficiency and unfairness as well as any major anomalies, but
to be fair about the stats, you have to take time to understand what
the OS is up to and why the folks who wrote it collected the stat in
the first place. I can't emphasize that point enough.
Orca (Score:1)
I'd recommend Orca (http://www.geocities.com/ResearchTriangle/Thinkt
martin
ucd-snmp will do what you want (Score:1)
MRTG ! (Score:1)
It groks SNMP, but, has a real simple way of running a program to return the critical values for a given thing that doesnt happen to run snmp agents.
It simply handles year/month/week/day graphs without keeping boatloads of data.. the data is simply combined as it gets older, so log files do NOT grow.
It takes very little cpu. A halfway decent box can monitor a hundred routers and create html graph pages for them.
A lot of people have written patches & addons etc for it.
It only costs a CD from cdnow.com to the author. and thats optional.
Check mine out.. i put this up in 30 minutes. It monitors my eth0 interface, and i dont even run snmpd. My sdsl line [dslreports.com]