Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Software IT Technology

Ask Slashdot: Remote Server Support and Monitoring Solution? 137

New submitter Crizzam writes I have about 500 clients which have my servers installed in their data centers as a hosted solution for time & attendance (employee attendance / vacation / etc). I want to actively monitor all the client servers from my desktop, so know when a server failure has occurred. I am thinking I need to trap SNMP data and collect it in a dashboard. I'd also like to have each client connect to my server via HTTP tunnel using something like OpenVPN. In this way I maintain a site-site tunnel open so if I need to access my server remotely, I can. Any suggestions as to the technology stack I should put together to pull off this task? I was looking at Zabbix / Nagios for SNMP monitoring and OpenVPN for the other part. What else should I include? How does one put together a good remote monitoring / access solution that clients can live with and will still allow me to offer great proactive service to my servers located on-site?
This discussion has been archived. No new comments can be posted.

Ask Slashdot: Remote Server Support and Monitoring Solution?

Comments Filter:
  • by Anonymous Coward

    Set up a script to initiate a reverse-SSH tunnel from the remote device back to a monitoring server, set up no-login on the tunnel but distribute keys for the monitoring user on the remote devices.

    You should be able to passwordless login from the monitoring box over a completely secure link that doesn't require port-forwarding at the remote site.

    • by BitZtream ( 692029 ) on Saturday September 06, 2014 @03:28PM (#47842295)

      Or, do the right thing and hire a network admin so someone with a clue is involved.

      If you have to ask this question on slashdot, you need to change the question to something appropriate. Based on exactly what was posted, he doesn't have any idea what his requirements are. He knows the conceptual goals, but not the actual goals or requirements. Unless he is trying to change careers from whatever he is to a full time network infrastructure person he is going to be wasting a lot of time getting a clue. That means time he won't be spending doing whatever his actual job is.

      He needs someone who can look at his actual setup, figure what what actually needs monitored, and knows the appropriate ways to do it.

      Short of multiple Bennett hasleton length posts, and many discussions in depth, no answer coming from slashdot or all of them combined is going to be useful.

      Everyone here posting solutions has their own, certainly incorrect idea of what he wants but no one actually knows. No one so far has even started by asking the right questions. It's the blind leading the blind at best.

      • How does he not have it in the first place? Explorer with 500 client servers.. Soon as I had 5 servers I setup central mobile monitoring lol He needs to hire someone that knows what they are doing for sure. Google it and the top open source monitor comes up as a start...
      • by Anonymous Coward

        I agree. A meeting needs to be held with the technical team to determine what exactly needs to be monitored.

        With that being said, ask yourself a few questions:

        Are you looking for a heartbeat?

        Are you actually more concerned for the applications running on the servers?

        Are you looking to monitor individual pieces of hardware, e.g. CPU, RAM, etc.

        Are you trying to determine if a there is a network hardware failure as well, e.g. router, switch, etc. (did a switchport die and did I lose a particular subnet or clu

  • Scratch my back (Score:2, Interesting)

    by Anonymous Coward
    Will you do my job if I tell you the answer? You've already gotten your start. What more do you need?
  • by Anonymous Coward

    Should ask them!

  • by WayneDV ( 35999 ) on Saturday September 06, 2014 @12:58PM (#47841567) Homepage

    Check out www.newrelic.com - even their free service tier offers great features and it's easy to deploy on all servers

    • by astro ( 20275 ) on Saturday September 06, 2014 @01:01PM (#47841589) Homepage

      NewRelic is pretty sweet, as the parent says, even at the free tier. They will definitely bombard your email and phone with hard-sales pitches, though, and there's a giant cost leap from free to the next tier.

    • by Anonymous Coward

      We have NewRelic deployed and pay for it. It is worth the money for us because we not only get the "Is it up" but get to see the software stack interact with the hardware. We had one client who had feature creep and we watched their VM start to die because of memory creep and it justified putting them on another box. When we showed them the reports, they were quite happy to write a check for the upgrade.

  • by Dan Askme ( 2895283 ) on Saturday September 06, 2014 @01:04PM (#47841607) Homepage

    For Server active status (eg: am i dead?)
    Inside a while loop or sleep() if you cant be bothered.
    for(int i=0;iMAX_SERVERS;i++)
    {
              IcmpSendEcho(..........);
    }

    For everything else monitoring related. Employ someone to make a custom monitoring application ,or, Google "server monitoring software".

    • Re:Ping? (Score:4, Informative)

      by Enry ( 630 ) <enry.wayga@net> on Saturday September 06, 2014 @01:54PM (#47841873) Journal

      For some reason, disabling ping is considered a security feature, so a lot of places block it at the firewall. Cloud services (I'm looking at you, Azure) also either doesn't allow it or can't do it.

      • Ping is almost the worst way to check to see if your server is up. In fact, certain machines will return an ICMP response even after you've broken into their bios-equivalent (hello, Solaris).

        Do a service level check.It's not that hard to do a curl instead of a ping. A curl's results can show you if it's present and functioning. A ping just shows you that the network interface is responding or not.

        People disable ping because if you don't know a server is there you can't attack it. It's like enabling MAC addr

        • by Enry ( 630 )

          People disable ping because if you don't know a server is there you can't attack it. It's like enabling MAC address filtering - it doesn't really help that much, but it in a specific set of circumstances help a bit.

          If there's no other services presented to the world, yes. But a simple port scan will tell you it's up and that doesn't take long to do.

  • Have the clients connect with ssh to your server and open a reverse port. They'll each have to pick a different port on your server.

    Use something like autossh ( http://www.harding.motd.ca/aut... [harding.motd.ca] ) to make sure the ssh connection is always open.

    Having said all that, sounds like a great security hole if your server is ever breached. Plus lots of potential privacy violations.

    Marqis

    • by aheath ( 628369 ) *
      I agree that this creates the potential for a hug security that has the potential to compromise the privacy of all of the employees at 500 companies. The consequence of this breach might be worse there is a connection between his servers and a payroll system or any point of sale system. I also wonder his clients are willing to open up the ports required to support remote access to their data centers.
  • 500 OpenVPN connections is going to be a bit of a headache to keep straight. Obviously you won't have 500 tun devices so it'll be a multi-client to server config. You'll need a means of knowing that 10.20.20.x is client x and 10.20.20.y is client y. Of course OpenVPN allows you to do this but maintaining that table by hand could be a bit of a pain.

    HTTPS solutions like NewRelic aren't an option because you want to be able to ssh back into the host..

    Assuming all clients will allow it I can only think t
    • You'll need a means of knowing that 10.20.20.x is client x and 10.20.20.y is client y. Of course OpenVPN allows you to do this but maintaining that table by hand could be a bit of a pain.

      You mean like the common name of the ssl certificate used to connect in the first place? Combine this with a client-connect script to update dns and/or the ifconfig-pool-persist option and you've got a great solution.

      • Right but how do you know which connection belongs to which client without setting it all up by hand? Presumably he'll have to initiated the connection via script or manually on the first go-round so I suppose that's the proper time to build out the mapping.
    • by dskoll ( 99328 )

      Managing the OpenVPN connections is not that bad. You give each client its own key and certificate and you use OpenVPN's ccd/ directory to assign VPN IP addresses.

      We use the following tools to monitor our servers, but we're only monitoring about 30, not 500:

      • OpenVPN for accessing the remote servers. SSH if we need to log on to the server to do something. Some of our more important servers include built-in KVM-over-IP ability which can be very handy if the OS locks up.
      • Xymon (formerly known as Hobbit)
      • by mlts ( 1038732 )

        I personally have used Xymon with more than that many systems. It takes time to classify them, but it is doable.

        The price is right on Xymon, however, if I were to recommend a monitoring solution for both real time, "oh shit" monitoring such as a drive array about to fail as well as a historical log (for security and finding a baseline), I'd go with Splunk if possible due to the tools available, and the fact that you can send management-friendly reports about the health of the enterprise up the chain.

        Again,

  • Just download JFFNMS - it's a Net Monitoring system more than capable enough to watch 500+ servers. It can also be configured to do email and text alerting. It monitors CPU, Memory, Disk etc. It's pretty much the open source version of Nagios.
  • by Wycliffe ( 116160 ) on Saturday September 06, 2014 @01:24PM (#47841713) Homepage

    I do something similiar. I use openvpn and x11vnc. I have a cron on each client that runs a
    small perl script that grabs the output of several programs like top, uptime, and sensors
    and then saves the results in an easy to parse file that my server periodically grabs so that
    I have stuff like cpu temperature, cpu usage, memory usage, etc...
    I also grab a screenshot of x11vnc using vnccapture.
    I also have a way to remotely activate reverse ssh if for some reason openvpn fails.
    My only problem with openvpn is key management. Creating and distributing unique keys
    to each client is kindof a pain.

  • by 93 Escort Wagon ( 326346 ) on Saturday September 06, 2014 @01:35PM (#47841769)

    Make damn sure your clients are aware of exactly what you're doing. They probably don't care about the specifics (e.g. openvpn, reverse ssh); but they need to know you can remotely access the boxes.

    It's probably a good idea to have some sort of document to give them that does spell out all the specifics - something they need to acknowledge/sign, with both of you keeping copies.

    • The media says Target was breached due to a compromise at their HVAC vendor. Do you want to be the vendor that gets hit with a liability suit because someone broke in through your network?

      It's obvious from your question that you're not really sure what you're doing. SNMP? That's for network crap, not for server and application level stuff. Why would you even talk about SNMP? Why would you even want a VPN into the customer network?

      If you need access to your server, write it into your support contract, and as

  • is the solution I use and is working well. Routers are 1U mini atx boards with pfSense. Nagios mostly with NRPE, SNMP for devices, on which I can't install packets. Works well for last ... 8 years or so.
    • Icinga rather than nagios... always... the simple basic changes to Icinga make it so much nicer to work with, even the v1 branch which is just a fork with some updates

  • by TheGratefulNet ( 143330 ) on Saturday September 06, 2014 @01:56PM (#47841885)

    not really. snmp is an afterthought for them and its clumsy as hell to add snmp to it. I tried and gave up. instead, I picked hobbit (uhm, the new name is 'xymon').

    xymon has its quirks but it was not hard to modify to add more snmp features to and its coding was not too bad to get thru. its not written in a lot of 'strange' languages, and that's a plus, to me, too.

    personally, I usually just write snmp code fresh, from scratch, using net-snmp mgr tools. its not hard and you get just what you want and you are not muddled down in lots of 'infrastructure' that someone else thought was good but useless to you (like zabbix).

  • Excellent monitoring solution can generate KPI based reports, email/sms/snmp notifiactions etc, comes with a bunch of out of box server monitoring modules and you can build your own with scripts or SNMP GETs. I swear by it.
  • by Anonymous Coward

    Zenoss is awesome and as your business scales so can it. Our organization monitors 5000+ servers worldwide in all sorts of places. Zenoss lets you do everything you'd want. Setup notifications for one or more servers, types of errors, and filters within filters. It's a rocking platform and if you're big enough, they'll set it all up for you for a fee.

  • Would need more information on the locations. Running Linux, Windows, Solaris? I presonally use Zenoss for all of my monitoring. It is handling around 1800 devices right now and monitors all aspects of the network and servers. Zabbix uses agents. So you could run the server at your location and of course the agents connect to it for monitoring. People talk about needing a VPN connection to be safe. But another solution that I would do is use stunnel for encrypting. I do run a large openvpn setup as well. Wi
  • NAV [uninett.no] is a great network and server monitoring suite...I have it monitoring much stuff connected over VPN.
  • The ELK Stack (ElasticSearch, Logstash, Kibana) are great tools for capturing logs from *anything*, indexing and massaging of the data captured, and then offering up visualization, searches, and dashboards (that refresh). Built with Angular.js so the speed happens.

    We could be talkin' web server logs of the NY Times servers, centralized and displaying dashboards in real-time, or maybe 24/7 sensor data streaming from the ocean floor. The ELK Stack can do it.

    First googled citation, and there's plenty more wher

    • ELK works but frankly it's defaults do just about nothing. As a stack sure it's great but it needs to be added as an adjunct to a real monitoring system and it needs useful defaults and/or some sort of add on repository. The opennms boys are working on showing rrd data into ES.

      Pretty much you set up ELK and go great my logs are all one place but it does nothing by default nor is it easy to do anything useful with it. Adhoc searches of logs is great in all but your basically replacing ssh cat | grep. Tak

  • by account_deleted ( 4530225 ) on Saturday September 06, 2014 @03:52PM (#47842423)
    Comment removed based on user account deletion
  • Call me silly (Score:2, Insightful)

    But shouldn't this have been part of the design BEFORE you rolled out 500 servers?

    • I'll bit, and I'll call you silly.

      Many projects evolve over their lifetimes. This isn't just an IT thing. In many cases during the construction / commissioning stage you'll come out of the end with a wishlist of things and features to add in the future. Many such things would be impossibly expensive (both in money and lost time) to add during the project stage, and many projects which demand everything from the very beginning end up turning into an unmanageable behemoth.

      If the primary goal was to get 500 se

  • I would write a wrapper though to make the whole thing bit more robust. Groundwork does this with their GDMA agent and it allows you centrally configure and have the client pick up its configuration.
  • Let me know if you want to do something like this and we can work something out. Reply to this and we can connect.

  • A place I worked for did exactly that. There are a few details that you should attend to - give out ip addresses based on the ssl certificate used by the openvpn client (and make sure you don't deploy the same ssl cert to two servers!), and have a method of restarting openvpn every time it crashes/disconnects (and exits). You'd be surprised how flaky enterprise internet connections can be. From there my work kept a database of all the openvpn servers and used it to generate a nagios config. Honestly, I've n
  • Have you considered SolarWinds Server & Application Monitor? The latest version, currently in beta adds an optional agent that negates the need for VPN tunnels. It supports overlapping IP address space, NAT traversal, passing through authenticated proxy servers, and communications are fully encrypted. These agents report back to a single, centralized server at your location, or in the cloud, such as Amazon EC2, Azure, RackSpace, etc.. More information can be found at the following links. https://thwac [solarwinds.com]
  • I want to drastically increase my clients exposure to attack by opening remote holes in their network firewall through my equipment. How can I best go about doing so?
  • ... is your friend. A simple shell script run from cron every so many minutes to test to each server, and then text / email / raise an alarm if no answer. I'd do this from at least 2 locations to allow for transient network issues or the monitoring systems have hardware issue and tank. And don't use windows for critical stuff. A couple of low end linux systems on amazon or similar would work. Low cost, efficient and very manageable.
    • Duct-tapey but it works. Lets add some bailing wire and include a phone with a limited data plan that you can pair with via bluetooth or usb. That way, when both their internet connection *and* the box you are monitoring go tits up at the same time you can be notified as well.

  • Take a look at The Assimilation Project [linux-ha.org]: What we do: Continually discover and monitor systems, services, switches and dependencies with very low human and network overhead.
  • I manage a hub server and a backup server. Every 60 seconds the backup server crontab (wget) fetches a 'web page' from the hub server which as a side effect records the callers IP address into a file. Even though the backup srever has a dynamic IP address I can always find it by going to the hub server and looking into that file.

    I have a page I can go to on the hub server which checks the timestamp on the file BackupServer.ip. if it is suspiciously old then that web page turns red and tells me that thin

  • Our company uses 'Whats Up' by Ipswitch. Currently monitoring over 2500 devices such as servers, routers, temperature sensors. You can ping devices, monitor for SNMP events, logged events in Windows, AIX, Linux, WMI monitoring, services, tasks.... You can script custom monitors either via VBscript, Powershell, or JavaScript. You can script custom actions for Whats Up to take upon detecting a condition. Can restart services on either *nix or Windows boxes if they go down. Can launch applications if nee
  • Comment removed based on user account deletion
  • This is an amazing product. I've used this in the past and LOVE it! Need to run a remote powershell command from your android? It does that. Dashboards for all the things? Has that covered.

    Check it out:

                    http://www.pulseway.com/

  • If you're using linux or BSD, another option to reverse ssh tunnels or openvpn would be EPS Conduits: http://eps-conduits.sourceforg... [sourceforge.net]

    It was written with the goal of having a large number of remote devices form a virtual network for ease of management/maintenance.

  • Cacti [cacti.net] is a FOSS monitoring service, that can give you a big dashboard showing up/down status, and you can drill down to view graphs of pretty much anything you can monitor over SNMP. Oh, and you can have emails on up/down and reaching thresholds (eg "$host has reached threshold of 75% full on /var/" or whatever).

    We have VPNs to each data centre and client site and administer them over SSH generally. Some systems (eg ones dealing with customer details like credit cards) we have a single external facing hos

  • So about 7 years ago I tested out Nagios, What's Up Gold, Cacti, Zabbix, SolarWinds Orion, and a variety of other software monitoring solutions and the problem that we had for almost all of them is that they required heavy customization or that they were incredibly expensive when they included more initial customization regarding device discovery, included templates, etc. (a la SolarWinds). We finally settled on PRTG (www.paessler.com) because it had some of the industry standard devices templated already
  • Check out http://www.continuum.net/ [continuum.net]. I've been using their services for over 5 years, and they've been steadily improving it since they split from Zenith Infotech. No, it's not free, but it's quite cheap per unit and you get a lot of bang for your buck. Remote monitoring and alerts on any service, remote access, at-a-glance dashboard, etc. With 500 clients, I'm guessing you'd rather spend your time monitoring the situation than putting together a custom solution.

The hardest part of climbing the ladder of success is getting through the crowd at the bottom.

Working...