Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Data Storage Software Linux

Preventing Shutdown on Active NFS Servers? 66

Ed Almos asks: "Like many Slashdot Readers, I run a small network at home with a server and a number of desktops. The server holds all our files as NFS shares and doubles as a desktop machine should the need arise. Problems however occur if the server is shut down whilst there are NFS shares in use, the minimum disruption is a crashed desktop and a couple of times I have had to deal with corrupted files. Does anyone know of a way to prevent shutdown of a machine if someone else has drives mounted to its NFS shares ? I have already explored use of the /etc/shutdown.allow file but all this does is determine who can kill the machine. The minimal solution would be something similar to a Microsoft Windows system, where a request to shutdown brings up a warning window that there are users connected to the system, but I am not sure how to achieve this on a Linux system. Ideally I would like to prevent shutdown of a system with active NFS shares altogether, or at least until the user has unmounted and logged off the network."
This discussion has been archived. No new comments can be posted.

Preventing Shutdown on Active NFS Servers?

Comments Filter:
  • Can't do it (Score:5, Insightful)

    by djmitche ( 536135 ) on Wednesday December 03, 2003 @09:10PM (#7624244) Homepage

    NFS is stateless from the server's perspective. This is done so that the server doesn't have to track the state of a whole fleet of clients (and so that the server can pick up where it left off when it crashes and restarts).

    So the server, by design, has no notion of the number / names of users connected to it.

    The best you could do would probably be to monitor NFS traffic, and present a dialog on shutdown if there has been any traffic in the last 5 minutes or so.

    • Implementation (Score:2, Interesting)

      As a hack, you could replace /sbin/shutdown with a shell script that pops up a dialog [google.com] (If $DISPLAY is set) or asks on the console.
      • Re:Implementation (Score:5, Insightful)

        by Glonoinha ( 587375 ) on Thursday December 04, 2003 @10:19AM (#7628027) Journal
        Even better thought, he could decide that there actually is a distinction between server duty and workstation duty and decide which this particular machine is going to pull. If he needs the machine to run as a workstation, quit trying to use an unstable environment as a server. If the files and stability of the system are of any importance whatsoever then it is a server, treat it as such and buy another computer to use as a workstation (they are dirt cheap now.) Pretty simple.

        Want to see your uptime and stability rise incredibly on the server? Put it in the closet on a UPS and once it is running turn off the monitor, unplug the keyboard, and tape a piece of cardboard over the power switch so it doesn't get turned off by accident. Where the machine used to sit put a cheap replacement computer to use as a workstation - even new entry level boxes are starting at under $500 fully loaded (a little wimpy, but including all the necessary parts including a monitor) and used hardware has gotten insanely cheap (ie $200 for a full machine that is a generation or two old, PIII .5 to 1GHz range with a CRT.)

        That said, I am going to read every post in this thread to get a better understanding of how to do this - now you have my interest up.
      • Define all of your enviros at the start of your script and you won't have that problem. {BTW Not everyone uses csh... it just reeks of being a Sunny)
        • What makes you think I use csh? I use bash on Linux, tcsh on OS X

          Besides, checking to see if $DISPLAY is set tells you if the command was run from within an X display.
    • by bill_mcgonigle ( 4333 ) * on Wednesday December 03, 2003 @10:59PM (#7624951) Homepage Journal
      put a program on each client machine, call it nfsmounts and it would go a little something like this:

      mount | grep $1 | wc -l

      then write a wrapper on the server that does

      foreach client (client list)
      mounted = ssh client nfsounts `hostname`
      ok = false if (mounted)
      end

      you can hook that into your shutdown script, and abort if there are any clients who think they have a mounted drive.

      of course, read the other suggestions about mount options. Noone's mentioned sync yet, but don't mount your shares async even though the performance is so much better or you'll loose data.
    • Re:Can't do it (Score:1, Informative)

      by Anonymous Coward
      NFS is not stateless. There are files in

      /var/lib/nfs

      /var/state/nfs
      which specificly tell IP addresses and mountpoints of clients. If those files are deleted/modified while nfs is off, when nfs returns it won't know anything about the clients already connected, and their accesses will fail.
    • >NFS is stateless from the server's perspective. This is done so that the server doesn't have to track the state of a whole fleet of clients (and so that the server can pick up where it left off when it crashes and restarts).

      Wrong - the nfsd is statless.. mounts, locks and stuff like that persists...
    • NFS is stateless from the server's perspective. .... So the server, by design, has no notion of the number / names of users connected to it.

      NFS is indeed stateless, however the server does know of users "connected" to it via the mount RPC protocol which is stateful. Try 'showmount'.
  • But... (Score:3, Informative)

    by Trbmxfz ( 728040 ) on Wednesday December 03, 2003 @09:10PM (#7624245)
    Not quite an answer to the article's question, but...

    Theoretically, once the NFS server has crashed, shouldn't all clients simply freeze until the server is back? On all systems I used, this was the observed behavior, and it is quite useful actually: it seems to avoid all data loss problems (under conditions). When the NFS gets reachable, all running program go on executing as if nothing had happened.

    A solution to the original problem, though, would be: tell all user that the NFS machine is to be powered on constantly.
    • Re:But... (Score:2, Informative)

      by danbeck ( 5706 )
      This is only useful if the clients are never doing anything important. Try this on a huge cluster of webserver nodes with nfs mounts and you have one hell of a dead website. Remember kids, if you are going to go to bed with your nfs, bring along a few intrs for a safe, enjoyable time.
  • low tech solution (Score:3, Interesting)

    by ArmorFiend ( 151674 ) on Wednesday December 03, 2003 @09:14PM (#7624270) Homepage Journal
    1) Put a big piece of tape/wood/whatever protecting the power switch on the NFS server, and disable software shutdown by non-root.
    2) Put your NFS server on the *best* machine, not the worst one, so that users want to use it first. If worst comes to worst, put signs on the other machines advertising the other's superiority. (and without the NFS overhead, it will really outperform its clients!)
    3) Put your NFS server on a g3 laptop or other ultra low power system, and hide the system in your closet so others can't find it to turn it off. (hardware suggestions, anyone?)
    4) switch to samba (heh heh heh)
  • by Whip ( 4737 ) on Wednesday December 03, 2003 @09:15PM (#7624272)
    If your NFS server rebooting, shutting down, or crashing causes any problem but temporarilly 'hung' clients, you have something wrong.

    NFS is explicitly designed to be stateless, precisely to allow it to function across server reboots, crashes, and other fun. If your clients are crashing, or getting back corrupted data, something is screwed up somewhere.

    And, by the way, if you're getting corrupted data on a server crash, and the server is linux, you just had an object lesson on why it's bad that linux NFS defaults to async writes. :)
    • That would be a good point, if it wasn't horribly out of date:

      "In releases of nfs-utils upto and including 1.0.0, this option was the default. In this and future releases, sync is the default, and async must be explicit requested if needed."

      From the rpm changelog:

      "* Mon Jul 22 2002 Bob Matthews bmatthews@redhat.com>

      - Move to nfs-utils-1.0.1"

      'nuff said.
  • I know when I am shutting down my computer from a command prompt, it pops up saying the computer is being shut down in KDE. Obviously I know this because it's my computer, but don't you get a similar message when you shutdown your server? I am sure you can delay the shutdown for 30 seconds or so to allow users to unmount. I remember at our university, the 'server' when down on our solaris machines, and everything froze, but then re-started again in a couple of minutes, without problems. People booted into
    • Maybe there's a way to send one of those "net send" messages to the network saying the server is going down for a reboot? Doesn't it say something like that when you do it on the console to all those logged in?
  • Try rwall or similar (Score:4, Informative)

    by ctr2sprt ( 574731 ) on Wednesday December 03, 2003 @09:23PM (#7624334)
    There's a network-able version of wall that uses RPC (I think). It's not a foolproof solution, since it won't work if your users are logged in without an open terminal window, but it's a help. I'm sure it's terrifically insecure, but since you're running NFS you're already insecure (and so hopefully have a firewall).

    If that isn't good enough for you, there are a couple other possibilities. You could probably cobble together an utterly trivial Python (or Perl or whatever) script on your client machines, then have the server invoke it via ssh when a shutdown starts. If you aren't a programmer at all, you could try firing off an email to the client machines. As long as you have a periodic mail-checker going, it would alert you to the arrival of a new message. (Since you'd be able to use the local spool, you could have it check every 15 seconds.)

  • If I understand you correctly, it sounds like you just want to replace shutdown with a wrapper script.

    a) Move /sbin/shutdown to something like /sbin/shutdown.real.

    b) Write a shell script called /sbin/shutdown that checks for your NFS mounts before invoking shutdown.real.

    If you want to make it really fancy, you can do something like calling and parsing the output of 'df', comparing that to the contents of /etc/fstab...or just compare /etc/mtab to /etc/fstab. Whatever....just look to see that your have NF
  • by stevef ( 5539 ) on Wednesday December 03, 2003 @09:28PM (#7624364)
    If you use the correct mount options you should not have to worry about corruption when the nfs server goes away.

    The options you want (for filesystems mounted rw) are:

    rw,hard,nointr...

    A lot of people don't like these options because it means that the clients will hang until the server returns, but it is THE RIGHT THING TO DO if you are mounting important data rw. If you can't stand for your clients to hang, maybe replace 'nointr' with 'intr', but you've been warned.

    Steve
    • Mod parent up (Score:2, Informative)

      by Emnar ( 116467 )
      I work for a major NAS storage company. Using the mount option "hard" is the right advice. It sounds like the submission author is using soft NFS mounts, which is a big no-no with rw mounts where you want any kind of data integrity.
      • Re:Mod parent up (Score:4, Interesting)

        by HalfFlat ( 121672 ) on Wednesday December 03, 2003 @10:41PM (#7624842)
        Back when I was administering a mixed Unix network, we used to say the two NFS mount options were 'hard' and 'corrupt'.

        I believe that it is theoretically possible to write software that can survive a soft mounted filesystem disappearing from under it, but no one ever does. How often do people check the return value from write()? And in memory mapped io land, it would be nasty.
    • The parent has the correct answer.

      rw,hard,nointr

      Those are the correct options for a read-write NFS drive that will freeze the clients until the server has been restarted. After restart the clients continue as if nothing has happened.
    • That's what I have, however sometimes when the server's been down and back up again, the clients get a "Stale NFS handle" error trying to access the mounted volumes. umount/mount fixes it, but what causes it in the first place? Kernel 2.4.22, Gentoo.

      Hey, this IS an Ask Slashdot, right? :-)

      • NFS doesn't use full path names but compressed unique ID called cookies for each file. And it usually generates these cookies as the files are being accessed You'll get a "stale nfs handle" when the machine reboots, the old cookie (kept by the client) is now no longer known to the server. Remounting fixes it because the client will ask for new cookies starting at the file system root and forget all the old cookies.
        • Ah. And as the file system (Reiser3, in this case) assigns an inode to directories (ie making it a file in some technical meaning), that's why I get the Stale NFS Handle message even when I just try to ls /nfs-mounted-dir/ on the client? Makes some kind of sense...

          But doesn't this cookie system sorta break the statelessness of the NFS server, at least in spirit? What's the reason for doing it this way instead of having the client ask for full path names? Can this behaviour be turned off somehow? Having th

          • No, what's breaking statelessness is that Reiser doesn't really use inodes and is not a very good choice for a NFS server. Sadly, as it's a beautiful file system. IIRC, ReiserFS generates inode numbers on the fly -- directories in a traditional Un*x filesystem have inode numbers, because they are just files. Ext3, XFS, JFS - all better choices for a NFS server.

            Sorry, dude.

  • maybe... (Score:4, Interesting)

    by Froze ( 398171 ) on Wednesday December 03, 2003 @09:31PM (#7624390)
    use lsof to monitor tcp/udp/rcp sockets that are open on the host and pointing at the file space that nfs is serving.

    Then write a wrapper around each of halt, shutdown, and reboot to check the open ports and fail if they are active.

    Seems fairly hackish, but... whaddya expect from /.?
  • by stefanlasiewski ( 63134 ) * <slashdotNO@SPAMstefanco.com> on Wednesday December 03, 2003 @09:40PM (#7624451) Homepage Journal
    I can't remember the details on this, but would the NCF Locking Services [google.com] work for you?

    NFS input/output is stateless, but I believe the locking mechanism is stateful.

    When clients are accessing a file, a lock is established. When the client is done with the file, the lock is removed. You can see who has what resource locked with a utility (I forget which, but fcntl() and lockf() come to mind).

    In a shutdown script, look for locks, and refuse to procede until the locks are cleared.
    • I can't remember the details on this, but would the NCF Locking Services work for you?

      Sorry, that's "NFS Locking Services"
    • I use nfs locking for everything. It's just as good as regular file locking.

      Try this:
      Create /nfsmount/.lockfile on the nfsdrive readable by everyone.
      After the clients mount the nfs drive, have them perform a read (non-exclusive) lock on /nfsmount/.lockfile & Put it in the background.
      (clientnfslock /nfsmount/.lockfile & )
      In the server shutdown script put in a routine that fails if it can not perform a write (exclusive) lock on that same file.
      (... servernfslock /nfsmount/.lockfile || exit; ...)

      Note
  • by gl4ss ( 559668 ) on Wednesday December 03, 2003 @10:04PM (#7624609) Homepage Journal
    but wouldn't one key things to consider when building such a system be that a) it's down as little as possible and b) when it goes down it's well known beforehand(and the users can be told in advance that it will go down at time x and they're fsced if they don't get out before it).

    look, what point would there be in initiating the shutdown if you didn't know when the users will get out anyways? it could take hours/days before it would actually boot, and if that doesn't matter(waiting _hours_) then why would you be booting it in the first place? just out of habit?

    anyways.. it sounds a lot like you should be fixing on why you have to be booting it instead of how the booting occurs.. so that you wouldn't need to be booting it.
  • Alternate solution (Score:3, Interesting)

    by 0x0d0a ( 568518 ) on Wednesday December 03, 2003 @11:32PM (#7625118) Journal
    I know that it's not quite what you wanted, but using Coda, which is designed to support disconnected operation (i.e. servers goes away for a while and then comes back) may be an appealing option for you.
  • by !3ren ( 686818 )
    First off, I would recommend that you never run applications on your file server. That just seems like tempting fate to me. Get a cheap old system from reseller/friend/garbage and use that.

    Second, use FTP instead of NFS. Allow it to support resume, only let it talk locally. I believe there are utils which will allow you to mount an FTP as a drive in windows as well...

    couple of ideas anyways.
  • NFS Automounter (Score:2, Interesting)

    Where I work they had this problem and someone implemented a automounter program that mounts the shares when they are needed and releases them as soon as they are not needed. Not a perfect solution but it works really well.
    • Re:NFS Automounter (Score:2, Insightful)

      by Anonymous Coward
      This comment is definitely on the right track. I've managed a number of sites which use NFS heavily on hundreds of systems, and it all comes down to effective use of the automounter.
      At larger sites, you also get some other great benefits, like being able to move filesystems transparently from one server to another without touching the clients, all by managing the automounter map.

      Most Unix and Linux variants have a standard automounter. Use it! This will lead to two possible scenarios when the fileserver
  • rmtab (Score:3, Interesting)

    by Bazman ( 4849 ) on Thursday December 04, 2003 @06:08AM (#7626904) Journal
    The file /var/lib/nfs/rmtab on the server keeps a list of what systems have mounted NFS drives. What you then do is this:

    1. When shutting down, first go through rmtab and send an rwall message to those machines, saying 'get the heck off because the server is going down shortly'

    2. Two minutes later try again and send a more forceful message.

    3. Two minutes later tell them they are about to get it in the neck. And shutdown.

    If you really dont want to shut the machine down with NFS mounted stuff still there, then modify to taste - dont shutdown unless rmtab doesn't have any shares from machines you dont want to annoy.

    This is untested of course.... Until you try it! Read the warnings about rmtab in man mountd. It may not be trustworthy.

    Baz
  • rm -f /usr/sbin/shutdown
  • ... a long time ago. Basically, every user gets a simple switch, all switches are connected in parallel, so you get a "wired-or". All users are told: If you need the server, turn on that switch. If you don't need the server any more, switch it off again. A little piece of electronic connected to a RS232 port and a tiny C programm control the power supply for the server. Details (german only, sorry) are here [foken.de].

    The trick is: The server now knows best when to start and to shutdown, there is no more need for man

  • Couldnt you use lsof to get a list of open files and check if they are in mounted nfs shares ? If there are any when shutdown is called use a bit of scripting to pop up a dialog ? Just my 0.02
  • Home network server (Score:2, Informative)

    by 1eyedhive ( 664431 ) *
    I use samba to share my files so i can't comment on NFS too much.

    Similar setup:
    one Linux file server, 3 desktops (mixed nix and win2k), when the server shuts down, any cients active go nuts (winamp freezes, explorer complains about the sudden disconnect)
    best theing to do, since you're on a home network is to make sure no one's using the damn thing before you shut it down.

    This is easy, even with 6 systems, (assuming samba), just make sure no programs are activly using the shares and reboot the box, the wor
  • showmount anyone? (Score:2, Informative)

    by loony ( 37622 )
    showmount -a is your friend ... that will show you all clients that currently have a filesystem mounted... just move your shutdown, reboot or whatever out of the way, replace them by a wrapper script that checks if showmount -a returns any clients, and only executes the real shutdown when noone has the filesystems mounted... if they do you can always print out a list of workstations that have mounted filesystems...
  • Does anyone know of a way to prevent shutdown of a machine if someone else has drives mounted to its NFS shares ?

    Put a sign on the server that reads "Do not shutdown this server". If you want a technical solution - use a post-it.

Math is like love -- a simple idea but it can get complicated. -- R. Drabek

Working...