Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Unix Operating Systems Software

fsck-less Booting? 50

patrick42 asks: "I am working on a project where I'll be replacing a DOS/Windows-based system with that of Linux or FreeBSD. The company for whom I'm working uses cheap PC's running some proprietary software on DOS/Windows to perform a certain task. The machines are deployed in environments where there are no keyboards or displays, and minimum-wage clerks are the people watching these machines. The company has decided to go with a free Unix system because they no longer wish to pay the licensing fees for Windows. The machines get unplugged all the time when they are moved or whatnot. They do not get a proper shutdown procedure ever, and it's not possible to change this due to the environments in which they are deployed. I've been told that they've never had a problem running DOS in terms of filesystem corruption. So I guess I'm looking for the safest filesystem possible that I can use with either FreeBSD or Linux. My head would be served on a platter if I picked something that sometimes requires user-intervention." Note that Ask Slashdot covered a similar question back in 1999, the situations differed, but the need remains the same: can Linux work in environments where proper shutdowns are rare-to-non-existant?

"I have run many Linux machines, and I've experienced firsthand (only on occassion) where a machine did not get properly shutdown, and then on the next boot user-interaction was required to run fsck manually.

I really want to use either FreeBSD or Linux, but if there is any chance of this happening (hardware failures excluded) where someone needs to manually run fsck, I will not be able to use them.

I've been reading about the ext3 filesystem, and how corruption is quite rare, but it still seems possible. UFS claims to be quite stable as well, but fsck-less booting will not be available until FreeBSD 5.0 (from what I've read).

These machines aren't doing too much writing to the disk -- they are mostly just reading data, but that isn't to say that there will be no disk writes at all.

Can anyone offer some advice?"

This discussion has been archived. No new comments can be posted.

fsck-less Booting?

Comments Filter:
  • by earthy ( 11491 ) on Saturday August 24, 2002 @07:42AM (#4132667)
    The simplest solution is to go read-only for all system data, such as binaries and static configuration. Even better: use something of a commit-system to commit configuration changes to disk and have the disk be read-write only when committing changes.

    Even though you'll still run with fsck this will not be a problem, as stuff can't have changed for reasons other than hardware failure... and you're not going to work around that in software anyway.
    • Amen to that... What about using a ROM based linux of some sort? Sounds like a POS or something where you'd be connecting to a central server anyways.

      If not, and you do need local storage then maybe a battery backed memory storage device, or a flash system could work, too.
  • by Anonymous Coward on Saturday August 24, 2002 @07:59AM (#4132687)
    Disable (or strictly limit) the write cache (-> Relevant documentation [tldp.org]). Use a journaling filesystem. The result will be at least as good as with using the FAT-filesystem and DOS. The journaling filesystem means that the filesystem *structure* will always be consistent. Disabling the diskcache will reduce the chance of inconsistent *data* to DOS levels. But in the end, the application has to have precautions against inconsistent data, much like a journaling filesystem protects against inconsistent filesystem structure.
    • I believe with ext3, you can journal the structure and the data.

      First, you'll want to disable the periodic filesystem checks with this commands (ref. http://www.symonds.net/~rajesh/howto/ext3/ext3-5.h tml#ss5.4):
      tune2fs -i 0 /dev/hdxx

      Then, by using the right journaling mode, you can have it journal your data as well.. I believe putting the option ournal=data is what you want.

  • Synchronous (Score:5, Informative)

    by Phaid ( 938 ) on Saturday August 24, 2002 @08:05AM (#4132694) Homepage
    I've had to deal with a situation like this before - a hard drive in a laser printer, where there was no shutdown procedure, only an on/off switch. I used the "sync" option - e.g.

    mount -t ext2 -o sync /dev/hda2 /usr

    This causes the filesystem to be mounted synchronous, so that there are no deferred writes and all disk writes are committed to the disk before the I/O call returns.

    This is not 100% fool proof either, as it is still possible to power down the machine in the middle of a write, but it makes it much more difficult to screw up.
    • Re:Synchronous (Score:4, Insightful)

      by orangesquid ( 79734 ) <orangesquid@nOspaM.yahoo.com> on Saturday August 24, 2002 @08:25AM (#4132730) Homepage Journal
      Be wary of modern hard drives---some of them may use a write cache internally (from what I have heard, anyway).

      As for manually running fsck, you don't have to; calling /sbin/fsck -A -a will automatically repair the filesystems, without prompting for questions. On slackware [slackware.com], the appropriate file to edit is /etc/rc.d/rc.S --- Of course, your distribution might be one of those with a million confusing files in /etc/init.d and /etc/rc*.d in which case I can't help you :)

      If you want to disable e2fsck:

      The root filesystem will be checked first unless the -P option is specified (see below). After that, filesystems will be checked in the order specified by the fs_passno (the sixth) field in the /etc/fstab file. Filesystems with a fs_passno value of 0 are skipped and are not checked at all.
      -- fsck(8)
      • Knowing only how to get mysql and apache to start up on boot I'd say to follow this gent's recommendations, using /etc/rc.d or /etc/init.d daemon initializations seems like the easiest and least obscure method though there may be better routines for dealing with the file system.

        I work on OS X so... it's all BSD to me and the file system does things like auto reboot on power failure with a simple checkbox.

        Good luck.


      • Be wary of modern hard drives---some of them may use a write cache internally (from what I have heard, anyway).

        Of course, in the event of power being turned off, many(most?) of these drives are also smart enough to commit cached writes as the drive spins down.

      • Re:Synchronous (Score:5, Informative)

        by reynaert ( 264437 ) on Saturday August 24, 2002 @10:18AM (#4132938)

        Be wary of modern hard drives---some of them may use a write cache internally (from what I have heard, anyway).

        You can disable this with hdparm -W 0 /dev/hd*. Other hdparm parameters may also be interesting.

    • Re:Synchronous (Score:2, Interesting)

      by displague ( 4438 )
      You may also want to mount non-write needing partitions (/usr, /) as read-only... Generally, depending on the software, you can create a system which only writes to /var, /tmp, and the user-home directories.

      Ofcourse, you should also use a journaling filesystem like reiser or ext3. These filesystems tend to take the whack out of impropper shutdowns.

      If you have a modem, it is possible to direct all of the linux boot, and often (on newer systems) the bios, out to the serial port. This way you can handle all user intervention. Short of boot-up/bios your best administrative interface will be your ssh client. (don't forget to remount rw before installing new software)

      I thought you said this application was DOS based? Why do you need windows licenses? You could easily go with FreeDOS, or just use an existing DOS 6.x license. You're not guaranteed an ssh interface that way, but you tend not to get instability as well.
  • Why you don't create a readonly filesystem where all the binaries and static data resides. And send all the mutable data by socket(or a file in an nfs server).
    Other solution could be having a partition with static data readonly another one with all the mutable data readwrite. And modified the init scripts to check if there is any error that fsck unattainded cannot correct(or an pseudo-attainded one always responding yes), then recreate the filesystem. In this case you loose all the old info :(). But you assure the always on.

  • Diskless? (Score:5, Insightful)

    by Col. Klink (retired) ( 11632 ) on Saturday August 24, 2002 @08:38AM (#4132745)
    Are the machines networked? You could use Etherboot [sourceforge.net] to boot over the network and have no local disks (or just a floppy if you don't want to make boot EPROMs).

    You didn't really say much about what the "certain task" these machines do. Do they need to save a lot of data? You could boot off a CD-ROM and use RAM disks for /var and other writeable partitions. Each time the machine is unplugged, it returns entirely to its initial state.

    If you want to save a small amount of data, you could put a VFAT formatted floppy and write persistent data there.

  • by Zocalo ( 252965 ) on Saturday August 24, 2002 @09:09AM (#4132793) Homepage
    In addition to automating fsck's you could circumvent a large part of the issue by using one of the various journalling file systems (EXT3, JFS, ReiserFS, XFS). Being able to roll-back to a known good state is an ideal way of avoiding having to run fsck altogether.

    Secondly, once the box is configured, edit your fstab file and change any partitions which don't need to be written to to be mounted read-only. If there are no writes to a volume, then there is no need to check the volume (this is how I used to speed up post hard-down boots before journalling filesystems). It's a good security practice as well - in combination with chattr it can be a very effective "escalation of priviledges" block.

  • Ext3 (Score:2, Interesting)

    Ext3 in ordered mode [redhat.com] was my first thought. I'm comfortable with the stock kernel - after crashes (and on regular intervals), I don't think I've had to intervene with the fsck at bootup. If you'd prefer more QA, then you might examine the patches that, say, RedHat and Debian provide in their kernels, and stick with one.
  • These sound like point of sale terminals or similar. Are these machines networked?

    If so, then it shouldn't matter what filesystem you use, so long as you mount it read-only. Then, keep all writable data on a RAM disk (for /temp and friends) and on the network (for real data).

    If the systems are new enough, I'd even consider booting from CD or network and doing away with the hard drives completely.

  • Darwin & OS X (Score:3, Informative)

    by benh57 ( 525452 ) <bhines@alumnREDH ... edu minus distro> on Saturday August 24, 2002 @09:33AM (#4132852) Homepage
    Darwin and Mac OS X runs fsck itself at boot and does not ask questions. It seems to handle it well. Darwin will run on x86. Mac OS X is proprietary, but does not have the steep licensing fees that windows does.
    • How fsck runs at bootup is determined by the boot-scripts, so its a 'distribution' (ie RedHat / Debian) thing as opposed to the 'kernel' (Darwin / Linux).

      Also while darwin does run on x86 it would make much more sense to use one of the more established 'FreeOSs' for the architecture, as you are more likely to find support.

      Finally saying that Mac OS X does not have the steep licencing that Windows does is just plain wrong, it still costs $129 for a client and $499/$999 for a server. And you have to buy apple hardware, which is, often, more expensive than x86 equivilents.

      Finally if there is already software for DOS, why not use FreeDOS http://www.freedos.org/. It maybe basic but it would not require a bunch of retraining, and rewriting code to do the same things in a different way.
    • Mac OS X is proprietary, but does not have the steep licensing fees that windows does.

      Yes it does. Mac OS X is a $100 OS with a $1000 hardware key. (I make no statement about the TCO, only the initial cost of hardware and OS per seat.)

    • Actually, as a long-time Mac advocate, I don't think MOSX does this very well. With HFS+ it can sometimes take two or three passes of fsck before it comes back with a filesystem OK message. If MOSX crashes, always hold down command-s when you restart and type fsck -y until you get the OK message.

      Furthermore, clerks don't need all the things that MOSX does well, and probably shouldn't have access to that.

      IMNSHO, I like the suggestions for booting from a read-only volume and writing all data to a network share. I suggest that there be no floppy drive available. And if you use a CD-ROM, find a drive that you snap the hole on the spindle (like laptops have, so it doesn't shift when the clerks drop the machine) and one that you can lock, or otherwise make it difficult to eject.

      Netboot could make this very easy and cost-effective if you have enough machines at a location. Then the individual terminals don't need excess parts like CD-ROMs, floppy (data loss devices), or hard drives. Pretty much the definition of a "thin client", like IBM's Network Station 1000 series, which can be had for $500.

      Although LCD iMac point of sale terminals would be tres chic.

  • I do this for my CF-based firewalls.

    The CF has two partitions on it: A 3M boot/config partition (Minix), and the rest is reserved for the cramfs (right now the cramfs takes about 10M).

    The boot/config partition holds the kernel, GRUB, an initrd image and the system configuration. At boot time /conf is booted and the initrd is loaded and executed. It loads a few modules, mounts the cramfs and then pivot_roots so that cramfs is /. The cramfs boot process mounts /dev/pts, proc and /var (as a 4MB tmpfs), untars the basic /var system and runs init. The rest is a standard boot.

    The advantage to this is that I don't waste RAM by decompressing the entire root filesystem; cramfs decompresses the program to RAM at execution time. /var is the only point that is always rw but it's RAM anyway. Any time I need to do a config change I remount /conf as rw, make the change, and remount ro. If power is lost there is no fsck.

    At present I have a complete firewall (including ssh, ipsec, the iptables 1.2.8 connection helpers (h323, ah/esp, etc.) and a full Perl install (I use it for SNMP, XMLRPC, integrity checking, etc.)) to fit in a 16M CF card. Cramfs is *awesome* :-)

    In your situation I would probably do away with having the config in /conf, instead mounting it from the network (perhaps using BOOTP).

  • by photon317 ( 208409 ) on Saturday August 24, 2002 @10:48AM (#4133017)

    Use linux, and go ext3 (the journalling version of ext2). Mount the ext3 filesystem with options "sync,data=journal", and you should never have any issues.
    • don't forget ext3 is two times slower than ext2.


      Create EXT3 journal in ordered data mode:

      % unmount /mnt

      % tune2fs -j /dev/hda1
      tune2fs 1.26
      Creating journal inode: done
      This filesystem will be automatically checked every 10 mounts or
      10 days, whichever comes first. Use tune2fs -c or -i to override.

      % mnt -t ext3 /dev/hda1 /mnt

      % cd /mnt

      % time dd if=/dev/zero of=test count=1000k
      3.900u 39.540s 1:10.53 61.5% 0+0k 0+0io 103pf+0w

      % unmount /mnt

      Mount as EXT2 filesystem:

      % mnt -t ext2 /dev/hda1 /mnt

      % cd /mnt

      % time dd if=/dev/zero of=test count=1000k
      2.540u 11.960s 0:38.58 37.5% 0+0k 0+0io 105pf+0w

      EXT3 is 1.8 (71/39) times slower than EXT2!

      There are other options like non-ordered for using EXT3 which make it run a bit faster, but it's still at least 1.5 times slower than EXT2. If you decide to run EXT2 and put up with regular fscks, just disable manual mode in fsck so it runs automatically on reboot without user intervention required. An EXT2 filesystem used like this will typically live for years before it finally collapses. Keep regular backups and you'll be fine.

  • The latest versions of FreeBSD allow you to specify this option in the rc.conf file:

    # Set to YES to do fsck -y if the initial preen fails.
    fsck_y_enable="YES"

    That, in combination with a good choice for your filesystem type, should ensure that even if it fsck does find something, it will make decision about what to do without bothering the user.
    • Definitely enable SoftUpdates if you're using one of the BSDs. It will ensure metadata stays consistent as much as possible. Much better behaviour in case of unexpected shutdown. Actually, with SoftUpdates a boot-time fsck isn't required, it can happen after boot in the background - but that won't be in a released version of FreeBSD until 5.0. Much faster startup on a system with large filesystems.
  • If you are using IDE drives, regardless of what solution your try to use, disable write caching. Even when mounting an IDE drive with the sync option, it may lie about when writes complete if the cache is being used (IDE drives do this for better performance numbers, SCSI drives don't do this).
  • One of the 'issues' in setting up Sun Blade100 workstations for 30 Architects was how to handle unclean shutdowns, power failures, etc. Integrity of the OS isn't an issue (if it breaks we simply JumpStart it, and /usr/local & /opt are nfs mounts) - and integrity of the user data isn't an issue either (~, etc are all NFS mounts). All I wanted was for it to NOT prompt for a root password for fsck on bootup EVER. The following solves this problem on Solaris 8:

    Open /etc/rcS in your favourite editor

    Find the line that contains the comment:
    # Determine fsck options by file system type

    Underneath that, you will see a construct which says:
    case $2 in
    ufs) foptions="-o p" ;;
    s5) foptions="-y -t /tmp/tmp$$ -D" ;;
    *) foptions="-y" ;;
    esac

    The 'foptions' are the arguments passed to fsck man fsck for further details. Change the ufs line to read:
    ufs) foptions="-y" ...save and exit. Now test this by pulling the plug, and rebooting. It will see your hosed-up filesystem, run fsck with the -y argument, fix a whole bunch of crap without any user intervention, then proceed with the bootup when it's done.

    You may be able to apply something similar to your needs.
    • Wow, I sat here stunned looking at your post trying to think of a response. I'm always amazed at people who take a sledge hammer aproach to solving a problem when there's something alot easier and better.

      Starting with Solaris 7 11/99 (I think) Solaris has had UFS logging support, ie a journaling filesystem. It is similar to ext2/ext3 in that you can on the fly switch between the two without need for newfs'ing your slices. Just add the logging flag to the options column in /etc/vfstab. And to start using it right away just do a `mount -o remount,logging $fs_mount`. Sometimes / and /var require a reboot to be switched over to logging.
      • I am well aware of the 'logging' flag, and use it on all our systems. However, the /var filesystem would still frequently come up unclean, and prompt for '...root password to enter single-user mode, or control-D to continue...'

        Hammer? Maybe, but after getting the umpteenth call from users saying 'my system says to hit control-D and I do but nothing happens...' it was a fast, simple, once-and-forget-it approach (the change to rcS is applied with the JumpStart script).
  • by kevquinn ( 563706 ) on Saturday August 24, 2002 @01:30PM (#4133508) Homepage
    That'd be my first shot - FreeBSD implements "Soft Updates" (as on OpenBSD) which practically eliminates the need for fsck'ing.

    Soft Updates ensure that the filesystem is always in a consistent state. Updates are effectively not marked as complete until they have actually all gotten to disk. This ensures that after a re-boot, the system is consistent, maybe with the disk state as that of a some seconds earlier. The Soft Updates technique is also much faster than journalling, which is your other option (reiserfs, ext3fs etc in Linux).

    I said above that fscking is practically eliminated - in fact a fsck task still needs to run to recover sectors that are 'dirty' but the system is stable without it - critically the system boots up without it, and in the background at some point when the system finds time to do so it recovers the sectors marked 'dirty'; the soft update people call this a "background fsck".

    Note that this won't stop loss of data - but then nothing will stop loss of data. fsck certainly won't even if it is run properly, because that's not what it does. What it does do is ensure the filesystem metadata is always consistent (i.e. whether a file has been created/deleted, contents of directories etc).

    More details on soft updates can be found in the OpenBSD FAQ and also in the [openbsd.org] FreeBSD handbook [freebsd.org]FreeBSD handbook.

    If you want to get the same kind of disk flushing that you get with DOS, then you can only really do that with a single-tasking operating system (if that's not a contradiction in terms!) which can therefore ensure a minimum of delay between the application generating data and it being flushed to disk. Note this is never perfect, but can be close enough that you'd only notice one in a million power-offs.

  • by aminorex ( 141494 ) on Saturday August 24, 2002 @04:00PM (#4133956) Homepage Journal
    eliminate the disk entirely, silly person.
    boot from flash or from a CD. if you really
    need to store more data than you can keep in
    flash between power-cycles, then use CDRs.
    when one fills up, eject it, and they can
    pop in a new one. *bam* instant permanent
    audit trail, in a compact format.

  • although i can't imagine why, if you *insist* on
    using a hard-drive, don't use a file system.
    make a tiny read-only root partition, and a big
    fat block of raw disk. do your reads and writes
    to the raw device.
  • How about replacing fsck with a wrapper like, say:

    #!/bin/sh
    fsck -y /dev/hda??

    Regards,
    Cengiz Akinli
    Netmar, Inc. - Expert webhosting since 1994
    http://netmar.com/ [netmar.com]

  • I run an internet cafe/gaming center running linux, and have had great luck with xfs. sometimes kids kick the power button, ar the nvidia drivers lock up the box, and so far, my xfs filesystems have come up without fail. When I had ext3, it always did an fsck (even with journaling on) - this took 20 minutes with my 40gb drivers.

    Another option would be to use knoppix linux or another cd based linux - then send any data over the network to a central, ups'ed server. Or, boot up via network (from the same central server). Either would let you hit the power button all day long.

    Hope this helps,

    Greg
  • I have been using EXT3 since Linux Kernel 2.4.12, a couple revisions before it got folded in...
    It has severed me well since, with no hitch. My server goes down probably one or twice a month due to power outages(Yes I have a APC, but let just say its a little under powered and I am strapped for cash). The server goes down hard and ugly when this happens. It always come back with out issue however, and quickly! A normal boot after a proper shutdown is about 2 Minutes(it starts lots of services)...when in ext3 recovery it only takes about 20 seconds extra.
  • could just run the bulk of the system as read-only and mount your data partition as dos or vfat.... nice and persistant.

    OR just run the whole darned thing as UMSDOS.

I tell them to turn to the study of mathematics, for it is only there that they might escape the lusts of the flesh. -- Thomas Mann, "The Magic Mountain"

Working...