Building A Stable NFS Box? 11
parker9 asks: "I do a lot of numerical computations- typical runs take one or two days (occasionally a week) on dual alpha processors. These boxes (either running Tru64 or Linux) are pretty stripped down- just dual processors, 2Gb ram and tiny hard drives. All the results are dumped to a simple PC running Linux via NFS. We're talking GBs of data in 24 hours. Works fine, since backups are a snap. The problem, however, is that this user disk has been hosed three times since November. Not sure what's going on, since the latest combination of hardware looked pretty good (SCSI Seagate HDs). So, please advise me on what hardware to put together for a stable (24/7) NFS server: motherboard, processor, scsi card, network card, power supply and HD. Is there a flavor of Linux which would be optimal for this type of application?"
RAID (Score:1)
A good linux distro... (Score:1)
FreeBSD (Score:2)
There are lots of tradeoffs that the BSDs and Linux make differently, but *BSD is unquestionably superior to linux in terms of NFS serving.
A match in the distro flame war. (Score:2)
Well, for my money (luckily none is required), I would go with Debian [debian.org]. It isn't the most 'up-to-date' distribution of Linux, but I have found it to be among the most stable (probably _because_ it isn't as up-to-date).
It sounds like parker9 [mailto] is running a 'no frills' setup - probably the best thing in his/her position. NFS has been around for ages, and Debian [debian.org] will handle it well.
The other solution mentioned is the one I would go with: RAID. The redundancy is the key here - speed really shouldn't be an issue - any hard drive is going to be faster than the data stream, unless parker9 [mailto] is using some networking hardware he/she didn't mention.
-Percival
Wait... (Score:3)
If it's a physical disk error, were the drives mounted properly, with enough space between them and enough ventilation so they wouldn't overheat? Was the machine moved a lot, or did the disks vibrate too much? Improperly-installed disks have very short lives.
If it was an ext2 thing, try a journaling filesystem like reiserfs (does it work fully and reliably yet?) or turn off async writes in ext2 for a little added security and a little less speed (use the "sync" option in mount(8).) Either way, do it on (_hardware_) RAID to minimize data loss further.
Thankfully, you already back things up. I've had people complain to me that "backups are too expensive", to which I reply, "then you don't value your time or your data" (or "not backing up is like fscking a hooker without a condom.")
Good luck.
- A.P.
--
"One World, one Web, one Program" - Microsoft promotional ad
I prefer IBM Harddrives (Score:1)
older might be better (Score:4)
If I was setting out to build a Linux NFS server for maximum reliability, I'd put down a Tyan Tomcat4 mobo, with a PC Power and Cooling power supply (just for it; put the drives on a different supply), Kingston RAM, etc. That's an HX chipset Pentium motherboard and I have 4 of 'em (the dual CPU model), one of which has ben running 24/7 for nearly 3 years with 0 trouble. You don't need a great deal of CPU power for this anyway, and I have yet to see anything newer that's anything like as stable.
The 3c509 10Mbit ethernet cards are the most stable NICs I've used under linux. The machine mentioned above is a partly a router and has 5 of 'em, barring lightning they just work.
In 100Mbit, I like the 3c905b cards. Those are hard to find anymore, the 3c905C has replaced 'em and while the hardware may be as good, the drivers are not quite there yet... I speak as one who has tried all 5 of the current drivers for that card, and ain't happy about that. Every Tulip based 100Mbit NIC I own seems to have one reliabilty problem or another after a few days or a few GB of traffic, I've long since given up on putting them in servers. Haven't heard good things about Intel NICs, and thus haven't tried any seriously.
The previous poster's comment about SCSI RAID is a good one; I've been told that reliability isn't even a design consideration for any but the high end SCSI drives, and hasn't been for several years. I buy that from examination of several dead drives, the cheaper SCSI's were identical to the IDE version but for the circut board, while the "server" drives were obviously better designed and engineered.
One gotcha that got me on hard drives was cooling; the warmer drives in my stable had the highest error rates, IDE or SCSI. If you've got big 10k RPM drives they'll want fans all their own. Once cooled, mine have had very few problems regardless of brand or interface, with one exception... (warning, rant follows)
I've got WDs, Segates, Quantums, IBMs, even a few Fushitsus, but up 'til recently I was very loyal to Maxtor. I've got 10yr old maxtor drives running error free today, but the last reliable maxtor I have is a 17GB IDE from back when that was the biggest they made... 2 years ago? Since then, I've had a better than 25% DOA rate, over 50% dead-after-a-few-days rate, and not one single drive that doesn't drop a bit here and there. Maxtor used to be rock solid, but they've apparently decided that all that quality and customer satisfaction wasn't worth the trouble.
I've had the least trouble from my IBMs, but they're not really old enough for me to really respect 'em yet. I've burnt up a couple of Quantum drives (IDE and SCSI), but have others of the same model, age, and running the same loads still alive. Seagates seem to have a shorter lifespan on average than most of the other brands I've tried, I've been avoiding them for several years. One point in their favor; the drives seem to give some warning before they die: suddenly becoming much noisier, and/or vibrating badly, or running hot... I had a Seagate 545MB IDE that died suddenly, all the rest gave enough warning to be replaced before failing. I avoid Western Digital 'cuz their stuff looks and feels cheap, and doesn't play well with other brands.
The only thing left for your server is a SCSI controller, I've had good luck with Adaptec 2940UW boards, bad luck with the UWPro and 29160, but no reliability complaints. Your application probably justifies hardware RAID which I know little about.
Oh, yeah, video cards. I don't think you'd care a lot about video in this application, but I'd put in a Matrox card, I've had good luck with 'em. I have in my pile of rarely used hardware a Diamond Stealth 500 (riva128 chipset, i think) VGA card that locks up any machine its in after about a week running. Never have figured out whay, but then I've never really tried that hard to diagnose it. I'm sure it's the video card though, as it happens on every motherboard from 486en to P2s I've tried it on, most of which are rock solid otherwise.
... that's a lot more than I meant to say when I hit the reply button, but perhaps it'll be of some use. I don't mean to imply special expertise; this is just the results of my own trip down the learning curve.
Re:A good linux distro... (Score:1)
Re:older might be better (Score:1)
Don't get a video card, get a PC Weasel 2000 [realweasel.com].
It works like a (text-only) video card, but outputs to a serial line, so you can use a regular PC like a real server.
Of course, my vote would probably go for FreeBSD if you insist on a PC for NFS-serving, but other things (Irix and/or Solaris) might be better.
NFS Boxes (Score:2)
However, why are you using a Linux box at all? If you want a box that just holds tons of data for an NFS share, you probabally want to look into solutions that are designed to do that. I do consulting for a major web hosting provider that has a 150GB array hanging off a NetApp [netapp.com] 760. I'm not going to say use that particular box, but there are quite a few storage + ethernet interface solutions out there that are designed for high availability.
Just make sure you can keep some hot spares in the array, and that whichever method you do choose has backup solutions that work for you.
Stable NFS, go specialized (Score:1)