Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Upgrades Software Hardware Linux

RAMdisk RAID? 114

drew_92123 asks: "I've got a friend who does a LOT of video editing but is on a limited budget. He is currently using a raid array and while quite large, it's not as fast as he had hoped. I had an idea and wanted to know if there is a way to make it a reality, so of course I though of all the brilliant minds here. I have about a dozen Pentium II computers with 1GB of RAM. I would like to upgrade them to 2GB and throw in some gigabit NICs and create a 1.9GB RAMdisk on each one. Then I want to use one of the computers to RAID the RAMdisks together to be shared via Samba most likely. They are all 1U systems, with no HDD's, just a 64MB IDE flash disk. Any ideas out there?" Has anyone successfully put together such a system? How well did it work for you and are there any caveats that you would like to share with others who would do the same?
This discussion has been archived. No new comments can be posted.

RAMdisk RAID?

Comments Filter:
  • Eh?? (Score:5, Informative)

    by foooo ( 634898 ) on Wednesday February 26, 2003 @06:15PM (#5390301) Journal
    Wouldn't the bottleneck of the NICs be an issue?

    You might just try reconfiguring the raid to be Raid 0+1 (Striped and Mirrored) That would give you the redundancy and speedy access. If the RAM is already maxed out on the video workstation it might be more cost affective to get a better motherboard that supports more RAM.

    ~george
    • What if all of the 'RamDiskStations' had 100 megabit NICs connected to a switch and the actual work machine had a gigabit NIC connected to the one gigabit uplink on the switch?

      If it was a 12+ port 100MHz switch with a single gigabit port he could get a peak theoretical sustained throughput of 125 megabytes per second. I am guessing he already has fast NICs in the machines so it is a matter of upgrading the master box to a gigabit NIC (about $90 for an Intel card) and the switch with single gigabit port on it. There is a Accelar 1050 on eBay, 12 10/100 ports and a gigabit uplink for around $50 with 14 hours to go ...

      Note : all NICs are not created equal, use SMC, Intel, 3Com or other quality brand name NICs for the highest throughput (NE2000 cards and clone cards run about half as fast as good cards.)

      Of course once you do this you have a dozen 2G virtual drives but they are all different drive letters and you can't work on one big 24G file at once ... and for this to work you would need to find a way to spread the load across all the servers to get your peak throughput.

      I suggest addressing the issue. Look at some of the recent versions of the American Megatrends (AMI) MegaRAID card and couple it with a BUNCH of RAM configured 50/50 read cache / writeback cache and some fast hard drives. Used drives are wicked cheap now, and with 5 or 6 9G drives you could could make a nice RAID 5 setup with a spare for when one craters. I think Dell OEMs these as their PERC cards, and I think they are up to PERC/4 now but a PERC/3 is the one you might find used for a semi-reasonable price.

    • I thought the same. If gigabit ethernet were faster than scsi or ide, then why wouldn't ne have NAS drives inside the computer.

      Mind you, some work has been done on switched busses for motherboard use, and perhaps, maybe, you could emulate something like that, but once again, going to the NIC+NET would seem to be a big bottleneck.
      • If gigabit ethernet were faster than scsi or ide, then why wouldn't ne have NAS drives inside the computer.

        Because the overhead of IP processing is enormous compared to the overhead of scsi command processing.

        Raw disks produce 30-60mbyte/sec, and climbing: say up to 500Mbit/sec. IU haven't seen any commercial NICs which can handle ram-to-ram transfers at this speed on desktop hardware. Scsi-160, nominalk throughput about 1280Mbit/sec, can handle this data rate without breaking into a sweat. Overheads for IP are going to be large fractions of a millisecond per packet (and a packet is 1.5 Mbit, or 9Mbit if you have jumbo packets). Scsi overheads are going to be tens of microseconds per transfer (probably per frame for video). Scsi just wins hands down - and IDE still beats the pants of NAS.

        The only reason for NAS - and it is a good reason - is sharing.

    • Re:Eh?? (Score:2, Interesting)

      by mkldev ( 219128 )
      Wouldn't the bottleneck of the NICs be an issue?

      Not as big as the bottleneck of the CPU. I don't know what type of video editing this person is doing, but odds are about 10:1 it's DV. I actually worked through the math on a video editing mailing list once and showed the fallacy of assuming that faster disk performance made any real difference. I'm actually rather surprised nobody else has already mentioned the logical flaw, given that this is slashdot, after all. :-)

      Amdahl's law tells us that speeding up a portion of an operation results in an overall speedup proportional to the ratio between the part that was sped up and the total operation, and that the maximum overall speed-up couldn't be more than the percentage of the total time that the operation used originally.

      Say you have an operation that takes ten seconds. It is divided into two parts, one of which takes 1 seconds, one of which takes 9. If you speed up the part that takes 3 seconds by a factor of two, you have only knocked a meager 5% off the overall time of the operation.

      In video editing, the amount of time taken for reading a block off a disk is totally insignificant compared to the amount of time needed to process the frame to such an extent that the disk access time is generally lost entirely due to the precision of most people's calculations.

      Basically, the time to seek to a given block was something like one sixth of a percent of the total time needed to decompress a single DV frame with a 1GHz CPU. Even if you cut your seek overhead by a factor of a thousand, you couldn't cut your time by even one percent.

      Worse, if the code is written correctly, it is possible --indeed trivial, given the well-defined access patterns involved in video editing -- to prefetch the blocks of data before the processor needs them, making the speedup absolutely zero, regardless of the speed of the disk.

      Long story short, while it's a cool idea, it won't have any noticeable benefit. Now if your friend were doing -audio- editing with multiple channels, where it is actually possible (even easy) to exceed the speed of a single 5400 RPM hard drive, that would be a different story. But at least for DV editing, there is no benefit of even moving from a single 5400 RPM drive to a 7200 RPM drive apart from having a greater safety margin to avoid the risk of dropouts when capturing. The benefits of moving to a RAID are even less, and to a huge ramdisk, thus, less still.
      • perfect presentation I wish I had karma to give. if anything, faster ram might make a difference or a cpu upgrade.
      • that maybe so, however when im encoding some things my cpu reguarly sits there not at 100%, so it is clearly waiting for somthing.

        If im say adding a small logo to the bottom right hand corner of a video of some description and writing it out to disk the operation completes in LESS time if im compressing the output too, but if im working with uncompressed footage (as I do generaly until the last stage of processing) then writing it out to disk is by far the slowest part of most operations
  • Do some tests first (Score:5, Informative)

    by HotNeedleOfInquiry ( 598897 ) on Wednesday February 26, 2003 @06:16PM (#5390309)
    Before you buy a bunch of hardware, set up one ramdisk with a network link and find out what your real-life tranfer bandwidth is. I'll bet that the gain, if any, would not be worth the effort.
    • I completely second this. The costs of the hardware will not be worth it. I suspect that you would do a lot better adding disks to the array, creating another array, or upgrading the ram on the main machine. Although it sounds like a cool idea in concept, this is not a good idea if you're doing anything but playing arround with the tech. It does sound like a cool project, though.
  • what? (Score:2, Insightful)

    by Anonymous Coward
    you want to fragment all filesystem access into super-tiny 1500 byte ethernet packet access to each of several hosts for a mere 20gb ram "filesystem"?

    gig E does not perform that well on a single host. You'll -might- get lower latency than a cheap raid array but the thruput won't be any better.

    why waste time with "RAID" anyways? this ram is all ECC ram and is non-persistent so there's no point in tossing that extra computation into the mix to make it worse.
    • Re:what? (Score:3, Funny)

      by otuz ( 85014 )
      he needs the RAID part when one of these machines crashes.
    • I'm completely unsure if this is possible, but could you set up the PII boxes as iSCSI servers, and connect to them that way? I think that would offer better efficiency than Samba, and someone who knows could tell you if any sort of RAID is possible with iSCSI.
  • Power outage (Score:3, Informative)

    by CounterZer0 ( 199086 ) on Wednesday February 26, 2003 @06:19PM (#5390359) Homepage
    Just remember, he's SCREWED if the power goes out and he hasn't flushed that /huge/ RAMdisk to a real disk.
    • Re:Power outage (Score:4, Insightful)

      by coryboehne ( 244614 ) on Wednesday February 26, 2003 @07:01PM (#5390770)
      Ahh, come one... Anyone who is crazy enough to want to setup a lan-based-ram-raid-array (Whew...) is certainly going to be sane enough to set up some sort of serious power backup.

      Of course just to repeat and be redundant, the bottle neck will simply come down to the NIC card, a possible solution would be to install several NIC's and spread the load out... But this person would be much better off simply buying a system that will either (a) handle 24 GB of RAM, or (b) has fast enough disks to please him...

      Of course a better idea might be to buy a system that would support both (a) and (b) and then set the system up to use the ram as a primary holding area for data, then flush to the (now faster) disk array...

      Either way this person is still affected by "The Sickness" that we all seem to suffer from (More, bigger, better, faster....)
    • Maybe thats the point. If he runs a porn site he could wipe all the evidence by hitting a kill switch but with RAID and UPS he is protected from accidental data loss. He could also rig a software deadman switch which shuts down the systems unless it receives a signal every so often.
  • Also... (Score:5, Insightful)

    by foooo ( 634898 ) on Wednesday February 26, 2003 @06:22PM (#5390382) Journal
    Ever thought of just upgrading your SCSI controller? You can get RAID controllers that have insane amounts of RAM in them. That might patch up any access issues.

    If your concern is extended duration throughput the multiple rack computers with ram *might* be an option, but most normal users wouldn't consider it due to the latency involved with going through the southbridge then the nic then the nic then the southbridge (of the other computer) then the north bridge. And that's just a one way trip.

    Just don't shell out a bunch of money before you do a proof of concept.

    ~george
    • If he wanted to do this with Macs instead, the newer PowerMacs use a single System Controller within which both the ethernet and memory subsystems lie. This gives you one-step access from the ethernet card to the memory, which could improve your communication latency. Also, the new PowerMacs handle GigE, support to 2 GB ofRAM, and use DDR (266 or 333, depending on model), making for a (hopefully) better situation.

      Of course, you are still suffering from practical network limitations, protocol limitations, and more. Making a distributed file system over a network of sufficient speed doesn't sound like a bad idea, it just seems to need a few things.
  • First reactions. (Score:5, Informative)

    by FreeLinux ( 555387 ) on Wednesday February 26, 2003 @06:24PM (#5390397)
    At first glance, this sounds like an incredible waste of time. RAID RAMDisk? Why? Are you crazy? What's the point?

    But, if you give it some thought it is an interesting idea. Basically you are trying to build a clustered RAM disk..

    There is however, a major drawback to this idea. The whole advantage of a RAM disk is speed/performance. Locally, the RAM disk is MUCH faster than a normal disk drive. But, the problem arises when you connect your "RAID RAM disk". You must network the machines in order for them to communicate with each other and suddenly, your performance has dropped to nothing. In fact is is below the performance of a normal disk drive.

    In order for your RAID RAM disk to perform equally with a good disk drive you would require a switched gigabit network between your nodes. This will cost more than the "normal" disk. Additionally, even with a switched gigabit network the performance is highly unlikely to exceed the performance of highend disk drives.

    So, when you get right down to it, the RAID RAM disk is an interesting idea, just to see if you can do it. But, there isn't really any advantage to it.
    • Sorry, but I don't think so.
      RAM -> RAM across a network (assuming at LEAST 100mbit ethernet) will be FASTER than accessing a RAID of local disks. It's all memory to memory transfer at that point - no spin up, no seek time. The disk's may get close for a very long sequential write/read, where the multiple drives can actually come close to using the bandwitdh available via the RAID controller.
      But for random access...no way. RAM 'seek time' is measured in NANOSECONDS, while even the fastest drive is in the miliseconds! RAM is over 1000 times faster!
      • You are simply thinking in ideal numbers. Just because linux says your ping is 1ms doesn't mean that you seek times will be at least 10 times that of a hard disk. Just because the nic says it's 1000mb/s doesn't mean it's 5 times as fast in transfer rates. Obviously you'd have to test all of this to know for sure, but I'd be willing to bet that any raid card worth anything will equal the speed of this idea. But this argument is trivial because of all the other negatives of running this (power costs, maintenance costs, problems when a node goes down, etc.).
      • 12.5 megabytes/second max thruput on 100mbit ethernet, and that's not counting lag time, packet size, etc...

        RAM 'seek time' may be nanoseconds, but that does not mean it will transfer across the network that quickly.

        but i'm not the first one to say this... so...
      • Re:First reactions. (Score:5, Informative)

        by Harik ( 4023 ) <Harik@chaos.ao.net> on Wednesday February 26, 2003 @09:26PM (#5391932)
        Sayeth CounterZer0:
        Sorry, but I don't think so. RAM -> RAM across a network (assuming at LEAST 100mbit ethernet) will be FASTER than accessing a RAID of local disks. It's all memory to memory transfer at that point - no spin up, no seek time. The disk's may get close for a very long sequential write/read, where the multiple drives can actually come close to using the bandwitdh available via the RAID controller.

        I, however, beg to differ.

        harik@taz:~$ ping -s 1492 192.168.100.99 PING 192.168.100.99 (192.168.100.99) 1492(1520) bytes of data. 1500 bytes from 192.168.100.99: icmp_seq=1 ttl=64 time=2.80 ms 1500 bytes from 192.168.100.99: icmp_seq=2 ttl=64 time=2.77 ms 1500 bytes from 192.168.100.99: icmp_seq=3 ttl=64 time=2.77 ms 1500 bytes from 192.168.100.99: icmp_seq=4 ttl=64 time=2.77 ms 1500 bytes from 192.168.100.99: icmp_seq=5 ttl=64 time=2.77 ms
        This is two machines sitting side by side on a seperate, completely unloaded switch. Don't just go by the 500ns ping time, you actually have to transfer data. You're talking at LEAST 3ms PER BLOCK... and thats with some insanely optimized code.

        Now, for video editing 99% of the effort is linear (unless you are horribly fragmented) so you're talking ONE 6ms seek ONCE then thousands upon thousands of linear reads.

        Secondly, his "raid array" sucks if the performance is bad. I buy low end LSI Express 500s (Ultra 160 LVD) and they have stellar performance. For doing AV, this is my reccomendation:

        Buy a multi-channel Ultra160 or Ultra320 SCSI Raid controller (160s are pretty cheap now that 320s are on the market) Load it up with 5 large drives. Set the stripe size to the maximum. Buy a cheaper IDE RAID and set it in mode 15 (Mirror two RAID5 arrays together, harder to lose data that way.)

        Use the SCSI for your working set, and reformat it frequently (or at least delete all files) to defrag. Use RAID0, it's faster. Save your finished projects to the IDE raid, burn to DVD, DLT, whatever.

        It will _STILL_ be cheaper then putting 2gig of RAM unto a pile of boxes, AND faster. single-channel ultra-320 can hit you with up to 40 megaBYTES per second, all on a measly 5ms initial seek. (Remember, ALL the drives seek in parallel) Putting drives on the second channel can whollup you with 80MB/second. You're talking around $1500 for the card, of course. But have you priced out a 1U server with 2gig ram lately?

        • It will _STILL_ be cheaper then putting 2gig of RAM unto a pile of boxes, AND faster. single-channel ultra-320 can hit you with up to 40 megaBYTES per second, all on a measly 5ms initial seek. (Remember, ALL the drives seek in parallel) Putting drives on the second channel can whollup you with 80MB/second. You're talking around $1500 for the card, of course. But have you priced out a 1U server with 2gig ram lately?

          Surely you mean 320MB/sec on one channel, as SCSI is rated in MB/sec and not Mb/sec (like IEEE1394 and USB are). IDE is also rated in MB/sec.
        • Buy a cheaper IDE RAID and set it in mode 15 (Mirror two RAID5 arrays together, harder to lose data that way.)
          Raid 15? Why? Unless you're being overly paranoid, you're far better off with RAID 10 (mirror then stripe) or at least 0+1 (stripe then mirror) as you'll get redundancy with the mirroring and performance with the striping.

          RAID 15 will use up more disks (extra disk for parity) and will also have lower performance (due to the parity calculations).

          However, the rest of the post is sound; use a fast RAID-0 array for working sets (fast, but no redundancy; worst case, you lose your current run of data) before copying to redundant (e.g. RAID 5, 10, 0+1) storage or copying to tape.

        • actualyf if you read the initial article he says he allready has the stack of 1u servers all he has to do is add the ram. which is much cheaper then what your suggesting.


          not to mention that most slashdot articles are from people on a budget

        • why do raid 1 + 5 when you can just put more spares in the raid 5?

          • why do raid 1 + 5 when you can just put more spares in the raid 5?

            Because you can't add spares. Raid-5 error correction only expands the data to N+1. If you put in extra drives, they are "hot spares", not redundant. Disks are getting pretty cheesy lately (especially for someone "on a budget") and a multi-disk failure isn't unheard of.

            "raid 6", whatever that is and whenever it becomes a common standard expands the data to N+X, so you have to have X+1 drive failures before losing data.

            Also, raid 15 is a bad idea. Raid 51 has a LOT more redundancy (raid5 made up of individually raid1'd disks.) Your odds of losing the 4 drives needed to take it down are much less then if you use 15.

      • Spin up and seek time are irrelevant for video. Unless your filesystem is completely braindead it will almost always be a very long sequential read/write.

        Besides, it's not going RAM -> RAM, it's going RAM -> northbridge -> southbridge -> PCI bus -> NIC -> switch -> NIC -> PCI bus -> soutbridge -> northbridge -> RAM, and being broken up into little packets by TCP/IP and reassembled and so on. Every one of those steps introduces latency, which potentially interupts the video stream.

        Additionally, a 100Mbit network can only transfer 12MBps (more like 8-10MBps in the real world), and a single 120GXP Deskstar can sustain 4 times that. A 4 drive RAID-0 IDE array should easily be able to outperform the 1000Mbit network proposed here.

    • by mosch ( 204 )
      I had a slightly different first reaction. My reaction, after a little thought, was to conclude that the poster is either a troll or a fucking retard.

      Seriously, I have a hard time imagining how you could design a storage system with a worse unneccessary cost/MB or lower reliability.

  • by baka_boy ( 171146 ) <lennon.day-reynolds@com> on Wednesday February 26, 2003 @06:28PM (#5390433) Homepage
    Assuming that you'd have to buy at least some of the Gb networking hardware (switches, cables, etc.), you're really not going to be saving much. Assuming at least $100 per 'RAMdisk server', you'll be spending $1200+ for a ~20GB RAID array that will lose everything the minute the power blinks, not to mention drawing several kilowatts of AC.

    On the other hand, if you just throw four 100GB ATA-100 drives in a standard tower case with a decent IDE RAID controller, you get five times as much storage for probably about half the money.

    Also, remember that most low-to-mid-range PCs can't actually fully take advantage of a gigabit network link, since the PCI bus and CPU get saturated long before the network does.
    • [FlameSuit On] I'm sure you all will flame me for this but, I can take it.{/FlameSuit Off]

      It is very likely that he is already using IDE or ATA disks, and that is part of his problem. When large amounts of data need to be transferred quickly, SCSI is what you need. There is nothing faster than 15,000 RPM SCSI drives connected to good RAID controllers that have large amounts of cache RAM. Nothing.

      If you want high performance then you must use high performance gear. Yes, it does cost 5 to 10 times more than the IDE RAID solution but, there is a VERY good reason for that.

      Ok, now comes the flames from the know-it-all masses who's experience is limited to home PCs and no-traffic webservers.
      • Notice the original restriction in the posting, though: on a limited budget. SCSI RAID really isn't the first solution I'd offer for that problem.

        One thing I didn't think of in my previous post, though, was the XServe RAID units that Apple released recently...$6k gets you a 720GB Fibre-Channel RAID array in a 3U enclosure. Not bad at all, really.
        • Yeah like buying oodles of RAM and GigE network gear (good switches for what he wants cost a fuckton) is going to be any less expensive than some good SCSI drives and .

          What might be a good compromise is the new WD SerialATA 10KRPM drives due to ship later this month. On a good RAID controller they should give you a bit of a speed boost for the $$, and the individual channels burst at 1.5Gb/s which is a bit faster than the GigE interconnect on a ramdisk would give you anyway.

          Apple's XServe solution, while it has a FibreChannel interconnect, is an 7200 RPM ATA RAID solution inside the box, so that's going to be a per-client speed hit, and for what he's doing could be accomplished at the same performance level for much cheaper with an internal local ATA RAID controller (which is what he's already doing) - probably individually cheaper than a decent FibreChannel host adaptor!

          [Source: Apple XServe RAID Technology Overview [apple.com] ]

          All in all, I agree with the grandparent post that 15K SCSI drives with a good 64-bit RAID controller with oodles of ram on a nice motherboard will probably SMOKE anything you could get with some kind of bizzare frankenstein Network-raid-ramdisk, though the frankenstein network-raid-ramdisk would be a fun hack just to pull for the hell of it.
      • Well, my experience is limited to customer service and repair of high end professional video servers, and what you're suggesting is overkill.

        I agree that there is nothing faster than what you describe, but when it comes down to it there's no reason a single user editing station would need more than about 70Mbps, which a 4 drive IDE RAID-0 should be able to sustain.

        I'd still recomend SCSI, but 10k or even 7200RPM drives should be able to handle the load easily, and be fairly affordable. In my experience a RAID-0 array of 7200RPM drives can handle about 24Mbps per drive (simultaneous record and playback of 12Mbps per drive is how I test them).

  • by pjcreath ( 513472 ) on Wednesday February 26, 2003 @06:29PM (#5390437)
    Um, aren't the network latency and bandwidth constraints going to obliterate any benefit you get from using RAM disks?
    • That's why he's thinking of using Gigabit Ethernet. That should be faster than the transfer rate of typical disks, and the latency of waiting for your sector to come around is way more than any network delay.

      It does seem like a waste of hardware, though. If you are going to drop some money into buying RAM for these systems, I would think it would be way better to figure out a way to attach that RAM locally. Somebody must make something like a PCI or Fast SCSI RAMdisk card that takes the cheapest memory modules. This would be likely to have some sort of battery built in to make it non-volitile as well.

  • by KurdtX ( 207196 ) on Wednesday February 26, 2003 @06:48PM (#5390633)
    How about a be- *bang!*

    *smack* *thump*

    *mass cheering*

    Btw, it does seem to be a (disturbing) recent trend at Slashdot to try to troll whole stories, instead of just trolling comments. C'mon anyone who's taken even one networking or hardware class knows the speed heirarchy:

    cache > memory > disk > network

    And, with the amount of physical RAM drives out there (very few), you'd quickly realize that even a local RAM drive doesn't offer enough of a speed benefit to offset it's cost. C'mon editors, I know it sounds cool, but do you really have to post it?
    • Change that to: cache > memory > network > disk

      Show me a disk that is faster than a gigabit ethernet. And if you tell me disks in a RAID are faster, I'll tell you it's possible to use several network connections in parallell.

      The networked ramdisks could be faster, but it doesn't fit the limited budget he wanted.
      • by Anonymous Coward
        RAID disks and multiple network connections in parallel are equally fast. The PCI bus maxes out around 4 disks or the equivalent in network cards. Who cares how fast it's getting blasted to the machine if you can't do any processing on it.
      • Show me a disk that is faster than a gigabit ethernet

        I'll be happy to show you a gigabit ethernet that is slower than a disk...

        Gigabit ethernet really doesn't live much up to it's promise, mostly because it is still ethernet which fragments everything into 1500 byte packets. Either, you loose network speed, or you saturate one CPU on each side of the link doing nothing but feed/read the networking card. Seems to me to be a relatively stupid way of using your new spiffy multiprocessor machine.

        On the other hand, yes, there exist networking solutions that are clearly faster than any disk you can buy.

      • I've done a bit of testing on a server that has both IDE and SCSI drives in it AND is attached to a NAS.

        The NAS wins, hands down, for random access stuff, and is aboutt as fast for linear stuff. the only thing it can't compete on is running postgresql, where having the disks local seems to be much faster. Plus I get nervous running a database across an NFS mount.

        FYI, the SCSI drives offer linear read speeds of about 25 Megs a second each, the IDEs offer about 10Megs a second, and the NAS is connected via switched 100BaseTx FD, so it's only got 10 Megs a second to work with. I'm betting it's faster because of its internal caching and what not.

        It has gig connections, we just can't justify the increased cost on an intranet server that already has sub second response time.
  • by doozer ( 7822 ) on Wednesday February 26, 2003 @07:00PM (#5390760) Homepage
    Though what your suggesting would work, and I've done similar things before,
    think of the additional costs:

    12GB of Ram
    12 Gig nic cards
    1 >12 port Gig ethernet switch.
    Setup time

    For what your looking at spending, it may be the same cost as buying some U320
    scsi disks and some sort of SCSI raid card.

  • Network Block Device (Score:2, Informative)

    by Halvard ( 102061 )

    Sounds like it nbd [sourceforge.net] may be your ticket if you are using Linux. nbd is designed to take a block device, like a hard drive and make it available over a network on a different host. It will also do RAID 0,1,5. Perhaps it will work with a ramdisk. I can't swear that this will work but is sure might, since after all, a ramdisk is implemented as a block device.

    RAM is cheap. If you are unconcerned about high electricity costs and need a large *F*A*S*T* device for storage, stripping a number of ramdisks could be the thing to do. PC133 1GB DIMMs are currently about US$200 [tigerdirect.com] and are on their way down. Sure, it's expensive compared to RAID 5, but I'm sure it's a lot faster. Just make sure you write out anything you need prior to downing the whole array.

  • I am not sure if you thought of it, but with that many systems running et, and if you are using mode 0, a crash on any machine would cause the whole array to be unusable. A simple reboot would force 100% data loss.

  • Closer is better (Score:4, Informative)

    by Anonymous Coward on Wednesday February 26, 2003 @07:07PM (#5390831)
    Buy the RAM and use it with a few of these solid state disks [cenatek.com]. 4GB per PCI slot. But don't be disappointed if it still isn't as fast as you want it to be: The disks are probably not the bottleneck. I'd be surprised if a properly configured RAID array couldn't deliver adequate performance for video editing. Even single disks are fast enough to work with uncompressed video these days.
    • Holy crap - those are NICE! If only there was a way to merge a bunch of these cards into one big (massive) drive or to RAID 0 them into a big drive ...

      I hope all that power is used for good, not evil.
    • At $1800/card + $550 for the memory, I bet he could do a lot better with a semi-top of the line raid setup. DV editing doesn't require nanosecond seek time that memory provides.
      • They have a card with no memory (the Rocket DL) for $400, you provide the memory.

        So for a 4G unit it would be about $950 total.

        Expensive yes but if you could increase the overall performance of a top of the line system 20% for a grand ... not out of the question for someone that would spend $5k on a system to begin with. The difference in cost between a P4/2.6GHz and the new P4/3.2GHz (with HT) is about a grand, if a 2.6GHz configured with a Rocket Drive was faster than a 3.2GHz without, that would make sense.

        Granted it is overkill in the context of the current discussion, and granted this wouldn't be the best application for demonstrating performance gains, but I still would love one :)
        • If nothing else, one of those RAM drives would be an AWESOME place to put a swap file/partition! A server motherboard with 4Gigs, one of those for virtual memory would be quite the insane combo...
          • I would also configure a chunk of it for temporary files. The Internet Explorer cache for those temp files, working space for SQL cursors and temp files, scratch space for WinZip files, etc ... Granted it would be a $1000 upgrade to your system but you could easily(!) migrate it when you bought a new machine. I have played a litle with RAMDrives under Win2k using Superspeed's ram drive warez (30 day trial) for exactly the same kinds of things and although it takes a little work to reconfigure your system to fully take advantage of it, once you get it working (esp putting the browser's cache on a ram drive) makes a big difference.

            Along the original poster's idea though, I occasionally revisit the idea of putting two machines next to each other, connected by a crossover cable between their Gigabit networking cards. One machine would do all the processing and the other machine would simply be a virtual file server with a massive RAMDrive - maybe 4G - to store and serve files. This idea of course not necessary with the RocketDrive but before I knew about that ... My idea was that the files would come at ultrafast speeds, 125MB/s peak theoretical sustained throughput being faster than anything I could hope for from regular drives, much lower seek times also. Additionally the CPU would be freed up to do work instead of using a chunk of it to manage disk I/O (IDE drives used to take a big chunk of CPU during massive throughput, not so much now.) Of course when the power went out I was going to be screwed, but scientists like to look at the big picture :)

            I still may do that, the barrier to entry being having two machines with gigabit NICs in them, and four 1G sticks of RAM to put in one of them (or at least 1G just to play with the idea.) Oh yea, and buying a full version of SuperSpeed for ?? under a hundred bucks.

            www.superspeed.com IIRC
  • whilst the RAM and NICs will not be overly expensive the real cost will be in the 12 port gig switcher. Doing some large scale networking recently I've had occasion to handle a few from a couple of suppliers and the decision we had to make came down to whether we really wanted gig or could live with 100m and 3 or 4 extra servers.
    All things being equal I'd put the value of the switch higher than the entire setup as it presently exists...
  • Dumb idea.

    The idea of a drive is persistant storage. Disk caching algorithms nowadays are excellent and normally surpass ram drives, since in reality, you are pretty much ALWAYS using a ramdrive. Not to mention the network issues brought up in other threads, and the simple fact that the main reason for raid 5 is redundancy of storage in order to boost reliability not just performance, and by putting everything in ram, there goes reliability. Sounds like, if you are not concerned with reliability as much, is a Raid-1 array, which implements striping but not parity.
  • ...about a dozen Pentium II computers...
    ...upgrade them to 2GB...

    Let's see, you want to buy 12 GB RAM, 12 gigabit NICs, and a gigabit switch to get a 24 GB logical volume?

    The cheapest gigabit NIC on pricewatch is $40. I'm not sure what type of ram a P-II takes, but everything on pricewatch is over or about $100/GB. So that's $140 per computer, times 12 computers is $1,680 for a 24 GB volume. (This is what you consider a 'limited budget'?)

    And you said your friend already has a raid array? I'm willing to bet that it's bigger than 24 GB, and since it's probably attached locally instead of through the network, a hell of a lot faster.

    For comparison, you can buy a 250 GB EIDE drive for $325. I'm sure you could put together a cheap computer with four of these drives for less than the $1600.
  • I have about a dozen Pentium II computers with 1GB of RAM. I would like to upgrade them to 2GB and throw in some gigabit NICs and create a 1.9GB RAMdisk on each one

    Are you on a limited budget, or are you going to spend a couple grand to upgrade all these ancient machines?

    Did you ever consider just taking the money and buying one really fast machine? :)
  • A poster above asked, "Why don't you just put all that RAM in his/her system, or, if the RAM is maxed out, buy a better motherboard."

    Problem is, there's only so much RAM a "normal" motherboard can refer to, before you have to start doing ugly (and expensive, slower) hacks. 2^32 = 4294967296 bytes, or precisely 4 gigabytes.

    After that, you're on your own.

    So, my question is: Why don't we have "RAM banks" that interface over SCSI, firewire, or even IDE?

    If it's firewire, it could even be external.

    A RAM chip isn't that large. Why isn't there a slick "drive" about the size of an iBook that holds 40-60 modules.

    The RAM itself, according to this week's memory prices is $80 for 512 MB PCC133 ECC, which means 5 gigabytes (more than enough probably for whatever work you want to do -- remember, this is volatile memory, so you can't store anything permanent on it!) is only $800 for the chips themselves. [sharkyextreme.com]

    I'm not sure I understand why PC800 is about the same price for 256 MB modules, but an order of magnitude more for 512 MB, while PC3200 (apparently the fastest of the lot?) is almost the same price as what I quoted for PC133.

    Anyone?
    • What a cool idea.

      Why can't i get an external firewire (or ide/scsi) box and just shove a bucketload of ram sticks in there and have an extremely fast external drive?

      Yum!
      • "extremely fast" -- understatement of the century! Try saturated frontside bus :).

        Just don't forget that it's volatile. Imagine this though:

        There's a battery involved, and whenever you disconnect from power it warns you, beep-beep, that it's going to lose power, indicating that for hours and hours and hours (hey! I need to be plugged in!). And get this: THEN, with the last hour of the battery, it powers up an IDE hard-drive and dumps itself to the battery before finally dying!.

        Whenever you next plug it in it says: Sorry, you'll have to wait until I load memory from the hard-drive (AND get enough power back to dump to hard-drive again, in case you unplug me just as finish doing so).

        Am I a genius or what,DiSKiLLeR?
    • It's called solid state drives. There's even one a href="http://www.cenatek.com/product_rocketdrive.c fm">linked in the comments above.

      Ohh yeah, and they're bloody expensive and don't really give a big boost. (Because you satuate the PCI bus.)
    • The machine I built last has PC4200 Dual Channel RAMBUS Memory.

      With the built-in Promise IDE RAID, I get >40MB/S thruput.
      That over 100 frames/sec encoding DiVX.

      Sounds like your bud's problem isn't in hard drive speed/thruput.
  • by bill_mcgonigle ( 4333 ) on Wednesday February 26, 2003 @08:19PM (#5391429) Homepage Journal
    This isn't a complete answer, for sure, but for linux RAID you need to present the RAM on computers A, B, and C to computer M (the mux) as block devices. You'll probably need to write a device driver for machine M that presents a block interface and speaks a UDP protocol to machines A, B, and C, where your server stores blocks on the local RAM disk according to whatever scheme works for you. Then 'just' edit the raidtab and build your md0 from the block devices. The reason 'just' is in quotes is because who knows if it'll work with a non-disk block device.

    Don't forget to deal with lost UDP packets, but you don't even want to go near TCP's latency on this. If you put them all on a switch your packet loss should be negligable anyhow.

    I don't think it's practical for your application but it would be a very cool hack. Good luck!
    • the simplest way to actualy do this would be using network block devices (in the kernel as standard) and softraid on the actualy client machine to make all the network ramdisk block devices into one virtual block device.
  • Waste of time (Score:5, Informative)

    by MrResistor ( 120588 ) <peterahoff@gmai[ ]om ['l.c' in gap]> on Wednesday February 26, 2003 @08:19PM (#5391430) Homepage
    I do customer service repair and testing for high end video servers, particularly the RAIDs attached to them. Based on my experience, what you're proposing seems like a waste of time and money. I think your friend would be much better served with a more traditional RAID setup. For single-user editing station a 4 drive IDE RAID-0 should be able to handle the load, and a similar SCSI array should be more than capable. I recommend SCSI.

    In typical storage situations you have 2 issues you need to consider: bandwidth and space. With video you add a 3rd issue which can easily eclipse the other 2 in importance: latency. Latency can cause hiccups in your data stream which would be unnoticable in any other application, but become painfully obvious with video, and any networked sollution is going to add latency. The more network is involved, the more latency will be added, which is why I would absolutely advise against a distributed solution. For that reason even if your network has the same theoretical bandwidth as your RAID, it will be slower, and that will kill the video stream.

    Anyway, your friends needs should be taken care of with an older (cheaper) SCSI RAID controller and some older SCSI drives, say 9-18GB 7200RPM. In a RAID-0 configuration they should be able to handle simultaneous record and playback of 12Mbps per drive, with a 3 hour capacity at that maximum bandwidth. For example: my test fixture has space for 5 drives, so it can handle 3 hours worth of 60Mbps video, which is decent for HD and ludicrous for SD. You should be able to pull together something like that for less than what you're planning to spend on RAM and Gig-E network gear, and will be more reliable (minimized data loss in case of power outage) and a hell of a lot cheaper to operate (unless your friend lives in the magical land of free electricity).

    Bear in mind that what I've described is the test I use to verify the fitness of our drives, and we use it because we've found that it is more strenuous than any commercially available SCSI test setup. Most new drives are able to handle it, but used drives can be a different story even though they might be perfectly good for any other application. With used drives you may have to drop your expectations to 10 or even 8Mbps per drive, so plan accordingly.

    That said, you also want to take a close look at your encoding/decoding hardware, as that can be a source of problems. Don't just look at the hardware specs, either, as all to often the driver capabilities fall far short of what the hardware can theoretically do.

  • by BDW ( 74391 )

    ...but only if you can deal with the OS latency. My very rough understanding says any networking based on the OSI model is going to pay a sufficiently large penalty in OS latencies that remote memory probably won't be any faster than a good local disk subsystem. However, if you can get rid of that latency, you can win BIG.

    Since the questioner is looking at using commodity hardware with a commodity OS using a commodity networking protocol, my gut feeling is that (s)he doesn't have a prayer. It is a cool idea, but latencies are likely to be too high.

    The /. dreamers don't need to give up all hope, however. :) There is relevant work in the academic literature, using specialized hardware and software of course. The work [washington.edu] I'm familiar with is from Hank Levy's group at UW. To sum up, based on what I remember from a class I took back in '98 from Mike Feeley [cs.ubc.ca] (first author on said paper; also did his PhD thesis on the topic):

    The motivating example came from Boeing [boeing.com]. They had a bunch of CAD workstations all with lots of RAM (by the standards of the day). However, looking at any nontrivial part of the design required more memory than any single workstation. Paging to disk was S-L-O-W. So why not use the frequently idle memory on the other workstations? The result of the UW work was a sort of global memory management, with paging to remote workstations in the cluster as well as to disk. Using memory on the remote workstations was significantly faster than using the local disk.

    So what about latency from the network stack? IIRC (and it has been five years since I talked to Mike about this...) they used myranet. In some sense myranet is basically DMA to remote workstations. One myranet node issues a write request in software, which includes the source address in memory for the data to be copied, a target node in the cluster, and the target memory address on the target node. The myranet hardware on the local workstation does DMA from the source memory location, fires it over fibre to the remote workstation, which dutifully does DMA from the myranet card to the memory locations specified by the sender. This is very fast, but not the stuff traditional general-purpose computing has been made of.

    Brian

    • You're mad

      A standard 100B-T network can theoretically sustain 12MBps, more like 8-10Mbps in the real world. A single 120GXP Deskstar can sustain 4 times that.

      Granted, this guy's talking about using 1000B, but that's still only 80-100MBps, which should be easily matched by a local 4 drive ATA RAID-0 array.

    • Just out of curiosity, what are the security implications of Myrinet? I mean, I ANYONE on the network can just dump a block of memory ANYWHERE in your RAM!

      Not to mention the sort of pointer bugs that will drive anyone out of his mind.

      Anyways, what does a Myrinet setup cost and how fast are these thins really anyways? (in Bps)
      • From what I've seen it's typically used on clusters. Hence security isn't that big an issue. (If you have access to one machine you have access to all of them.)

        And regarding pointers I once met a man who had done such debugging. His last words were "The horror, The horror". ;-)

  • by Wakko Warner ( 324 ) on Wednesday February 26, 2003 @09:21PM (#5391894) Homepage Journal
    I hate to be frank, but your solution is equal parts ambitious, elaborate, expensive, unreliable, slow, kludgey, and stupid, with an extra helping of stupid.

    Buy a single SCSI RAID card with three channels, three 36GB U160 drives (10 or 15K, doesn't really matter), and set up a hardware RAID 0 stripe. You'll save money and be able to edit any amount of video you want. Hell, buy a SINGLE 72 gig 10K drive and a high-quality single-channel SCSI controller. You'll save even more money.

    This is the best way to do this. You've at least proven there's at least one other way to do it.

    - A.P.
  • enter FCAL (Score:1, Informative)

    by Anonymous Coward
    FCAL controller: $100
    36G FCAL disk (10k RPM): $125
    cabling: $50
    misc parts: $50
    enclosure: $75

    for say $500, you could have a 36G RAID-0 array setup, local to the machine, and very, very speedy.
    • I bought 10 Fibre drives, 18gig 4meg buffer.
      10 FCAL Adapters

      1 hub

      2 Optical Links

      2 Optical cables (want to put them in the basement where it's quieter)

      1 PSU / external case

      Total cost so far is around 500$ for close to 270 gig of storage with 1 gbps speed. I'll post some benchmarks if you are serious about wanting to do this.
  • by Qbertino ( 265505 ) <moiraNO@SPAMmodparlor.com> on Thursday February 27, 2003 @07:42AM (#5394704)
    ..one or two of these [cenatek.com]
    Briefly said: Kicks any RAID (SCSI or not) and your RAMdisk solution up and down the street.
    It could be a tad pricey though, as you might wanna suspect. :-)
  • Everyone here seems to think that you should stick with RAID. I would agree with them. But one thing that I didn't really see anyone point out was the number of drives that you use.

    Your best solution for this setup would be one of the better SCSI cards or a RAID controller if you can afford it and get AS MANY drives as you can stuff into the box that is going to house this. Go with 9Gig drives if you dont need the space. 18Gig 15K would be best and setup a RAID 0 or 0+1 if you can. The more spindles in the setup the better off you'll be.

    Duke

  • Sorry to be one more person to not answer your question directly (does anyone ever on ask slashdot?), but with the volitility of RAM and the cost of commodity hardware spread across 12 machines, it does seem a little bit unlikely to be the most cost-efficient solution.

    If I am to understand you correctly, he needs about 24GB to store his files... but perhaps he doesn't need to access all 24GB at once?

    If you want to see if it will make an improvement, why not setup a GB ethernet samba connection to one of those P2's setup for a ramdisk? Then you can decide if it is worth it to him, pricewise, to do such a major upgrade before going all out and spending several thousand dollars on equipment. (unfortunately, final latency will probably be about twice what you experience on the single setup, as you are adding a host controller) Perhaps you could eek out some performance gains by copying the files to a local ramdisk, then editing (silly, though, as it should be copied to ram anyway for editing), or using the remote RAM as a scratch disk, so as to maximize the available ram on the host board.

    And, of course, there is the list of upgrades that might be more inexpensive, such as - Dual CPU's? More optimized VidCard? Turning off fontsmoothing?

    Hardware may just not be fast enough yet. There must be other people here that remembers when running a Photoshop filter meant a coffee break.
  • The University of Kentucky [uky.edu]'s KLAT2 [aggregate.org] project used a FNN [aggregate.org] to get insane bandwidth without worrying about gigabit cards and switches. I'd suggest you take a look at it.
  • 1) upgrade drives to WD new 10,000RPM SATA drives, get a number of them and a TRUE hardware SATA RAID card with at least the ability to have 8MB cache/drive via at least pc100 sdr. Eight of these drives with a true hardware SATA RAID card could give you a near fully PCI load(133Mb/s, which is as good as your going to get without moving up to 64/66 PCI system(528Mb/s). The Hardware SATA RAID controller keeps you CPU usage to a minimum(SCSI levels or BETTER) and also gives you the thouroughput needed at 150Mb/s/drive max.

    2)look into one of the SCSI LAN systems, using a SCSI160 or 320 controller to link computers together, and look at building some inexpensive machines like duron800/k7s5a combos for 100$ + 1.5Gb of ram per. The SCSI 160 connection easily fills up the PCI bus on the RAM machines and gives pretty good thouroughput because SCSI is very low latency. On the HOST machine, you can use pci64/66 so your BUS isn't much of a bottleneck. then you can use Linux Ramdisk on the RAM machines and export the drive over the SCSI interfact and mount it on the host. You can then create a RAID array on these if you like, but i think its unnecessary and just piling the RAM drive mounts together in a lateral array would be faster considering that RAID on the mounted disks will be in software, and lateral spanning is less processor intesive than striping
  • what you need can be found here [uc3m.es]. The enhanced network block device allows you to share a block device, like a ram drive or hard drive, over the network and make it appear on the main machine as a normal hard drive plugged into it. i have created a raid0 over this before. I can't really comment on speed, as it was a 10Mbps network, but it did work. with fast machines and a fast network, i would imagine you would easily saturate gigabit networking.
  • One of the big issues with RAID today is that drives are so large it is simple to hit your size requirement and still have a RAID that doesn't perform to the required level. This is because most people spec a RAID to have a required size and assume it will meet their performance requirements. Since the performance of a RAID is directly related to the number of drives used in the RAID, they should either use a larger number of smaller capacity drives or use the large capacity drives, but use more of them, even if it exceeds the space requirements for the RAID. Look at any review of RAID controllers and they will test with 2, 4, 8 etc drives and you can see the performance increase with each added set of drives. Even RAID 5, which requires more calculations for each drive added, will only slow the rate of growth not stop it.
  • To answer your question (redundantly, I guess) use nbd. (network block device). Linux-specific solution.

    But you don't want to do that (for reasons posted elsewhere in this discussion).

    You refer to this as "a raid array". You don't supply more details, which makes me suspect that your raid array isn't well understood. Which means you perhaps have RAID 5 set up, perhaps using IDE RAID and perhaps have multiple disks per IDE channel.

    What you need (IDE or SCSI, don't care) is multiple disks set up with a RAID 0 configuration for speed. If you need reliability (and I assume you don't, since you are considering RAM disks) it is best added via RAID 1+0 or 0+1.

    This introduces mirroring, either by mirroring your disks in pairs and then raid 0'ing those pairs together or by taking two big RAID0 devices and mirroring them. [Nice thought exercise. Why is 1+0 better than 0+1? Hint: what happens if two random disks fail.]

    This will half your storage. But you said you had plenty of storage.

    Solution:

    Reconfigure to a more appropriate raid config (1+0, 0+1) and ensure you aren't using a really cheesy solution.

    (DON'T put multiple disks on the same channel with IDE RAID! Performance will suck as the disks contend for the channel. You may say that you have an IDE raid controller with two channels, max of 4 disks and only by putting two disks on each channel. That is why those controllers suck. They are OK for mirroring. If that is the case, you will be faster overall by using software RAID over 4 disks on your 4 IDE channels (2 from your mobo, 2 from your IDE RAID solution).

    Thanks for the off-the-wall idea though. Caused us all to think a bit.
  • Try this out: http://www.superssd.com We put a big database on one of these babies and cut I/O wait down by 1/100 of the time.
  • I do amateur video editing myself, and I recently bought a new system for ~$600 USD for that purpose. They system is IDE RAID w/ AMD 1700 and 512mb of PC2700 ram. I use 2 WD 8mb Buffer 80gb drives in RAID-0 and my throughput for that is enough to record at full HALF FRAME rates (~60fps) AND surf the web (without dropping frames). If your friend is having problems with RAID performance, perhaps they should dump thier RAID card and buy the Epox 8k5a board with dual raid... (its a hell of a lot cheaper than a cluster too)
  • What I personally would like to see is a resurgance of RAMdisk cards. Full-length PCI cards chock full of DIMM sockets, or even SIMM sockets to make use of all that old memory you have hanging around.

    Have an IDE or SCSI socket on the top edge (or a SCSI connector on the back plate) to plug it directly into your disk system. Looks like a disk, acts like a disk, runs like a demon. I'd like to have my swap file on it. :)
  • Try firewire drivers, supposesly they get faster data transfers than IDE, even the lastest...
  • This sounds more interesting as a distributed computing project, rather than for mere video editing. My personal experience is the typical 100 gig IDE drives they're selling now are plenty fast enough for DV editing; the bottleneck is in processor speed, for effects, MPEG encoding, etc.

    OTOH I've sometimes thought, in OO information systems, if you never had to persist data to disk, it would sure save a lot of trouble. Multiply-redundant storage of data in memory on lots of machines on a network, with separate UPSs, might alleviate the need to save it to disk. Only in certain applications, of course.

Machines have less problems. I'd like to be a machine. -- Andy Warhol

Working...