RAMdisk RAID? 114
drew_92123 asks: "I've got a friend who does a LOT of video editing but is on a limited budget. He is currently using a raid array and while quite large, it's not as fast as he had hoped. I had an idea and wanted to know if there is a way to make it a reality, so of course I though of all the brilliant minds here. I have about a dozen Pentium II computers with 1GB of RAM. I would like to upgrade them to 2GB and throw in some gigabit NICs and create a 1.9GB RAMdisk on each one. Then I want to use one of the computers to RAID the RAMdisks together to be shared via Samba most likely. They are all 1U systems, with no HDD's, just a 64MB IDE flash disk. Any ideas out there?" Has anyone successfully put together such a system? How well did it work for you and are there any caveats that you would like to share with others who would do the same?
Eh?? (Score:5, Informative)
You might just try reconfiguring the raid to be Raid 0+1 (Striped and Mirrored) That would give you the redundancy and speedy access. If the RAM is already maxed out on the video workstation it might be more cost affective to get a better motherboard that supports more RAM.
~george
A way to overcome the NIC bottleneck (Score:2)
If it was a 12+ port 100MHz switch with a single gigabit port he could get a peak theoretical sustained throughput of 125 megabytes per second. I am guessing he already has fast NICs in the machines so it is a matter of upgrading the master box to a gigabit NIC (about $90 for an Intel card) and the switch with single gigabit port on it. There is a Accelar 1050 on eBay, 12 10/100 ports and a gigabit uplink for around $50 with 14 hours to go
Note : all NICs are not created equal, use SMC, Intel, 3Com or other quality brand name NICs for the highest throughput (NE2000 cards and clone cards run about half as fast as good cards.)
Of course once you do this you have a dozen 2G virtual drives but they are all different drive letters and you can't work on one big 24G file at once
I suggest addressing the issue. Look at some of the recent versions of the American Megatrends (AMI) MegaRAID card and couple it with a BUNCH of RAM configured 50/50 read cache / writeback cache and some fast hard drives. Used drives are wicked cheap now, and with 5 or 6 9G drives you could could make a nice RAID 5 setup with a spare for when one craters. I think Dell OEMs these as their PERC cards, and I think they are up to PERC/4 now but a PERC/3 is the one you might find used for a semi-reasonable price.
Re:Eh?? (Score:2)
Mind you, some work has been done on switched busses for motherboard use, and perhaps, maybe, you could emulate something like that, but once again, going to the NIC+NET would seem to be a big bottleneck.
Re:Eh?? (Score:2)
Because the overhead of IP processing is enormous compared to the overhead of scsi command processing.
Raw disks produce 30-60mbyte/sec, and climbing: say up to 500Mbit/sec. IU haven't seen any commercial NICs which can handle ram-to-ram transfers at this speed on desktop hardware. Scsi-160, nominalk throughput about 1280Mbit/sec, can handle this data rate without breaking into a sweat. Overheads for IP are going to be large fractions of a millisecond per packet (and a packet is 1.5 Mbit, or 9Mbit if you have jumbo packets). Scsi overheads are going to be tens of microseconds per transfer (probably per frame for video). Scsi just wins hands down - and IDE still beats the pants of NAS.
The only reason for NAS - and it is a good reason - is sharing.
Re:Eh?? (Score:2, Interesting)
Not as big as the bottleneck of the CPU. I don't know what type of video editing this person is doing, but odds are about 10:1 it's DV. I actually worked through the math on a video editing mailing list once and showed the fallacy of assuming that faster disk performance made any real difference. I'm actually rather surprised nobody else has already mentioned the logical flaw, given that this is slashdot, after all.
Amdahl's law tells us that speeding up a portion of an operation results in an overall speedup proportional to the ratio between the part that was sped up and the total operation, and that the maximum overall speed-up couldn't be more than the percentage of the total time that the operation used originally.
Say you have an operation that takes ten seconds. It is divided into two parts, one of which takes 1 seconds, one of which takes 9. If you speed up the part that takes 3 seconds by a factor of two, you have only knocked a meager 5% off the overall time of the operation.
In video editing, the amount of time taken for reading a block off a disk is totally insignificant compared to the amount of time needed to process the frame to such an extent that the disk access time is generally lost entirely due to the precision of most people's calculations.
Basically, the time to seek to a given block was something like one sixth of a percent of the total time needed to decompress a single DV frame with a 1GHz CPU. Even if you cut your seek overhead by a factor of a thousand, you couldn't cut your time by even one percent.
Worse, if the code is written correctly, it is possible --indeed trivial, given the well-defined access patterns involved in video editing -- to prefetch the blocks of data before the processor needs them, making the speedup absolutely zero, regardless of the speed of the disk.
Long story short, while it's a cool idea, it won't have any noticeable benefit. Now if your friend were doing -audio- editing with multiple channels, where it is actually possible (even easy) to exceed the speed of a single 5400 RPM hard drive, that would be a different story. But at least for DV editing, there is no benefit of even moving from a single 5400 RPM drive to a 7200 RPM drive apart from having a greater safety margin to avoid the risk of dropouts when capturing. The benefits of moving to a RAID are even less, and to a huge ramdisk, thus, less still.
Re:Eh?? (Score:1)
Re:Eh?? (Score:1)
If im say adding a small logo to the bottom right hand corner of a video of some description and writing it out to disk the operation completes in LESS time if im compressing the output too, but if im working with uncompressed footage (as I do generaly until the last stage of processing) then writing it out to disk is by far the slowest part of most operations
Do some tests first (Score:5, Informative)
Re:Do some tests first (Score:3, Insightful)
what? (Score:2, Insightful)
gig E does not perform that well on a single host. You'll -might- get lower latency than a cheap raid array but the thruput won't be any better.
why waste time with "RAID" anyways? this ram is all ECC ram and is non-persistent so there's no point in tossing that extra computation into the mix to make it worse.
Re:what? (Score:3, Funny)
Re:what? (Score:2)
Power outage (Score:3, Informative)
Re:Power outage (Score:4, Insightful)
Of course just to repeat and be redundant, the bottle neck will simply come down to the NIC card, a possible solution would be to install several NIC's and spread the load out... But this person would be much better off simply buying a system that will either (a) handle 24 GB of RAM, or (b) has fast enough disks to please him...
Of course a better idea might be to buy a system that would support both (a) and (b) and then set the system up to use the ram as a primary holding area for data, then flush to the (now faster) disk array...
Either way this person is still affected by "The Sickness" that we all seem to suffer from (More, bigger, better, faster....)
Re:Power outage (Score:1)
Re:Power outage (Score:1)
Re:Power outage (Score:1)
PIN number - Personal Identification Number Number - gonna say that?
NIC card - Network Interface Card Card.
Re:Power outage (Score:3, Funny)
Also... (Score:5, Insightful)
If your concern is extended duration throughput the multiple rack computers with ram *might* be an option, but most normal users wouldn't consider it due to the latency involved with going through the southbridge then the nic then the nic then the southbridge (of the other computer) then the north bridge. And that's just a one way trip.
Just don't shell out a bunch of money before you do a proof of concept.
~george
Re:Also... (Score:2)
Of course, you are still suffering from practical network limitations, protocol limitations, and more. Making a distributed file system over a network of sufficient speed doesn't sound like a bad idea, it just seems to need a few things.
First reactions. (Score:5, Informative)
But, if you give it some thought it is an interesting idea. Basically you are trying to build a clustered RAM disk..
There is however, a major drawback to this idea. The whole advantage of a RAM disk is speed/performance. Locally, the RAM disk is MUCH faster than a normal disk drive. But, the problem arises when you connect your "RAID RAM disk". You must network the machines in order for them to communicate with each other and suddenly, your performance has dropped to nothing. In fact is is below the performance of a normal disk drive.
In order for your RAID RAM disk to perform equally with a good disk drive you would require a switched gigabit network between your nodes. This will cost more than the "normal" disk. Additionally, even with a switched gigabit network the performance is highly unlikely to exceed the performance of highend disk drives.
So, when you get right down to it, the RAID RAM disk is an interesting idea, just to see if you can do it. But, there isn't really any advantage to it.
Re:First reactions. (Score:3, Insightful)
RAM -> RAM across a network (assuming at LEAST 100mbit ethernet) will be FASTER than accessing a RAID of local disks. It's all memory to memory transfer at that point - no spin up, no seek time. The disk's may get close for a very long sequential write/read, where the multiple drives can actually come close to using the bandwitdh available via the RAID controller.
But for random access...no way. RAM 'seek time' is measured in NANOSECONDS, while even the fastest drive is in the miliseconds! RAM is over 1000 times faster!
Re:First reactions. (Score:1)
Re:First reactions. (Score:2)
Re:First reactions. (Score:1)
RAM 'seek time' may be nanoseconds, but that does not mean it will transfer across the network that quickly.
but i'm not the first one to say this... so...
Re:First reactions. (Score:5, Informative)
I, however, beg to differ.
This is two machines sitting side by side on a seperate, completely unloaded switch. Don't just go by the 500ns ping time, you actually have to transfer data. You're talking at LEAST 3ms PER BLOCK... and thats with some insanely optimized code.Now, for video editing 99% of the effort is linear (unless you are horribly fragmented) so you're talking ONE 6ms seek ONCE then thousands upon thousands of linear reads.
Secondly, his "raid array" sucks if the performance is bad. I buy low end LSI Express 500s (Ultra 160 LVD) and they have stellar performance. For doing AV, this is my reccomendation:
Buy a multi-channel Ultra160 or Ultra320 SCSI Raid controller (160s are pretty cheap now that 320s are on the market) Load it up with 5 large drives. Set the stripe size to the maximum. Buy a cheaper IDE RAID and set it in mode 15 (Mirror two RAID5 arrays together, harder to lose data that way.)
Use the SCSI for your working set, and reformat it frequently (or at least delete all files) to defrag. Use RAID0, it's faster. Save your finished projects to the IDE raid, burn to DVD, DLT, whatever.
It will _STILL_ be cheaper then putting 2gig of RAM unto a pile of boxes, AND faster. single-channel ultra-320 can hit you with up to 40 megaBYTES per second, all on a measly 5ms initial seek. (Remember, ALL the drives seek in parallel) Putting drives on the second channel can whollup you with 80MB/second. You're talking around $1500 for the card, of course. But have you priced out a 1U server with 2gig ram lately?
Re:First reactions. (Score:2)
Surely you mean 320MB/sec on one channel, as SCSI is rated in MB/sec and not Mb/sec (like IEEE1394 and USB are). IDE is also rated in MB/sec.
Re:First reactions. (Score:2)
RAID 15 will use up more disks (extra disk for parity) and will also have lower performance (due to the parity calculations).
However, the rest of the post is sound; use a fast RAID-0 array for working sets (fast, but no redundancy; worst case, you lose your current run of data) before copying to redundant (e.g. RAID 5, 10, 0+1) storage or copying to tape.
Re:First reactions. (Score:2)
not to mention that most slashdot articles are from people on a budget
Re:First reactions. (Score:2)
Re:First reactions. (Score:2)
Because you can't add spares. Raid-5 error correction only expands the data to N+1. If you put in extra drives, they are "hot spares", not redundant. Disks are getting pretty cheesy lately (especially for someone "on a budget") and a multi-disk failure isn't unheard of.
"raid 6", whatever that is and whenever it becomes a common standard expands the data to N+X, so you have to have X+1 drive failures before losing data.
Also, raid 15 is a bad idea. Raid 51 has a LOT more redundancy (raid5 made up of individually raid1'd disks.) Your odds of losing the 4 drives needed to take it down are much less then if you use 15.
Re:First reactions. (Score:2)
Besides, it's not going RAM -> RAM, it's going RAM -> northbridge -> southbridge -> PCI bus -> NIC -> switch -> NIC -> PCI bus -> soutbridge -> northbridge -> RAM, and being broken up into little packets by TCP/IP and reassembled and so on. Every one of those steps introduces latency, which potentially interupts the video stream.
Additionally, a 100Mbit network can only transfer 12MBps (more like 8-10MBps in the real world), and a single 120GXP Deskstar can sustain 4 times that. A 4 drive RAID-0 IDE array should easily be able to outperform the 1000Mbit network proposed here.
Re:First reactions. (Score:1, Flamebait)
Seriously, I have a hard time imagining how you could design a storage system with a worse unneccessary cost/MB or lower reliability.
Re:Huge waste of money (Score:2)
Not really cost-effective (Score:3, Informative)
On the other hand, if you just throw four 100GB ATA-100 drives in a standard tower case with a decent IDE RAID controller, you get five times as much storage for probably about half the money.
Also, remember that most low-to-mid-range PCs can't actually fully take advantage of a gigabit network link, since the PCI bus and CPU get saturated long before the network does.
This is probably the problem. (Score:3, Insightful)
It is very likely that he is already using IDE or ATA disks, and that is part of his problem. When large amounts of data need to be transferred quickly, SCSI is what you need. There is nothing faster than 15,000 RPM SCSI drives connected to good RAID controllers that have large amounts of cache RAM. Nothing.
If you want high performance then you must use high performance gear. Yes, it does cost 5 to 10 times more than the IDE RAID solution but, there is a VERY good reason for that.
Ok, now comes the flames from the know-it-all masses who's experience is limited to home PCs and no-traffic webservers.
Re:This is probably the problem. (Score:2)
One thing I didn't think of in my previous post, though, was the XServe RAID units that Apple released recently...$6k gets you a 720GB Fibre-Channel RAID array in a 3U enclosure. Not bad at all, really.
Re:This is probably the problem. (Score:2)
What might be a good compromise is the new WD SerialATA 10KRPM drives due to ship later this month. On a good RAID controller they should give you a bit of a speed boost for the $$, and the individual channels burst at 1.5Gb/s which is a bit faster than the GigE interconnect on a ramdisk would give you anyway.
Apple's XServe solution, while it has a FibreChannel interconnect, is an 7200 RPM ATA RAID solution inside the box, so that's going to be a per-client speed hit, and for what he's doing could be accomplished at the same performance level for much cheaper with an internal local ATA RAID controller (which is what he's already doing) - probably individually cheaper than a decent FibreChannel host adaptor!
[Source: Apple XServe RAID Technology Overview [apple.com] ]
All in all, I agree with the grandparent post that 15K SCSI drives with a good 64-bit RAID controller with oodles of ram on a nice motherboard will probably SMOKE anything you could get with some kind of bizzare frankenstein Network-raid-ramdisk, though the frankenstein network-raid-ramdisk would be a fun hack just to pull for the hell of it.
Re:This is probably the problem. (Score:2)
I agree that there is nothing faster than what you describe, but when it comes down to it there's no reason a single user editing station would need more than about 70Mbps, which a 4 drive IDE RAID-0 should be able to sustain.
I'd still recomend SCSI, but 10k or even 7200RPM drives should be able to handle the load easily, and be fairly affordable. In my experience a RAID-0 array of 7200RPM drives can handle about 24Mbps per drive (simultaneous record and playback of 12Mbps per drive is how I test them).
Trading disk latency for network latency (Score:3, Interesting)
Re:Trading disk latency for network latency (Score:2)
It does seem like a waste of hardware, though. If you are going to drop some money into buying RAM for these systems, I would think it would be way better to figure out a way to attach that RAM locally. Somebody must make something like a PCI or Fast SCSI RAMdisk card that takes the cheapest memory modules. This would be likely to have some sort of battery built in to make it non-volitile as well.
Re:Trading disk latency for network latency (Score:5, Informative)
And for the obligatory... (Score:4, Insightful)
*smack* *thump*
*mass cheering*
Btw, it does seem to be a (disturbing) recent trend at Slashdot to try to troll whole stories, instead of just trolling comments. C'mon anyone who's taken even one networking or hardware class knows the speed heirarchy:
cache > memory > disk > network
And, with the amount of physical RAM drives out there (very few), you'd quickly realize that even a local RAM drive doesn't offer enough of a speed benefit to offset it's cost. C'mon editors, I know it sounds cool, but do you really have to post it?
Re:And for the obligatory... (Score:1)
Show me a disk that is faster than a gigabit ethernet. And if you tell me disks in a RAID are faster, I'll tell you it's possible to use several network connections in parallell.
The networked ramdisks could be faster, but it doesn't fit the limited budget he wanted.
Re:And for the obligatory... (Score:1, Informative)
Re:And for the obligatory... (Score:2)
I'll be happy to show you a gigabit ethernet that is slower than a disk...
Gigabit ethernet really doesn't live much up to it's promise, mostly because it is still ethernet which fragments everything into 1500 byte packets. Either, you loose network speed, or you saturate one CPU on each side of the link doing nothing but feed/read the networking card. Seems to me to be a relatively stupid way of using your new spiffy multiprocessor machine.
On the other hand, yes, there exist networking solutions that are clearly faster than any disk you can buy.
Re:And for the obligatory... (Score:2)
The NAS wins, hands down, for random access stuff, and is aboutt as fast for linear stuff. the only thing it can't compete on is running postgresql, where having the disks local seems to be much faster. Plus I get nervous running a database across an NFS mount.
FYI, the SCSI drives offer linear read speeds of about 25 Megs a second each, the IDEs offer about 10Megs a second, and the NAS is connected via switched 100BaseTx FD, so it's only got 10 Megs a second to work with. I'm betting it's faster because of its internal caching and what not.
It has gig connections, we just can't justify the increased cost on an intranet server that already has sub second response time.
Not the most in-expensive option (Score:3, Interesting)
think of the additional costs:
12GB of Ram
12 Gig nic cards
1 >12 port Gig ethernet switch.
Setup time
For what your looking at spending, it may be the same cost as buying some U320
scsi disks and some sort of SCSI raid card.
Network Block Device (Score:2, Informative)
Sounds like it nbd [sourceforge.net] may be your ticket if you are using Linux. nbd is designed to take a block device, like a hard drive and make it available over a network on a different host. It will also do RAID 0,1,5. Perhaps it will work with a ramdisk. I can't swear that this will work but is sure might, since after all, a ramdisk is implemented as a block device.
RAM is cheap. If you are unconcerned about high electricity costs and need a large *F*A*S*T* device for storage, stripping a number of ramdisks could be the thing to do. PC133 1GB DIMMs are currently about US$200 [tigerdirect.com] and are on their way down. Sure, it's expensive compared to RAID 5, but I'm sure it's a lot faster. Just make sure you write out anything you need prior to downing the whole array.
Re:Network Block Device (Score:2)
You need to shop around. I can get 1 GB PC133 Samsung for $141, or Micron for $169, and that includes shipping.
Re:Network Block Device (Score:1)
Re:Network Block Device (Score:2)
They didn't, Tiger Direct just sucks.
Small Problem here... (Score:1)
Closer is better (Score:4, Informative)
Wow (Score:2)
I hope all that power is used for good, not evil.
Re:Closer is better (Score:2)
Re:Closer is better (Score:2)
So for a 4G unit it would be about $950 total.
Expensive yes but if you could increase the overall performance of a top of the line system 20% for a grand
Granted it is overkill in the context of the current discussion, and granted this wouldn't be the best application for demonstrating performance gains, but I still would love one
Re:Closer is better (Score:1)
Re:Closer is better (Score:2)
Along the original poster's idea though, I occasionally revisit the idea of putting two machines next to each other, connected by a crossover cable between their Gigabit networking cards. One machine would do all the processing and the other machine would simply be a virtual file server with a massive RAMDrive - maybe 4G - to store and serve files. This idea of course not necessary with the RocketDrive but before I knew about that
I still may do that, the barrier to entry being having two machines with gigabit NICs in them, and four 1G sticks of RAM to put in one of them (or at least 1G just to play with the idea.) Oh yea, and buying a full version of SuperSpeed for ?? under a hundred bucks.
www.superspeed.com IIRC
Whoah expensive (Score:2)
All things being equal I'd put the value of the switch higher than the entire setup as it presently exists...
Simple answer (Score:2)
The idea of a drive is persistant storage. Disk caching algorithms nowadays are excellent and normally surpass ram drives, since in reality, you are pretty much ALWAYS using a ramdrive. Not to mention the network issues brought up in other threads, and the simple fact that the main reason for raid 5 is redundancy of storage in order to boost reliability not just performance, and by putting everything in ram, there goes reliability. Sounds like, if you are not concerned with reliability as much, is a Raid-1 array, which implements striping but not parity.
Re:Simple answer (Score:2)
RAID 0 is striping with no redundancy and is faster.
Why bother? (Score:1)
Let's see, you want to buy 12 GB RAM, 12 gigabit NICs, and a gigabit switch to get a 24 GB logical volume?
The cheapest gigabit NIC on pricewatch is $40. I'm not sure what type of ram a P-II takes, but everything on pricewatch is over or about $100/GB. So that's $140 per computer, times 12 computers is $1,680 for a 24 GB volume. (This is what you consider a 'limited budget'?)
And you said your friend already has a raid array? I'm willing to bet that it's bigger than 24 GB, and since it's probably attached locally instead of through the network, a hell of a lot faster.
For comparison, you can buy a 250 GB EIDE drive for $325. I'm sure you could put together a cheap computer with four of these drives for less than the $1600.
Which is it? (Score:2)
Are you on a limited budget, or are you going to spend a couple grand to upgrade all these ancient machines?
Did you ever consider just taking the money and buying one really fast machine?
Issue as I see it... (Score:2)
Problem is, there's only so much RAM a "normal" motherboard can refer to, before you have to start doing ugly (and expensive, slower) hacks. 2^32 = 4294967296 bytes, or precisely 4 gigabytes.
After that, you're on your own.
So, my question is: Why don't we have "RAM banks" that interface over SCSI, firewire, or even IDE?
If it's firewire, it could even be external.
A RAM chip isn't that large. Why isn't there a slick "drive" about the size of an iBook that holds 40-60 modules.
The RAM itself, according to this week's memory prices is $80 for 512 MB PCC133 ECC, which means 5 gigabytes (more than enough probably for whatever work you want to do -- remember, this is volatile memory, so you can't store anything permanent on it!) is only $800 for the chips themselves. [sharkyextreme.com]
I'm not sure I understand why PC800 is about the same price for 256 MB modules, but an order of magnitude more for 512 MB, while PC3200 (apparently the fastest of the lot?) is almost the same price as what I quoted for PC133.
Anyone?
Re:Issue as I see it... (Score:1)
Why can't i get an external firewire (or ide/scsi) box and just shove a bucketload of ram sticks in there and have an extremely fast external drive?
Yum!
Re:Issue as I see it... (Score:2)
Just don't forget that it's volatile. Imagine this though:
There's a battery involved, and whenever you disconnect from power it warns you, beep-beep, that it's going to lose power, indicating that for hours and hours and hours (hey! I need to be plugged in!). And get this: THEN, with the last hour of the battery, it powers up an IDE hard-drive and dumps itself to the battery before finally dying!.
Whenever you next plug it in it says: Sorry, you'll have to wait until I load memory from the hard-drive (AND get enough power back to dump to hard-drive again, in case you unplug me just as finish doing so).
Am I a genius or what,DiSKiLLeR?
Re:Issue as I see it... (Score:1)
Ohh yeah, and they're bloody expensive and don't really give a big boost. (Because you satuate the PCI bus.)
PC3200 isn't the fastest..... (Score:1)
With the built-in Promise IDE RAID, I get >40MB/S thruput.
That over 100 frames/sec encoding DiVX.
Sounds like your bud's problem isn't in hard drive speed/thruput.
You need to make block devices (Score:3, Informative)
Don't forget to deal with lost UDP packets, but you don't even want to go near TCP's latency on this. If you put them all on a switch your packet loss should be negligable anyhow.
I don't think it's practical for your application but it would be a very cool hack. Good luck!
Re:You need to make block devices (Score:1)
Waste of time (Score:5, Informative)
In typical storage situations you have 2 issues you need to consider: bandwidth and space. With video you add a 3rd issue which can easily eclipse the other 2 in importance: latency. Latency can cause hiccups in your data stream which would be unnoticable in any other application, but become painfully obvious with video, and any networked sollution is going to add latency. The more network is involved, the more latency will be added, which is why I would absolutely advise against a distributed solution. For that reason even if your network has the same theoretical bandwidth as your RAID, it will be slower, and that will kill the video stream.
Anyway, your friends needs should be taken care of with an older (cheaper) SCSI RAID controller and some older SCSI drives, say 9-18GB 7200RPM. In a RAID-0 configuration they should be able to handle simultaneous record and playback of 12Mbps per drive, with a 3 hour capacity at that maximum bandwidth. For example: my test fixture has space for 5 drives, so it can handle 3 hours worth of 60Mbps video, which is decent for HD and ludicrous for SD. You should be able to pull together something like that for less than what you're planning to spend on RAM and Gig-E network gear, and will be more reliable (minimized data loss in case of power outage) and a hell of a lot cheaper to operate (unless your friend lives in the magical land of free electricity).
Bear in mind that what I've described is the test I use to verify the fitness of our drives, and we use it because we've found that it is more strenuous than any commercially available SCSI test setup. Most new drives are able to handle it, but used drives can be a different story even though they might be perfectly good for any other application. With used drives you may have to drop your expectations to 10 or even 8Mbps per drive, so plan accordingly.
That said, you also want to take a close look at your encoding/decoding hardware, as that can be a source of problems. Don't just look at the hardware specs, either, as all to often the driver capabilities fall far short of what the hardware can theoretically do.
Network wins over disk... (Score:2, Interesting)
Since the questioner is looking at using commodity hardware with a commodity OS using a commodity networking protocol, my gut feeling is that (s)he doesn't have a prayer. It is a cool idea, but latencies are likely to be too high.
The /. dreamers don't need to give up all hope, however. :) There is
relevant work in the academic literature, using specialized hardware and software of course. The work [washington.edu]
I'm familiar with is from Hank Levy's group at UW. To sum up, based
on what I remember from a class I took back in '98 from
Mike Feeley [cs.ubc.ca]
(first author on said paper; also did his PhD thesis on the topic):
The motivating example came from Boeing [boeing.com]. They had a bunch of CAD workstations all with lots of RAM (by the standards of the day). However, looking at any nontrivial part of the design required more memory than any single workstation. Paging to disk was S-L-O-W. So why not use the frequently idle memory on the other workstations? The result of the UW work was a sort of global memory management, with paging to remote workstations in the cluster as well as to disk. Using memory on the remote workstations was significantly faster than using the local disk.
So what about latency from the network stack? IIRC (and it has been five years since I talked to Mike about this...) they used myranet. In some sense myranet is basically DMA to remote workstations. One myranet node issues a write request in software, which includes the source address in memory for the data to be copied, a target node in the cluster, and the target memory address on the target node. The myranet hardware on the local workstation does DMA from the source memory location, fires it over fibre to the remote workstation, which dutifully does DMA from the myranet card to the memory locations specified by the sender. This is very fast, but not the stuff traditional general-purpose computing has been made of.
Brian
Re:Network wins over disk... (Score:2)
A standard 100B-T network can theoretically sustain 12MBps, more like 8-10Mbps in the real world. A single 120GXP Deskstar can sustain 4 times that.
Granted, this guy's talking about using 1000B, but that's still only 80-100MBps, which should be easily matched by a local 4 drive ATA RAID-0 array.
Re:Network wins over disk... (Score:2)
Not to mention the sort of pointer bugs that will drive anyone out of his mind.
Anyways, what does a Myrinet setup cost and how fast are these thins really anyways? (in Bps)
Re:Network wins over disk... (Score:1)
And regarding pointers I once met a man who had done such debugging. His last words were "The horror, The horror".
Is this a troll? Or are you on crack? (Score:3, Funny)
Buy a single SCSI RAID card with three channels, three 36GB U160 drives (10 or 15K, doesn't really matter), and set up a hardware RAID 0 stripe. You'll save money and be able to edit any amount of video you want. Hell, buy a SINGLE 72 gig 10K drive and a high-quality single-channel SCSI controller. You'll save even more money.
This is the best way to do this. You've at least proven there's at least one other way to do it.
- A.P.
enter FCAL (Score:1, Informative)
36G FCAL disk (10k RPM): $125
cabling: $50
misc parts: $50
enclosure: $75
for say $500, you could have a 36G RAID-0 array setup, local to the machine, and very, very speedy.
I'm doing this right now- (Score:2)
10 FCAL Adapters
1 hub
2 Optical Links
2 Optical cables (want to put them in the basement where it's quieter)
1 PSU / external case
Total cost so far is around 500$ for close to 270 gig of storage with 1 gbps speed. I'll post some benchmarks if you are serious about wanting to do this.
Re:I'm doing this right now- (Score:1)
speaking of which, i'm only using single port single connect right now. 1gbit. my drives all claim to be capable of not only standard looped connection, but seem to have two different data ports. any idea how i can make use of this?
can i just add another HBA and plug it in on the other side for the loop? how do i take advantage of the second data port set?
i really want a pinout.
Space, Power, Noise, Setup Time? How about... (Score:3, Informative)
Briefly said: Kicks any RAID (SCSI or not) and your RAMdisk solution up and down the street.
It could be a tad pricey though, as you might wanna suspect.
good suggestions... (Score:2)
Your best solution for this setup would be one of the better SCSI cards or a RAID controller if you can afford it and get AS MANY drives as you can stuff into the box that is going to house this. Go with 9Gig drives if you dont need the space. 18Gig 15K would be best and setup a RAID 0 or 0+1 if you can. The more spindles in the setup the better off you'll be.
Duke
Smart-ass solution (Score:2)
If I am to understand you correctly, he needs about 24GB to store his files... but perhaps he doesn't need to access all 24GB at once?
If you want to see if it will make an improvement, why not setup a GB ethernet samba connection to one of those P2's setup for a ramdisk? Then you can decide if it is worth it to him, pricewise, to do such a major upgrade before going all out and spending several thousand dollars on equipment. (unfortunately, final latency will probably be about twice what you experience on the single setup, as you are adding a host controller) Perhaps you could eek out some performance gains by copying the files to a local ramdisk, then editing (silly, though, as it should be copied to ram anyway for editing), or using the remote RAM as a scratch disk, so as to maximize the available ram on the host board.
And, of course, there is the list of upgrades that might be more inexpensive, such as - Dual CPU's? More optimized VidCard? Turning off fontsmoothing?
Hardware may just not be fast enough yet. There must be other people here that remembers when running a Photoshop filter meant a coffee break.
Network Infrastructure (Score:2)
options: (Score:2)
2)look into one of the SCSI LAN systems, using a SCSI160 or 320 controller to link computers together, and look at building some inexpensive machines like duron800/k7s5a combos for 100$ + 1.5Gb of ram per. The SCSI 160 connection easily fills up the PCI bus on the RAM machines and gives pretty good thouroughput because SCSI is very low latency. On the HOST machine, you can use pci64/66 so your BUS isn't much of a bottleneck. then you can use Linux Ramdisk on the RAM machines and export the drive over the SCSI interfact and mount it on the host. You can then create a RAID array on these if you like, but i think its unnecessary and just piling the RAM drive mounts together in a lateral array would be faster considering that RAID on the mounted disks will be in software, and lateral spanning is less processor intesive than striping
Enhanced network block device (Score:2)
In RAID, More spindles = More Performance (Score:1)
not the problem (Score:2)
But you don't want to do that (for reasons posted elsewhere in this discussion).
You refer to this as "a raid array". You don't supply more details, which makes me suspect that your raid array isn't well understood. Which means you perhaps have RAID 5 set up, perhaps using IDE RAID and perhaps have multiple disks per IDE channel.
What you need (IDE or SCSI, don't care) is multiple disks set up with a RAID 0 configuration for speed. If you need reliability (and I assume you don't, since you are considering RAM disks) it is best added via RAID 1+0 or 0+1.
This introduces mirroring, either by mirroring your disks in pairs and then raid 0'ing those pairs together or by taking two big RAID0 devices and mirroring them. [Nice thought exercise. Why is 1+0 better than 0+1? Hint: what happens if two random disks fail.]
This will half your storage. But you said you had plenty of storage.
Solution:
Reconfigure to a more appropriate raid config (1+0, 0+1) and ensure you aren't using a really cheesy solution.
(DON'T put multiple disks on the same channel with IDE RAID! Performance will suck as the disks contend for the channel. You may say that you have an IDE raid controller with two channels, max of 4 disks and only by putting two disks on each channel. That is why those controllers suck. They are OK for mirroring. If that is the case, you will be faster overall by using software RAID over 4 disks on your 4 IDE channels (2 from your mobo, 2 from your IDE RAID solution).
Thanks for the off-the-wall idea though. Caused us all to think a bit.
RAM SAN (Score:1)
Poor Performance? Buy a new system. (Score:1)
Return of RAMDisk cards? (Score:1)
Have an IDE or SCSI socket on the top edge (or a SCSI connector on the back plate) to plug it directly into your disk system. Looks like a disk, acts like a disk, runs like a demon. I'd like to have my swap file on it.
Firewire (Score:1)
Interesting project (Score:2)
OTOH I've sometimes thought, in OO information systems, if you never had to persist data to disk, it would sure save a lot of trouble. Multiply-redundant storage of data in memory on lots of machines on a network, with separate UPSs, might alleviate the need to save it to disk. Only in certain applications, of course.