Where are the High-Capacity SCSI Drives? 138
An anonymous reader asks: "Storage technology has really exploded in recent years, giving us ATA drives up to and exceeding 200-250 GB per drive. Why is it that SCSI drive technology has remained stagnant? I can't find a SCSI drive exceeding about a 146 GB capacity. Instead, businesses (and some individuals) wanting greater storage capacities are required to buy more drives which takes up more space, generates more heat, provides more points of failure, uses more electricity, etc. Why is this so?"
Where are the High-Capacity SCSI Srives? (Score:2, Funny)
Time to do some reading (Score:2)
Are you unfamiliar with the concept of RAID [google.com]? That's where all those SCSI drives are going, and it most certainly does not add more points of failure as it pertains to systems. Business do not want high-capacity single SCSI drives, especially when they can pile together 146 GB drives.
Re:Time to do some reading (Score:2)
Being limited to 146 GB drives means you are limited in scaling, which, of course, is what RAID is all about, as you've pointed out.
Re:Time to do some reading (Score:2)
Re:Time to do some reading (Score:2)
Re:Time to do some reading (Score:2)
Re:Time to do some reading (Score:2)
However many disks worth of space you've lost in your raid to parity, that's more or less how many drives have to fail before you've lost data and the raid goes down.
Depending on the scheme you use, the same is true in a raid of raids, except with raids. That many raids, have to lose that many drives, before you've lost data.
I agree however, simply because there are benefits to large numbers of small disks, doesn't mean there aren't ben
Re:Time to do some reading (Score:2)
Most SCSI RAID controllers have at least two channels. At 13 SCSI devices each (14 minus the controller's ID) that's 26 devices. A three channel (also very common), or four channel (much less common) would handle 39 and 52 devices respectively. Not to mention you can have more than one card (some even have cross-connects for a true failover configuration).
Worst case you have one el-cheapo single channel - you still
Re:Time to do some reading (Score:2)
With RAID 5 you get...
4x250GB = 750GB of storage 7x146GB = 876GB of storage
Now let's assume the backplane only supports a total of ten drives:
10x250GB = 2.25TB of storage 10x146GB = 1.31TB of storage
Gee, which solution scales better?
And let's say we wanted to be really crazy and implement RAID 6, a more practical solution than RAID 5+1, and more likely to be implemented in the real world:
10x250 = 2TB of storage 10x146 =
Re:Time to do some reading (Score:2)
Hetfield, being a big, dumb idiot, was depicted as a big, dumb idiot relegated to simple phrases, such as "NAPSTER, BAD!" He was rather cave-man-like. I believe one full phrase went, "MONEY, GOOD! NAPSTER, BAD!"
Re:Time to do some reading (Score:3, Insightful)
Why? they make for higher capacity raids.
The more devices the controller has to be able to handle the more expensive, also although more drives means better overall performance, the overall efficiency goes down.
The big thing is like beggers home users have pretty much been locked out of SCSI. Even a single scsi drive yields better performance than an IDE drive.
If scsi drives were offered widely in home pc's, there would obviously be a performance in
Re:Time to do some reading (Score:2)
I know, it's ironic that that when computers are advertised, they usually have the most blazing fast cpu and crap for the rest(thought this is improving). For the money, more RAM would probably help users out better than a SCSI disk(or 2 since the advent of high end video games, mp3s and movies has really caused the demand for storage to soar)
Re:Time to do some reading (Score:2)
I doubt it, since 90+ percent of them are running windows and windows starts to swap before you hit the desktop, no matter how much memory you have. That means the hard drive is your bottleneck, not memory.
On a linux, bsd, or pretty much anything not windows I'd agree, you need to put in enough memory that it's rare
Re:Time to do some reading (Score:2)
that's failed logic. that's saying that businesses don't want more space. of course they want more space, higher capacity, more reliability and faster speeds and you don't lose reliability at all if the drives are as reliable and as many.
however.. as to the original poster: just buy some damn sata drives.
Re:Time to do some reading (Score:2)
Actually, some businesses *DO* want the capacity.
We bought two Promise 15100 arrays, and put in 30 250Gb drives. Sometimes an array is wanted to be sometimes large, sometimes redundant, and in this case, both.
My Guess (Score:5, Insightful)
The solution to this reliability problem is the RAID. There are two RAID levels that are ideal (there are more, but this is a simple explanation). There is 1, which is just a mirror; and 5, which is striping with parity.
With RAID 1, if you have 500 GB of data, you would need 2 500 GB drives. You lose 50% of the capacity you buy. The other option is RAID 5, where you lose (1/number of disks). So you could store 500 GB of data on 6 100 GB disks. This way you've only lost 100 GB of storage to redundancy as opposed to 500 GB.
So when businesses want to store large ammounts of data, it's more economical to use many smaller drivers than to large drives. Even if you don't need the redundancy (for example the disk is just being used for temporary storage while working on large digital picture or video files) they it's still better to use many small disks. While using a 500 GB drive will only go so fast (lets just say 60 MB/s sustained), by using a RAID, you can mulitply that. So by using 5 100 GB drives, you might be able to sustain 300 MB/s (assuming the bus can keep up, etc). Even if you only scale at 50% (that would be 150 MB/s) that's still 2 to 3 times faster than a single drive. That performance can save you money.
So, if you can afford it you can get much better performance or economics from using multiple smaller drives from one large one.
That's my theory/understanding. Begin tearing it apart!
Re:My Guess (Score:4, Informative)
Most decent external RAID units today have dual hot-swappable dual power supplies and fans. However there is still only a single backplane and RAID controller board (IBM PowerPC chips are very popular for this) involved. I've both a backplane and controller a fail on me in the span of 2 years, in both cases taking all the data with them. These units were 6x200GB IDE drives, 1TB usable, 1 parity drive, and we had several cold spares available to hot-swap in on a failure.
Sure I agree that statistcally your drives, fans and power supplies are much more likely to fail than the backplane or controller, but it can still happen.
Never forget the important of having backups, and make sure you can recover from them as part of implementing your backup solution. (1 month rotation of Ultrium tapes here).
There is a solution to the above, but it's very costly, and that's RAID over distributed storage (iSCSI and the like).
Re:My Guess (Score:2)
Re:My Guess (Score:3, Informative)
Of course, by the time you spent the money on this type of setup, you could probably have purchased another complete machine, with another array in it, and used software to handle redundancy and updates to the array. We did this with our S
Re:My Guess (Score:2)
Of course, by the time you spent the money on this type of setup, you could probably have purchased another complete machine, with another array in it, and used software to handle redundancy and updates to the array. We did this with our SQ
Re:My Guess (Score:2)
However, every RAID unit I've dealt with has at least had a slot for a redundant controller. Of course, these are SCSI RAIDs. I guess now you know what the price difference is all about.
That said, unless there's something extremely screwed up about the design of your RAIDs, t
Re:My Guess (Score:2)
For example, if you send a block of 00000 to your RAID array, and the controller barfs and actually tells the drives to write 00100, then it doesn't matter if all your drives are okay, the data on them is actually wrong, meaning that your controller corrupted your data.
Re:My Guess (Score:2)
Also, while this is theoretically possible, I've never seen it happen. I troubleshoot RAIDs for a living, and I probably average about a controller a day between my various fixtures (several different chassis from several different manufacturers) over the last 2 years, so I don't think that's due to lack of exposure. In my experience, controllers either work
Re:My Guess (Score:2)
Re:My Guess (Score:2)
Interesting. That's a little beyond t
Re:My Guess (Score:2)
Re:My Guess (Score:3, Informative)
Re:My Guess (Score:3, Interesting)
The largest benefit is performance. Gamers invest so much in their system bus, cpu, and memory, but disk i/o is 5 orders of magnitude slower. if performance is key, a small investment in SCSI improves disk intensive apps considerably.
1. IDE requires CPU cycles. SCSI buses have embedded ICs that handle queuing of data and such, freeing the CPU to perform other tasks.
2. IDE channels are shared. Most IDE
Re:My Guess (Score:4, Insightful)
Re:My Guess (Score:4, Informative)
This has, in large part, disappeared with the advent of UDMA. It was true that IDE was very cycle expensive a decade ago when the IDE really meant Internal Disk Eletronics. The IDE "interface" was just a set of tri-state latches and the CPU would be responsible for pushing and reading every single byte. If you ever look at the pinout for an IDE cable, it's no surprise that it very closely resembles the ISA bus. Another historical note, ATA means AT-Attachment because the first set of IDE drives that were really popular were designed to attach to the IBM PC AT (the successor of sorts to the IBM PC XT) bus.
Now, processors queue dma requests in and out of the drive and the "interface" really has grown up to be more of a "controller." They're not as complex as the SCSI adapters, of course, but then again, SCSI is a much more complex signaling system.
2. No Longer True.
What you're trying to describe is called as "bus disconnect." I'm not sure which side of the bus was responsible, however, the idea is that while a drive was processing a command, the bus was locked until the command finished.
Note, the first version of SCSI did not have Disconnect either. However, given many more devices sharing the bus, bus contention was more severe, especially using slow devices like tape drives and cdroms, that it became necessary rather than just a feature.
SCSI supports disconnection as well as Tagged Command Queueing. TCQ allows the host to issue multiple outstanding commands to the device. The device is allowed to complete these commands out of order. Many drives will reorder the requests to take advantage of the head movement.
Recent revisions of IDE include support for TCQ.
I will add, however, that it is still worthwhile to have only one device per channel. Compare this to putting more than two 15K drives on a U160 channel.
3. Not even remotely true. SCSI is a parralel bus, much like IDE, ISA, or half a dozen others. Its only possible for one device to drive the bus at one time. This is clearly evident since a few of the lines in the SCSI cable are used to indicate the Target of the bus transaction. There is only one set of these signals, therefore, there can only be one target.
Also, the electrical interface for Serial ATA is designed with hot-swap in mind.
While your first suggestion is accurate, disk i/o is very slow and SCSI equipment tends to be of better quality than IDE hardware. SCSI drives with higher spindle speeds have much lower latency, which can lend a dramatic difference to a similar computer with IDE drives. However, that difference is of no fault of IDE. I would encourage, you, in future to be more accurate with your information.
If you believe I have written inaccurately, I would recommend reading the draft documents from INCITS T13, the ATA technical comittee.
Re:My Guess (Score:2)
Re:My Guess (Score:2)
I suggest you contribute this (and more) to Wikipedia [wikipedia.org].
Re:My Guess (Score:2)
Small Computer System Interface
Check the ANSI document X3.131:1994[1999]
You may be thinking of some serial adaptations of SCSI, like SBP, Serial Bus Protocol, or SAS Serial Attached SCSI.
Re:My Guess (Score:2)
The second is that in terms of performance there is a reduced efficientcy for every drive added to a raid.
The third is that controllers to handle an increased amount of drives are orders of magnitude more expensive and you can only have 15 devices on a scsi chain. With 146gb drives that gives you a max of 2.1TB on a chain, with 250gb drives that becomes 3.7TB on a chain, yeah it's
Re:My Guess (Score:2)
That is all, the rest is absolutely correct, except to mention that if you're running a Windows OS, the maximum volume size is 2.4TB anyway.
Re:My Guess (Score:2)
What sort of idiot runs anything that needs multiple TB of storage and doesn't keep extra drives on hand? Also in a raid 5 configuration, you haven't lost data, and the raid isn't down simply because a drive has failed.
"That is all, the rest is absolutely correct, except to mention that if you're running a Windows OS, the maximum volume size is 2.4TB anyway."
True that, althoug
Re:My Guess (Score:2)
Re:My Guess (Score:2)
Maybe, fortunately with harddrives it generally doesn't work that way
"Besides, replacing drives costs money too, warranty or not, because it ta
Re:My Guess (Score:2)
Very true, except for the UPS issue. Seriously, I had 3 modems coming for some remote offices last week, one was lost, crushed, and delivered to the wrong address, and the other two were two days late.
small drives good -- big drives cheap. (Score:2)
If you're only worried about how much data you can put on a chain, SCSI has a two-level addressing scheme. Each 'target' (usually a drive), an have up to 8 or 15 Drives on it... It's n
SAS - development (Score:2)
well the storage people do not think it wise to spend the money...
iSCSI and SAS are good things !
(pitty there is not a MacOS X driver for iSCSI...)
regards
John Jones
Just a thought... (Score:2)
Maybe it is due to the fact that SCSI storage has typically doubled in size... 9.1, 18.2, 36.4, 72.8, 145.6... Could it be that they're currently testing 291.2GB disks?
My $0.02.
THE ANSWER (Score:5, Informative)
Re:THE ANSWER (Score:3, Interesting)
Not long ago I had to set up a several terabyte array (around 4 TB) using SCSI drives. We were constantly replacing the damn things. And this was supposedly quality hardware from Sun. Now, with as many drives as we had, there were bound to be failures. Eventually the failure rate stablized at about 1 or 2 drives per month. A rate which continues to this day, some 3 years later.
Previous to that array I had helped set up a similar system usi
Re:THE ANSWER (Score:2)
Re:THE ANSWER (Score:3, Interesting)
I'm very curious which Sun array this is, and which drives you are using.
I've worked in the Sun market for well over a decade, and I haven't seen failure rates like you're describing since the old Seagate 2.9G 5-1/4" full-height drives they used to have in their "Mass Storage" cabinets (the ones that looked exactly like a SPARCcenter 2000)... and that was only after the drives were out of production for a few YEARS (all replacements were refurbs).
My guess is you have serious environmental issues... heat/h
Re:THE ANSWER (Score:2)
You really should check the Seagate 18.2 GB FC-AL disks. They're crap. The firmware is crap, the drive is crap, and the failure rate is WAY WAY WAY too high.
I can't tell you how many times I've seen an entire loop on an A5200 go offline because a single disk was failing.
Piece of
The active system can't access the disks, so it attempts a failover.
Re:THE ANSWER (Score:3, Interesting)
THE ANSWER just isn't right (Score:2)
They do exist! (Score:5, Informative)
The real reason is that when you move up to higher rotational sppeds to reduce latency, you have to reduce density relative to the motion of the disk under the head, so a 10K drive can generally pack only 60%-ish as much data per-inch as a 7200RPM drive.
The same can be seen in 15K disks, which are much lower density than their 10K counterparts. The 15K platters are smaller too, to keep them from flying apart.
Do you remember when the 5400RPM disks had higher capacity than the 7200 ones? I sure do, it was for the same reason.
Until the latency of the read-write head improves this will be the case.
Re:They do exist! (Score:4, Insightful)
I've heard some things about the new Hitachi 400gb drive being optimized for tv settop boxes. Does that mean that it's optimized for linear reads/writes? If so, why did they not decrease rpm in order to gain more capacity?
Re:They do exist! (Score:2)
Re:"Every now-and-then", or "all the time"? (Score:2)
Re:They do exist! (Score:2)
You do mean '.25 TB,' yes?
1024 megabytes = 1 gigabyte; 1024 gigabytes = 1 terabyte.
Re:They do exist! (Score:2)
A byte is 8 bits. 250GB == 2Tb.
Pegasus probably meant TB, and AKC probably knew that.
Re:They do exist! (Score:2)
Aye. Unfortunately, as so many people don't know the difference, I tend to assume, when they're talking about hard drives, that they mean bytes, and when talking about network speeds, they mean bits.
I dont know... (Score:2)
Thin client servers (Score:2)
...provides more points of failure... (Score:3, Funny)
Yeah, that's a problem. It's much better to reduce potential points of failure... preferably down to a single point of failure.
Or is that not what you meant?
Re:...provides more points of failure... (Score:2)
Parallel vs. series (Score:2)
Whether a single point of failure is better than multiple points of failure depends on whether the points of failure are in parallel (e.g. RAID 1) or in series (e.g. RAID 0).
Too slow to be useful? (Score:5, Informative)
Imagine you have a 1TB drive, but were stuck at a 100MB/sec max seq transfer rate. It takes you 2.7 hours to read/write the entire drive. And that's for _sequential_ access. Gets ugly for random seek.
A similar speed 10TB drive will take you more than a day (27+ hours) to read sequentially.
Before the point where it takes too long to read an entire single drive you might as well start using multiple drives to add capacity rather than having bigger drives.
Taking too long is subjective, but I'd say this: how long can you make your boss/customer wait whilst you are restoring an entire disk image from backup? 27 hours or 2.7 hours? or 25 minutes?
So 70GB would be about the limit if you have impatient users and bosses.
Larger capacities are OK if they are to hold data that aren't important enough to be backed up, and don't require masses of data to be available quickly. Or you are doing mirroring and read speeds are important but write speeds aren't as important (but remember that restoring from backup = writing
Re:Too slow to be useful? (Score:2)
if you have 20gb of data on your 100gb drive, and then ghost that to a 1TB drive, your copying 20gb of data. If you then ghost that 1TB drive to a 10TB drive, your still only copying 20gb.
Re:Too slow to be useful? (Score:2)
Unless you consider the stupidity of spelling/grammar trolls, who live in such shame they have to convince their pathetic little minds of superiority. They do this of course by pointing out the mistakes of others in the most anal fashion, usually where those mistakes matter least.
Re:Too slow to be useful? (Score:3, Informative)
Re:Too slow to be useful? (Score:3, Interesting)
The _evidence_ of actual transfer rates is more important that your "important fact".
This might be helpful [storagereview.com]. Select WB99 transfer rate - Begin.
If you have evidence of significantly faster single drives do let me know.
Speed. (Score:2)
Re:Speed. (Score:2)
And there are always those who wants lots of larger drives.
Re:Speed. (Score:2)
That's exactly the point. What do you think limits the bandwidth of, say, a database?
you can only transfer the data from those requests at 320mb/s max
I am pretty sure fiber channel is faster than that. That's what fast arrays are hooked up with, anyway. These are generally independent boxes, with their own highly sophisticated and intelligent controller and a really fat pipe.
And there are always those who wants lots o
Re:Speed. (Score:2)
That is a little different, but there your still maxed at 1gb/s tops, because that is the fastest network link your looking at. Since the drives can each put out 320mb/s and in that case would be doing so in parallel, you still can't signficantly improve throughput beyond 4 drives
Nobody wants it (Score:2)
Reliability (Score:4, Interesting)
Re:Reliability (Score:2)
I haven't seen any of the roadmaps recently, but it has been a while since the 146GB drives came out, so it's probably time for a bump in the next 3-6 months.
The reason is speed. (Score:3, Informative)
1) RPM. It is easier to spin a 2.5" platter at 15K than a 3.5" platter. (someone else can figure out the addtional energy but I would guess more than double the juice adduming uniform density.)
2) IOs per second. In large arrays the driving factor is not necessaraly throughput but IOs per second. Which leads to more transactions per second for your server farm. So more spindles = more IOs per second.
3) Access time. The bigger the drive the longer it takes the drive's processor to position the head. Therefore increasing access times. decreasing IO per second. I now its a trivial amount of time but it adds up over millions of IO.
4) Error correction. I cannot speak for IDE but each block on a SCSI drive has an Error Correction Code (ECC) which helps the drive recover from read errors. Again minimal.
5) Cynical answer. Smaller drives means your drive company sells more product to meet a given capacity.
educational point. SCSI is a protocol like IP or TCP. It can be tunneled through or carried by anything.
SPI -SCSI Parralel interface (old school).
FCP - Fibre channel protocol
SAS - Serial attached SCSI. SAS can also tunnel SATA.
iSCSI - scsi in TCP. (not ethernet)
SBP - SCSI Block Protocol. firewire.
ATAPI - yep SCSI ove IDE so your CDROM works.
many others.
Cost and reliability (Score:2)
Capacity vs. Speed (Score:3, Interesting)
It's called "short-stroking" (Score:2)
Re:God that's just sad (Score:2)
man!
OT: Use Konquer! (Score:2)
Maybe that's why the
Re:What speed are most SCSI drives? (Score:2)
Re:What speed are most SCSI drives? (Score:4, Interesting)
Everyone is going serial. USB, SAS, Serial ATA, etc. Time to invest in Kellogs.
Oops, wrong "cerial".
(sorry for the pun, couldn't help it).
Re:What speed are most SCSI drives? (Score:2)
Re:What speed are most SCSI drives? (Score:3, Interesting)
Re:What speed are most SCSI drives? (Score:2)
Re:What speed are most SCSI drives? (Score:2)
Re:What speed are most SCSI drives? (Score:3, Interesting)
Re:What speed are most SCSI drives? (Score:4, Informative)
As for real performance, my old 18G 7200 RPM IBM scsi drives are faster than my brand-new SATA raptors in real world applications (compiling the linux kernel for example.)
So here's what I do. I use my scsi drives for my everyday stuff, and archive on the SATA drives (MP3's, old source / packages, etc.) That way I get my performance and reliability, and space. Since I have two of each, I just raid mirror.
As for real world server applications, we run some Large raid arrays. We don't need the space as much as we need the performance you get with dozens of spindles spread over multiple channels on 64bit controllers.
Re:What speed are most SCSI drives? (Score:4, Informative)
It's all about those command queues, they let the computer spit commands at the disk without having to see their immediate completion.
I actually get better performance with my SCSI drive _mounted over NFS_ than I can with my previous local 40GB ATA-66 drive.
Re:What speed are most SCSI drives? (Score:2)
IDE sucks the life out of PC, even newer 3ghz+ pc's still pause when you put in a floppy or eject a cdrom in windows.
I can put a bunch of slow IDE's in a pentium 2 box, and over TCP no performance hit loading/saving files. (In windows)
BTW, this is also how bootless terminals or low class cpu's can be so smooth. My laptop on the network has slow HD IDE issues, so I put apps on the server and load, qui
Re:What speed are most SCSI drives? (Score:3, Informative)
That's a drive and/or Windows issue. When you insert a CD, the CDROM has to spin it up to read it, and then Explorer.exe (not Internet Explorer, Windows Explorer, A.K.A. the Windows "shell") immediately wants to know what's in it, so you have a slight lag, depending on background services, the drive, the media condition, etc. You can see what's going on by opening up Explorer while there's
Re:What speed are most SCSI drives? (Score:2)
Re:What speed are most SCSI drives? (Score:2)
The normal IDE drives are fore CDroms only, most HD's go on the Raid controller (even if you dont do raid). That seperates it, like you said.
USB floppy, so I can unplug it when I dont need it. Still OS's out there you need a floppy for.
Re:What speed are most SCSI drives? (Score:2)
hdparm -d 1
In Windows... uh... anyone care to enlighten him?
I actually benchmarked that one day... (Score:2)
Re:What speed are most SCSI drives? (Score:2)
Also a 10k RPM scsi drive is a cheapy, generally they are 15k RPM.
You also generally use scsi in raid configuration. SATA raid devices generally don't compare to intelligent scsi raid controllers.
There is also the quality and warranty on the drives, usually 5yr or more warranty with priority replacements. The replacements are usua
Re:What speed are most SCSI drives? (Score:3, Interesting)
Re:It's the RAID, silly... (Score:2)
Re:SCSI not meant for high capacity (Score:2)
Re:SCSI not meant for high capacity (Score:2)
Re:Always That Way (Score:2)
No, traditionally, SCSI provided higher capacity than IDE.. For example, when the largest IDE drive you could get was 130MB, 2GB SCSI drives were available.