Catch up on stories from the past week (and beyond) at the Slashdot story archive


Forgot your password?
Check out the new SourceForge HTML5 internet speed test! No Flash necessary and runs on all devices. ×
Stats Upgrades Hardware

Ask Slashdot: Is the Gap Between Data Access Speeds Widening Or Narrowing? 92

New submitter DidgetMaster writes: Everyone knows that CPU registers are much faster than level1, level2, and level3 caches. Likewise, those caches are much faster than RAM; and RAM in turn is much faster than disk (even SSD). But the past 30 years have seen tremendous improvements in data access speeds at all these levels. RAM today is much, much faster than RAM 10, 20, or 30 years ago. Disk accesses are also tremendously faster than previously as steady improvements in hard drive technology and the even more impressive gains in flash memory have occurred. Is the 'gap' between the fastest RAM and the fastest disks bigger or smaller now than the gap was 10 or 20 years ago? Are the gaps between all the various levels getting bigger or smaller? Anyone know of a definitive source that tracks these gaps over time?
This discussion has been archived. No new comments can be posted.

Ask Slashdot: Is the Gap Between Data Access Speeds Widening Or Narrowing?

Comments Filter:
  • by NotInHere ( 3654617 ) on Friday October 02, 2015 @09:05PM (#50649183)

    The distance between the "fastest" and "slowest" gets larger and larger, but the gaps are getting smaller because things like SSDs fill them.

    • by Anonymous Coward

      The distance between the "fastest" and "slowest" gets larger and larger, but the gaps are getting smaller because things like SSDs fill them.

      Except the bus it (and everything else) is connected to hasn't kept up. Try building a computer to solve real problems, and you'll quickly learn this.

    • Each memory address is normally associated with 8 bits of data (not counting correction bits). But processors nowadays routinely consume 64 bits at a time. That means getting the data from 8 different addresses simultaneously. Things would be simpler if they put all those 64 bits at one address --if every single address had 64 bits of data associated with it. In the previous processor generation, gobbling 32 bits at a time meant accessing 4 different addresses simultaneously, and the total accessible ad
      • Good idea. So good, in fact, you're getting close to how it's actually done: data is moved in parallel in bulk. For example, when your program accesses an 8-bit byte, the 256-bit (or larger) chunk (called a cache line containing it gets read from DRAM into cache. There is no address space sacrifice because once the cache line is read, additional logic selects the desired byte from the cache line using the low-order bits of the address.
      • by Mr.CRC ( 2330444 )

        Memory busses already transfer more than one octet at a time, in fact, more than the 64-bit architecture size for typical implementations of x64. Having the effective address space for 64-bit words be 61 bits isn't really much of a problem. Who is going to have nearly 2.15GiGiB of memory (attached to one CPU) any time in the next century?

        I program the TI C2000 series 32-bit microcontrollers, where the 16-bit word size can be a significant headache when trying to deal with 8-bit IO streams.

        So I'd opt to

  • ...but not just because someone else is too lazy to do it themselves. Soing the maths would have taken less effort than writing this /. fill-piece.

  • by i.r.id10t ( 595143 ) on Friday October 02, 2015 @09:14PM (#50649209)

    Does it matter? Fast CPU, fast RAM, fast disks is like having no speed limits on every race track in the world - but in order to get from track to track you have to go on the interstates or perhaps back country roads (PCI bus, etc). Sure, each component is fast and getting faster, but the way those components connect to each other hasn't changed all that much...

    • Re: (Score:3, Interesting)

      by Anonymous Coward

      I disagree, the interconnecting paths may not be keeping up as fast but look back at the original ISA bus or before that S100. The PC/AT 16 bit slots more than doubled the speed of the original ISA. Specialized video busses were all the rage not that long ago. PCI has grown to become PCIX and multi lane. For us greybeards, that a serial bus operated faster than a parallel bus remains one of those great mysteries.

      • by Anonymous Coward

        PCIE. Not x. X died a while ago.

        Serial outperforming is simple. Multiple serial busses with switching on each end. They're independent but they send packetised data in parallel.

    • Yes. It matters. Price also matters. If the price/performance gap is eliminated then major architectural changes are both possible and likely. We're already seeing this with tape vs harddrive where people are just archiving to an external harddrives instead of tape. Think about what would be possible if RAM was cheaper than harddisks and also non-volatile. Then everything could be in ram and even "saving to disk" becomes a thing of the past. Likewise, if SSDs becomes just as fast as ram, then why dif

      • wait until they start sticking sad'd on the ram bus

      • I had RAM on the 8-bit ISA bus back in the PC-1 days, but back then the CPU was so slow that wasn't a serious impediment. Today, however, we have specialized (and seriously fast) memory buses, and we have massive interconnects between the CPU and the chipset, and we have large numbers of PCI-E lanes. High-speed SSDs are designed to sit right on a useful number of PCI-E lanes. These days a PC actually does have massive bandwidth, in a way it never did before; it was crippled between basically the time PCs go

    • by Anonymous Coward

      Yeah, things like this matter. I was once told by a person who worked for Microsoft that this person was assigned the task of overseeing some research for an upcoming Xbox system. The research involved having prototype units run various pieces of code to see which piece of code ran the fastest. It was rather assumed that the different versions of the code were equally compatible, coming up with the same result. However, automated benchmarking was measuring speed.

      See, it doesn't even matter which piece o

    • by godrik ( 1287354 )

      That is so untrue. What are you talking about!?
      PCI-express is much faster than the late AGP, PCI and ISA buses, you get over 15GB/s from one GPU to an other one. SATA and SCSI have had a good run, but are being remplaced by PCI-express for high throughput devices. In clusters, Infiniband decreased latency massively compared to ethernet, and gives you bandwidth of over 100Gb/s.

      So surely, the interconnect is a factor slower than devices, but that's pretty much always has been the case.

    • Yes, it does it matter.

      For example, if you want to sort objects which does not fit at one level, the sorting may spill over to a next level of much slower memory. This imbalance of speed is relevant regardless of the interconnect speed limits, as a delta as such is the culprit.

      There are some ways to improve on that.

      From Wikipedia > Sorting_algorithm > Memory_usage_patterns_and_index_sorting []

      That is one example of a solution to a problem that matters.
    • fast disks

      I showed your mom my fast disks, but she only had an ISA port.

  • by Anonymous Coward

    This could literally be answered with three google searches. '2015 l2 cache speed' 'ddr4 speed' '2015 ssd speed'

    • by jeffb (2.718) ( 1189693 ) on Friday October 02, 2015 @09:38PM (#50649285)

      Ancient scrolls of dubious provenance hint darkly that DDR4 was not the first inhabitant of the RAM slots we consider so permanent. Debased cultists still sometimes mutter chants mentioning "PC100", or even uncouth syllables such as "korr"...

    • Re: (Score:2, Interesting)

      by Anonymous Coward

      This could literally be answered with three google searches. '2015 l2 cache speed'

      This /. article, plus one called "Casino lock flooring [...] play casino online" which 404s in Norwegian when you click on it.

      'ddr4 speed'

      As it turns out, a whole bunch of really technical reviews on DDR4 memory, plus 'scope/test gear for testing DDR4 bus access. At least partially, potentially, useful, if you're prepared to wade through a bunch of dense stuff.

      '2015 ssd speed'

      A whole bunch of MacBook reviews/unpaid ads, follow

      • by Anonymous Coward

        (Whether /. counts as that is an entirely different matter, though there's usually at least one person who appears to know what they're talking about and will answer the question usefully and honestly without being a smug stuck up prick.)

        So, you're ruling yourself out for an answer then?

        • by HiThere ( 15173 )

          Yes, he was ruling himself out as unable to answer. So am I. And it would take a *LOT* more than a Google search to answer. I lean towards agreeing with the people who cite bus speed as the limiting factor, but I'm not sure, and there could certainly be special circumstances where something else was the limit.

          I *do* know that it's not an easy question to answer, and that any answer is going to depend for its correctness on a presumed workload. (Some things are CPU bound, and don't even use much RAM. Ot

  • by jeffb (2.718) ( 1189693 ) on Friday October 02, 2015 @09:29PM (#50649261)

    I'm not sure what a historic timeline of these ratios (not "differences", please) would gain you.

    These ratios can have a big impact on what algorithms and implementations you choose to maximize performance. I suppose if, say, the ratio of RAM to disk speed increased by a factor of 10 over the decade before last, then decreased back to its original ratio in the last decade, it might be worth trawling through some old papers (or old source trees) to revisit lessons learned in the earlier period -- but that seems like a bit of a stretch.

    If you're just curious, it shouldn't be too hard to generate timelines of CPU cycle speeds, cache and RAM latencies and bandwidths, disk performance, and so on. But really, each of those has enough factors that a simple "ratio" would probably conceal more than it illuminates.

    • I for one would like to see a historic timeline of absolute numbers for CPUs, memory, and mass storage. But that is not so easy to do. I have found little snippets here and there on Wikipedia, but not even a single master list of CPUs, let alone more hardware. There are master lists of CPU benchmarks but not spanning generations and radically different CPU sizes obviously. Here's DDR3 RAM: [] DDR4, not really there in Wikipedia, though there are some articles that talk aro
  • Writing a paper (Score:3, Insightful)

    by Anonymous Coward on Friday October 02, 2015 @09:30PM (#50649263)

    for a CS or IT class?

  • by fleabay ( 876971 ) on Friday October 02, 2015 @09:34PM (#50649269)
    Yes, yes it is.
    • Yes, yes it is.

      It depends on your diet and how much exercise.... oh, you mean tech, not your waist line....

  • by Anonymous Coward

    is still in effect. Every memory speed increase is perforce dwarfed by speed increases above in the hierarchy. The ongoing stall in increasing CPU clock speeds has obscured the nature of the problem, but we'll never go back to the 1980s when CPUs were not fast enough to cause memory hot spots.

    No one has attacked the problem successfully except for architects who have designed split-phase memory transactions for loads and stores along with the capability for many of them to be simultaneously in flight. Th

    • My first computer was a ZX81.
      This had a 16K RAM pack, with 250ns RAM.
      This could read a random byte of RAM in 470ns.
      My current system can access a byte of RAM in around 50ns.

      That is a factor of ten improvement, when RAM size has increased a million fold.
      Caches ammeliorate this.
      But they do not eliminate it.

  • by kuzb ( 724081 ) on Friday October 02, 2015 @10:23PM (#50649423)

    "Everyone knows that CPU registers are much faster than level1, level2, and level3 caches."

    I'd argue that most people don't even know what a CPU register is, never mind what it's faster than.

    • And apparently, now everybody knows that 15 minutes can save you 15% or more on car insurance. Well, Screw that! We need to get the word out. CPU registers are faster than level 1 cache! Suck it, Geico.
  • by gman003 ( 1693318 ) on Friday October 02, 2015 @10:53PM (#50649509)

    Originally, there was CPU registers, and memory. Then there was registers, memory, and disk. Then there was registers, SRAM cache, memory, and disk. Then there was registers, L1 cache (on CPU), L2 cache (on mobo), DRAM, and disk. Then the L2 moved onto the CPU. Then there was L3. Then SSDs were added between RAM and disk. Now some chips have an L4 cache on the CPU package (but not the CPU die).

    Oh, and there's a difference between latency and bandwidth. DRAM latency has not significantly improved over time, particularly compared to DRAM bandwidth.

    And with multiple cores, some levels are core-specific while others are not. You can even have a bizarre situation where L1 cache is per-core, L2 cache is shared between two cores, and L3 cache is per-CPU (in SMP setups, that means main RAM is the first level shared among all cores).

    • And IBM has made external memory controllers with 16MB L4 cache in each of them, called "Centaur".

  • For a number of years, we were in the technological progress era, we're now in the commercial progress era.
  • by Anonymous Coward

    I think that a lot of good IT folks are looking at where the bottlenecks are within the technologies available and making implementation decisions in a much more detailed way. Right now, and has often been the case, the bus becomes the bottleneck for a lot of applications. Sure there are memory bound, and I/O bound problems that still see the gaps between memory speed and disk read/write speeds as an issue, but there are far more problems that are handicapped due to the system bus speed being hobbled, compa

  • by hkultala ( 69204 ) on Saturday October 03, 2015 @12:13AM (#50649685)

    The latency of RAM is improving very slowly, only something like 2x-4x improvement in last 20 years.

    Only the bandwidth of the memory is growing faster, and that's just because they have been putting more dram cells in parallel, always doing bigger data transfers and having faster memory bus.

    Same is true for hard disk drive speed, the rotation speeds dictates the random access latency and the rotation speed of average hard disk has only gone up from 4200 or 5400 to 7200 rpm in the last 20 years, meaning only 1.7 or 1.33 times improvement in random access latency

      Though replacing hard disks with flash-based SSD storage has improved latency by a huge margin.

    • And CPU clock-speed hasn't really increased at all in the last 10 years. We are still at just over 3GHz. Of course we have had massive performance improvements in other areas.

    • by Robear ( 68955 )

      Replacing hard drives with SSDs still leaves another bottleneck. The disks have to connect to the cpu somehow. If they are internal (as in a home pc), then you only get a few disks, but they connect at PCIe speeds. If you need more disks, you go to a SAN. But then you're putting your disks at the end of *network* latencies; there's definitely a wall there. You can't cache your way out of the transmission delays on the SAN... Other solutions are used which essentially move the software and/or data closer to

  • There are orders of magnitude difference in the access times. Disk access is measured in milliseconds; memory access is measured in nanoseconds; register speed in picoseconds. Improving disk access by even 1 millisecond closes the gap.
  • A modern cheap laptop from today is faster than a Convex supercomputer I was using 20 years ago. But in that same time span disk seek times dropped from 15 milliseconds down to about 7.5 milliseconds which is only a factor of two.

    • A modern cheap laptop from today is faster than a Convex supercomputer I was using 20 years ago. But in that same time span disk seek times dropped from 15 milliseconds down to about 7.5 milliseconds which is only a factor of two.

      Manufacturers were exploring expensive strategies to speed up seek times, like multiple armatures, independent arms, etc. Then hybrid drives and massive arrays became things and there was no more reason to bother trying to improve seeks on spinning rust. We're all just hoping that it will be cost-effective to stop using it sometime soon.

  • by Anonymous Coward on Saturday October 03, 2015 @01:11AM (#50649843)

    20 years ago main memory was 10-14 ns, instruction cycle time was 2-4ns (Cray)

    Guess what? it still is.

    Memory has grown, it has gotten cheaper.

    What HASN'T changed? Access to memory. That is how Cray got its speed - instead of a single port to memory, it used a crossbar switch - 4 ports for each processor. 1 instruction bus, 2 input data busses, and one output bus; even I/O got its own port to memory; all with overlapping address/data cycles.

    The effect was that all of main memory worked at the speed of cache, thus the CPU had no need to waste silicon on cache memory - and the entire system ran full speed.

    What slows down the current systems? Memory access. Most systems only have a single port to main memory. Some servers and "high performance" desktops have dual ported memory. Yet even dual ported memory access is slow when you have to share it among 4/8 cores... plus I/O (which isn't dual ported). Interrupt latency on PCs is really horrible. Still only 15 IRQs? and have to share them? No direct vectoring? Forced interrupt chain actions? Even the old PDP 11 with ONE interrupt request line allowed direct interrupt vectoring (64 basic vectors) to reduce overhead.

    There hasn't been much innovation in architecture in over 20 years.

    • I am a dissenter from Moore's law. There are physical limits and we are near them. There is a physical limit on fab size, charge time, propagation time, signal speed, current and heat. There are also pricing issues: Would you pay $300 extra for faster ram? It could probably be done by using cache. I doubt if users would pay.
      If we could fit the entire system on a chip, Then you could speed up. But choice goes out. No choice at all.
    • there has been MUCH improvement to computer architecture, but most slashdotters think only desktop PCs and desktop PC-esque server exist.

      a mainframe has a thousand or more channels to memory

      • by Morpf ( 2683099 )

        And even if you had a trillion channels to main memory, you'd still only improve bandwidth and keep the same latency.

  • This article can shed some light on it: [] Looks like RAM is the laggard.
  • Calculating is easy (Score:4, Informative)

    by guruevi ( 827432 ) < minus berry> on Saturday October 03, 2015 @10:02AM (#50651157) Homepage

    You can look up the specs easy.

    Back in the 80286 days, there was not even an L1 cache however the memory and ISA bus ran at CPU speed 8-20Mhz. Hard drive latency was ~65ms.

    In the 80486 days L1 cache was introduced and L2 was sometimes available in (very) expensive modules. I remember buying 256kb for the same price as 16MB RAM. The L1 caches ran (if I remember correctly) at CPU speed, 1 cycle. However the bus speed started to slow down compared to the CPU. The VLB ran at CPU bus speed ("local" bus) and was often used for graphics but PCI (an inferior bus) ran at 33MHz so for anything over 33MHz, we started needing dividers. The RAM ran at 80-120ns so it started being slower than the CPU bus. Hard drive speeds were however up to ~30ms.

    In the Pentium age memory slowed even farther compared to the CPU bus. Now it took several cycles to access memory, buses ran even slower (still PCI mostly, eventually PCI-X (133MHz?) until PCI-e (serial buses running) came along. Hard drive speeds went up to ~15ms

    In modern age, L1 caches have slowed even further requiring 4 cycles for L1 cache and up to 30 for L3 caches. RAM is even slower access with bus speeds about a quarter of a single CPU but sometimes 16 CPU's need to share those lanes. Peripheral bus speeds however have gone up and PCIe 3.0 is now directly integrated into CPU 80486 VLB-style. Hard drives have latencies of 10ms (we have a mechanical issue there) still but even cheap SSD's can go down to ~1-2ms.

Top Ten Things Overheard At The ANSI C Draft Committee Meetings: (3) Ha, ha, I can't believe they're actually going to adopt this sucker.