Shared Video Memory and Memory Bandiwidth Issues?

Shared Video Memory and Memory Bandiwidth Issues? 37

Posted by Cliff on Tuesday January 06, 2004 @08:30AM from the it's-all-about-the-performance-baby! dept.

klystron2 asks: "Does shared video memory consume a huge amount of memory bandwidth? We all seem to know that a notebook computer with shared video/main memory will have performance drawbacks.... But what exactly are they? It's easy to see that the amount of main memory decreases a little bit, but that shouldn't make a big difference if you have 1GB of RAM. Does the video card trace through memory every time the screen is refreshed? Therefore consuming a ton of memory bandwidth? If this is the case then the higher the resolution and the higher the refresh rate, the lower the performance of the system, right? I have searched the Internet for an explanation on shared memory and have come up empty. Can anyone explain this?"

Shared Video Memory and Memory Bandiwidth Issues?

This discussion has been archived. No new comments can be posted.

Search 37 Comments Log In/Create an Account

Comments Filter:

Pixels are read from RAM everytime (Score:5, Informative)

by G4from128k ( 686170 ) writes: on Tuesday January 06, 2004 @09:04AM (#7890066)

Does the video card trace through memory every time the screen is refreshed? Therefore consuming a ton of memory bandwidth? If this is the case then the higher the resolution and the higher the refresh rate, the lower the performance of the system, right?

Yes. The pixels on the screen are read out every single frame time (i.e., 60 to 75 times each second). The DAC (Digital to Analog Convertor) must be fed the pixel data every time -- with video in main RAM, there is no other place to store this image data because the main memory is this buffer. The product of the frame rate, resolution, and color depth tells you how much bandwidth is consumed.

The exact performance impact is not easy to predict though. Where it gets tricky is with CPUs that have large L1, L2, and L3 caches. It is possible for the CPU to be running at 100% while the video is being read if the CPU is finding all the data and instructions in the cache. But if the CPU must access main RAM, then there will be competition.

Band*I*Width? (Score:5, Informative)

by shyster ( 245228 ) writes: <brackett@uflPOLLOCK.edu minus painter> on Tuesday January 06, 2004 @09:14AM (#7890119) Homepage

I know I always have band*i*width issues...
But seriously, you may want to take a look at this [tomshardware.com] Tom's Hardware article detailing the weaknesses of an integrated chip.
For those looking for the quick answer, I'll do my best to summarize. First off, since integrated graphics tend to be low cost solutions, transistor counts are nowhere near current add-in boards. From the article, Nvidia's FX5200 has 47 million transistors (FX5600=80 million and FX5900=130 million), while their onboard solution (equivalent to GeForce4 MX440) has only 27 million.
Then, there's the question of memory bandwidth. Dual channel DDR 400 has a peak of 6.4GB/s, which is shared, while an equivalent GeForce4 MX440 would have a dedicated 8GB/s.
Now, to your question. Does this consume a ton of bandwidth and affect performance? Well, that would all depend on what you're doing with it.
If you're running 3D games and the like, then both performance and bandwidth will be an issue and limit your framerates. Comparing the previous review and this [tomshardware.com] review of add in boards, shows about a 25% reduction in framerate (at 1024x768) between an add in GeForce4 MX440 and an NForce2 integrated chipset in UT2003, and an almost 40% reduction in 3DMark 2001. Since the machines were not identical, don't take the numbers as gospel, but they were similar enough to make a meaningful comparison IMHO.
That being said, for normal 2D work, bandwidth utilization is negligible and shouldn't seriously impact performance as shown by this [tomshardware.com] SysMark 2002 test. AFAIK, this doesn't take into account extremely intensive RAM->CPU loads, but I wouldn't expect results to vary significantly, since memory requirements for 2D work are relatively low.
Be warned though, that Tom's Hardware did note image quality issues with most of the integrated chips-which they theorized was the result oflow cost manafacturing, not a limit of the technology itself. This theory is bolstered by the fact that their low cost add in card (Radeon 9200) suffered the same problems.

'T'ain't nuthin' compared to a Sinclair ZX-81... (Score:5, Informative)

by leonbrooks ( 8043 ) writes: <SentByMSBlast-No ... .brooks.fdns.net> on Tuesday January 06, 2004 @09:21AM (#7890156) Homepage

...which had the Z80 CPU generating the video directly, leaving only interframe gaps for computing.

Since the greeblie had no interrupts and they were too lazy to quantise the BASIC interpreter so that they could run it in the interframe and still generate reasonably consistent sync pulses, the screen went away completely while programs ran. A modern monitor would go postal, faced with a constantly appearing/vanishing sync pulse train but TVs are kind of used to dealing with cruddy signals.

I think the Sinclair was branded a Timex in the UK.

Re:'T'ain't nuthin' compared to a Sinclair ZX-81.. (Score:3, Informative)

by Steve Cox ( 207680 ) writes: on Tuesday January 06, 2004 @09:35AM (#7890247)

> I think the Sinclair was branded a Timex in the UK.

No, Timex sold the Sinclair ZX81 in North America. Sinclair Research Ltd. sold the Sinclair ZX81 in the UK. The US variant was named the Timex Sinclair 1000.

The Timex 1000 was pratically identical to the ZX81, except for a few changes on the circuitboard and a whopping 2K of RAM instead of the 1K that the ZX81 had.

Steve

Lets do some sums (Score:4, Informative)

by EnglishTim ( 9662 ) writes: on Tuesday January 06, 2004 @12:20PM (#7891944)

Let's say you're running a game at 1280 x 1024 * 32bit @ 75Hz.

1280 x 1024 x 32 x 75 = 3145728000 bits/second just to display

That's 375 Mb/s.

If you've got DDR 2700 memory, that's a peak rate of around 2540 Mb/s.

Therefore, the screen refresh alone is taking up 15% of your memory bandwidth.

You've also got to be drawing the screen every frame, let's say it'd doing this 25 times a second, and that the game you're playing had an average overdraw per pixel of 1.5 and it hits the z-buffer on average twice per pixel.

You've got 125Mb/s used up with the colour and 125Mb/s used up with z-buffer accesses (assuming 16bit buffer) that uses up 10% of your maximum data rate

Overall, then, a quarter of the maximum available bandwidth is being used by the video card.

One point about memory (Score:4, Informative)

by wowbagger ( 69688 ) writes: on Tuesday January 06, 2004 @02:32PM (#7893292) Homepage Journal

What we call "RAM" (Random Access Memory) really isn't all that good at random access.

When you read a location of RAM, the RAM chips have to read the entire row that location lives in. For a memory that is 256 million locations (where a location could be a bit, or a byte, or even a dword, depending upon the memory's layout), to read a location means loading 16 thousand locations into the sense amps of the chip.

Now, once you've fetched the data into the sense amps, reading the rest of the row out can happen much faster than that initial access.

CPUs tend to access things more or less sequentially when it comes to code (modulo jumps, calls, interrupts, and context switches), but data isn't quite as nice.

Video, on the other hand, is great from the DRAM controller's point of view - it can grab an entire row of data and shove it into the display controller's shift register. And wonder of wonders, the next request from the video refresh system is going to be the very next row!

So while video refresh does take bandwidth, in many ways driving the video controller is "cheaper" than feeding the CPU.

(the details in this post GREATLY simplified for brevity)

Depends on the implementation... (Score:5, Informative)

by mercuryresearch ( 680293 ) * writes: on Tuesday January 06, 2004 @02:37PM (#7893339) Journal

In general, yes, shared memory sucks bandwidth. As others pointed out, the calculations are pretty straightforward (X * Y * #bytes/pixel * refresh rate = Bandwidth).

However, in today's systems it's FAR more complicated that this.

First, some older implementations, particularly the Intel 810, used a 4MB display cache. The net of this is that the display refresh was generally served from a secondary memory and didn't interfere with main memory bandwidth. As well, Intel used some technology Chips & Tech developed that basically did run-length encoded compression on the display refresh data (look right at your screen now, there's a LOT of white space, and RLL will shrink that substantially.)

Today most chip sets incorporate a small buffer for the graphics data and compression techniques to minimize the impact of display refresh on bandwidth.

But wait -- it gets even MORE complicated. With integrated graphics on the north bridge of the chip set, the memory controller in the chip set knows both what the CPU and what the graphics core want to access. So the chip set actually does creative scheduling of the memory accesses so that the CPU doesn't get blocked unless absolutely necessary. So most of the time the CPU is either getting its memory needs services by its own cache, or it's getting (apparently) un-blocked access to memory. So the impact of graphics is much less than the simple equation above would suggest.

Finally... we now have dual-channel memory systems. Even more tricks to keep the graphics and CPU memory accesses separate come into play here.

So, the short answer is yes, there's an impact, but it used to be much worse. Innovative design techniques have greatly reduced the impact so that in non-degenerate cases it doesn't affect the system too much. In a degenerate case of your app never getting cached and doing nothing but pound on the memory system with accesses, however, then you'll see the impact in line with the bandwidth equation above.

Cost & space efficiency vs. performance (Score:3, Informative)

by G4from128k ( 686170 ) writes: on Tuesday January 06, 2004 @06:53PM (#7897016)

You say that every frame must be read out from main memory. This is true if the shared memory system has 0 memory (caching issues aside), but don't shared memeory systems have at least a single buffer to store at least the last frame? I mean, how much can 3 Meg of RAM cost these days (i.e. 1024 x 728 @ 32 bit)?

No, these systems have no separate frame buffer - main RAM is the buffer. Even when nothing is changing on the screen, the video subsystem is reading data at the full frame rate from RAM.

Although 3 MB of RAM chips might seem cheap, every component adds cost (most system designers try to minimize the total number of components). More importantly, space on the motherboard (especially in a laptop or miniATX) is a precious commodity. The most cost efficient and space efficient way to have 3 MB of video memory in a PC is to borrow it from the 256 MB DIMM that you will be putting in there anyway.

Borrowing from main RAM may incur a slight performance penalty, but the systems that use this approach are not sold for their performance. Low cost or extreme compactness drive the designer to avoid adding special video memory buffers. And with DDR RAM, the memory bandwidth is sufficently high to not cause too much of a performance hit.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Shared Video Memory and Memory Bandiwidth Issues? 37

Shared Video Memory and Memory Bandiwidth Issues? More Login

Shared Video Memory and Memory Bandiwidth Issues?

Pixels are read from RAM everytime (Score:5, Informative)

BandIWidth? (Score:5, Informative)

'T'ain't nuthin' compared to a Sinclair ZX-81... (Score:5, Informative)

Re:'T'ain't nuthin' compared to a Sinclair ZX-81.. (Score:3, Informative)

Lets do some sums (Score:4, Informative)

One point about memory (Score:4, Informative)

Depends on the implementation... (Score:5, Informative)

Cost & space efficiency vs. performance (Score:3, Informative)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot

Pixels are read from RAM everytime (Score:5, Informative)

Band*I*Width? (Score:5, Informative)

'T'ain't nuthin' compared to a Sinclair ZX-81... (Score:5, Informative)

Re:'T'ain't nuthin' compared to a Sinclair ZX-81.. (Score:3, Informative)

Lets do some sums (Score:4, Informative)

One point about memory (Score:4, Informative)

Depends on the implementation... (Score:5, Informative)

Cost & space efficiency vs. performance (Score:3, Informative)

Related Links Top of the: day, week, month.

Slashdot Top Deals

BandIWidth? (Score:5, Informative)