Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Hardware

Why Design New Processor Cores? 14

Gus asks: "Why do vendors continue to develop new processor cores, as opposed to using newer technologies in older proven designs? The best example of this I can find is IBM selling the 'Blue Lightening' chip, which was really a 386, but was sold as a 486. Why not make a 386 today using copper and a 0.18 micron process? Wouldn't this be fairly cheap? Are new instructions worth adding?"
This discussion has been archived. No new comments can be posted.

Why Design New Processor Cores?

Comments Filter:
  • by cperciva ( 102828 ) on Sunday October 01, 2000 @09:23PM (#740774) Homepage
    1. Memory latency doesn't scale. You could build a 386 running at 500MHz now, but it wouldn't have 500MHz memory, so it would perform far less than 20 times as fast as a 25MHz 386.

    2. Transistors. If you've got lots of them, why not use them? Going from the P5 to the P6 improved performance by 50% on any manufacturing process; going from 486 to Pentium, and 386 to 486 yielded even larger gains.
  • by mafried ( 93164 ) on Sunday October 01, 2000 @09:27PM (#740775)
    The Intel 8088 featured in the orriginal IBM PC ran at roughly 4.77 MHrz. A 500 MHrz Celeron is roughly 4,000 times as powerful, yet the actual clock speed has only increased by a factor of 100. Some of this can be attributed to the bus size, faster RAM, 32-bit architecture, or new instructions, but the majority of this new speed stems from the improved architecture of the modern Pentium Pro based CPUs.

    The Intel 8088, believe it or not, was not even a pipelined processor! Many instructions took upwards of TWELVE whole cycles to complete, before execution could even begin on the next. Compare this to the modern Pentium Pro based CPUs - where up to two floating point, two integer, and one memory access instruction may be executing simultaniously, and not necessarly in any particular order. On modern CPUs it is not uncommon to be able to execute THREE separate instructions in the SAME clock cycle!

    It is the CPU's architecture that gives it it's processing power, not it's posted clock speed.

  • by Zaffle ( 13798 ) on Sunday October 01, 2000 @10:58PM (#740776) Homepage Journal
    Newer chips are coming out with a lot of things inbuilt. Eg video decoding, sound, serial ports, etc. All built into the one chip. The internal clock speed of this chip can be really high, thusly allowing it to do a lot of things, the external bus speed is slower to communicate with standard devices. There are a number of new chips coming out like this.

    Other advantages already in use are things like optimized instructions. Its true, most chips in PCs today are RISC, but some (eg the x86 arch) have a CISC wrapper. The actual core of the chips are RISC, and the chip has an inbuilt microcode which is used to represent the CISC instructions.

    So a hypothetical instruction like fetch add 5, and store in pc (FAFSP) could be represented by the microcode instructions load into r1, add 5, mov r1 to pc.

    This of course is highly simplified, but what traditionaly was done with a series of assembler instructions by the programmer (at best) or compilier (at worst) is now done by one assembler instruction, and some microcode thats tailored to that exact chip to give highest possible speed.

    What all this boils down to is, the more you do in chip, the faster its done. Yes, doing more in chip increases complexity, cost, and power consumption (and HEAT!), but it means it goes faster.

    However its usually more efficent (other than perhaps in price) to do it in chip. The power consumption is going to be greater having 5 chips doing all the tasks rather than 1, the same applies to things like cost, heat, etc.

    The only thing you loose is the modularity of it all. When a new revision of the chip comes out, you get a new serial port, video driver, sound card, etc. All of which could have new bugs. (Compare to traditional approach where the subsystems are in different chips, and usually don't get upgraded at the same time). However bug issues are mainly to be worried about by the bleeding edge developers who are the first in the world to work on these chips.

    In the end, your get more bang for your bucks with redesigned cores. And thats all we're all really interested in isn't it?

    ---

  • First, there are area/power/performance tradeoffs that are determined by the technology. When you change the underlying technology, the tradeoffs that you made no longer make any sense. Would you buy a processor without caches today? Of course not! But 10 years ago, caches were small and of limited value because the tradeoffs were different.

    The other big reason that chips get constantly redesigned is that because of circuit issues (speedpaths and the like) you can't just shrink the design and get much faster. It becomes diminishing returns. If you take a look at a graph of clock (or even performance) rate over time for a family of processors that extends across process technologies, you'll see that it isn't a nice linear scaling. It tapers off, because you need to redesign the chip to compensate for changes in the underlying technology.

    Finally, it's economic. People will fork out money for the latest architecuture and highest frequencies. As long as people are willing to pay for it (and not pay for 1GHz 386es), there will be companies actively designing new chips.

    Besides, it's fun :-)

    - Mike

  • Well, it might be possible to use an older, more simplified chip design to experiment with massively distributed problems; after all, you could fit a lot of 486's on today's silicon.

    Interestingly, newer chip designs revolve around a concept that isn't too different from this. Transmeta, for example, is using a proven design (a RISC core) and making it use multiple execution units simultaneously. This is similar to breaking a problem down into simpler steps and having four CPU's working on it, but it works for more general problems.

    Since they've managed to save so many transistors in their design, and move a lot of work into software, they should be able to ramp up the clock speed in the future, and once compilers catch up, maybe add some more execution units too, and get some faster SIMD instructions.
  • by Max Hyre ( 1974 ) <mh-slash@hy r e . n et> on Monday October 02, 2000 @08:55AM (#740779)
    ...just not a lot, nor very well.

    I've spent too much time looking at logic-analyzer traces of 386s. Bless them, the analyzer manufacturer sold a trace program (read: hardware run-time disassembler) that could sort things out, and tell me which fetches were discarded prefetches (when the CPU took a jump to other than where the prefetched code was). Without it, it would have been an even bigger pain to debug.

    Otherwise, the post is spot on. For instance, when the prefetch is invalidated, a 386 sits on its thumbs until the memory can be bothered to dig up the new opcodes. Nowadays, the newer chips have plenty to keep them busy at such times (mostly). As mentioned, some of them even fetch and execute both code streams, keeping a finger in the book so when the time comes they can see which way the branch actually went, and throw away the results from the untaken path.

    Maybe that's why we embedded types are so fond of Z80s and such: you can actually see what it's doing, when it does it, with a logic analyzer. You gotta be able to, when the question is whether you're diddling the right I/O port at the right time. With the new chips, which can suck up a few hundred K into cache, and show no bus activity thereafter, you need to be a mind-reader.

    Nowadays the monitoring tools for these caching, prefetching, secretive bastards are enormously expensive, and are developed in parallel with the chip itself, to be released at the same time, 'cause nobody's going to buy an embedded processor for which `gdb' is the only debugging tool.

    The folks who design these monstrosities deal with stuff that makes my head hurt just to understand the questions, much less figure out the answers. Hats off to them!

  • Because architecture is everything. Or, rather, architecture can get gains so you may as well use it.

    (and in a somewhat related post, X is bloated - not by code but by Architecture and the steeping-stone hacks that are required to bring it to standard).


  • A modern CPU can keep my basement warm in the winter. Just imagine how many 386's it would take to accomplish the same. That's called progress.
  • Efficiency and "power" (not energy) is the reason for new chips. Like posts previously made, some processors are inefficient. Newer processors can do more per cpu cyle. Compare 486's and pentiums at the same clock speed and the Pentium kills the 486.

    Making the chip smaller and cranking up clock speed also only takes you so far. Some chips are designed to go up to a certain clock speed because that is how they were designed. (Damn laws of physics)

    Just because a design works and is proven to work doesn't mean it can't be made better. There are tons of improvements that need to be made to processors (as well as their supporting infrastructure).

  • Cheap 8 bit chips are available, Mirochip and Atlmal chips. These chips may not use the x86 code but fill the market space left behind by Intel, AMD ...
  • actually, the mediagx chip from cyrix used a 486 core, and some ran over 233MHz. cyrix added additional functionality, but the cpu core itself was their old 486 design. amd used to make 3.3v 486s (66 MHz could run without an heat sink) from 66 to 133MHz.

    of course, as mentioned by lots of people, performance of these chips just wasn't up to par as far as comparisons to the pentium go.

    hey, with the lower power consumption, maybe those stupid SX chips weren't so stupid after all!! :)

  • Incremental changes to existing processor cores yield only incremental benefits, whereas delivering radical improvements in price/performance demand a ground-up reevaluation of the entire architecture. by way of example, consider sort programs. You might write the tightest, fastest bubble sort program on the planet, fine-tuned over many years of tweaking in assembler - then a newbie comes along with a Quicksort routine hacked in C and blows it away. For more concrete examples, consider Digital's (now Compaq's) Alpha and StrongARM processors, which respectively delivered the highest performance, and highest performance/watt of all competing products. By reevaluating the fundamental algorithms used (i.e. truly rearchitecting the core), it becomes possible to achieve dramatic leaps. Tweaks to existing designs just can't yield the same result! Best Regards, MarkF
  • by BitMan ( 15055 ) on Monday October 02, 2000 @06:34AM (#740786)

    The 386 chugs along and does one instruction after another. If one instruction installs any part of the chip (fetch, decode, execution, etc...), the whole chip stalls. It has no pre-fetch, no branch prediction, no (real) pipelining, no out-of-order execution and, again, only one pipe to do things.

    The Pentium, K6 and newer processors have multiple pipes, even separate ones for integer, floating point, branching. The pre-fetch instructions, which includes branch prediction so they will usually pick the right path (although Intel's IA-64 just does both, long story). They do out-of-order execution to avoid stalls on opcodes with long cycles (like memory loads). Etc...

    These things are NOT cake to design.

    But your points are semi-valid. In fact, companies like IDT are the nemesis of AMD (the king of extending x86 designs in ways Intel could only dream of), they do simple designs (e.g., no out-of-order execution) in record design cycle times (e.g., 12-18 months). But in today's microprocessor world, you can get 3-4 fold increase by such designs. It's not just about clock speed anymore.

    Places where you don't care about such features are in the embedded world. And in the embedded world, you don't care so much about x86 compatibility -- more concerned with power consumption (where x86 sux -- at least before Transmeta ;-). E.g., Intel's (formerly Digital's) StrongArm is available at upto 600MHz, but has but a single pipe. But it also eats only 450mW.

    -- Bryan "TheBS" Smith

  • The StrongARM series was developed by Advanced RISC Machines in the UK - I believe they were bought by Intel (not Compaq). Actually I was rather dismayed to see both Digital and ARM - cool, innovative companies making slick fast chips - bought by crapmeisters like Intel and Compaq. The triumph of mass-marketing over good design, or something.

The rule on staying alive as a program manager is to give 'em a number or give 'em a date, but never give 'em both at once.

Working...