Why Design New Processor Cores? 14
Gus asks: "Why do vendors continue to develop new processor cores, as opposed to using newer technologies in older proven designs? The best example of this I can find is IBM selling the 'Blue Lightening' chip, which was really a 386, but was sold as a 486. Why not make a 386 today using copper and a 0.18 micron process? Wouldn't this be fairly cheap? Are new instructions worth adding?"
Two reasons (Score:3)
2. Transistors. If you've got lots of them, why not use them? Going from the P5 to the P6 improved performance by 50% on any manufacturing process; going from 486 to Pentium, and 386 to 486 yielded even larger gains.
Because New Processor Cores Make For Faster CPUs (Score:3)
The Intel 8088, believe it or not, was not even a pipelined processor! Many instructions took upwards of TWELVE whole cycles to complete, before execution could even begin on the next. Compare this to the modern Pentium Pro based CPUs - where up to two floating point, two integer, and one memory access instruction may be executing simultaniously, and not necessarly in any particular order. On modern CPUs it is not uncommon to be able to execute THREE separate instructions in the SAME clock cycle!
It is the CPU's architecture that gives it it's processing power, not it's posted clock speed.
More is better (Score:3)
Other advantages already in use are things like optimized instructions. Its true, most chips in PCs today are RISC, but some (eg the x86 arch) have a CISC wrapper. The actual core of the chips are RISC, and the chip has an inbuilt microcode which is used to represent the CISC instructions.
So a hypothetical instruction like fetch add 5, and store in pc (FAFSP) could be represented by the microcode instructions load into r1, add 5, mov r1 to pc.
This of course is highly simplified, but what traditionaly was done with a series of assembler instructions by the programmer (at best) or compilier (at worst) is now done by one assembler instruction, and some microcode thats tailored to that exact chip to give highest possible speed.
What all this boils down to is, the more you do in chip, the faster its done. Yes, doing more in chip increases complexity, cost, and power consumption (and HEAT!), but it means it goes faster.
However its usually more efficent (other than perhaps in price) to do it in chip. The power consumption is going to be greater having 5 chips doing all the tasks rather than 1, the same applies to things like cost, heat, etc.
The only thing you loose is the modularity of it all. When a new revision of the chip comes out, you get a new serial port, video driver, sound card, etc. All of which could have new bugs. (Compare to traditional approach where the subsystems are in different chips, and usually don't get upgraded at the same time). However bug issues are mainly to be worried about by the bleeding edge developers who are the first in the world to work on these chips.
In the end, your get more bang for your bucks with redesigned cores. And thats all we're all really interested in isn't it?
---
Tradeoffs (and a 386 can't do 1GHz) (Score:2)
The other big reason that chips get constantly redesigned is that because of circuit issues (speedpaths and the like) you can't just shrink the design and get much faster. It becomes diminishing returns. If you take a look at a graph of clock (or even performance) rate over time for a family of processors that extends across process technologies, you'll see that it isn't a nice linear scaling. It tapers off, because you need to redesign the chip to compensate for changes in the underlying technology.
Finally, it's economic. People will fork out money for the latest architecuture and highest frequencies. As long as people are willing to pay for it (and not pay for 1GHz 386es), there will be companies actively designing new chips.
Besides, it's fun :-)
- Mike
Using the old cores... (Score:1)
Interestingly, newer chip designs revolve around a concept that isn't too different from this. Transmeta, for example, is using a proven design (a RISC core) and making it use multiple execution units simultaneously. This is similar to breaking a problem down into simpler steps and having four CPU's working on it, but it works for more general problems.
Since they've managed to save so many transistors in their design, and move a lot of work into software, they should be able to ramp up the clock speed in the future, and once compilers catch up, maybe add some more execution units too, and get some faster SIMD instructions.
Erm, the 386 does prefetch... (Score:3)
I've spent too much time looking at logic-analyzer traces of 386s. Bless them, the analyzer manufacturer sold a trace program (read: hardware run-time disassembler) that could sort things out, and tell me which fetches were discarded prefetches (when the CPU took a jump to other than where the prefetched code was). Without it, it would have been an even bigger pain to debug.
Otherwise, the post is spot on. For instance, when the prefetch is invalidated, a 386 sits on its thumbs until the memory can be bothered to dig up the new opcodes. Nowadays, the newer chips have plenty to keep them busy at such times (mostly). As mentioned, some of them even fetch and execute both code streams, keeping a finger in the book so when the time comes they can see which way the branch actually went, and throw away the results from the untaken path.
Maybe that's why we embedded types are so fond of Z80s and such: you can actually see what it's doing, when it does it, with a logic analyzer. You gotta be able to, when the question is whether you're diddling the right I/O port at the right time. With the new chips, which can suck up a few hundred K into cache, and show no bus activity thereafter, you need to be a mind-reader.
Nowadays the monitoring tools for these caching, prefetching, secretive bastards are enormously expensive, and are developed in parallel with the chip itself, to be released at the same time, 'cause nobody's going to buy an embedded processor for which `gdb' is the only debugging tool.
The folks who design these monstrosities deal with stuff that makes my head hurt just to understand the questions, much less figure out the answers. Hats off to them!
Why design new processor cores? (Score:1)
(and in a somewhat related post, X is bloated - not by code but by Architecture and the steeping-stone hacks that are required to bring it to standard).
To make them better at what they do (Score:1)
A modern CPU can keep my basement warm in the winter. Just imagine how many 386's it would take to accomplish the same. That's called progress.
This is like saying why design new engines? (Score:2)
Making the chip smaller and cranking up clock speed also only takes you so far. Some chips are designed to go up to a certain clock speed because that is how they were designed. (Damn laws of physics)
Just because a design works and is proven to work doesn't mean it can't be made better. There are tons of improvements that need to be made to processors (as well as their supporting infrastructure).
Cheap 8 bit chips (Score:1)
Old cores in new apps (Score:1)
of course, as mentioned by lots of people, performance of these chips just wasn't up to par as far as comparisons to the pentium go.
hey, with the lower power consumption, maybe those stupid SX chips weren't so stupid after all!! :)
Why? Revolution vs. Evolution! (Score:1)
Pipelining, multiple pipes, out-of-order execution (Score:3)
The 386 chugs along and does one instruction after another. If one instruction installs any part of the chip (fetch, decode, execution, etc...), the whole chip stalls. It has no pre-fetch, no branch prediction, no (real) pipelining, no out-of-order execution and, again, only one pipe to do things.
The Pentium, K6 and newer processors have multiple pipes, even separate ones for integer, floating point, branching. The pre-fetch instructions, which includes branch prediction so they will usually pick the right path (although Intel's IA-64 just does both, long story). They do out-of-order execution to avoid stalls on opcodes with long cycles (like memory loads). Etc...
These things are NOT cake to design.
But your points are semi-valid. In fact, companies like IDT are the nemesis of AMD (the king of extending x86 designs in ways Intel could only dream of), they do simple designs (e.g., no out-of-order execution) in record design cycle times (e.g., 12-18 months). But in today's microprocessor world, you can get 3-4 fold increase by such designs. It's not just about clock speed anymore.
Places where you don't care about such features are in the embedded world. And in the embedded world, you don't care so much about x86 compatibility -- more concerned with power consumption (where x86 sux -- at least before Transmeta ;-). E.g., Intel's (formerly Digital's) StrongArm is available at upto 600MHz, but has but a single pipe. But it also eats only 450mW.
-- Bryan "TheBS" Smith
Re:Why? Revolution vs. Evolution! (Score:1)