Silicon Investor (SI) -- The First Internet Community

STOCKTALK

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor. We ask that you disable ad blocking while on Silicon Investor in the best interests of our community. If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.

Politics : Formerly About Advanced Micro Devices -- Ignore unavailable to you. Want to Upgrade?

To: tejek who wrote (122832)	8/22/2000 2:25:04 PM
From: Daniel Schuh	Read Replies (1) \| Respond to of 1578306

Ted, I think that Transmeta's low power technology is somewhat orthogonal to the "code morphing" business. According to Scumbria, who has stated he's impressed, it sounds like they can throttle both voltage and frequency to reduce power when the CPU demand is less than 100%. I don't know much about the power stuff myself, and I remain moderately skeptical about the x86-to-VLIW in software business, but we'll see, Transmeta does seem to have scored some major OEMs. Cheers, Dan.

To: tejek who wrote (122832)	8/22/2000 3:09:06 PM
From: Petz	Read Replies (2) \| Respond to of 1578306

ted, how Transmeta CPU works: <ref: So that I understand Transmeta and its product...it substitutes software for some of the functions normally performed by a chip? That's why power consumption is less. Is that correct?> That's Transmeta's way of describing it, but it is a little misleading. EVERYTHING eventually gets done in hardware. But the hardware execution unit of the TM CPU is simple, small, low power and high speed. It is a RISC (reduced instruction set) CPU designed to be fast and efficient. The essential concept to understand is that the TM CPU translates x86 code found in your software on the fly into the simpler RISC instruction set of the core CPU (lets call it, "TM code"), but in most cases each x86 instruction is only translated once into "TM code." The translation of x86 code into TM code is not done by a big hardware engine inside the TM CPU, it is done by a software program which once in a while interrupts the execution of already translated code so that it can translate new chunks of code. Thus, the TM CPU is jumping back and forth between translating code and executing the code it has already translated. Since the translation is done by sophisticated microcode inside the TM CPU, it can do optimizations that would be impossible via hardware translation. An Athlon also translates x86 code into a simpler code refered to as RISC86. The difference is that the Athlon does this all in hardware and doesn't try to optimize the RISC86 code. But the hardware that executes the RISC86 instructions has very sophisticated and capable of doing instructions out of order, skipping instructions that aren't really necessary, looking ahead to see if there's something the CPU can do, etc. If a program just consisted of a sequence of intructions without any loops, the TM CPU would be extremely inefficient. It would spend a good deal longer translating the program into its little bite size TM instructions than it would take for even a 486 to execute the program. But if a program has a lot of short loops, the TM design makes some sense. Once a loop's code is translated, it may run 1000 times, and during that time, nothing will interrupt the execution of the loop. It will be running on simple hardware which doesn't do anything fancy like speculative execution, out-of-order execution, etc, because the code morpher used some sophisticated algorithms to translate the x86 code of the loop into the minimum number of "TM code" instructions. There are some disadvantages to this approach, I've made it sound like a panacea. For one, fast floating point requires a total hardware implementation, and fast FPU's take a lot of power. The Crusoe isn't designed to do floating point super fast. Second, if a program jumps all over the place, then the it is less likely that the translated code will be in the cache, which means that code may have to be re-morphed many times. This would happen, for example, on benchmarks that do a lot of multitasking. Finally, its a poor choice for anything that requires a predicable response time, because, when an how often the execution of a program has to be interrupted to do some more "morphing" is unpredictable. PS, see also arstechnica.com Petz