JC, Re: "Nice try, but you're wrong here. The P4 can do a peak of half as many MMX ops per clock compared to Celeron. The workaround is SSE2, which equals per clock peak MMX op rate. But the cpu would have to be at double the frequency to be equal in peak MMX op rate of the Celeron."
I'm sorry, but you are incorrect. As I said, the Pentium 4 has a 2-cycle latency for accepting new MMX instructions into the execution engine before they come out again, but that signifies a 2 stage pipeline for executing MMX instructions. If you continually feed in MMX instructions every cycle, you will get one instruction outputted every cycle. That's the way a pipeline works. The Celeron may have a single cycle latency, but as long as there are instructions in the pipeline, and more often than not, there are, then the Pentium 4 will have an equal throughput, which only increases based on frequency.
I'm sure you can design an application that specifically targets the extra cycle of latency penalty, but realize that most if not all applications will have a number (or a vast majority) of instructions in the pipeline at any one time, which can be forwarded by the out-of-order engine to keep the MMX unit well fed for a throughput of ONE MMX INSTUCTION PER CLOCK CYCLE.
Re: "That doesn't matter. You made a blanket statement about P4 superiority, and I'm just gave you counterexamples. Suddenly you think that I'm trying to argue that the Celeron is faster than the P4, but that is not the case at all."
JC, you don't seem to realize that your counter-examples are based off of incomplete information. The knowledge I have is not all on hand, so forgive me for not including the proof you need, but suffice it to say, it's pretty scattered. You are speculating on the possibility of Pentium 4 being less ideal than the P6 core, and I am telling you with complete certainty that your opinion is incorrect. I am aware of many enhancements that went into the design of the Netburst architecture, and every time sacrifices were made to enhance the frequency, other enhancements were included to take up the slack. When properly implemented, the Pentium 4 should be faster in every respect. The fact that this is not represented by many of today's benchmarks shows the relative lack of effort on Intel's part in bringing code up to date.
To put it another way, today's code has optimizations back in the 486 and Pentium days, which hurts the performance of P6, Athlon, and Netburst architectures, although it hurts Netburst most of all due to the longer pipeline. By improving the code, all platforms benefit, but the Pentium 4 will most of all, because it started with the most severe penalty. There are many guesses out there about why the Pentium 4 performs badly, and nearly all of them are wrong. If you won't take my word on it, then agree to disagree, and wait until it is proven one way or another.
Re: "Yes, could very well be tests in which the P4 has better latency than the Celeron. But there are tests (such as this one) in which the Celeron has better L2 latency than the P4 (and remember that the Celeron has the latency of the Coppermine, not the Mendocino). Intel's pdfs made claims that the P4 had one cycle better latency than the Coppermine, but that doesn't make it automatically true. It's just like the Thunderbird and it's "11 cycle" L2 latency (it wasn't, at least not all the time)."
That cache latency utility is measuring things incorrectly. Although Intel will quote the "best case" latency in Pentium 4 briefings, I also know that the average latency is better than the Celeron. It is not my assumption, but a proven fact. Again, if you don't believe me, then table the argument for a later time.
Re: "By the gods!!! You're talking like a person who's been reading press quotes for all his life! You have no idea what we're talking about, do you? You made a very ignorant statement which shows that you are making blind assumptions, and I gave you counterexamples to show you that you cannot make assumptions like that! I did not make any assumptions here. I did not say that the Celeron is a more advanced processor. But you said not only that the P4 is more advanced, but that it is "far better equipped". This is a blanket statement, and I showed you several areas in which it is not better equipped for certain memory oriented activities (which is what many of these "future benchmarks" stress). And you have the audacity to refer to my statements of truth as "lame attempts"?"
Yes, JC, your attempts were lame, but I don't mean that in a particularly bad way. I am aware that you do not have the information that I do, and unfortunately, my information is too scattered to provide you with right now. You ought to calm down a little, though, because you are obviously upset. If you don't want to listen to what I have to say, then ignore it. There is no disagreeing, though. I am only claiming facts, as I am aware.
I fully expect the Netburst micro-architecture to be criticized until such time as it proves itself indisputably. It's unfortunate that there is so much code out there that performs badly with the Pentium 4, but these are the cards it was dealt. Condemning benchmarks that are more favorable to the Pentium 4 may sound reasonable at this point, since so many other applications tell a different story, but soon you'll realize that some applications perform so much better, simply because they are better written. SysMark2001 uses an application that many people here admit is used rarely. So be it; if they want to ignore it, then there are other benchmarks out there. If they are ignoring it because they are thinking about conspiracies, that's where they are wrong. Just realize that I am not supporting benchmarks because they make Pentium 4 look better than the competition. I just want what's fair for the Pentium 4, and so far the only benchmarks that are being fair are being condemned for the very opposite reason. I only expect justification when some of the more pathological performance bugs with legacy code are somewhat alleviated, and when more code becomes better written to show the true power of the micro-architecture. Until then, say what you want. You will eventually be proven wrong.
wanna_bmw |