A Creative Cow Review



Athlon MPs Arrive

Tyler A. Hawes
Audio Intervisual Design, Hollywood, California USA

©2001 Tyler A. Hawes. All Rights Reserved. Reprinted at Creativecow.net with kind permission of the author.

Tyler Hawes

Article Focus:
Tyler Hawes, Creative Cow's resident guru for all things silicon, explores the importance of AMD's entry into the multi-processor marketspace. In this article, Tyler examines the issues and the stats and gives his own opinions as to their meaning -- and with a background as a former Intel engineer who has a real fondness for AMD as well -- his opinions just might surprise you! In this in-depth exploration we know you'll find many answers to many of your questions -- not to mention that Tyler will give you answers to questions you didn't even know you had yet!

AMD Athlon MP processors and 760MP chipset


As details began to emerge in recent weeks, people started holding their breath, buyers postponed purchases in expectation of it, and a degree of relief came that the wait might finally be over. Finally, the speculation, wondering, and waiting are over, and AMD has released their first multiprocessor solution to the market: AMD Athlon™ MP Processor and 760 MP chipset.

We are left with a release that holds little in the way of surprise. Indeed, the details AMD provided me with lined-up almost exactly with what I expected. But that doesn’t necessarily diminish the impact or excitement of this product launch, as we shall see...



Background

Almost two years ago AMD unleashed the Athlon processor upon the world. To the amazement of most, they quickly began gobbling up share in the desktop computing market segment. For the first time, AMD created a processor superior to the competition from Sunnyvale in terms of design, performance, and price all at once – and held onto that position. At first, many pundits were cynical of the prospect that AMD would truly challenge Intel®’s status as the processor King. It didn’t take long, though, before the pundits not only changed their mind, but they began heralding this change as one who’s time had come.

Today, AMD arguably still has the best overall performance in a desktop computer, and they certainly have the best pricing. Their tremendous progress of late has been almost entirely in desktop computing, and most of that in the consumer market. It took a while before their gains in the consumer space started to catch on in the corporate desktop market, which AMD is just now starting to make serious inroads to. Recently they’ve made important launches that should help them attack the mobile market as well, such as the Duron™ Mobile Processor and the Athlon 4 Processor.

Despite these victories for AMD, Intel remains relatively unchallenged as the x86 King of the workstation and server market. With the launch of the Athlon MP and 760 MP chipset, AMD aims to change that. This pair of releases is aimed squarely at the high-end workstation and low to mid-end server market space.  

 

What’s New?

The Athlon MP Processor is based on the new “Palamino” core. The design is an evolutionary step up from the Thunderbird, offering some significant but not earth-shattering performance improvements. More important is the supporting 760 MP chipset, which essentially is the standard 760 chipset with dual-processor support and 64-bit PCI support. There’s still plenty to talk about, though. First, let’s summarize the major changes of the Athlon MP relative to the Athlon “Thunderbird” and the 760 MP relative to the 760 chipset:

AMD Athlon MP Processor (New Features Relative to Athlon “Thunderbird”)

AMD 760 MP Chipset (New Features Relative to AMD 760)


AMD Athlon MP Processor

AMD Athlon MP Processor At launch time, AMD says that the processor is expected in systems from more than 50 manufacturers worldwide, with no less than 20 offering systems for sale immediately. That list is growing fast, so those numbers could already be out of date. Pricing in 1,000 unit quantities is set at $215 for the 1GHz part and $265 for the 1.2GHz part.

Dual-Processor Compatibility

The Athlon MP is compatible with dual-processor configurations on the AMD 760 MP chipset and using AMD’s “Smart MP Technology”, which we’ll talk about later. Of course, this is the most-relevant feature, but entirely expected. There are other features, however, that make the Athlon MP perform superior to the Athlon “Thunderbird”.

Improved Data Prefetch

When there is some extra bandwidth available on the bus, The Athlon MP processor will take advantage of that by pre-fetching data it expects to need, thereby alleviating the need to fetch that data later when it is requested and the bus may be more saturated. This type of speculative data prefetching has already been shown to help performance in Intel’s Pentium® 4, and is one of three reasons why an Athlon MP will outperform an equivalently clocked Athlon “Thunderbird”.

3DNow! Professional Technology

AMD has added 52 new instructions to the AMD Athlon MP, which they call 3DNow! Professional. Since the Pentium III, Intel has touted what they call “SSE” instructions, which can potentially provide a nice boost to applications that are optimized for them. It’s long been a feather in Intel’s cap that AMD did not have. 3DNow! Professional is SSE-compatible, so the Athlon MP can now take advantage of those very same optimizations that have become common in workstation and server software.

3DNow! Professional instructions improve performance by simplifying, and therefore optimizing, the processing path for floating-point operations. With regular instructions, floating-point calculations are performed to a high-degree of accuracy. Often times, that level of accuracy is unnecessary. So 3DNow! / SSE instructions provide a way of performing that same calculation to the level of detail that is needed, thereby reducing the computing requirements and optimizing performance. This is a drastic over-simplification of just one aspect of these enhancements, but it should give you an idea of how it can help performance.

As you will see in the SpecFP benchmarks later on, these new instructions have opened up the performance on the Athlon MP considerably. Running the SpecFP CPU200 Base and Peak benchmarks, which measure a computer’s floating-point performance, we see that the Athlon MP processor gains between 11% and 15% in performance verses an Athlon “Thunderbird” running at the same clock speed. Since floating-point operations are of prime importance in 3D applications, this is a good indicator of the kind of performance gains 3D modelers and animators might expect (we’ll see some real-world benchmark data to that end a little later on…).


Improved TLBs

TLBs (Translation Look-Aside Buffers) are a cache for translated memory addresses. When a request is made to system memory by the processor(s), the addresses requested are then translated into the actual physical addresses at which the data is stored in memory. By caching these addresses in the TLBs, performance is improved upon subsequent requests to memory if the requested address is already cached.

AMD has increased the size of the TLB from 32 to 40 entries. By being able to store more addresses, the TLB is more likely to cache a given address request, and therefore more likely to improve performance for that request.

The TLB design now supports speculative reload. If an instruction requests an address that is not stored in the TLB, this feature will go ahead and load that address into the TLB before the instruction completes. In this situation, the TLB won’t improve performance as much as if the address had been stored in the TLB in the first place, but it’s better than nothing at all.

Also, the TLB design is now exclusive. With a non-exclusive TLB design, the addresses that are cached in the L1 TLB are duplicated in the L2 TLB. By only storing this information in the L1 TLB, space is freed on the L2 TLB, which could increase performance. However, it can also increase latency since the addresses are now only cached in one place. In most situations this will likely be an improvement overall.

The improved TLB is a rather technical detail and is not likely to make a huge difference in ordinary desktop use. However, in high-end applications with very heavy data i/o, every little bit counts. The cumulative effect of these types of refinements can add up to an appreciable difference.

Why Only at 1GHz and 1.2GHz?

Perhaps I should put “only” in quotes above, as 1.2GHz is certainly still very fast. However, the fact remains that AMD has already released the Athlon “Thunderbird” clocked at 1.4GHz. To this question, AMD’s Bret Kirby told me:

“The validation time it takes on a 2 [processor system] takes longer than on 1 [processor]. There’s no issues at all with the process we’re using, we just want to be sure that everything is validated to the millionth-degree before we release it. Probably, as our validation team gets bigger toward the end of the year, you’ll start to see parity between the desktop and workstation processors. So far, we haven’t had any major release mistakes, and we want to keep it that way.”

Bret was gracious enough not to point out any “major release mistakes” of a certain competitor of theirs, but I probably wasn’t the only one in the conversation who immediately called to mind Intel’s problematic launch of the i820 chipset or their embarrassing botched-launch of the Pentium III processor running at 1.13GHz. That AMD is placing a premium on validation and quality control over launching the fastest chip they can possibly create is a good thing; that’s exactly the attitude they need to be successful in this new market space.


AMD 760 MP Chipset

AMD 760MP Chipset The AMD 760 chipset is very similar to the AMD 760 chipset, with the notable exception of dual-processor support. I think this is good because the AMD 760 chipset has already had about six months to mature in the desktop market space and forms a solid foundation for AMD to build the MP version on. Already Tyan, Asus, Abit, Gigabyte and MSI have committed to supporting 760 MP in motherboard products. Tyan’s Thunder K7 motherboard is available immediately (I found it on PriceWatch.com for less than $600) and has a kitchen-sink feature set. True to form, Tyan will release a “Tiger” version of the motherboard later that is without all the extra features, which are not needed for many workstation configurations. Other motherboard manufactuers will be shipping their products by the third quarter.

Features

Of course, the exact feature list will vary from one motherboard, but the following will be supported at a minimum:

Why No Dual-Processing Support with Thunderbird?

Doubtless many will be disappointed to learn that AMD does not support the Athlon “Thunderbird” in dual-processor systems. AMD told me that the 760 MP chipset was designed from the beginning for the Palamino core and that Thunderbird was never tested or validated on it. They also said that 760 MP motherboards will auto-detect if you have a Thunderbird installed and will disable dual-processor support. You will still be able to run a Thunderbird chip in single-processor mode.

However, I’m hearing numerous reports from other testers that they are having success running AMD Athlon “Thunderbird” CPUs and even Durons in dual-processor mode on the 760 MP. It is possible that the hardware auto-disable feature AMD described to me was not included on the pre-release version of the Tyan Thunder K7 test motherboard, and will be added in the shipping version (although that seems unlikely to me).

Even if such a hardware lock were implemented, a third-party workaround may present itself, although none has been announced. Intel never sanctioned the use of Celeron processors in dual-processor configurations, but that didn’t stop many from doing just that with great success (nor did it stop Abit from making the BP6, a board made for exactly that application). Then again, Intel didn’t auto-disable Celeron processors like the AMD 760 MP chipset purportedly will. However, we’ve seen third-party devices trick chipset auto-detection before, such as the “Gold fingers” devices unlock the clock multiplier on Slot A Athlon processors, thereby allowing more flexibility for overclocking.

Whether dual Thunderbirds and Durons require a hardware workaround or else it works without any extra ingredients, it doesn’t seem likely that AMD will support it, and it may void the warranty on your processors and/or motherboard. There’s not much reason for AMD to encourage those setups, since they are targeting this release at the professional workstation and server markets.

Besides, as you will see below, there may be good reason to use Athlon MP processors, as they don’t cost all that much more and they offer a performance edge over standard Athlons.

33MHz/64-bit PCI Support

The vast majority of PCI busses in the world are of the 33MHz/32-bit variety. However, since Intel released the i840 chipset over a year ago, which supports 33MHz and 66MHz 64-bit PCI busses, several manufacturers have released devices to take advantage of the new specification. Typically these are high-bandwidth devices, such as SCSI and RAID controllers, and video capture interfaces such as Pinnacle’s Targa 3000. Most of these devices run at 33MHz and support both 32-bit and 64-bit configurations. That way they are still compatible with the majority of systems which only support 32-bit PCI devices, but they can also take advantage of the increased bandwidth offered by newer platforms with 64-bit support.

I didn’t get a specific technical reason from AMD on why they didn’t incorporate 66MHz/64-bit PCI bus support in this chipset release. They did say that for the market their going after with this release, 33MHz/64-bit support is more prevalent. In the third-quarter AMD plans to release the 760 MPX chipset, which will add 66MHz PCI bus support. This chipset will probably be hitting right around the time 760 MP is getting into its stride, so that 760 MP may have a short life. However, other than the 66MHz/64-bit PCI support, there’s no difference between the two and therefore no reason to wait if you are ready to buy now and Tyan’s Thunder K7 motherboard suits your needs.

Only Registered DDR-SDRAM is Supported

By now most of you are familiar with DDR-SDRAM memory. This memory is able to deliver twice the bandwidth as regular SDRAM memory while not costing much more. That stands in contrast with RAMBUS RDRAM, which also offers increased bandwidth but at a very considerable expense. RDRAM is also very controversial, with several articles purporting that there is very little real-world benefit from using this ultra-expensive technology. Currently, Intel is the only x86 chipset manufacturer to support RDRAM, and even Intel has been straddling the fence a bit by promising a SDRAM platform for the Pentium 4 at sometime in the near future.

The only motherboard currently available with the 760 MP chipset, the Tyan Thunder K7, requires the use of registered DDR-SDRAM memory. AMD indicated to me that this is a chipset requirement, although I’m sure they could enable support of unbuffered memory as well. Currently, registered memory is only available in the form of ECC DDR memory modules. This means that if you were planning to upgrade to this platform and already have DDR memory, you’ll likely need to buy new memory (unless you already purchased ECC DDR-SDRAM). AMD says the feedback they got from their customers was that registered DIMMs are what they wanted; so registered DIMMs are what they got.

Typically when new multiprocessor products come out, the first motherboards released are targeted at the high-end market where integrated-everything is desired. A few months from now, we will see workstation-specific boards that are more stripped-down, and perhaps then they’ll support unbuffered DDR memory.

460-watt Power Supplies, Oh My!

Did I say “power supply”? I’m sorry, I meant “nuclear reactor!” It seems that desktop users have just gotten used to the idea of 300-watt power supplies. Before we get too dismayed at the power requirements, though, let’s remember this is a workstation/server product. Not only will there be an extra processor, but typical configurations will include perhaps a RAID array consisting of several high-speed SCSI hard drives, power-hungry professional OpenGL graphics accelerators, and other high-speed i/o devices that suck a lot of energy.

AMD said they are developing a power supply specification and will release details about it shortly. For the time being, you can get a qualified unit made by NMB or Delta. I did a quick search on PriceWatch.com and found NMB’s SD025A460WSW, an approved power-supply, for under $200. Relative to the overall costs of setting up a server/workstation, an extra $150 for a beefier power supply is not that much. Especially when you consider the damage that could result to your expensive graphics card, SCSI drives or data from under-powering your system.


AMD Smart MP Technology

Smart MP Technology is really a function of the chipset, but I thought it deserved it’s own topic. Later on, when you get to the benchmark data, you’ll see that AMD’s MP configurations are scaling rather nicely compared to the competition. Smart MP is a key to this scalability, and it is a major difference between AMD’s multiprocessor implementation and the competition’s.

Dual Point-to-Point 266MHz System Busses

To understand the advantages of Smart MP, it’s useful to first discuss the “traditional” ways of doing things.

Shared CPU Bus Diagram Intel’s multi-processor implementation for the Pentium III processor relies on shared bus architecture (left). Both the processors share a single bus of communication with the Northbridge, which is their connection to all other system resources such as main memory. This means that the two processors must share the bandwidth afforded by the particular bus spec. In the case of a Pentium III system running at 133MHz FSB, that equals 1GB/s. So if both processors are fully utilized, you can see that they will each get half as much bandwidth as they would in a single-processor system. This presents a performance bottleneck, as supplying both processors with a steady workload is of major importance to take full advantage of a multiprocessor configuration.

Now let’s see how AMD’s Smart MP Technology improves upon the situation:

Point-to-Point CPU Bus Diagram AMD’s Smart MP design dedicates a full bus from the Northbridge to each processor. In this way, each AMD Athlon MP processor in a multiprocessor system gets its own dedicated bandwidth to the system logic and the bottleneck is removed. Since the AMD 760 MP chipset is operating at 266MHz DDR FSB, that yields a dedicated 2.1GB/s bandwidth to EACH processor. This is actually the same EV6 bus, based on Alpha technology, which is used for single-processor Athlon systems, except there are now two of them.


MOESI and Snoop Busses

Once again, let’s look at a diagram at the traditional shared-bus that is employed in conventional systems:

Shared Front-Side-Bus Diagram In a shared bus (left), any communication between the processors must go over the bus (which, remember, is more congested than AMD’s dual EV6 busses), through the Northbridge controller, and written to main memory. Then, the other processor must make that same journey over the bus, through the Northbridge, to memory to read the data.

Now let’s see how AMD once again refines that process:

Point-to-Point Front-Side-Bus Diagram

As you can see, with the point-to-point bus, Athlon MP processors are able to communicate to each other through just the Northbridge controller. This has many advantages. For one, it reduces latency because the trip is shorter. It also will be faster because the processors are reading directly from each other’s cache, as opposed to the much slower system memory. Lastly, since system memory is not being used for the transaction, it frees up system memory (although this is a very small amount relative to the massive RAM likely to be installed in a workstation/server). The net effect is an increase in performance compared to the traditional shared-bus architecture.

You might wonder in what type of situation this type of transaction would be employed. You’ve probably at least heard reference to the cache on a processor. Basically, this is a small amount of very high-speed memory, hundreds of times faster than SDRAM, which is embedded on the processor. The processor will store data that it has already retrieved in its cache so that if it needs that data again it can retrieve it from the fast cache rather than from the relatively slow system memory.

This is similar to how your Internet browser uses a cache to store web pages you’ve downloaded before. The next time you visit a site you’ve been to before, if you’ve already looked at a page, the browser can pull it from a cache on your local hard disk rather than going over the slow Internet to get it. To continue the illustration, suppose a web page did not exist in your own computer’s browser cache, but it did exist in the browser cache of someone else’s computer on your local network. While not as fast as getting it of your local system, it would still be much faster to read that page over your local network than to retrieve it over the Internet.

Remembering that illustration, suppose CPU0 in the diagram above wants to retrieve data that is not in its own cache but exists in the cache of CPU1; it would be much faster to retrieve the data from the cache of CPU1 than it would from the slow system memory. This is possible with both the traditional shared-bus design as well as the dual-bus design of the 760 MP, but remember the shared-bus design requires that the transaction be routed through system memory, thereby nullifying the potential benefits.

To make this inter-processor communication possible, AMD uses something they call a “Modified Owner Exclusive Shared Invalid” Cache Coherency Protocol, or MOESI (pronounced mo-ess-ee). MOESI keeps track of what data is stored in which computer caches. When either CPU requests data from system memory, MOESI checks first to see if that data is stored in the other CPU’s cache. If so, then it retrieves it from the high-speed cache instead of memory and supplies it back to the requestor. This communication path where the chipset “listens” to data requests from the processors is called a “Snoop” bus.


AMD’s Multiprocessor Strategy and the Future

I think AMD╠s strategy is smart. They began by attacking the market in which they had the most experience and where the customers were most willing to adopt an unproven product, the consumer desktop space. Having established their product there, they began moving into the corporate desktop space with their newly acquired respect. So, having put their best foot forward already, how are they going to proceed?

Well, in recent weeks they╠ve launched the AMD Duron Mobile Processor and Athlon 4 processor, both aimed at the notebook market. The new ¤PalaminoË core is important for their success their because of its lower power requirements. With the Athlon MP and 760 MP they are initiating ¤AMD╠s attack into the 1- and 2-way server marketË (quotation from AMD╠s press materials). The idea is that they will get some penetration with Athlon MP solutions in the dual-processor market, which is the low-end of the server market. This is also the server segment that is the largest and currently experiencing the most growth, and its customers are more willing to adopt new solutions than the upper echelons of the market.

Socket A Fever!

Has anyone else gotten tired of the constantly changing form-factors for processors? Well, it appears AMD has tired of this as well. AMD╠s representative was able to confirm what the roadmap says, which is that Socket A is the planned interface for all AMD Athlon processors through 2002 and beyond. For those of you who are codename-happy, at the least this includes Morgan, Thoroughbred, Appaloosa, and Barton. Not only will the Socket A remain unchanged, but AMD told me that these future versions of the Athlon/Duron processor family should work on current motherboards (of course a BIOS update will be in order at that time).

Some of you may know first-hand of the cost-savings this could bring about as you upgrade. I know I wasn╠t too happy about having to replace my Slot A Asus K7V motherboard to get an Athlon ¤ThunderbirdË 1.33GHz processor, but I bit the bullet. If they╠d been the same form factor, I could╠ve saved over $100. Now, if you╠re an enterprise environment that savings could potentially be multiplied many times over.

However, most businesses use systems for their life and then replace them entirely, rather than incremental upgrades. Having a common Socket A platform is still important, though, as it simplifies administration by allowing a single platform to be standardized across an entire enterprise.

3rd-Party MP Chipsets?

I asked AMD if there were any 3rd-party chipsets in the works for dual-operation. They of course couldn╠t comment on unreleased products from other companies, but they did say that they would work to enable any such efforts by 3rd-parties. ¤We╠re in the business of selling processors,Ë Bret Kirby said. This is consistent with AMD╠s pattern of cooperating with 3rd-party chipset vendors. In fact, in the past you could even say relying on 3rd-party manufacturers, as AMD has sometimes looked a bit too eager to hand-off the chipset business to other companies such as VIA and ALI.

However, it seems clear that AMD recognizes that they can╠t count on someone else to get the job done right when it comes to the very demanding workstation/server market space, and so they are in the chipset business for the duration (at last as far as multiprocessing goes).

I╠ve heard VIA may be working on a MP chipset for Athlon, but I wouldn╠t expect that until next year. What I really wish for is a solution from the likes of nVIDIA, who this week released their ¤nForceË chipset for the Athlon. While some may correctly point out that the nForce is targeted at consumer desktops, it also features some innovations that could find a home on the high-end, such as the use of AMD╠s Hyper Transport technology as a link to the Southbridge.

To be clear, I haven╠t even heard rumors about nVIDIA having any such plans, so this is pure wild speculation. However, AMD and nVIDIA are great friends, and a man can dream, can╠t he?

Dual Durons Are Coming

On AMD’s roadmap we see a product code-named “Morgan” slated for release in the second half of this year. This will be a multi-processor version of the Duron processor based on the Palamino core and targeted at low-end servers and workstations, such as appliance servers and network-attached storage. Judging from the excellent value of the single-processor Duron, this may be a great solution for workstation users on a budget, though its small cache will severely hamper it in heavier i/o applications.

“Hammer” and x86-64

Next year AMD will launch their “Hammer” family of products with x86-64 technology and supporting 4- and 8-proccessor configurations. I am told by AMD that this will be their first platform to support more than two processors, and will also be the first AMD product to incorporate their Hyper Transport Technology (a.k.a. “Lightning Data Transport”), which is a revolutionary high-speed interface that has already been incorporated in third-party products (such as nVIDIA’s nForce, a.k.a. “Crush”, chipset).

Intel has already launched their 64-bit computing platform, “Itanium”. One major difference between Intel’s and AMD’s strategy for next-generation 64-bit processing is that AMD’s Hammer line will natively support 32-bit processing as well as 64-bit, while Intel’s will resort to emulation to run any 32-bit code. This means that AMD’s hammer line will allow you to take advantage of 64-bit programs while not sacrificing 32-bit performance, while Intel requires you to choose one or the other (Pentium 4 for 32-bit, Itanium for 64-bit; you can run 32-bit on Itanium but it is very slow because it is in emulation mode). While AMD’s approach sounds very pragmatic at this time, it remains for a future date to determine which approach is better or more successful.

Future Summary

Looking back since the launch of the original Athlon processor, and looking forward to next year’s launch of the Hammer line of products, we see that AMD’s strategy consists of a series of careful steps, each one taking them a little further into the pool. Perhaps because of this gradual and deliberate strategy, AMD has executed with confidence and poise.

We have every reason to believe that AMD will continue their successful track and build new inroads into the workstation/server market. As an engineer, I admire their success and hope they continue with their innovation and competitiveness. As a consumer, I also hope that Intel is able to respond and keep things interesting, and that this battle for dominancy between the chip kings is long and drawn-out.


Show Me the Money!

Okay, enough talk, let’s get to the important part: benchmarks – or, more specifically, real-world performance. Being as this is not a hardware site, but an interest site for Digital Content Creators, we’re going to focus on benchmarks that enlighten us on AMD’s MP performance in that regard. These benchmarks were provided for us by AMD. We intend to perform our own benchmarking in the near future, and will update you with the results at that time.

SPEC Floating-Point 2000

SPEC’s FP test is useful for comparing floating-point compute intensive operations. Floating-point operations are especially important in 3D applications for real-time display, rendering, and solving operations (such as particle systems, kinematics, etc.). The Base score is using the default compiling options for the benchmark, as defined by SPEC, while the Peak score reflects performance after the program has been tweaked to optimize performance.

SPEC Floating-Point 2000 Benchmarks

The new Athlon MP processor bests the Athlon “Thunderbird” at the same clock speed by between 11% and 15%. To see this kind of performance difference between two core revisions is pretty impressive, enough to justify the “MP” version of the Athlon as more than just a marketing label. Since this is a single-processor test, the performance gain is coming from the optimizations made in the Palamino core of the Athlon MP processor. The existing Athlon is already well-know for having much better floating-point performance than Intel’s Pentium III or Pentium 4 processor, so AMD only has itself to compete against in this aspect.


BAPCO SYSmark 2000 Internet Content Creation

BAPCO’s Internet Content Creation benchmark consists of a suite of a suite of real-world benchmarks using Adobe Premiere 5.1, Adobe Photoshop 5.5, Avid Elastic Reality 3.1, Metacreation Bryce 4, and Microsoft Windows Media Encoder 4.

BAPCO SYSmark 2000 Internet Content Creation Benchmarks

Athlon MP 1.2 GHz beats the Pentium 4 1.7 GHz by almost 11% in this benchmark. In the past, Pentium 4 fared a little bit better in this benchmark against the Athlon “Thunderbird”, but Athlon MP can now take advantage of the SSE optimizations present in some of these benchmarks that perform heavy image processing. The pre-fetch improvements help a little bit, too. We also see a 16% increase between the 1 GHz and 1.2 GHz versions of the Athlon MP, compared to a 20% increase in clock speed, so the performance is scaling fairly well as the clock speed increases in this test.


Adobe Photoshop 5.5

I doubt Photoshop needs introduction to our audience. It is probably one of the most widely used applications across all types of content creation.

Adobe Photoshop 5.5 Benchmarks

The Athlon MP 1.2 GHz trounces the Pentium 4 1.7 GHz, besting it by no less than 25%. Even the Athlon MP 1 GHz is beating it by over 15%. Performance increases only 8.33% from the 1 GHz Athlon MP to the 1.2 GHz.


3D Studio Max

We already know that the Athlon has a much more-powerful Floating-Point performance than the Pentium, so we can expect it to easily win this test.

3D Studio MAX Benchmarks

Indeed, no surprises here. At the same clock speed, the dual Athlon MP 1 GHz processors beat dual Pentium III 1 GHz processors by almost 12%, while the 1.2GHz Athlon MPs beat the 1GHz Pentium IIIs by 25%. Performance scaling is good again at almost 15% when stepping up to the 1.2 GHz part.


SoftImage|XSI

Athlon enthusiasts will be happy with the recent announcement by Avid that they have certified their high-end 3D modeling and animation software for use on AMD's Athlon processor. You might say it's about time, as we've known for a while how excellent the Athlon is in this type of application. Let's hope this sets a precedent for others to follow, as well as for other divisions of Avid...

SoftImage|XSI results are very similar to 3ds max, showing that Athlon MP makes an excellent choice for 3d modeling and animation professionals.

SoftImage|XSI Benchmarks

Here the Athlon MP is besting the equivalent Pentium III by just over 9%. Performance scaling is again at approximately 16%.


Benchmark Summary

There are many more benchmarks that we’d like to see, such as more specific application benchmarks and a complete cross-platform comparison. However, I think that by culling out these early benchmarks that are especially relevant to digital content creation, an accurate story begins to emerge.

First, the Athlon MP has improved significantly upon the performance of the Athlon “Thunderbird”. We’re seeing potential double-digit percentage improvements with the new design. Secondly, the performance is scaling well as clock speeds increase, with between 9% and 16% increase in speeds in content creation with a 20% increase in clock speeds.

There are a host of other benchmarks that AMD has provided, and you can view them at your leisure from their web site at http://www.amd.com/products/cpg/server/Athlon/benchmarks.html. I chose only to include content-creation, 3D applications, and FPU benchmarks since that is what is most relevant to our readers. What the other workstation and server benchmarks show is consistent with what we’ve seen here: the AMD 760 MP chipset with Athlon MP Processor generally out-perform Pentium III and Pentium 4 processor-based solutions. Also, AMD MP solutions tend to benefit slightly more from a second processor, which makes sense after examining their Smart MP technology.


Final Conclusions

Perhaps the easiest way to understand the performance of AMD’s multiprocessor solutions is by remembering what we already knew about the Athlon “Thunderbird”. Athlon is already the fastest all-around x86 processor available, and costs significantly less than competing solutions. Now AMD has made the Athlon processor better and available in dual-CPU configurations. Add to that a supporting chipset with a better-optimized high-bandwidth multiprocessor bus, and you have a very scalable platform that emerges as the price/performance leader in the multiprocessor market. In other words, everything that Athlon is to price/performance in the single-processor space, it now does in the dual-processor space as well.

However, it’ll likely be a couple of months before AMD’s MP solution really hits its stride for a few reasons. First, Tyan’s Thunder K7 motherboard is currently the only one available at a price of about $600 with integrated-everything. Also, you have to use a proprietary 460-watt power supply that is only available from Delta or NMB at almost $200. So immediately, those two things reduce the pricing advantage of the Athlon MP relative to the Pentium III. However, the total cost is still considerably less than a dual Pentium 4 Xeon system.

Another potential cost issue for upgraders is the requirement for registered DDR memory, which does cost about 10% more. However, this is less of an issue since most people will be buying new platforms with 760 MP as opposed to upgrading incrementally, and workstation/server systems typically come with ECC memory anyway.

Stability: The X-Factor

Price and performance are two very important issues, but none of that matters in the workstation/server market without excellent stability. That isn’t something that benchmarks can show, with the obvious exception of a system not being able to complete benchmarks in the first place. Only time in the market can adequately establish the stability of Athlon MP and 760 MP, and that is one reason it will take a while before most of the major OEMs begin adopting them in their product line.

However, AMD knows this too, and has subjected their multiprocessor platform to extra-vigorous validation and testing. The 760 MP chipset has been two years in the making because AMD didn’t want to release it until they knew they had it right. Early reports of stability are very encouraging. There’s every reason to believe that Athlon MP will have similar success in the server/workstation market as Athlon did in the desktop market. While the multiprocessor market is more conservative and less willing to adopt new solutions, AMD also has a newly gained respect based on their success with Athlon.

The Best is Yet to Come

In the third-quarter, we’ll begin seeing more motherboard options from several other leading manufacturers. Already-announced products from Asus, MSI, Gigabyte, Abit, and Tyan will be sans all the integrated features such as SCSI, video, LAN, which will reduce motherboard cost to under $200. Some of these will likely work with standard ATX power supplies and support unbuffered DDR memory. Once these additional choices have arrived, the Athlon MP and 760 MP will fully-realize their potential for leadership in multiprocessor price/performance.

The third quarter will also see the follow-up release of the AMD 760 MPX chipset, which will add 66MHz/64-bit PCI bus support and a better connection between the Northbridge and Southbridge chips. Also, in the second half of this year we’ll see the debut of “Morgan”, the multiprocessor-capable version of the Duron that will be based on the Palamino core architecture.


Beware the Sleeping Giant

As for the competition, I’ve heard it said recently that laurels are uncomfortable to sit on. Intel’s Pentium 4 and Pentium 4 Xeon continue to be interesting products with a lot of potential, but that potential is as-yet unrealized. In order for them to become more competitive on a performance-basis, they will need to ramp up clock speeds well above 2 GHz to compensate for their slower per-clock performance.

Intel has already improved pricing by drastic cuts in chip prices, but it’s going to take more than that. It’s no secret that the high cost of RDRAM is the single biggest obstacle to pricing parity. Despite the gradual fall in RDRAM prices, it’s not enough. Deliverance is going to have to come in the form of a SDRAM memory-based platform for Pentium 4. Early reports are not overly enthusiastic about the performance of these solutions, but they are still in development so could definitely improve some before release.

Barring some unforeseen fiasco, such as a major foul-up by AMD in their manufacturing process or quality assurance, Intel has a small window of opportunity to strike back – perhaps six months or a year at most. After that, and probably much sooner, even conservative large OEMs will be forced to acknowledge the merits of AMD’s solution.

This is not to say that Intel doesn’t have a compelling product. The Pentium 4 and Pentium 4 Xeon products hold onto their lead in bandwidth-limited applications such as media encoding, although the lead is diminished with AMD’s 3DNow! Professional technology and other MP optimizations. The Pentium 4 core technology has also proven it has legs, with the ability to scale to very high clock speeds.

From a consumer standpoint, we should all hope for success by both Intel and AMD. During the past two years, we’ve seen great advancements in performance and value, largely attributable to the competition between these two market leaders.

Bottom Line

The AMD Athlon MP Processor on the 760 MP chipset performs as expected, which is very well. It is not perfect, but the imminent release of cheaper motherboards and faster-clocked Athlon MP processors will enhance it and extend its position as the overall price/performance leader. If you have ignored or disparaged AMD, it’s time you take another look now that they have the price/performance leading solution in both single and dual-processor systems. If you already liked the Athlon for single-processor systems, you’ll also like it for multiprocessor systems.

As for myself, I have no doubt that my next dual-processor system will be equipped with the AMD Athlon MP processor.

Tyler A. Hawes

 

Agree? Disagree? Want to know more? Visit Creative Cow's AMD forum cowmunity


Tyler A. Hawes has a background in the technology industry, having worked for Microsoft, 3Com and Intel in areas from support to Software Engineer. A year ago he founded Audio Intervisual Design, providing Web Development, Computer Animation, Video Production and Post-Production, as well as integration services for Canopus-based real-time non-linear editing systems. He is an active Creative Cow member and a Cowmunity leader on the AMD, Canopus, and 3D Studio MAX forums.


Visit Creative Cow's website
and forums if you got here by direct link to this article...