GPU Cheatsheet - A History of Modern Consumer Graphics Processors

Name: GPU Cheatsheet - A History of Modern Consumer Graphics Processors
Item: GPU Cheatsheet - A History of Modern Consumer Graphics Processors
Author: Jarred Walton

by Jarred Walton on September 6, 2004 12:00 AM EST

Posted in
GPUs

43 Comments | Add A Comment

43 Comments

DirectX 8 Performance

Below you can see our plot of the DirectX 8 components.


GF4 Ti4200 64	250	500	4	2	2	128	2000	113	7629	100.0%	100.0%	100.0%	100.0%
DirectX 8 and 8.1
GF4 Ti4800	300	650	4	2	2	128	2400	135	9918	120.0%	130.0%	120.0%	123.3%
GF4 Ti4600	300	600	4	2	2	128	2400	135	9155	120.0%	120.0%	120.0%	120.0%
GF4 Ti4400	275	550	4	2	2	128	2200	124	8392	110.0%	110.0%	110.0%	110.0%
GF4 Ti4800 SE	275	550	4	2	2	128	2200	124	8392	110.0%	110.0%	110.0%	110.0%
GF4 Ti4200 8X	250	514	4	2	2	128	2000	113	7843	100.0%	102.8%	100.0%	100.9%
GF4 Ti4200 64	250	500	4	2	2	128	2000	113	7629	100.0%	100.0%	100.0%	100.0%
GF4 Ti4200 128	250	444	4	2	2	128	2000	113	6775	100.0%	88.8%	100.0%	96.3%
8500	275	550	4	2	1	128	2200	69	8392	110.0%	110.0%	61.1%	93.7%
9100 Pro	275	550	4	2	1	128	2200	69	8392	110.0%	110.0%	61.1%	93.7%
9100	250	500	4	2	1	128	2000	63	7629	100.0%	100.0%	55.6%	85.2%
8500 LE	250	500	4	2	1	128	2000	63	7629	100.0%	100.0%	55.6%	85.2%
9200 Pro	300	600	4	1	1	128	1200	75	9155	60.0%	120.0%	66.7%	82.2%
GF3 Ti500	240	500	4	2	1	128	1920	54	7629	96.0%	100.0%	48.0%	81.3%
9000 Pro	275	550	4	1	1	128	1100	69	8392	55.0%	110.0%	61.1%	75.4%
GeForce 3	200	460	4	2	1	128	1600	45	7019	80.0%	92.0%	40.0%	70.7%
9000	250	400	4	1	1	128	1000	63	6104	50.0%	80.0%	55.6%	61.9%
9200	250	400	4	1	1	128	1000	63	6104	50.0%	80.0%	55.6%	61.9%
GF3 Ti200	175	400	4	2	1	128	1400	39	6104	70.0%	80.0%	35.0%	61.7%
9250	240	400	4	1	1	128	960	60	6104	48.0%	80.0%	53.3%	60.4%
9200 SE	200	333	4	1	1	64	800	50	2541	40.0%	33.3%	44.4%	39.2%
* RAM clock is the effective clock speed, so 250 MHz DDR is listed as 500 MHz.
** Textures/Pipeline is the maximum number of texture lookups per pipeline.
*** NVIDIA says their GFFX cards have a "vertex array", but in practice it generally functions as indicated.
**** Single-texturing fill rate = core speed * pixel pipelines
+ Multi-texturing fill rate = core speed * maximum textures per pipe * pixel pipelines
++ Vertex rates can vary by implementation. The listed values reflect the manufacturers' advertised rates.
+++ Bandwidth is expressed in actual MB/s, where 1 MB = 1024 KB = 1048576 Bytes.
++++ Relative performance is normalized to the GF4 Ti4200 64, but these values are at best a rough estimate.

No weighting has been applied to the DirectX 8 charts, and performance in games generally falls in line with what is represented in the above chart. Back in the DirectX 8 era, NVIDIA really had a huge lead in performance over ATI. The Radeon 8500 was able to offer better performance than the GeForce 3, but that lasted all of two months before the launch of the GeForce 4 Ti line. Of course, many people today continue running GeForce4 Ti cards with few complaints about performance - only high quality rendering modes and DX9-only applications are really forcing people to upgrade. For casual gamers, finding a used GF4Ti card for $50 or less may be preferable to buying a low-end DX9 card. It really isn't until the FX5700 Ultra and FX5600 Ultra that the GF4Ti cards are outclassed, and those cards still cost well over $100 new.

ATI did have one advantage over NVIDIA in the DirectX 8 era, however. They worked with Microsoft to create an updated version of DirectX; version 8.1. This added support for some "advanced pixel shader" effects, which brought the Pixel Shader version up to 1.4. There wasn't anything that could be done in DX8.1 that couldn't be done with DX8.0, but several operations could be done in one pass instead of two passes. Support for DirectX 8 games was very late in coming, however, and support for ATI's extensions was, if possible, even more so. There are a few titles which now support the DX8.1 extensions, but even then the older DX8.1 ATI cards are generally incapable of running these games well.

It is worth noting that the vertex rates on the NVIDIA cards are calculated as 90% of the clock speed times the number of vertex pipelines, divided by four. Why is that important? It's not, really, but on the FX and GF6 series of cards, NVIDIA uses clock speed times vertex pipelines divided by four for the claimed vertex rate. It could be that architectural improvements made the vertex rate faster. Such detail was lacking on the ATI side of things, although 68 million vertices/second for the 8500 was claimed in a few places, which matches the calculation used on NVIDIA's DX9 cards. You don't have to look any further than such benchmarks as 3DMark01 to find that these theoretical maximum are never reached, of course - even with one light source and no textures, the high polygon count scene doesn't come near the claimed rate.

Number nine… Number nine… Seven, seven for n-n-no tomorrow

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

43 Comments

View All Comments

JarredWalton - Thursday, October 28, 2004 - link
43 - It should be an option somewhere in the ATI Catalyst Control Center. I don't have an X800 of my own to verify this on, not to mention a lack of applications which use this feature. My comment was more tailored towards people that don't read hardware sites. Typical users really don't know much about their hardware or how to adjust advanced settings, so the default options are what they use.
Thera - Tuesday, October 19, 2004 - link
You say SM2.0b is disabled and consumers don't know how to turn it on. Can you tell us how to enable SM2.0b?

Thank you.

(cross posted from video forum)
endrebjorsvik - Wednesday, September 15, 2004 - link
WOW!! Very nice article!!

does anyone have all these datas collected into an exel-file or something??
JarredWalton - Sunday, September 12, 2004 - link
Correction to my last post. KiB and MiB and such are meant to be used for size calculations, and then KB and MB can be used for bandwidth calculations. Now the first paragraph (and my gripe) should be a little more clear if you didn't understand it already. Basically, the *bandwidth* companies (hard drives, and to a lesser extent RAM companies advertising bandwidth) proposed that their incorrect calculations stand and that those who wanted to use the old computer calculations should change.

There are problems, however. HDD and RAM both continue to use both calculations. RAM uses the simplified KB and MB for bandwidth, but the accepted KB and MB (KiB and MiB now) for size. HDD uses the simplified KB and MB for size, but then they use the other KB and MB for sustained transfer rates. So, the proposed change not only failed to address the problem, but the proposers basically continue in the same way as before.
JarredWalton - Saturday, September 11, 2004 - link
#38 - there are quite a few cards/chips that were only available in very limited quantities.

39 - Actually, that is only partially true. KibiBytes and MibiBytes are a *proposed* change as far as I am aware, and they basically allow the HDD and RAM people to continue with their simplified calculations. I believe that KiB and MiB are meant for bandwidths, however, and not memory sizes. The problem is that MB and KB were in existence long before KiB and MiB were proposed. Early computers with 8 KB of RAM (over 40 years ago) had 8192 bytes of RAM, not 8000 bytes. When you buy a 512 MB DIMM, it is 512 * 1048576 bytes, not 512 * 1000000 bytes.

If a new standard is to be adopted for abbreviations, it is my personal opinion that the parties who did not conform to the old standard are the ones that should change. Since I often look at the low level details of processors and GPUs and such, I do not want to have two different meanings of the same thing, which is what we currently have. Heck, there was even a class action lawsuit against hard drive manufacturers a while back about this "lie". That was the solution: the HDD people basically said, "We're right and in the future 2^10 = KiB, 2^20 = MiB, 2^30 = GiB, etc." Talk about not taking responsibility for your acttions....

It *IS* a minor point for most people, and relative performance is still the same. Basically, this is one of my pet peeves. It would be like saying, "You know what, 5280 feet per mile is inconvenient Even though it has been this way for ages, let's just call it 5000 feet per mile." I have yet to see any hardware manufacturers actually use KiB or MiB as an abbreviation, and software that has been around for decades still thinks that a KB is 1024 bytes and a MB is 1048576.
Bonta - Saturday, September 11, 2004 - link
Jarred, you were wrong about the abbreviation MB.
1 MB is 1 mega Byte is (1000*1000) Bytes is 1000000 Bytes is 1 million Bytes.
1 MiB is (1024*1024) Bytes is 1048576 Bytes.

So the vid card makers (and the hard drive makers) actually have it right, and can keep smiling. It is the people that think 1MB is 1048576 Bytes that have it wrong. I can't pronounce or spell 1 MiB correctly, but it is something like 1 mibiBytes.
viggen - Friday, September 10, 2004 - link
Nice article but what's up with the 9200 Pro running at 300mhz for core & memory? I dun remember ATI having such a card.
JarredWalton - Wednesday, September 8, 2004 - link
Oops... I forgot the link from Quon. Here it is:

http://www.appliedmaterials.com/HTMAC/index.html

It's somewhat basic, but at the same time, it covers several things my article left out.
JarredWalton - Wednesday, September 8, 2004 - link
I received a link from Matthew Quon containing a recent presentation on the whole chip fabrication process. It includes details that I omitted, but in general it supports my abbreviated description of the process.

#34: Yes, there are errors that are bound to slip through. This is especially true on older parts. However, as you point out, several of the older chips were offered in various speed grades, which only makes it more difficult. Several of the as-yet unreleased parts may vary, but on the X700 and 6800LE, that's the best info we have right now. The vertex pipelines are *not* tied directly to the pixel quads, so disabling 1/4 or 1/2 of the pixel pipelines does not mean they *have* to disable 1/4 or 1/2 of the vertex pipelines. According to T8000, though, the 6800LE is a 4 vertex pipeline card.

Last, you might want to take note of the fact that I have written precisely 3 articles for Anandtech. I live in Washington, while many of the other AT people are back east. So, don't count on everything being reviewed by every single AT editor - we're only human. :)

(I'm working on some updates and corrections, which will hopefully be posted in the next 24 hours.)
T8000 - Wednesday, September 8, 2004 - link
I think it is very good to put the facts together in such a review.

I did notice three things, however:

1: I have a GF6800LE and it has 4 enabled vertex pipes instead of 5 and comes with a 300/700 gpu/mem clock.

2: Since gpu clock speeds did not increase much, they had to add more features (like pipelines) to increase performance.

3: Gpu defects are less of an issue then cpu defects, since a lot of large gpu's offered the luxory of disabling parts, so that most defective gpu's can still be sold. As far as I know, this feature has never made it into the cpu market.

GPU Cheatsheet - A History of Modern Consumer Graphics Processors

Post Your Comment

43 Comments

View All Comments

JarredWalton - Thursday, October 28, 2004 - link

Thera - Tuesday, October 19, 2004 - link

endrebjorsvik - Wednesday, September 15, 2004 - link

JarredWalton - Sunday, September 12, 2004 - link

JarredWalton - Saturday, September 11, 2004 - link

Bonta - Saturday, September 11, 2004 - link

viggen - Friday, September 10, 2004 - link

JarredWalton - Wednesday, September 8, 2004 - link

JarredWalton - Wednesday, September 8, 2004 - link

T8000 - Wednesday, September 8, 2004 - link

Log in

Don't have an account? Sign up now