GPU Cheatsheet - A History of Modern Consumer Graphics Processors
by Jarred Walton on September 6, 2004 12:00 AM EST- Posted in
- GPUs
NVIDIA Chipsets
Below you can see our breakdown of the GPU guide for NVIDA video cards:
NVIDIA Craphics Chips Overview | ||||||||
DirectX 9.0C with PS3.0 and VS3.0 Support | ||||||||
GF 6600 | NV43 | 300 | 550 | 8 | 1 | 3 | 128/256 | 128 |
GF 6600GT | NV43 | 500 | 1000 | 8 | 1 | 3 | 128/256 | 128 |
GF 6800LE | NV40 | 320 | 700 | 8 | 1 | 5 | 128 | 256 |
GF 6800LE | NV41 | 320 | 700 | 8 | 1 | 5 | 128 | 256 |
GF 6800 | NV40 | 325 | 700 | 12 | 1 | 5 | 128 | 256 |
GF 6800 | NV41 | 325 | 700 | 12 | 1 | 5 | 128 | 256 |
GF 6800GT | NV40 | 350 | 1000 | 16 | 1 | 6 | 256 | 256 |
GF 6800U | NV40 | 400 | 1100 | 16 | 1 | 6 | 256 | 256 |
GF 6800UE | NV40 | 450 | 1200 | 16 | 1 | 6 | 256 | 256 |
DirectX 9 with PS2.0+ and VS2.0+ Support | ||||||||
GFFX 5200LE | NV34 | 250 | 400 | 4 | 1 | 1 | 64/128 | 64 |
GFFX 5200 | NV34 | 250 | 400 | 4 | 1 | 1 | 64/128/256 | 128 |
GFFX 5200U | NV34 | 325 | 650 | 4 | 1 | 1 | 128 | 128 |
GFFX 5500 | NV34 | 270 | 400 | 4 | 1 | 1 | 128/256 | 128 |
GFFX 5600XT | NV31 | 235 | 400 | 4 | 1 | 1 | 128/256 | 128 |
GFFX 5600 | NV31 | 325 | 500 | 4 | 1 | 1 | 128/256 | 128 |
GFFX 5600U | NV31 | 350 | 700 | 4 | 1 | 1 | 128/256 | 128 |
GFFX 5600U FC | NV31 | 400 | 800 | 4 | 1 | 1 | 128 | 128 |
GFFX 5700LE | NV36 | 250 | 400 | 4 | 1 | 3 | 128/256 | 128 |
GFFX 5700 | NV36 | 425 | 500 | 4 | 1 | 3 | 128/256 | 128 |
GFFX 5700U | NV36 | 475 | 900 | 4 | 1 | 3 | 128/256 | 128 |
GFFX 5700U GDDR3 | NV36 | 475 | 950 | 4 | 1 | 3 | 128 | 128 |
GFFX 5800 | NV30 | 400 | 800 | 4 | 2 | 2 | 128 | 128 |
GFFX 5800U | NV30 | 500 | 1000 | 4 | 2 | 2 | 128 | 128 |
GFFX 5900XT/SE | NV35 | 400 | 700 | 4 | 2 | 3 | 128 | 256 |
GFFX 5900 | NV35 | 400 | 850 | 4 | 2 | 3 | 128/256 | 256 |
GFFX 5900U | NV35 | 450 | 850 | 4 | 2 | 3 | 256 | 256 |
GFFX 5950U | NV38 | 475 | 950 | 4 | 2 | 3 | 256 | 256 |
DirectX 8 with PS1.3 and VS1.1 Support | ||||||||
GF3 Ti200 | NV20 | 175 | 400 | 4 | 2 | 1 | 64/128 | 128 |
GeForce 3 | NV20 | 200 | 460 | 4 | 2 | 1 | 64 | 128 |
GF3 Ti500 | NV20 | 240 | 500 | 4 | 2 | 1 | 64 | 128 |
GF4 Ti4200 128 | NV25 | 250 | 444 | 4 | 2 | 2 | 128 | 128 |
GF4 Ti4200 64 | NV25 | 250 | 500 | 4 | 2 | 2 | 64 | 128 |
GF4 Ti4200 8X | NV28 | 250 | 514 | 4 | 2 | 2 | 128 | 128 |
GF4 Ti4400 | NV25 | 275 | 550 | 4 | 2 | 2 | 128 | 128 |
GF4 Ti4600 | NV25 | 300 | 600 | 4 | 2 | 2 | 128 | 128 |
GF4 Ti4800 SE | NV28 | 275 | 550 | 4 | 2 | 2 | 128 | 128 |
GF4 Ti4800 | NV28 | 300 | 650 | 4 | 2 | 2 | 128 | 128 |
DirectX 7 | ||||||||
GeForce 256 DDR | NV10 | 120 | 300 | 4 | 1 | 0.5 | 32/64 | 128 |
GeForce 256 SDR | NV10 | 120 | 166 | 4 | 1 | 0.5 | 32/64 | 128 |
GF2 MX200 | NV11 | 175 | 166 | 2 | 2 | 0.5 | 32/64 | 64 |
GF2 MX | NV11 | 175 | 333 | 2 | 2 | 0.5 | 32/64 | 64/128 |
GF2 MX400 | NV11 | 200 | 333 | 2 | 2 | 0.5 | 32/64 | 128 |
GF2 GTS | NV15 | 200 | 333 | 4 | 2 | 0.5 | 32/64 | 128 |
GF2 Pro | NV15 | 200 | 400 | 4 | 2 | 0.5 | 32/64 | 128 |
GF2 Ti | NV15 | 250 | 400 | 4 | 2 | 0.5 | 32/64 | 128 |
GF2 Ultra | NV15 | 250 | 460 | 4 | 2 | 0.5 | 64 | 128 |
GF4 MX4000 | NV19 | 275 | 400 | 2 | 2 | 0.5 | 64/128 | 64 |
GF4 MX420 | NV17 | 250 | 333 | 2 | 2 | 0.5 | 64 | 64 |
GF4 MX440 SE | NV17 | 250 | 333 | 2 | 2 | 0.5 | 64/128 | 128 |
GF4 MX440 | NV17 | 275 | 400 | 2 | 2 | 0.5 | 32/64 | 128 |
GF4 MX440 8X | NV18 | 275 | 500 | 2 | 2 | 0.5 | 64/128 | 128 |
GF4 MX460 | NV17 | 300 | 550 | 2 | 2 | 0.5 | 64 | 128 |
* RAM clock is the effective clock speed, so 250 MHz DDR is listed as 500 MHz. | ||||||||
** Textures/Pipeline is the number of unique texture lookups. ATI has implementations that can lookup 3 textures, but two of the lookups must be from one texture. | ||||||||
*** Vertex pipelines is estimated on certain architectures. NVIDIA says their GFFX cards have a "vertex array", but in practice it performs as shown. |
The caveats are very similar on the NVIDIA side of things. In terms of DirectX support, NVIDIA has DX7, DX8.0, DX9, and DX9.0c support. Unlike the X800 cards which support an unofficial DX spec, DX9.0c is a Microsoft standard. On the flip side, the SM2.0a features of the FX line went almost entirely unused, and the 32-bit floating point (as opposed to the 24-bit values ATI uses) appears to be part of the problem with the inferior DX9 performance of the FX series. The benefit of DX8.1 over DX8.0 was that a few more operations were added to the hardware, so tasks that would have required two passes on DX8.0 can be done in one pass on DX8.1.
When DX8 cards were all the rage, DX8.1 support was something of a non-issue, as DX8 games were hard to come by, and most opted for the more widespread 8.0 spec. Now, however, games like Far Cry and the upcoming Half-Life 2 have made DX8.1 support a little more useful. The reason for this is that every subsequent version of DirectX is a superset of the older versions, so every DX9 card must include both DX8 and DX8.1 functionality. GeForce FX cards in the beta of Counter Strike: Source default to DX8.1 rendering paths in order to get the best compromise between quality and speed, while GeForce 3 and 4 Ti cards use the DX8.0 rendering path.
Going back to ATI for a minute, it becomes a little clearer why ATI's SM2.0b isn't an official Microsoft standard. SM3.0 already supersedes it as a standard, and yet certain features of SM2.0b as ATI defines it are not present in SM3.0, for example the new 3Dc normal map compression. Only time will tell if this feature gets used with current hardware, but it will likely be included in a future version of DirectX, so it could come in useful.
In contrast to ATI, where the card generations are pretty distinct entities, the NVIDIA cards show a lot more overlap. The GF3 cards only show a slight performance increase over the GF2 Ultra, and that is only in more recent games. Back in the day, there really wasn't much incentive to leave the GF2 Ultra and "upgrade" to the GF3, especially considering the cost, and many people simply skipped the GF3 generation. Similarly, those that purchased the GF4 Ti line were left with little reason to upgrade to the FX line, as the Ti4200 remains competitive in most games all the way up to the FX5600. The FX line is only really able to keep up with - and sometimes beat - the GF4Ti cards when DX8.1 or DX9 features are used, or when enabling antialiasing and/or anisotropic filtering.
Speaking of antialiasing.... The GF2 line lacked support for multi-sample antialiasing and relied on the more simplistic super-sampling method. We say "simplistic" meaning that it was easier to implement - it is actually much more demanding on memory bandwidth, so it was less useful. The GF3 line brought the first consumer cards with multi-sample antialiasing, and NVIDIA went one step further by creating a sort of rotated-grid method called Quincunx, which offered superior quality to 2xAA while incurring less of a performance hit than 4xAA. However, as the geometrical complexity of games increased - something DX7 promised and yet failed to deliver for several years - none of these cards were able to perform well with antialiasing enabled. The GF4 line refined the antialiasing support slightly - even the GF4MX line got hardware antialiasing support, although here it was more of a checklist feature than something most people would actually enable - but for the most part it remained the same as in the GF3. The GFFX line continued with the same basic antialiasing support, and it was only with the GeForce 6 series that NVIDIA finally improved the quality of their antialiasing by switching to a rotated grid. At present, the differences in implementation and quality of antialiasing on ATI and NVIDIA hardware are almost impossible to spot in practical use. ATI does support 6X multi-sample anti-aliasing, of course, but that generally brings too much of a performance hit to use except on older games.
Anisotropic filtering for NVIDIA was a different story. First introduced with the GF2 line, it was extremely limited and rather slow - the GF2 could only provide 2xAF, called 8-tap filtering by NVIDIA because it uses 8 samples. GeForce3 added support for up to 8xAF (32-tap), along with performance improvements compared to the GF2 when anisotropic filtering was enabled. Also, the GF2 line was really better optimized for 16-bit color performance, while the GF3 and later all manage 32-bit color with a much less noticeable performance hit. This is likely related to the same enhancements that allow for better anisotropic filtering.
As games became more complex, the cost of doing "real" anisotropic filtering became too great, and so there were optimizations and accusations of cheating by many parties. The reality is that NVIDIA used a more correct distance calculation than ATI: d = x^2 + y^2 + z^2, compared to d = ax+by+cz. The latter equation is substantially faster, but the results are less correct. It ends up giving correct results only at certain angles, while other angles use a lower level of AF. Unfortunately for those who desire maximum image quality, NVIDIA solved the discrepancy in AF performance by switching to ATI's distance calculation on the GeForce 6 line. The GeForce 6 line also marks the introductions of 16xAF (64-tap) by NVIDIA, although it is nearly impossible to spot the difference in quality between 8xAF and 16xAF without some form of image manipulation. So, things have now been sorted out as far as "cheating" accusations go. It is probably safe to say that in modern games, the GF4 and earlier chips are not able to handle anisotropic filtering well enough to warrant enabling it.
NVIDIA is also using various versions of the same chip in their high end parts. The 6800 cards at present all use the same NV40 chip. Certain chips have some of the pipelines deactivated and they are then sold in lower end cards. Rumors about the ability to "mod" 6800 vanilla chips into 16 pipeline versions exist, but success rates are not yet known and are likely low, due again to the size of the chips. NVIDIA has plans to release a modified chip, a.k.a. NV41, which will only have 12 pixel pipelines and 5 vertex pipelines, in order to reduce manufacturing costs and improve yields.
43 Comments
View All Comments
Neo_Geo - Tuesday, September 7, 2004 - link
Nice article.... BUT....I was hoping the Quadro and FireGL lines would be included in the comparison.
As someone who uses BOTH proffessional (ProE and SolidWorks) AND consumer level (games) software, I am interested in purchasing a Quadro or FireGL, but I want to compare these to their consumer level equivalent (as each pro level card generally has an equivalent consumer level card with some minor, but important, otomizations).
Thanks
mikecel79 - Tuesday, September 7, 2004 - link
The AIW 9600 Pros have faster memory than the normal 9600 Pro. 9600 Pro memory runs at 650Mhz vs the 600 on a normal 9600.Here's the Anandtech article for reference:
http://www.anandtech.com/video/showdoc.aspx?i=1905...
Questar - Tuesday, September 7, 2004 - link
#20,This list is not complete at all, it would be 3 times the size if it was from the last 5 or 6 years. It covers about the last 3, and is laden with errors
Just another exampple of half-asssed job this site has been doing lately.
JarredWalton - Tuesday, September 7, 2004 - link
#14 - Sorry, I went with desktop cards only. Usually, you're stuck with whatever comes in your laptop anyway. Maybe in the future, I'll look at including something like that.#15 - Good God, Jim - I'm a CS graduate, not a graphics artist! (/Star Trek) Heheh. Actually, you would be surprised at how difficult it can be to get everything to fit. Maximum width of the tables is 550 pixels. Slanting the graphics would cause issues making it all fit. I suppose putting in vertical borders might help keep things straight, but I don't like the look of charts with vertical separators.
#20 - Welcome to the club. Getting old sucks - after a certain point, at least.
Neekotin - Tuesday, September 7, 2004 - link
great read! wow! i didn't know there were so much GPUs in the past 5-6 years. its like more than all combined before them. guess i'm a bit old.. ;)JarredWalton - Tuesday, September 7, 2004 - link
12/13: I updated the Radeon LE entry and resorted the DX7 page. I'm sure anyone that owns a Radeon LE already knows this, but you could use a registry hack to turn them into essentially a full Radeon DDR. (By default, the Hierarchical Z compression and a few other features were disabled.) Old Anandtech article on the subject:http://www.anandtech.com/video/showdoc.aspx?i=1473
JarredWalton - Monday, September 6, 2004 - link
Virge... I could be wrong on this, but I'm pretty sure some of the older chips could actually be configured with either SDR or DDR RAM, and I think the GF2 MX series was one of those. The problem was that you could either have 64-bit DDR or 128-bit SDR, so it really didn't matter which you chose. But yeah, there were definitely 128-bit SDR versions of the cards available, and they were generally more common than the 64-bit DDR parts I listed. The MX200, of course, was 64-bit SDR, so it got the worst of both worlds. Heh.I think the early Radeons had some similar options, and I'm positive that such options existed in the mobile arena. Overall, though, it's a minor gripe (I hope).
ViRGE - Monday, September 6, 2004 - link
Jarred, without getting too nit-picky, your data for the GeForce 2 MX is technically wrong; the MX used a 128bit/SDR configuration for the most part, not a 64bit/DDR configuration(http://www.anandtech.com/showdoc.aspx?i=1266&p... Note that this isn't true for any of the other MX's(both the 200 and 400 widely used 64bit/DDR), and the difference between the two configurations has no effect on the math for memory bandwidth, but it's still worth noting.Cygni - Monday, September 6, 2004 - link
Ive been working with Adrian's Rojak Pot on a very similar chart to this one for awhile now. Check it out:http://www.rojakpot.com/showarticle.aspx?artno=88&...
Denial - Monday, September 6, 2004 - link
Nice article. In the future, if you could put the text at the top of the tables on an angle it would make them much easier to read.