NVIDIA Unveils & Gives Away New Limited Edition 32GB Titan V "CEO Edition"
by Ryan Smith on June 21, 2018 7:30 AM ESTNVIDIA’s CEO Jensen Huang has over the years become increasingly known for his giveaway antics at AI conferences. In recent years the CEO has unveiled both the NVIDIA Titan X (Pascal) and the NVIDIA Titan V in this fashion. And now you can add one more reveal to this list, as last evening Huang gave out 20 units of a new Titan V SKU, the Titan V CEO Edition, at the Computer Vision and Pattern Recognition conference in Salt Lake City.
According to NVIDIA, the aptly named SKU is apparently a “limited edition” product, and unlike past Huang reveals, NVIDIA has not sent out any announcements of a new product. So for the moment, this is not a retail product and is not immediately expected to become one. None the less, this is an unusual development as the new Titan V SKU is not simply a Titan V with additional memory, but rather has some notable configuration differences that set it apart from the regular Titan V.
NVIDIA Compute Accelerator Specification Comparison | ||||||
Titan V CEO Edition |
Titan V | Tesla V100 (PCIe) |
Titan Xp | |||
CUDA Cores | 5120? | 5120 | 5120 | 3840 | ||
Tensor Cores | 640? | 640 | 640 | N/A | ||
ROPs | 128 | 96 | 128 | 96 | ||
Core Clock | 1200MHz? | 1200MHz | ? | 1485MHz | ||
Boost Clock | 1455MHz? | 1455MHz | 1370MHz | 1582MHz | ||
Memory Clock | 1.7Gbps HBM2? | 1.7Gbps HBM2 | 1.75Gbps HBM2 | 11.4Gbps GDDR5X | ||
Memory Bus Width | 4096-bit | 3072-bit | 4096-bit | 384-bit | ||
Memory Bandwidth | 900GB/sec? | 653GB/sec | 900GB/sec | 547GB/sec | ||
VRAM | 32GB | 12GB | 16GB | 12GB | ||
L2 Cache | 6MB | 4.5MB | 6MB | 3MB | ||
Single Precision | 13.8 TFLOPS | 13.8 TFLOPS | 14 TFLOPS | 12.1 TFLOPS | ||
Double Precision | 6.9 TFLOPS (1/2 rate) |
6.9 TFLOPS (1/2 rate) |
7 TFLOPS (1/2 rate) |
0.38 TFLOPS (1/32 rate) |
||
Tensor Performance (Deep Learning) |
125 TFLOPS | 110 TFLOPS | 112 TFLOPS | N/A | ||
GPU | GV100 (815mm2) |
GV100 (815mm2) |
GV100 (815mm2) |
GP102 (471mm2) |
||
Transistor Count | 21.1B | 21.1B | 21.1B | 12B | ||
TDP | 250W? | 250W | 250W | 250W | ||
Form Factor | PCIe | PCIe | PCIe | PCIe | ||
Cooling | Active | Active | Passive | Active | ||
Manufacturing Process | TSMC 12nm FFN | TSMC 12nm FFN | TSMC 12nm FFN | TSMC 16nm FinFET | ||
Architecture | Volta | Volta | Volta | Pascal | ||
Launch Date | 6/20/2018 | 12/07/2017 | Q3'17 | 04/07/2017 | ||
Price | N/A | $2999 | ~$10000 | $1299 |
Because this isn’t a retail SKU – at least not yet – NVIDIA hasn’t published official specifications for the card, so most of our table above is pending confirmation. However based solely on the 32GB VRAM capacity, we can accurately infer two very important points.
- NVIDIA is using new 8-Hi HBM2 memory stacks, as with their 32GB Tesla cards
- Titan V CEO Edition has all 4 of its ROP/Memory Controller partitions enabled, up from 3 on the retail Titan V
It’s the latter point in particular that has some potentially significant ramifications for NVIDIA’s limited edition Titan V SKU. The standard Titan V itself is a salvage part with only 3 ROP/MC partitions enabled; consequently it only has 3/4ths of the memory bandwidth, pixel throughput, and L2 cache of its fully-enabled sibling. This has helped to differentiate the relatively cheap Titan V from the more expensive Tesla V100, with NVIDIA being able to leverage the memory capacity and memory bandwidth differences to ensure their flagship card remains attractive.
The end result is that the Titan V CEO Edition is not just a Titan V with more memory. In fact memory capacity aside, thanks to these changes there will almost certainly be meaningful (though not necessarily large) performance differences between it and the regular Titan V in any kind of memory bandwidth-bound scenario. And from I’ve heard from Titan V users over the past year, bandwidth-bound scenarios are more common than one might think, as the regular Titan V can fully saturate its memory bandwidth on compute alone and still come up short. Equally important, this means that at least on paper, there’s not much separating the new SKU from the 32GB Tesla V100 in terms of performance.
As an added wrinkle, of the handful of specifications that NVIDIA’s blog post does cover, they list the new card as offering 125 TFLOPS of tensor core performance, whereas the retail Titan V is 110 TFLOPS. It’s not clear how NVIDIA gets this number, but importantly, it means that there may be further clockspeed or SM configuration changes that have yet to be revealed by NVIDIA.
In any case, for the time being the only way to get this unexpected Titan V SKU is to get one of the 20 winners from NVIDIA’s giveaway to part with one. So the immediate impact to NVIDIA’s business – or to potential Titan buyers – is negligible. However given the fact that this is not just a Titan V with more memory, it does strike me as unusual that NVIDIA would produce a small batch of cards and then just stop, as someone just created a fair bit of extra work for NVIDIA driver & validation teams. So I wouldn’t at all be surprised if we see a similar SKU hit retail down the line, especially as the Titan V is the only remaining commercial GV100 product that doesn’t have a second, higher memory capacity configuration.
Source: NVIDIA (via SH SOTN)
38 Comments
View All Comments
PeachNCream - Thursday, June 21, 2018 - link
Can't you see that people here are talking about something more important (leather packaging) and no one else cares about the number of TFLOPS some random computer part spits out? Shoo with your specifications! Shoo, I say!Alexvrb - Thursday, June 21, 2018 - link
They haven't actually published most of the specs so I think jabbadap's explanation makes the most sense... if the tensor number is accurate, the other compute numbers probably aren't.jabbadap - Thursday, June 21, 2018 - link
SXM2 version of V100 has 125TFlops of tensor power. So maybe it has same clocks and same tdp as sxm2 V100. So real specs would be 7.8Tflops fp64, 15.7Tflops fp32 and 31.4Tflops fp16. And thus gpu clocks(boost maybe) would be 1.533GHz.Bulat Ziganshin - Thursday, June 21, 2018 - link
125/4=31.25 and so on, OTOH 31.4*4=125.6 so it may just a matter of roundingI just realized that 8 such GPUs has a nice 1 PFlop total speed, which is probably what they will market for their DGC-8.5 workstations
mode_13h - Friday, June 22, 2018 - link
Because they're rated at 300 W, whereas the PCIe version is only 250 W.Spunjji - Friday, June 22, 2018 - link
That makes sense - set a 300W rating on the card and boom, more clock headroom. What's one more 8 pin connector among friends?mode_13h - Friday, June 22, 2018 - link
Yeah, the standard Titan V has a 6-pin + 8-pin and a 250 W rating.peevee - Thursday, June 21, 2018 - link
"The only realistic ways to get 125 TFLOPS is to increase clock speeds or increase amount of SMs"Not if they are memory-speed-limited in the previous design. Which might be, especially in the case of 8-bit encoded data where computation is essentially free (there is no need to actually add or multiply anything, just a direct read from a 64KB table with multiple read ports).
Bulat Ziganshin - Thursday, June 21, 2018 - link
memory speed is completely separate topic. these TFLOPS are ALWAY computed from raw ALU powerBulat Ziganshin - Thursday, June 21, 2018 - link
and if you mean on-chip RAM (or rather register pool), it scales by frequency/amount with SMs, since it's part of SM itself