AMD Announces FirePro S10000by Ryan Smith on November 12, 2012 12:01 AM EST
Kicking off this week is the annual International Conference for High Performance Computing, Networking, Storage, and Analysis, better known by its shortened name: SC. This year’s conference takes place in Salt Lake City, Utah, with SC12 being host to a number of product announcements. All of the major CPU and GPU vendors will be there, as will Anand, with the first major announcement of the day coming from AMD’s GPU division.
This being a supercomputing conference, AMD’s announcement of course is all about computing. AMD and NVIDIA have both made it clear that they desire to extend their GPU business into the high performance computing (HPC) space, with GPUs being a strong fit for certain workloads by offering a very high level performance for their price and relatively low power consumption. With Graphics Core Next and Kepler, AMD and NVIDIA have further refined both their hardware and software stacks to make them suitable for HPC users, and with the right tools finally in hand the SC conference has become an increasingly important venue for the two companies. As the largest North American supercomputing conference is the ideal place to announce and show off these wares to prospective buyers, particularly those who may not yet be intimately familiar with the capabilities of the latest generation of GPUs.
So what is AMD showing off at SC12? Having already launched their GCN-powered FirePro S cards in August, AMD is serving up a new member of the FirePro S family to the SC12 audience: the FirePro S10000. With the FirePro S9000 already positioning itself as AMD’s top single-GPU card for servers, AMD has gone bigger for the S10000; much bigger. Intending to push total performance and power efficiency as far as they can take it, AMD is launching their first dual-GPU FirePro card.
|AMD FirePro S Series Specification Comparison|
|AMD FirePro S10000||AMD FirePro S9000||AMD FirePro S7000|
|Memory Clock||5.0GHz GDDR5||5.5GHz GDDR5||4.8GHz GDDR5|
|Memory Bus Width||2x384-bit||384-bit||256-bit|
|Single Precision||5.91 TFLOPS||3.23 TFLOPS||2.4 TFLOPS|
|Double Precision||1.48 TFLOPS (1/4)||806 GFLOPS (1/4)||152 GFLOPS (1/16)|
|Manufacturing Process||TSMC 28nm||TSMC 28nm||TSMC 28nm|
In a nutshell, the S10000 is a dual-GPU Tahiti card, packing a pair of slightly underclocked (825MHz) Tahiti GPUs on a single board, with each GPU wired up to 3GB of GDDR5 operating at 5GHz. This puts AMD’s theoretical performance at around 5.91 TFLOPS SP performance and 1.48 TFLOPS DP performance, with an aggregate memory bandwidth of 480GB/sec. AMD is no stranger to dual-GPU cards, having released a number of dual-GPU cards in the past in their consumer Radeon lineup, but this is the first time they’ve launched a dual-GPU card like this for the professional market. The launch of the S10000 makes it the second dual-GPU server card to be released this year, joining NVIDIA’s K10 as the only other such card.
In terms of features and functionality we’re not looking at anything new from AMD – everything S10000 does the existing single-GPU FirePro S series cards can do as well – so this is primarily a power play for AMD. NVIDIA’s Tesla K20, their double-precision monster, is due soon (having already pseudo-launched with the Titan supercomputer) while K10 already exceeds the S9000’s single-precision computer on paper. S10000 is in turn AMD’s high-end Tesla counter, and on paper exceeds both the K10 and K20 (Titan configuration) in single-precision and double-precision performance. GCN is AMD’s first “modern” GPU compute architecture, so after effectively leaving the market to NVIDIA in the last generation, AMD has no intention of letting themselves be surpassed by NVIDIA this time if they can avoid it.
To do that AMD is banking on absolute performance and to a lesser extent power efficiency. The pitch for absolute performance is relatively straightforward – at 5.91 TFLOPS SP performance and 1.48 TFLOPS DP performance the S10000 is extremely capable on paper – meanwhile power efficiency is based on where GCN stands today, with some additional gains squeezed out by being a dual-GPU card. Dual-GPU cards are traditionally chart-toppers for power efficiency thanks to a combination of binning and lower overhead, and S10000 will squeak past S9000 in terms of GFLOPS/watt. Of course all of this is on paper, and with AMD’s primary compute competition being the unreleased K20 it’s anyone’s guess what real world performance will be like.
Moving on, along with compute S10000 will also be serving an additional role as AMD’s top-end VDI card. This is the same scenario as the other S series cards; AMD doesn’t produce cards dedicated to specific server markets, so one card fills multiple roles. NVIDIA recently announced their VGX K2 card, a VDI card based roughly on the Tesla K10, so along with being AMD’s Tesla counter the S10000 is also their VGX K2 counter. AMD is primarily focusing this card on the direct GPU passthrough high performance VDI market (where each user gets a dedicated GPU), though they’re also pitching it at a candidate for higher density VDI through Microsoft’s RemoteFX technology, which allows for GPU sharing through the use of virtual GPUs.
With all of that said, based on what we’ve seen of the S10000 we’re having a hard time figuring out where exactly it will fit into the server market. AMD has a very clear need for a higher performance part to hold off NVIDIA’s Tesla and VGX products, but the issue they face with the S10000 is one of design and power consumption. Traditional high-end server cards like the Tesla K10 and FirePro S9000 are in the 225W-250W TDP range; meanwhile the S10000 significantly overshoots that, coming in at 375W. Typical servers are not able to accept cards with power consumption this high, nor are they setup to cool such a card, so as a result the S10000 will require a custom chassis capable of meeting its particular power and cooling needs. With this in mind, AMD may promote it as a server card but the design and specs make far more sense for a workstation card, which thankfully is something it can certainly be used for regardless of the name. Otherwise it’s not clear to us if there’s anything other than a niche market for the S10000 in servers.
Wrapping things up, AMD will be launching the S10000 at $3599, $1100 over the S9000. AMD has already started shipping the card to key strategic partners – this announcement appearing to be intentionally delayed to line up with SC12 – so they should beat the yet to be released Tesla K20 to market.
Post Your CommentPlease log in or sign up to comment.
View All Comments
frogger4 - Tuesday, November 13, 2012 - linkI know very little about using GPU's for HPC (other than it is awesome and powerful), but looking at numbers, I was wondering:
The HD 7970 GHz claims 1.01 TFLOPS double precision, and the S10000 offers 1.48 TFLOPS double precision, but at about 7 times the cost. Is the difference of having ECC memory worth that much, and that critical to HPC?
Ktracho - Thursday, November 15, 2012 - linkAsk yourself, what would happen if I tried to use the HD7970 intensively 24x7x364? How long would it last? Now, suppose you had to do a calculation that took a few weeks to complete. How confident can you be that your results are correct if you use the HD7970? What if something happened to your HD7970, a tiny hardware hiccup or small power transient or even a software glitch, such that it causes a hang or crash in your program, and you had to restart your calculation all over again when it was just a couple days from being finished? In this scenario, how much would you be willing to pay for the assurance that your graphics card is going to perform as required? Changes to the design of the card itself, as well as the extra testing each one has to go through, cost money. Sure, there is a larger profit margin, but some of that is also needed to support the customer in case there are problems.
texasti89 - Wednesday, November 14, 2012 - link
I'm not sure I get the real advantage of this card over the K20 and K20X. K20X delivers a total of 5.26 TFLOPS peak performance using only 225w, this thing can achieve a total of 7.39 TFLOPS BUT needs a high TDP of 375w which makes K20 and K20X more power efficient (single card comparison). For a large server, this might be a good choice given the huge memory bandwidth and lower overhead saved due to lower number of cards needed to meet the target performance.
Howlingmoon - Sunday, June 16, 2013 - linkYes...but can I Crossfire 2 of these monsters and play Crysis 3 and Metro 2033 at Super, Ultra settings on my new 4k TV at 100+ FPS?
Pessimism - Monday, June 24, 2013 - linkIts over 9000.