09:21PM EDT - The final talk today at Hot Chips is from Habana, who is discussing its approach to how to scale AI compute.

09:21PM EDT - Goya and Gaudi

09:22PM EDT - Recapping Training vs Inference requirements

09:24PM EDT - Goya processor architecure

09:24PM EDT - 3 engines, RPC, GEMM, and DMA. Work Concurrently with shared SRAM

09:24PM EDT - TPC is VLIW SIMD core, C-programmable

09:24PM EDT - PCIe Gen 4.0 x16

09:24PM EDT - Two DDR4-2666 channels, built on TSMC 16

09:25PM EDT - Supports UINT8 to FP32

09:25PM EDT - Dedicated HW and TPC ISA for special function acceneration

09:25PM EDT - Have to adjust quantization to mix accuracy vs power

09:26PM EDT - PCIe card - Software stack is more important.

09:26PM EDT - Habana is a software company that just happens to do hardware

09:27PM EDT - Graph compiler with built-in quantization engine

09:27PM EDT - Multiple recipes can be loaded for the hardware

09:28PM EDT - Goya supports models trained on any processor: CPU, GPU, TPU, Gaudi etc

09:28PM EDT - Users can create custom layers and kernels

09:29PM EDT - Still market leader since benchmarks made 11 months ago vs common CPU/GPU

09:29PM EDT - New for today, natural language benchmark results

09:30PM EDT - Support BERT architecture on Goya

09:30PM EDT - GEMMs and TPCs are fully utilized

09:30PM EDT - Chip was designed long before BERT was invested

09:30PM EDT - invented

09:30PM EDT - High degree of accuracy when quantized

09:30PM EDT - Software managed SRAM

09:31PM EDT - Now Gaudi, the training processor

09:31PM EDT - Performance at Scale, high throughput at low batch size, high power efficiency

09:32PM EDT - Enable native ethernet scale out - on chip RDMA over Converged Ethernet

09:32PM EDT - Open Compute Project Accelerator Module: OAM = (OCP)AM

09:32PM EDT - Framework and ML compiler support, rich TPC Kernet Library

09:32PM EDT - Architecture looks similar to Goya

09:33PM EDT - Networking has changed, memory has changed

09:33PM EDT - PCIe 4.0 x16, 4x8GB HBM

09:33PM EDT - 10x 100 GbE, or 20x50 GbE

09:33PM EDT - Supports UINT8 to FP32 and BF16

09:34PM EDT - SW supports profiling tools

09:34PM EDT - Only AI Training chip with RoCE v2

09:35PM EDT - NVIDIA was first to showcase RoCE v2 for AI, but they haven't implemented it yet

09:36PM EDT - NVIDIA GPU is much more complex with RoCE v2 support via Mellanox

09:36PM EDT - Gaudi integrates both

09:36PM EDT - Supports Lossless and Lossy fabrics

09:36PM EDT - Advanced congestion controls

09:37PM EDT - Customers can buy OAM cards or an 8 card Server

09:38PM EDT - Server box has no CPU, up to customer to config to needed. Uses mini-SAS HD

09:38PM EDT - Ethernet connectivity for point-to-point links with non-blocking full mesh

09:38PM EDT - 3 ports per card for scale up

09:39PM EDT - Can choose ratio of CPUs to Gaudi cards

09:39PM EDT - Gaudi vs DGX

09:40PM EDT - Unlike DGX, do not force user to separate PCIe between management and scaleout. Gaudi offers separate PCIe ports

09:41PM EDT - PCIe card dual slot also available

09:41PM EDT - HL-200

09:41PM EDT - Data parallel possible, model parallel possible

09:44PM EDT - Can leapfrog performance over DGX-2 due to better connectivity. Can connect 64 gaudi chips with non-blocking throughput

09:45PM EDT - Q&A time

09:46PM EDT - Q: What type of quantization requires a processor? There is no quantization processor. There's a software engine that takes an FP32 model and can quantize to data types that are more efficient and gives the feedback on the accuracy

09:47PM EDT - Q: Can you comment on interconnectivity of GEMM? A: It's one functional unit.

09:48PM EDT - Q: What is the minimum viable for an IoT gateway? A: You can use a single card. You can put a gaudi in a single PCIe slot.

09:48PM EDT - That's a wrap for today. More talks tomorrow!

Comments Locked

17 Comments

View All Comments

  • zenabartell - Wednesday, January 27, 2021 - link

    Such sites are important because they provide a large dose of useful information. This is very significant, and yet necessary towards for me. Thank u! https://skribble-free.online -> skribbl io https://geometrydash-free.online -> geometry dash
  • DenBrown - Monday, February 15, 2021 - link

    Hello
  • daniellewatson - Friday, March 12, 2021 - link

    Thank you for sharing this useful material. The information you have mentioned here will be useful. I would like to share with you all one useful source https://mid-terms.com/buy-discussion-board-post/ which might be interesting for you as well.
  • gloribenedict - Tuesday, May 11, 2021 - link

    Technology is reaching a new level. But working with the latest gadgets and programs for them has become commonplace for masters of the new century. And https://thesisleader.com/expert-excel-help-online/ is a help to better master! Training, work, self-development will seem like a nice bonus, after reviewing the works of masters. Qualified masters will consult, advise on how best to perform tasks, giving motivation.
  • EvaHill - Thursday, May 27, 2021 - link

    It's an interesting article and here there a lot of useful information from this presentation. I really like it. And also I have to find useful essays here https://essayscreator.com/satirical-essay.html Hope to find this information too. I hope so)
  • adanielle - Wednesday, August 4, 2021 - link

    It is really useful, because i'm writing artificial intelligence essay and this article is full of illustrative examples. If you'll have some advice, write me please. Good luck with what you'r dooing!
  • Amanda33 - Saturday, August 7, 2021 - link

    Thanks for the article. It is very interesting to read about the presentation of new technologies that will help us discover something new. I think you may need to buy a discussion board post on this site https://essays-writer.net/discussion-board-post-wr... to add information about new technologies.

Log in

Don't have an account? Sign up now