Further to the announcement that AMD is refreshing their Phoenix-based 7040HS series for mobiles with the newer 'Hawk Point' 8040HS family for 2024, AMD is set to drive more development for AI within the PC market. Designed to provide a more holistic end-user experience for adopters of hardware with the Ryzen AI NPU, AMD has made its latest version of the Ryzen AI Software available to the broader ecosystem. This is designed to allow software developers to deploy machine learning models into their software to deliver more comprehensive features in tandem with their Ryzen AI NPU and Microsoft Windows 11.

AMD has also officially announced the successor to their first generation of the Ryzen AI (XDNA), which is currently in AMD's Ryzen 7040HS mobile series and is driving the refreshed Hawk Point Ryzen 8040HS series. Promising more than 3x the generative AI performance of the first generation XDNA NPU, XDNA 2 is set to launch alongside AMD's next-generation APUs, codenamed Strix Point, sometime in 2024.

AMD Ryzen AI Software: Version 1.0 Now Widely Available to Developers

Along with the most recent release of their Ryzen AI software (Version 1.0), AMD is making it more widely available to developers. This is designed to allow software engineers and developers the tools and capabilities to create new features and software optimizations designed to use the power of generative AI and large language models (LLMs). New to Version 1.0 of the Ryzen AI software is support for the open-source ONNX Runtime machine learning accelerator, which includes support for mixed precision quantization, including UINT16/32, INT16/32, and FLOAT16 floating point formats.

AMD Ryzen AI Version 1.0 also supports PyTorch and TensorFlow 2.11 and 2.12, which broadens the capabilities on which software developers can run in terms of models and LLMs to create new and innovative features for software. AMD's collaboration with Hugging Face also offers a pre-optimized model zoo, a strategy designed to reduce the time and effort required by developers to get AI models up and running. This also makes the technology more accessible to a broader range of developers right from the outset.

AMD's focus isn't just on providing the hardware capabilities through the XDNA-based NPU but on allowing developers to exploit these features to their fullest. The Ryzen AI software is designed to facilitate the development of advanced AI applications, such as gesture recognition, biometric authentication, and other accessibility features, including camera backgrounds.

Offering early access support for models like Whisper and LLMs, including OPT and Llama-2, indicates AMD's growing commitment to giving developers as many tools as possible. These tools are pivotal for building natural language speech interfaces and unlocking other Natural Language Processing (NLP) capabilities, which are increasingly becoming integral to modern applications.

One of the key benefits of the Ryzen AI Software is that it allows software running these AI models to offload AI workloads onto the Neural Processing Unit (NPU) in Ryzen AI-powered laptops. The idea behind the Ryzen AI NPU is that users running software utilizing these workloads via the Ryzen AI NPU can benefit from better power efficiency rather than using the Zen 4 cores, which should help improve overall battery life.

A complete list of the Ryzen AI Software Version 1.0 changes can be found here.

AMD XDNA 2: More Generative AI Performance, Coming With Strix Point in 2024

Further to all the refinements and developments of the Ryzen AI NPU block used in the current Ryzen 7040 mobile and the upcoming Ryzen 8040 mobile chips is the announcement of the successor. AMD has announced their XDNA 2 NPU, designed to succeed the current Ryzen AI (XDNA) NPU and boost on-chip AI inferencing performance in 2024 and beyond. It's worth highlighting that XDNA is a dedicated AI accelerator block integrated into the silicon, which came about through AMD's acquisition of Xilinx in 2022, which developed Ryzen AI and is driving AMD's commitment to AI in the mobile space.

While AMD hasn't provided any technical details yet about XDNA 2, AMD claims more than 3x the generative AI performance with XDNA 2 compared to XDNA, currently used in the Ryzen 7040 series. It must be noted that these gains to generative AI performance are currently estimated by AMD engineering staff and aren't a guarantee of the final performance.

Looking at AMD's current Ryzen AI roadmap from 2023 (Ryzen 7040 series) to 2025, we can see that the next generation XDNA 2 NPU is coming in the form of Strix Point-based APUs. Although details on AMD's upcoming Strix Point processors are slim, we now know that AMD's XDNA 2-based NPU and Strix Point will start shipping sometime in 2024, which could point to a general release towards the second half of 2024 or the beginning of 2025. We expect AMD to start detailing their XDNA 2 AI NPU sometime next year.

Comments Locked

20 Comments

View All Comments

  • name99 - Thursday, December 7, 2023 - link

    And everyone doing it in the cloud would prefer to move as much inference as practical to the device so that they don't have to pay for it...

    Same reason streamers would prefer you to have h.265 or AVI decode available even though h.264 works fine – it reduces their costs.

    Everything takes time! But you get to ubiquitous inference HW in five years by starting TODAY not five years from now.
  • mode_13h - Thursday, December 7, 2023 - link

    > everyone doing it in the cloud would prefer to move as much inference as practical to the device so that they don't have to pay for it...

    If you're OpenAI, these models are your crown jewels. Keeping them locked up in the cloud is a way they can keep them secure. You charge a service fee for people to use them, and the cloud costs are passed on to the customer.

    As a practical matter, these models are friggin' huge. They'd take forever to download, would chew up the storage on client-side devices, and you wouldn't be able to hold very many of them.
  • nandnandnand - Saturday, December 9, 2023 - link

    Smaller LLMs, Stable Diffusion, etc. can definitely fit on the SSDs in consumer devices and use relatively normal amounts of RAM. Only a minority of users will actively seek to do so, while others will probably end up using some of these basic LLMs by default on even their smartphones.

    Hopefully we see a doubling of storage and RAM in the near term, with 8 TB SSDs becoming more common and 32Gb memory chips enabling 64 GB UDIMMs, larger LPDDR packages, etc.
  • flyingpants265 - Sunday, December 10, 2023 - link

    Sure they would.
  • hyno111 - Friday, December 8, 2023 - link

    Actually there is a very active community on using local LLMs. Meta released the source of their LLaMa models in Feb, and a lot of follow-ups appear using their proven architecture.
    The most active personal use cases are storytelling/roleplaying and assistant. And businesses are interested in using their own database to augment LLM abilities.
  • mode_13h - Thursday, December 7, 2023 - link

    Clicked the Ryzen AI SDK link in the article. System Requirements: Win 11

    No thank you. Yes, I'm aware of the github ticket requesting Linux support, that they recently reopened. I'm not buying a Ryzen AI-equipped laptop anytime soon, so I can afford to wait.
  • lmcd - Thursday, December 7, 2023 - link

    Honestly hilarious given that AMD's compute platform is barely supported on Windows. When will AMD support its entire feature set on a single OS platform?
  • PeachNCream - Friday, December 8, 2023 - link

    You as well as anyone else that relies heavily on Linux knows that support for new hardware tends to lag on our preferred OS platform. And when we finally do get support, it's not optimized, buggy, and vendor interest is apathetic at best. Get ready for a few years wait for parity for Ryzen AI.
  • mode_13h - Saturday, December 9, 2023 - link

    ROCm started out on Linux. So, that didn't lag Windows, but still had a host of other issues too numerous to get into, here.

    Intel has done a great job of supporting compute on Linux, using a fully open-source stack. Again, I can't say anything definitive about Windows, but I get the impression their Linux GPU compute support was above and beyond what they were doing on Windows.

    As for Nvidia, CUDA always treated Linux as first-class, as far as I'm aware.

    Ryzen AI is a little bit special-case, for AMD. It's only something they're putting in their APUs, which sets it apart from their GPU-compute efforts. Most of their APU customers are actually running Windows, so I think it makes sense for them to prioritize Windows for Ryzen AI. However, if they want to use it for tapping into the embedded market, later on, they really shouldn't disregard supporting it on Linux.
  • nandnandnand - Saturday, December 9, 2023 - link

    Technically, XDNA should make it into Phoenix desktop APUs in January, but as for the actual desktop CPUs, it's anybody's guess. No plans have been leaked

Log in

Don't have an account? Sign up now