What kind of hardware do I need to get ready for AI?

This article aims to guide potential buyers through the necessary hardware considerations to ensure their system is well-equipped for AI applications, balancing between current needs and future-proofing their investment. Additional background information, including technical details on local AI acceleration, can be found in this blog post:

Which CPUs are already equipped with AI acceleration?

AMD and Intel have launched CPUs with integrated neural processing units (NPUs). These NPUs are designed to run AI tasks locally that would otherwise take much longer to execute on standard CPU cores. In our portfolio, the following list of CPUs include such an NPU:

  • AMD Ryzen 7 7840HS
  • AMD Ryzen 7 8845HS
  • Intel Core Ultra 5 125U
  • Intel Core Ultra 7 155U
  • Intel Core Ultra 5 125H
  • Intel Core Ultra 7 155H

(List includes yet-to-be released products for our 2024 roadmap.)

The extent to which third-party software will leverage these NPUs remains uncertain. It is also unclear, which CPU vendor offers the better choice. While AMD was first to launch a CPU with integrated NPU, Intel’s longstanding industry partnerships may lead to faster adoption of Intel’s API. One can only hope that future software will be developed in a vendor-agnostic approach to be able to run the same on both platforms, implementing the NPU drivers or APIs of both vendors.

Do I need a dedicated graphics card for AI?

It really depends on what kind of workloads you plan to run. For users whose primary focus is not on gaming, 3D rendering, or heavy content creation, investing in a high-end dedicated GPU (dGPU) solely for AI tasks might not be the most cost-effective decision, especially if portability and battery life are significant concerns. Commercial AI-powered services such as ChatGPT, Midjourney and Microsoft’s Co-Pilot will most likely continue to be available in the cloud instead of being computed on local hardware.

Which GPU should I pick?

If you are intent on buying a system with a dGPU, please bear in mind that the amount of video RAM (VRAM) may directly impact the feasibility of running advanced AI algorithms locally – depending on the size of the particular model and the input/output parameters (such as image resolution). Thus, if locally executed AI content creation, development or research is the primary focus of this machine, a dGPU with at least 12 or 16GB of VRAM is advisable.

Combination of NPU and GPU

To make a system absolutely future-proof for heavy AI use, it is reasonable to assume that a combination of NPU and dGPU would give you the best of both worlds. However, as of 2024, there won’t be systems available in our portfolio that offer two top-tier GPUs (RTX 4080/4090) together with an NPU-equipped CPU.

The highest dGPU pairing will be the RTX 4070 with up to 8GB of VRAM in systems equipped with NPUs. The top-tier GPUs this year will be exclusive to systems with Intel Core HX series, which is an extremely powerful CPU in of itself, but does not include an NPU yet.

What about NVIDIA’s own applications such as “RTX Voice” and “Chat with RTX”?

NVIDIA leads in hardware-based AI acceleration, offering exclusive applications developed for its accelerators. The decision to purchase an NVIDIA graphics card for these services alone depends on the specific services desired and other potential uses for the graphics card.

  • Some NVIDIA AI applications have similar counterparts from other providers, which operate on an NPU, within the CPU, or in the cloud. For such applications, opting for an NVIDIA GPU is not strictly necessary. This includes features like noise suppression (RTX Voice) or blurring the webcam background.
  • Other applications target gamers, designers, or industrial clients who already have a use for an NVIDIA graphics card. Examples include NVIDIA Omniverse, NVIDIA Broadcast, and NVIDIA Jarvis.

Chat with RTX” occupies a unique position in this consideration. Although it could be replaced by cloud-based services like ChatGPT, local execution on one’s own GPU offers privacy benefits and is not limited by specific prompt limits per hour. “Chat with RTX” is aimed at both casual users and, potentially, those in the creative and industrial sectors. The same applies to open-source LLMs (Large Language Models) accessible via “GPT4All”.

When it comes to running language models and chatbots locally, running them on a dedicated GPU is currently the de-facto standard. Although local LLM chatbots could potentially run on the slimmer NPUs of current CPU generations, this remains a future prospect, as far as we can tell. To wit, Intel has not yet shown something like “Chat with Intel” optimized for the NPU in Meteor Lake. Meanwhile, running LLM chatbots on NVIDIA graphics cards is a well-established reality. While open-source LLMs require some background knowledge (which language model fits my needs and fits into my graphics memory), a “Chat with RTX” optimized for NVIDIA graphics cards has the unique potential to make cloud-less chatbot usage accessible to a wider audience.

Conclusion

We would like to summarize the recommendations as follows:

  • Revolutionary AI applications will continue to be accessible without specialized hardware, as long as one is willing and able to run them in the cloud.
  • For smaller-scale, local AI applications, having an NPU within the CPU will likely suffice. There is significant potential for development in this area.
  • For larger AI projects, a powerful graphics card with an ample amount VRAM is advisable, if it’s not already required for gaming or 3D processing.
  • The offline use of chatbots could be a compelling reason for some users to choose an NVIDIA graphics card, even without gaming or 3D ambitions.

If you are still uncertain about which configuration best fits your budget and needs, we are here to advise you. Please get in touch with us.