Nvidia is highlighting its consumer GPU strengths for so-called “local” AI, which can operate on personal computers or laptops regardless of location.
These processors have long been Nvidia’s bread and butter, designed to run video games, but the company claims that this year’s graphics cards have been upgraded to run AI models without transmitting data back to the cloud. The new chip may be utilized to generate images in Adobe Photoshop’s Firefly generator, erase backdrops in video chats, and even create games with AI-generated speech.
Last year, Nvidia found itself at the epicenter of the artificial intelligence boom, as its high-end server graphics processors, such as the H100, became critical for the deployment and training of generative AI such as OpenAI’s ChatGPT. Nvidia is now emphasizing its strengths in consumers’ GPUs for so-called “local” AI, which may run on a PC or laptop at home or in the office.
On Monday, Nvidia unveiled three new graphics cards, the RTX 4060 Super, RTX 4070 Ti Super, and RTX 4080 Super, with prices ranging from $599 to $999. These GPUs contain additional “tensor cores” for running generative AI applications. Nvidia will also include graphics cards in laptops from Acer, Dell, and Lenovo.
Demand for Nvidia’s enterprise GPUs, which cost thousands of dollars each and are frequently sold in systems with eight GPUs collaborating, drove up overall Nvidia sales and resulted in a market worth more than $1 trillion.
Nvidia’s bread and butter has historically been GPUs for PCs, targeted at operating video games, but the company says this year’s graphics cards have been upgraded with a focus on running AI models without transmitting data back to the cloud.
According to the business, the latest consumer-level graphics chips are going to be largely utilized for gaming, but they can also rip through AI applications. Nvidia, for example, claims that the RTX 4080 Super can output “AI video 150% faster” compared to the previous-generation model. Other software enhancements disclosed lately by Nvidia will make massive language model processing up to five times faster.
“With 100 million RTX GPUs shipped, they provide a massive installed base for powerful PCs for AI applications,” Nvidia’s senior director of product management, Justin Walker, told reporters during a press briefing.
Nvidia anticipates that new AI applications will develop over the next year to capitalize on the additional horsepower. Microsoft is anticipated to deliver Windows 12, a new version of Windows that can take advantage of AI technology, later this year.
Walker claims that the newly developed chip can be used to make images on Adobe Photoshop’s Firefly generator or to erase backgrounds from video chats. Nvidia is also developing tools to enable game developers to incorporate generative AI in their products, such as generating conversation from a nonplayer character.
Server vs. Edge
Nvidia’s chip releases this week illustrate that, while the business is best known for large server GPUs, it will also compete in local AI with Intel, AMD, and Qualcomm. All three have revealed new chips that will power so-called “AI PCs” with machine learning-specific components.
Nvidia’s approach comes at a time when the technology sector is figuring out the best method to deploy generative AI, which demands a massive amount of computational power and may be extremely expensive to run on cloud services.
One technical solution suggested by Microsoft and Nvidia competitors is the “AI PC,” also known as “edge computing.” Instead of employing powerful supercomputers over the internet, smartphones will contain increasingly powerful AI chips that can “run so-called large language models or image generators, albeit with some trade-offs and weaknesses”.
Nvidia suggests apps that can employ a cloud model for difficult issues and a local AI model for short jobs. “Nvidia GPUs in the cloud can be running really big large language models and using all that processing power to power very large AI models, while at the same time, RTX tensor cores in your PC are going to be running more latency-sensitive AI applications,” said Walker from Nvidia.
The business said that their new graphics cards would be able to be transported to China, satisfying export rules. This provides a substitute for Chinese researchers and companies who are unable to obtain Nvidia’s most powerful server GPUs.