Nvidia has released its powerful open-source artificial intelligence model that could outpace the likes of OpenAI’s GPT-4.
The company’s new NVLM 1.0 family of open-source multimodal large language models (LLMs), with its flagship model, NVLM-D-72B, has around 72 billion parameters.
According to Nvidia’s research team, the new AI model excels in vision-language tasks while maintaining and even improving text-only performance compared to their LLM backbones. In their paper, the researchers state: “We introduce NVLM 1.0, a family of frontier-class multimodal large language models that achieve state-of-the-art results on vision-language tasks, rivaling the leading proprietary models (e.g., GPT-4o) and open-access models.”
Unlike some of the other proprietary models where there is a significant decline in text performance over time, the NVLM-D-72B reportedly increased its accuracy by an average of 4.3 points across key text benchmarks.
The LLM was also able to interpret charts and tables, analyze images, understand memes, code software, as well as solve mathematical problems. The model weights are publicly available on Hugging Face and Nvidia says it will eventually release the training code.
What the AI community think of Nvidia’s NVLM model
AI researchers on X have called the release “wild,” and praised its ability to understand visual data. One user wrote: “Wow! Nvidia just published a 72B model with is ~on par with llama 3.1 405B in math and coding evals and also has vision ?”
NVLM by NVIDIA is wild. And Open. Check it out.https://t.co/fYpagW4Kog pic.twitter.com/r9V8uamGVf
— Alex Zhavoronkov, PhD (aka Aleksandrs Zavoronkovs) (@biogerontology) October 2, 2024
Wow. New NVIDIA 72B model rivals Llama’s 405B! https://t.co/ACsvUUctml pic.twitter.com/TUZ378S4tz
— Jeremy Howard (@jeremyphoward) October 1, 2024
Wow nvidia just published a 72B model with is ~on par with llama 3.1 405B in math and coding evals and also has vision pic.twitter.com/c46DeXql7s
— Phil (@phill__1) October 1, 2024
That said, Nvidia itself has reportedly used open-source resources to develop NVLM 1.0, gaining insights from other AI models and various training data. However, the NVLM-D-72B model is restricted under its licensing terms. It cannot be used for commercial purposes or modified for resale. Essentially, Nvidia is providing the model exclusively for research purposes and for hobbyists eager to test the limits of their high-end graphics cards.
The researchers’ use of the term “open” is therefore quite intentional. Although Nvidia’s findings do provide value, the restrictions on commercial use mean it cannot be considered truly open-source, which would require the freedom to use, modify, and distribute the model without any limitations.
ReadWrite has reached out to Nvidia for comment.
Featured image: Midjourney
The post Nvidia unveils its new NVLM 1.0 AI model, rivaling the likes of OpenAI’s GPT-4 appeared first on ReadWrite.