Mistral launches its first multimodal AI model called Pixtral 12B

Posted on September 12, 2024

Mistral launches its first multimodal AI model called Pixtral 12B

By.
View Count. 0
0 Comments

French AI startup Mistral has released its first ever multimodal model called Pixtral 12B, competing with the likes of OpenAI and Anthropic. The 12-billion-parameter model is capable of processing both images and text, and currently uses its existing text-based model Nemo 12B.

Pixtral 12B is expected to be integrated into the company’s chatbot, Le Chat, and API platform, La Platforme, according to the company’s head of developer relations.

You can download the model via the torrent link. It’ll be available on le Chat and la Plateforme soon.

— Sophia Yang, Ph.D. (@sophiamyang) September 11, 2024

The model is said to be 24GB in size, and in theory, should be able to perform tasks like captioning images and counting the number of objects in a photo. The official account of Mistral on X released the AI model in a post by sharing its magnet link.

Pixtral 12B’s performance and accessibility

Pixtral 12B is available for download, fine-tuning, and use under an Apache 2.0 license without restrictions. It can be obtained through a torrent link on GitHub and the AI and machine learning development platform, Hugging Face.

A Reddit user shared benchmark scores for Pixtral 12B, which appears to show that the language model surpasses both Claude-3 Haiku and Phi-3 Vision in multimodal abilities on the ChartQA benchmark. It also reportedly exceeds the performance of competing AI models in multimodal knowledge and reasoning on the Massive Multitask Language Understanding (MMLU) benchmark.

Pixtral benchmarks results
byu/kristaller486 inLocalLLaMA

The Amazon-backed company is already known for Codestral, a large language model which helps developers to code, as well as Mistral Large. ReadWrite reported on the new LLM in February, which was described as a “cutting-edge text generation model” with “top-tier reasoning capabilities.”

Most generative AI models, such as those from Mistral, use extensive amounts of public data from the web, which is often under copyright. While some providers of these models claim that “fair use” allows them to collect any public data, numerous copyright holders contest this practice. As a result, AI firms like OpenAI and Midjourney have faced lawsuits aimed at stopping this.

In December, the open-source startup received $414 million in funding, closing the investment window with a valuation of $2 billion. By May, the Paris-based company was able to close a $645 million funding round led by General Catalyst that valued the company at $6 billion.

Featured image: Canva

The post Mistral launches its first multimodal AI model called Pixtral 12B appeared first on ReadWrite.

Mistral launches its first multimodal AI model called Pixtral 12B

Pixtral 12B’s performance and accessibility

Write a comment Cancel reply

Quick Links

Features

Contact

Follow Us on