Just two months after releasing its large language model, Llama 3.1, Meta has introduced an updated version, Llama 3.2. This is the first open-source AI system capable of processing both images and text, providing developers with the ability to build innovative AI applications.
Key Features and Uses of Llama 3.2
The new model enables the development of cutting-edge AI applications such as augmented reality platforms with real-time video recognition, visual search engines that categorize images based on content, and document analysis systems that can summarize lengthy text fragments. Meta claims that getting the updated model up and running will be straightforward, thanks to the added multimodality support, allowing Llama to interpret images and respond accordingly.
Although OpenAI and Google released their own multimodal AI models last year, Meta aims to close the gap with Llama 3.2, notes NIXsolutions. This image support is particularly crucial as Meta continues to expand its AI capabilities across devices, including Ray-Ban Meta glasses.
Llama 3.2’s Different Versions and Compatibility
Llama 3.2 offers two image models with 11 and 90 billion parameters and two lightweight text models with 1 and 3 billion parameters. The smaller models are designed to run efficiently on Qualcomm, MediaTek, and other Arm processors, indicating that Meta anticipates them being used on mobile devices. However, the previously released Llama 3.1, which debuted in July, remains a strong contender in terms of text generation, with one version boasting 405 billion parameters.
We’ll keep you updated on any further developments with Llama 3.2.