-
Drama, farce, a radio ham and the Baker Street bank heist - 8 mins ago
-
Why a liberal prison reform passed in Nevada and failed in California - 17 mins ago
-
NATO Fighter Jets Scrambled As Russia Launches ‘Massive Attack’ on Ukraine - 18 mins ago
-
Eli Lilly and Novo Want to Shake off Ozempic Copycats. Are They Ready To Meet Demand? - 22 mins ago
-
Daren Sammy Cricket Ground Pitch Report & Stats ahead of West Indies vs England 5th T20I - 24 mins ago
-
‘Wear your seat belt’ begs woman after Lincolnshire car crash - 28 mins ago
-
Outgoing Liberty Media CEO Responds To $20 Billion F1 Sale Rumors - 38 mins ago
-
Israel economy rebounds with 3.8% growth in Q3 amid wars with Hamas, Hezbollah - 40 mins ago
-
Shakey’s Super League: Ateneo claims fifth place - 42 mins ago
-
The feminists who fought for change in 1980s Birmingham - 45 mins ago
Meta Llama 3.1 405B Released as Company’s Largest Open Source AI Model to Date, Beats OpenAI’s GPT-4o
Meta on Tuesday released its latest and largest artificial intelligence (AI) model to the public. Called Meta Llama 3.1 405B, the company says the open-source model outperforms major closed AI models such as GPT-4, GPT-4o, and Claude 3.5 Sonnet across several benchmarks. The previously released Llama 3 8B and 70B AI models have also been upgraded. The newer versions were distilling from the 405B model and now offer a 1,28,000 tokens context window. Meta claims both of these models are now the leading open-source large language models (LLMs) for their sizes.
Announcing the new AI model in a blog post, the technology conglomerate said, “Llama 3.1 405B is the first openly available model that rivals the top AI models when it comes to state-of-the-art capabilities in general knowledge, steerability, math, tool use, and multilingual translation.”
Notably, 405B here refers to 405 billion parameters, which can be understood as the LLM’s number of knowledge nodes. The higher the parameter size, the more adept an AI model is in handling complex queries. The context window of the model is 128,000 tokens. It supports English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai languages.
The company claims the Llama 3.1 405B was evaluated on more than 150 benchmark tests across multiple expertise. Based on the data shared in the post, Meta’s AI model scored 96.8 in the Grade School Math 8K (GSM8K) GPT-4’s 94.2, GPT-4o’s 96.1, and Claude 3.5 Sonnet’s 96.4. It also outperformed these models in the AI2’s Reasoning Challenge (ARC) benchmark for science proficiency, Nexus for tool use, and the Multilingual Grade School Math (MGSM) benchmark.
Meta’s largest AI model was trained on more than 15 trillion tokens with more than 16 thousand Nvidia H100 GPUs. One of the major introductions in the Llama 3.1 405B is the official support for tool-calling which will allow developers to use Brave Search for web searches, Wolfram Alpha to perform complex mathematical calculations, and Code Interpreter to generate Python code.
Since the Meta Llama 3.1 405B is available in open source, individuals can access it from either the company’s website or from its Hugging Face listing. However, being a large model, it requires roughly 750GB of disk storage space to run. For inferencing, two nodes on Model Parallel 16 (MP16) will also be necessary. Model Parallelism 16 is a specific implementation of model parallelism where a large neural network is separated into 16 devices or processors.
Apart from being available publicly, the model is also available on major AI platforms by AWS, Nvidia, Databricks, Groq, Dell, Azure, Google Cloud, Snowflake, and more. The company says a total of 25 such platforms will be powered by Llama 3.1 405B. For safety and security, the company has used Llama Guard 3 and Prompt Guards, two new tools that safeguard the LLM from potential harm and abuse.
Source link