Meta Announces LLaMA-13B, a new AI-powered Large Language Model (LLM) that has reportedly outperformed OpenAI’s GPT-3, despite being “10x smaller”. Meta’s LLaMA models have the potential to revolutionize the AI industry by paving the way for ChatGPT-style language assistants to be run locally on devices such as PCs and smartphones.
LLaMA-13B is part of a new family of language models called “Large Language Model Meta AI” or LLAMA, which ranges from 7 billion to 65 billion parameters in size. In comparison, OpenAI’s GPT-3 model, the foundational model behind ChatGPT, has 175 billion parameters. Meta trained its LLaMA models using publicly available datasets, such as Common Crawl, Wikipedia, and C4, which could lead to the firm releasing the model and the weights as open-source.
Meta calls its LLaMA models “foundational models,” indicating that the firm intends the models to form the basis of future, more-refined AI models built off the technology. LLaMA will be useful in natural language research and potentially power applications such as question-answering, natural language understanding or reading comprehension, understanding capabilities and limitations of current language models.
While the top-of-the-line LLaMA model, LLaMA-65B with 65 billion parameters, goes toe-to-toe with similar offerings from competing AI labs DeepMind, Google, and OpenAI, the most significant development comes from the LLaMA-13B model. It can reportedly outperform GPT-3 while running on a single GPU, unlike GPT-3 derivatives which require data center requirements. LLaMA-13B could open the door for ChatGPT-like performance on consumer-level hardware in the near future.
Parameter size is a significant factor in AI. A parameter is a variable that a machine-learning model uses to make predictions or classifications based on input data. The number of parameters in a language model is a key factor in its performance, with larger models generally capable of handling more complex tasks and producing more coherent output. However, more parameters take up more space and require more computing resources to run. So if a model can achieve the same results as another model with fewer parameters, it represents a significant gain in efficiency.
According to independent AI researcher Simon Willison, Meta’s new AI models could enable language models with a sizable portion of ChatGPT’s capabilities to be run on mobile phones and laptops within a year or two.
Meta has made a stripped-down version of LLaMA available on GitHub, and researchers interested in receiving the full code and weights can request access via a form provided by Meta. The firm has not announced plans for a wider release of the model and weights at this time.