LLMs (Large Language Models) are becoming increasingly powerful: their evolution has demonstrated astonishing capabilities in generating coherent and complex texts, images, and videos. However, this power does not come without costs. The development and training of LLM models require significant resources. Companies and institutions must invest substantial amounts of money to develop and maintain artificial intelligences. In addition, there is a huge energy cost associated with this, which also raises concerns related to CO2 emissions.
The performance of current models, as reported by ilsole24ore.com, is directly influenced by the quantity of parameters (the complex instructions that a model can understand), and today, AIs like GPT-3 have hundreds of billions of them.
Microsoft has decided to launch Phi-3 Mini, an AI language model capable of matching OpenAI's GPT-3.5 with significantly reduced dimensions. Phi-3 Mini can handle 3.8 billion parameters, far from the costly 175 billion of GPT-3. Despite this difference, the models are comparable. In this way, Microsoft has managed to present a model that provides responses comparable to those of a model ten times larger.
Microsoft drew inspiration from the way children learn: the goal was to train the model with simple sentences to obtain output based on general knowledge, but with high problem-solving capabilities and a higher level of reasoning. Due to a lack of input data, Microsoft used some LLMs to create children's books, which were then used to train Phi-3 Mini.
The development of Phi-3 Mini brings more than one benefit. Firstly, it's an approach that enables a new paradigm, that of fine-tuning models with lower expenditure and better results. Secondly, a model of this kind can be adopted by entities with limited budgets and can also be embedded in low-powered devices such as smartphones and laptops, with the possibility of being used directly on the machine, reducing privacy and security concerns.
Comentarios