Improve instructional behavior with Ignos LLM

Open source models

Ignos, as part of its R&D activities in the field of Artificial Intelligence, has developed a new LLM based on the Mistral architecture under the open source paradigm. This model, which has achieved high scores on the Hugging Face Leaderboard, has shown that it can be an important aid in improving instructional behavior in AI systems. In less than a month this model has been downloaded thousands of times for evaluation and implementation in AI systems of all types.

Mistral LLM - Ignos Embrace Face — Mistral LLM – Ignos Hugging Face

LLM Details

Development and Characteristics Features

The model, a fine-tuned version of the “Toten5/Marcoroni-neural-chat-7B-v2”, has been developed by Ignos. It is based on a “Mistral” model and is distributed under the Apache-2.0 license, so that its use is not limited by commercial restrictions. The main intention behind its creation is to improve instructional behavior, which implies greater efficiency and accuracy in automated teaching and guidance tasks.

Biases, Risks and Limitations

Like its base models, this Ignos LLM inherits the same biases, risks and limitations. It is crucial to recognize and address these issues to ensure a fair and ethical application of the model, taking into account the setting in which it will be used and the type of generative tasks that will be required of it.

Training details

Training data

The model was trained using the “tatsu-lab/alpaca” dataset, known for its wide variety and quality in the field of natural language processing.

Training procedure

The training was carried out using the QLoRA(Quantization and Low-Rank Adapters ) approach, which are techniques used to optimize machine learning models. Quantization reduces the precision of the numbers used in the model (e.g. from 32 bits to 8 bits) reducing the size of the model and speeding up the computation, which is useful in devices with limited resources. Low-range adapters, on the other hand, are small networks inserted into a pre-trained model to adjust it to specific tasks. These adapters, being low-range, require fewer parameters, which reduces complexity without sacrificing much performance. Subsequently, this model was merged with the base model, thus integrating its strengths and capabilities.

LLM Technical Specifications

Architecture and Objective of the Model

LLM is based on the Mistral Architecture, an advanced framework for language models that provides more efficient processing and more accurate results.

Computing Infrastructure

The training was conducted at cloud infrastructure provider RunPod, using an environment including 3 RTX 4090 graphics cards, 48 vCPUs and 377 GB of RAM. This hardware configuration ensures fast and efficient training.

Software and Framework Versions

The software used for the development included Axolotl 0.3.0 and version 0.6.0 of the PEFT framework. These tools are essential for the efficient handling of large amounts of data and complex machine learning operations.

Conclusion

Ignos’ new LLM model, based on the Mistral architecture, represents a very useful tool in AI systems in the field of natural language processing. Its technical specifications and advanced training approach suggest great potential for improving instructional interaction in AI applications. However, it is vital to address legacy biases and limitations to ensure effective and ethical application in diverse contexts.

If you are considering adopting AI-based solutions for data management in your organization, do not hesitate to contact us.

More information

This post is also available in: English Español

High-performance Open Source LLM

Open source models

LLM Details

Development and Characteristics Features

Biases, Risks and Limitations

Training details

Training data

Training procedure

LLM Technical Specifications

Architecture and Objective of the Model

Computing Infrastructure

Software and Framework Versions

Conclusion

Leave a Reply Cancel reply

High-performance Open Source LLM

Open source models

LLM Details

Development and Characteristics Features

Biases, Risks and Limitations

Training details

Training data

Training procedure

LLM Technical Specifications

Architecture and Objective of the Model

Computing Infrastructure

Software and Framework Versions

Conclusion

Artículos relacionados

Gemini: Google’s revolutionary model

LangChain and Hugging Face. Introduction

Leave a Reply Cancel reply