Open source models
Ignos, as part of its R&D activities in the field of Artificial Intelligence, has developed a new LLM based on the Mistral architecture under the open source paradigm. This model, which has achieved high scores on the Hugging Face Leaderboard, has shown that it can be an important aid in improving instructional behavior in AI systems. In less than a month this model has been downloaded thousands of times for evaluation and implementation in AI systems of all types.
LLM Details
Development and Characteristics Features
The model, a fine-tuned version of the “Toten5/Marcoroni-neural-chat-7B-v2”, has been developed by Ignos. It is based on a “Mistral” model and is distributed under the Apache-2.0 license, so that its use is not limited by commercial restrictions. The main intention behind its creation is to improve instructional behavior, which implies greater efficiency and accuracy in automated teaching and guidance tasks.
Biases, Risks and Limitations
Like its base models, this Ignos LLM inherits the same biases, risks and limitations. It is crucial to recognize and address these issues to ensure a fair and ethical application of the model, taking into account the setting in which it will be used and the type of generative tasks that will be required of it.
Training details
Training data
The model was trained using the “tatsu-lab/alpaca” dataset, known for its wide variety and quality in the field of natural language processing.
Training procedure
The training was carried out using the QLoRA(Quantization and Low-Rank Adapters ) approach, which are techniques used to optimize machine learning models. Quantization reduces the precision of the numbers used in the model (e.g. from 32 bits to 8 bits) reducing the size of the model and speeding up the computation, which is useful in devices with limited resources. Low-range adapters, on the other hand, are small networks inserted into a pre-trained model to adjust it to specific tasks. These adapters, being low-range, require fewer parameters, which reduces complexity without sacrificing much performance. Subsequently, this model was merged with the base model, thus integrating its strengths and capabilities.
LLM Technical Specifications
Architecture and Objective of the Model
LLM is based on the Mistral Architecture, an advanced framework for language models that provides more efficient processing and more accurate results.
Computing Infrastructure
The training was conducted at cloud infrastructure provider RunPod, using an environment including 3 RTX 4090 graphics cards, 48 vCPUs and 377 GB of RAM. This hardware configuration ensures fast and efficient training.
Software and Framework Versions
The software used for the development included Axolotl 0.3.0 and version 0.6.0 of the PEFT framework. These tools are essential for the efficient handling of large amounts of data and complex machine learning operations.
Conclusion
Ignos’ new LLM model, based on the Mistral architecture, represents a very useful tool in AI systems in the field of natural language processing. Its technical specifications and advanced training approach suggest great potential for improving instructional interaction in AI applications. However, it is vital to address legacy biases and limitations to ensure effective and ethical application in diverse contexts.
If you are considering adopting AI-based solutions for data management in your organization, do not hesitate to contact us.