Hi Mansi Vaishnav,
Thanks for reaching out to Microsoft Q&A.
In the image you shared, there is a notice stating that the Microsoft Purchase Policy applies to the use of the opensource model, specifically mentioning that only dummy or artificial datasets should be used in non prod systems for testing or evaluating the model. Additionally, no live or prod data is allowed unless you are enrolled and compliant as a supplier with all applicable microsoft policies.
Based on this notice, here are the key points to consider:
Data Usage Restrictions: The models, such as Mistral and possibly others like Code-Llama, are designed for experimentation with non-production (dummy or artificial) data unless you are specifically compliant with Microsoft's supplier terms. This means that you cannot use production data with these models without proper enrollment and adherence to Microsoft's policies.
Open-Source Models and Production Data: In terms of using these models safely with production data, the notice is clear that these open-source models provided in Azure Marketplace are not safe for production data unless you have specific permissions and compliance in place. This restriction is due to potential risks related to data processing, as the models are intended for experimentation rather than direct production usage without the necessary agreements.
Recommendation: If you're experimenting with models like Mistral or Code-Llama, you should stick to testing with dummy or synthetic data. If your goal is to use them in a production environment, you will need to ensure compliance with Microsoft's supplier policies. Alternatively, for production use cases, consider leveraging fully supported Azure services like Azure OpenAI, which have clearer policies on production data handling.
For further guidance on using any open-source models in production, you might need to explore legal and compliance aspects or look for models that specifically mention support for production environments in the Azure Marketplace or AI Studio.
General Best Practices for AI Model Usage with Production Data:
- Data Encryption: Ensure that all production data is encrypted both at rest and in transit, whether it’s being processed by an AI model or stored in cloud infrastructure.
- Access Control: Implement strong access controls to ensure that only authorized personnel and systems can interact with sensitive data.
- Data Auditing: Enable detailed logging and auditing to track how data is processed, by whom, and when.
- Anonymization: Whenever possible, anonymize production data before passing it to any experimental or open-source models.
Next Steps:
- Review Azure OpenAI Policies: Familiarize yourself with the policies governing Azure OpenAI to assess whether it suits your needs for production use.
- Ensure Compliance: If you must use open-source models, work closely with legal and compliance teams to ensure you meet microsoft’s Supplier and data handling policies.
- Model Evaluation: If you’re still experimenting, make sure to leverage artificial or dummy data and avoid using any live customer or prod data.
Please 'Upvote'(Thumbs-up) and 'Accept' as an answer if the reply was helpful. This will benefit other community members who face the same issue.