Add and configure models to Azure AI services - Azure AI Foundry

Important

Items marked (preview) in this article are currently in public preview. This preview is provided without a service-level agreement, and we don't recommend it for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

You can decide and configure which models are available for inference in the inference endpoint. When a given model is configured, you can then generate predictions from it by indicating its model name or deployment name on your requests. No further changes are required in your code to use it.

In this article, you'll learn how to add a new model to Azure AI model inference in Azure AI Foundry.

Prerequisites

To complete this article, you need:

An Azure subscription. If you're using GitHub Models, you can upgrade your experience and create an Azure subscription in the process. Read Upgrade from GitHub Models to Azure AI model inference if that's your case.
An Azure AI services resource.

An AI project connected to your Azure AI Services resource with the feature Deploy models to Azure AI model inference service on.
- You can follow the steps at Configure Azure AI model inference service in my project in Azure AI Foundry.

Add a model

You can add models to the Azure AI model inference endpoint using the following steps:

Go to Model catalog section in Azure AI Foundry portal.
Scroll to the model you're interested in and select it.
You can review the details of the model in the model card.
Select Deploy.
For model providers that require more terms of contract, you'll be asked to accept those terms. This is the case for Mistral models for instance. Accept the terms on those cases by selecting Subscribe and deploy.
You can configure the deployment settings at this time. By default, the deployment receives the name of the model you're deploying. The deployment name is used in the model parameter for request to route to this particular model deployment. This allows you to also configure specific names for your models when you attach specific configurations. For instance o1-preview-safe for a model with a strict content safety content filter.

Tip

Each model can support different deployments types, providing different data residency or throughput guarantees. See deployment types for more details.
We automatically select an Azure AI Services connection depending on your project. Use the Customize option to change the connection based on your needs. If you're deploying under the Standard deployment type, the models need to be available in the region of the Azure AI Services resource.

Tip

If the desired resource isn't listed, you might need to create a connection to it. See Configure Azure AI model inference service in my project in Azure AI Foundry portal.
Select Deploy.
Once the deployment completes, the new model is listed in the page and it's ready to be used.

Manage models

You can manage the existing model deployments in the resource using Azure AI Foundry portal.

Go to Models + Endpoints section in Azure AI Foundry portal.
Scroll to the connection to your Azure AI Services resource. Model deployments are grouped and displayed per connection.
You see a list of models available under each connection. Select the model deployment you're interested in.
Edit or Delete the deployment as needed.

Test the deployment in the playground

You can interact with the new model in Azure AI Foundry portal using the playground:

Note

Playground is only available when working with AI projects in Azure AI Foundry. Create an AI project to get full access to all the capabilities in Azure AI Foundry.

Go to Playgrounds section in Azure AI Foundry portal.
Depending on the type of model you deployed, select the playground needed. In this case we select Chat playground.
In the Deployment drop down, under Setup select the name of the model deployment you have created.
Type your prompt and see the outputs.
Additionally, you can use View code so see details about how to access the model deployment programmatically.

Important

Items marked (preview) in this article are currently in public preview. This preview is provided without a service-level agreement, and we don't recommend it for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

You can decide and configure which models are available for inference in the inference endpoint. When a given model is configured, you can then generate predictions from it by indicating its model name or deployment name on your requests. No further changes are required in your code to use it.

In this article, you'll learn how to add a new model to Azure AI model inference in Azure AI Foundry.

Prerequisites

To complete this article, you need:

An Azure subscription. If you're using GitHub Models, you can upgrade your experience and create an Azure subscription in the process. Read Upgrade from GitHub Models to Azure AI model inference if that's your case.
An Azure AI services resource.

Install the Azure CLI and the cognitiveservices extension for Azure AI Services:
```
az extension add -n cognitiveservices
```
Some of the commands in this tutorial use the jq tool, which might not be installed in your system. For installation instructions, see Download jq.
Identify the following information:
- Your Azure subscription ID.
- Your Azure AI Services resource name.
- The resource group where the Azure AI Services resource is deployed.

Add models

To add a model, you first need to identify the model that you want to deploy. You can query the available models as follows:

Log in into your Azure subscription:
```
az login
```
If you have more than 1 subscription, select the subscription where your resource is located:
```
az account set --subscription $subscriptionId>
```
Set the following environment variables with the name of the Azure AI Services resource you plan to use and resource group.
```
accountName="<ai-services-resource-name>"
resourceGroupName="<resource-group>"
```
If you don't have an Azure AI Services account create yet, you can create one as follows:
```
az cognitiveservices account create -n $accountName -g $resourceGroupName
```

Let's see first which models are available to you and under which SKU. The following command list all the model definitions available:

az cognitiveservices account list-models \
    -n $accountName \
    -g $resourceGroupName \
| jq '.[] | { name: .name, format: .format, version: .version, sku: .skus[0].name, capacity: .skus[0].capacity.default }'

Outputs look as follows:

{
  "name": "Phi-3.5-vision-instruct",
  "format": "Microsoft",
  "version": "2",
  "sku": "GlobalStandard",
  "capacity": 1
}

Identify the model you want to deploy. You need the properties name, format, version, and sku. Capacity might also be needed depending on the type of deployment.

Tip

Notice that not all the models are available in all the SKUs.

Add the model deployment to the resource. The following example adds Phi-3.5-vision-instruct:

az cognitiveservices account deployment create \
    -n $accountName \
    -g $resourceGroupName \
    --deployment-name Phi-3.5-vision-instruct \
    --model-name Phi-3.5-vision-instruct \
    --model-version 2 \
    --model-format Microsoft \
    --sku-capacity 1 \
    --sku-name GlobalStandard

The model is ready to be consumed.

You can deploy the same model multiple times if needed as long as it's under a different deployment name. This capability might be useful in case you want to test different configurations for a given model, including content safety.

Manage deployments

You can see all the deployments available using the CLI:

Run the following command to see all the active deployments:

az cognitiveservices account deployment list -n $accountName -g $resourceGroupName

You can see the details of a given deployment:

az cognitiveservices account deployment show \
    --deployment-name "Phi-3.5-vision-instruct" \
    -n $accountName \
    -g $resourceGroupName

You can delete a given deployment as follows:

    az cognitiveservices account deployment delete \
    --deployment-name "Phi-3.5-vision-instruct" \
    -n $accountName \
    -g $resourceGroupName

Use the model

Deployed models in Azure AI model inference can be consumed using the Azure AI model's inference endpoint for the resource. When constructing your request, indicate the parameter model and insert the model deployment name you have created. You can programmatically get the URI for the inference endpoint using the following code:

Inference endpoint

az cognitiveservices account show  -n $accountName -g $resourceGroupName | jq '.properties.endpoints["Azure AI Model Inference API"]'

To make requests to the Azure AI model inference endpoint, append the route models, for example https://<resource>.services.ai.azure.com/models. You can see the API reference for the endpoint at Azure AI model inference API reference page.

Inference keys

az cognitiveservices account keys list  -n $accountName -g $resourceGroupName

Important

Items marked (preview) in this article are currently in public preview. This preview is provided without a service-level agreement, and we don't recommend it for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

You can decide and configure which models are available for inference in the inference endpoint. When a given model is configured, you can then generate predictions from it by indicating its model name or deployment name on your requests. No further changes are required in your code to use it.

In this article, you'll learn how to add a new model to Azure AI model inference in Azure AI Foundry.

Prerequisites

To complete this article, you need:

An Azure subscription. If you're using GitHub Models, you can upgrade your experience and create an Azure subscription in the process. Read Upgrade from GitHub Models to Azure AI model inference if that's your case.
An Azure AI services resource.

Install the Azure CLI.
Identify the following information:
- Your Azure subscription ID.
- Your Azure AI Services resource name.
- The resource group where the Azure AI Services resource is deployed.
- The model name, provider, version, and SKU you would like to deploy. You can use the Azure AI Foundry portal or the Azure CLI to identify it. In this example we deploy the following model:
  - Model name:: Phi-3.5-vision-instruct
  - Provider: Microsoft
  - Version: 2
  - Deployment type: Global standard

About this tutorial

The example in this article is based on code samples contained in the Azure-Samples/azureai-model-inference-bicep repository. To run the commands locally without having to copy or paste file content, use the following commands to clone the repository and go to the folder for your coding language:

git clone https://github.com/Azure-Samples/azureai-model-inference-bicep

The files for this example are in:

cd azureai-model-inference-bicep/infra

Add the model

Use the template ai-services-deployment-template.bicep to describe model deployments:

ai-services-deployment-template.bicep

@description('Name of the Azure AI services account')
param accountName string

@description('Name of the model to deploy')
param modelName string

@description('Version of the model to deploy')
param modelVersion string

@allowed([
  'AI21 Labs'
  'Cohere'
  'Core42'
  'DeepSeek'
  'Meta'
  'Microsoft'
  'Mistral AI'
  'OpenAI'
])
@description('Model provider')
param modelPublisherFormat string

@allowed([
    'GlobalStandard'
    'Standard'
    'GlobalProvisioned'
    'Provisioned'
])
@description('Model deployment SKU name')
param skuName string = 'GlobalStandard'

@description('Content filter policy name')
param contentFilterPolicyName string = 'Microsoft.DefaultV2'

@description('Model deployment capacity')
param capacity int = 1

resource modelDeployment 'Microsoft.CognitiveServices/accounts/deployments@2024-04-01-preview' = {
  name: '${accountName}/${modelName}'
  sku: {
    name: skuName
    capacity: capacity
  }
  properties: {
    model: {
      format: modelPublisherFormat
      name: modelName
      version: modelVersion
    }
    raiPolicyName: contentFilterPolicyName == null ? 'Microsoft.Nill' : contentFilterPolicyName
  }
}

Run the deployment:

RESOURCE_GROUP="<resource-group-name>"
ACCOUNT_NAME="<azure-ai-model-inference-name>" 
MODEL_NAME="Phi-3.5-vision-instruct"
PROVIDER="Microsoft"
VERSION=2

az deployment group create \
    --resource-group $RESOURCE_GROUP \
    --template-file ai-services-deployment-template.bicep \
    --parameters accountName=$ACCOUNT_NAME modelName=$MODEL_NAME modelVersion=$VERSION modelPublisherFormat=$PROVIDER

Use the model

Deployed models in Azure AI model inference can be consumed using the Azure AI model's inference endpoint for the resource. When constructing your request, indicate the parameter model and insert the model deployment name you have created.

Share via

Add and configure models to Azure AI model inference

Prerequisites

Add a model

Manage models

Test the deployment in the playground

Prerequisites

Add models

Manage deployments

Use the model

Prerequisites

About this tutorial

Add the model

Use the model

Next steps

Feedback

Additional resources