Azure OpenAI Embedding skill
The Azure OpenAI Embedding skill connects to a deployed embedding model on your Azure OpenAI resource to generate embeddings during indexing. Your data is processed in the Geo where your model is deployed.
Prerequisites
Your Azure OpenAI Service must have an associated custom subdomain. If the service was created through the Azure portal, this subdomain is automatically generated as part of your service setup. Ensure that your service includes a custom subdomain before using it with the Azure AI Search integration.
Azure OpenAI Service resources (with access to embedding models) that were created in AI Studio aren't supported. Only the Azure OpenAI Service resources created in the Azure portal are compatible with the Azure OpenAI Embedding skill integration.
The Import and vectorize data wizard in the Azure portal uses the Azure OpenAI Embedding skill to vectorize content. You can run the wizard and review the generated skillset to see how the wizard builds the skill for embedding models.
Note
This skill is bound to Azure OpenAI and is charged at the existing Azure OpenAI pay-as-you go price.
@odata.type
Microsoft.Skills.Text.AzureOpenAIEmbeddingSkill
Data limits
The maximum size of a text input should be 8,000 tokens. If input exceeds the maximum allowed, the model throws an invalid request error. For more information, see the tokens key concept in the Azure OpenAI documentation. Consider using the Text Split skill if you need data chunking.
Skill parameters
Parameters are case-sensitive.
Inputs | Description |
---|---|
resourceUri |
The URI of the model provider, in this case, an Azure OpenAI resource. This parameter only supports URLs with domain openai.azure.com , such as https://<resourcename>.openai.azure.com . If the Azure OpenAI endpoint has a URL with domain cognitiveservices.azure.com , like https://<resourcename>.cognitiveservices.azure.com , a custom subdomain with openai.azure.com must be created first for the Azure OpenAI resource and use https://<resourcename>.openai.azure.com instead. |
apiKey |
The secret key used to access the model. If you provide a key, leave authIdentity empty. If you set both the apiKey and authIdentity , the apiKey is used on the connection. |
deploymentId |
The name of the deployed Azure OpenAI embedding model. The model should be an embedding model, such as text-embedding-ada-002. See the List of Azure OpenAI models for supported models. |
authIdentity |
A user-managed identity used by the search service for connecting to Azure OpenAI. You can use either a system or user managed identity. To use a system manged identity, leave apiKey and authIdentity blank. The system-managed identity is used automatically. A managed identity must have Cognitive Services OpenAI User permissions to send text to Azure OpenAI. |
modelName |
This property is required if your skillset is created using the 2024-05-01-preview or 2024-07-01 REST API. Set this property to the deployment name of an Azure OpenAI embedding model deployed on the provider specified through resourceUri and identified through deploymentId . Currently, the supported values are text-embedding-ada-002 , text-embedding-3-large , and text-embedding-3-small . |
dimensions |
(Optional, introduced in the 2024-05-01-preview REST API). The dimensions of embeddings that you would like to generate if the model supports reducing the embedding dimensions. Supported ranges are listed below. Defaults to the maximum dimensions for each model if not specified. For skillsets created using the 2023-10-01-preview, dimensions are fixed at 1536. |
Supported dimensions by modelName
The supported dimensions for an Azure OpenAI Embedding skill depend on the modelName
that is configured.
modelName |
Minimum dimensions | Maximum dimensions |
---|---|---|
text-embedding-ada-002 | 1536 | 1536 |
text-embedding-3-large | 1 | 3072 |
text-embedding-3-small | 1 | 1536 |
Skill inputs
Input | Description |
---|---|
text |
The input text to be vectorized. If you're using data chunking, the source might be /document/pages/* . |
Skill outputs
Output | Description |
---|---|
embedding |
Vectorized embedding for the input text. |
Sample definition
Consider a record that has the following fields:
{
"content": "Microsoft released Windows 10."
}
Then your skill definition might look like this:
{
"@odata.type": "#Microsoft.Skills.Text.AzureOpenAIEmbeddingSkill",
"description": "Connects a deployed embedding model.",
"resourceUri": "https://my-demo-openai-eastus.openai.azure.com/",
"deploymentId": "my-text-embedding-ada-002-model",
"modelName": "text-embedding-ada-002",
"dimensions": 1536,
"inputs": [
{
"name": "text",
"source": "/document/content"
}
],
"outputs": [
{
"name": "embedding"
}
]
}
Sample output
For the given input text, a vectorized embedding output is produced.
{
"embedding": [
0.018990106880664825,
-0.0073809814639389515,
....
0.021276434883475304,
]
}
The output resides in memory. To send this output to a field in the search index, you must define an outputFieldMapping that maps the vectorized embedding output (which is an array) to a vector field. Assuming the skill output resides in the document's embedding node, and content_vector is the field in the search index, the outputFieldMapping in indexer should look like:
"outputFieldMappings": [
{
"sourceFieldName": "/document/embedding/*",
"targetFieldName": "content_vector"
}
]
Best practices
The following are some best practices you need to consider when utilizing this skill:
If you are hitting your Azure OpenAI TPM (Tokens per minute) limit, consider the quota limits advisory so you can address accordingly. Refer to the Azure OpenAI monitoring documentation for more information about your Azure OpenAI instance performance.
The Azure OpenAI embeddings model deployment you use for this skill should be ideally separate from the deployment used for other use cases, including the query vectorizer. This helps each deployment to be tailored to its specific use case, leading to optimized performance and identifying traffic from the indexer and the index embedding calls easily.
Your Azure OpenAI instance should be in the same region or at least geographically close to the region where your AI Search service is hosted. This reduces latency and improves the speed of data transfer between the services.
If you have a larger than default Azure OpenAI TPM (Tokens per minute) limit as published in quotas and limits documentation, open a support case with the Azure AI Search team, so this can be adjusted accordingly. This helps your indexing process not being unnecessarily slowed down by the documented default TPM limit, if you have higher limits.
For examples and working code samples using this skill, see the following links:
Errors and warnings
Condition | Result |
---|---|
Null or invalid URI | Error |
Null or invalid deploymentID | Error |
Text is empty | Warning |
Text is larger than 8,000 tokens | Error |