Azure AI Search で使用するモデルをデプロイする

[アーティクル]
09/03/2024

この記事では、Azure AI Search で使用するモデルを、Azure Machine Learning を使用してデプロイする方法について説明します。

Azure AI Search では、異種コンテンツを人やアプリケーションが照会できるようにするために、コンテンツ処理が実行されます。このプロセスは、Azure Machine Learning からデプロイされたモデルを使用して拡張することができます。

Azure Machine Learning は、トレーニング済みのモデルを Web サービスとしてデプロイすることができます。 Web サービスはその後、処理パイプラインの一部になる Azure AI Search の "スキル" に埋め込まれます。

重要

この記事の情報は、モデルのデプロイに限定されます。そのモデルを Azure AI Search で使用するためのデプロイ構成のサポートについて取り上げています。

デプロイされたモデルを使用するための Azure AI Search の構成方法については、「Azure Machine Learning を使用してカスタムスキルを作成およびデプロイする」のチュートリアルを参照してください。

Azure AI Search で使用するモデルをデプロイするときは、次の要件を満たす必要があります。

推論に使用するモデルは、Azure Kubernetes Service を使用してホストする。
Azure Kubernetes Service のトランスポート層セキュリティ (TLS) を有効にする。 Azure AI Search とデプロイされたモデルとの間の HTTPS 通信には、TLS を使用してセキュリティが確保されます。
エントリスクリプトは、inference_schema パッケージを使用してサービスの OpenAPI (Swagger) スキーマを生成する。
また、そのエントリスクリプトは、JSON データを入力として受け取り、出力として JSON を生成する。

前提条件

Azure Machine Learning ワークスペース。詳細については、ワークスペースリソースの作成に関するページを参照してください。
Azure Machine Learning SDK がインストールされた Python 開発環境。詳細については、Azure Machine Learning Web サービスに関するページを参照してください。
登録済みのモデル。
モデルをデプロイする方法と場所についての一般的な理解。

ワークスペースに接続する

Azure Machine Learning ワークスペースでは、Azure Machine Learning を使用するときに作成する、すべての成果物を操作するための一元的な場所が提供されます。ワークスペースには、スクリプトのログ、メトリック、出力、スナップショットなど、すべてのトレーニングジョブの履歴が保持されます。

既存のワークスペースに接続するには、次のコードを使用します。

重要

このコードスニペットでは、ワークスペースの構成が現在のディレクトリまたはその親に保存されていることを想定しています。詳細については、Azure Machine Learning ワークスペースの作成と管理に関するページを参照してください。構成をファイルに保存する方法について詳しくは、「ワークスペース構成ファイルを作成する」を参照してください。

from azureml.core import Workspace

try:
    # Load the workspace configuration from local cached inffo
    ws = Workspace.from_config()
    print(ws.name, ws.location, ws.resource_group, ws.location, sep='\t')
    print('Library configuration succeeded')
except:
    print('Workspace not found')

Kubernetes クラスターを作成する

推定所要時間: 約 20 分です。

Kubernetes クラスターは、コンテナー化されたアプリケーションの実行に使用される一連の仮想マシンインスタンス ("ノード" と呼ばれる) です。

Azure Machine Learning から Azure Kubernetes Service にモデルをデプロイすると、モデルとそれを Web サービスとしてホストするために必要なすべての資産とが 1 つの Docker コンテナーにパッケージ化されます。その後、このコンテナーがクラスターにデプロイされます。

次のコードは、ワークスペース用に新しい Azure Kubernetes Service (AKS) クラスターを作成する方法を示しています。

ヒント

既存の Azure Kubernetes Service クラスターを Azure Machine Learning のワークスペースにアタッチすることもできます。詳細については、「Azure Kubernetes Service にモデルをデプロイする方法」に関するページを参照してください。

重要

このコードは、enable_ssl() メソッドを使用して、クラスターのトランスポート層セキュリティ (TLS) を有効にしている点に注目してください。デプロイしたモデルを Azure AI Search から使用する予定がある場合、これは必須です。

from azureml.core.compute import AksCompute, ComputeTarget
# Create or attach to an AKS inferencing cluster

# Create the provisioning configuration with defaults
prov_config = AksCompute.provisioning_configuration()

# Enable TLS (sometimes called SSL) communications
# Leaf domain label generates a name using the formula
#  "<leaf-domain-label>######.<azure-region>.cloudapp.azure.com"
#  where "######" is a random series of characters
prov_config.enable_ssl(leaf_domain_label = "contoso")

cluster_name = 'amlskills'
# Try to use an existing compute target by that name.
# If one doesn't exist, create one.
try:
    
    aks_target = ComputeTarget(ws, cluster_name)
    print("Attaching to existing cluster")
except Exception as e:
    print("Creating new cluster")
    aks_target = ComputeTarget.create(workspace = ws, 
                                  name = cluster_name, 
                                  provisioning_configuration = prov_config)
    # Wait for the create process to complete
    aks_target.wait_for_completion(show_output = True)

重要

Azure では、AKS クラスターが存在する限り、課金が行われます。使い終わったら、必ず自分の AKS クラスターを削除してください。

AKS と Azure Machine Learning の使用の詳細については、Azure Kubernetes Service にデプロイする方法に関するページを参照してください。

エントリスクリプトを記述する

エントリスクリプトは、Web サービスに送信されたデータを受け取り、それをモデルに渡して、スコア付けの結果を返します。次のスクリプトでは、起動時にモデルを読み込み、モデルを使用してデータをスコア付けします。このファイルは、score.py とも呼ばれます。

ヒント

エントリスクリプトは、モデルに固有のものです。たとえば、スクリプトは、モデルで使用するフレームワークやデータ形式などを認識している必要があります。

重要

デプロイしたモデルを Azure AI Search から使用する予定がある場合は、inference_schema パッケージを使用して、デプロイのスキーマ生成を有効にする必要があります。このパッケージにはデコレーターが用意されており、モデルを使用して推論を実行する Web サービスの入力データ形式と出力データ形式は、このデコレーターを使用して定義することができます。

from azureml.core.model import Model
from nlp_architect.models.absa.inference.inference import SentimentInference
from spacy.cli.download import download as spacy_download
import traceback
import json
# Inference schema for schema discovery
from inference_schema.schema_decorators import input_schema, output_schema
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType
from inference_schema.parameter_types.standard_py_parameter_type import StandardPythonParameterType

def init():
    """
    Set up the ABSA model for Inference  
    """
    global SentInference
    spacy_download('en')
    aspect_lex = Model.get_model_path('hotel_aspect_lex')
    opinion_lex = Model.get_model_path('hotel_opinion_lex') 
    SentInference = SentimentInference(aspect_lex, opinion_lex)

# Use inference schema decorators and sample input/output to
# build the OpenAPI (Swagger) schema for the deployment
standard_sample_input = {'text': 'a sample input record containing some text' }
standard_sample_output = {"sentiment": {"sentence": "This place makes false booking prices, when you get there, they say they do not have the reservation for that day.", 
                                        "terms": [{"text": "hotels", "type": "AS", "polarity": "POS", "score": 1.0, "start": 300, "len": 6}, 
                                                  {"text": "nice", "type": "OP", "polarity": "POS", "score": 1.0, "start": 295, "len": 4}]}}
@input_schema('raw_data', StandardPythonParameterType(standard_sample_input))
@output_schema(StandardPythonParameterType(standard_sample_output))    
def run(raw_data):
    try:
        # Get the value of the 'text' field from the JSON input and perform inference
        input_txt = raw_data["text"]
        doc = SentInference.run(doc=input_txt)
        if doc is None:
            return None
        sentences = doc._sentences
        result = {"sentence": doc._doc_text}
        terms = []
        for sentence in sentences:
            for event in sentence._events:
                for x in event:
                    term = {"text": x._text, "type":x._type.value, "polarity": x._polarity.value, "score": x._score,"start": x._start,"len": x._len }
                    terms.append(term)
        result["terms"] = terms
        print("Success!")
        # Return the results to the client as a JSON document
        return {"sentiment": result}
    except Exception as e:
        result = str(e)
        # return error message back to the client
        print("Failure!")
        print(traceback.format_exc())
        return json.dumps({"error": result, "tb": traceback.format_exc()})

エントリスクリプトの詳細については、デプロイする方法と場所に関するページをご覧ください。

ソフトウェア環境を定義する

この環境クラスは、サービスの Python 依存関係を定義する目的で使用されます。ここには、モデルとエントリスクリプトの両方に必要な依存関係が含まれます。この例では、通常の pypi インデックスと GitHub リポジトリからパッケージをインストールします。

from azureml.core.conda_dependencies import CondaDependencies 
from azureml.core import Environment

conda = None
pip = ["azureml-defaults", "azureml-monitoring", 
       "git+https://github.com/NervanaSystems/nlp-architect.git@absa", 'nlp-architect', 'inference-schema',
       "spacy==2.0.18"]

conda_deps = CondaDependencies.create(conda_packages=None, pip_packages=pip)

myenv = Environment(name='myenv')
myenv.python.conda_dependencies = conda_deps

環境の詳細については、トレーニングとデプロイのための環境の作成と管理に関する記事を参照してください。

デプロイ構成を定義する

デプロイ構成では、Web サービスの実行に使用する Azure Kubernetes Service ホスト環境を定義します。

ヒント

デプロイに必要なメモリ、CPU、GPU がよくわからない場合は、プロファイリングを使用してそれらを調べることができます。詳細については、モデルをデプロイする方法と場所に関するページを参照してください。

from azureml.core.model import Model
from azureml.core.webservice import Webservice
from azureml.core.image import ContainerImage
from azureml.core.webservice import AksWebservice, Webservice

# If deploying to a cluster configured for dev/test, ensure that it was created with enough
# cores and memory to handle this deployment configuration. Note that memory is also used by
# things such as dependencies and Azure Machine Learning components.

aks_config = AksWebservice.deploy_configuration(autoscale_enabled=True, 
                                                       autoscale_min_replicas=1, 
                                                       autoscale_max_replicas=3, 
                                                       autoscale_refresh_seconds=10, 
                                                       autoscale_target_utilization=70,
                                                       auth_enabled=True, 
                                                       cpu_cores=1, memory_gb=2, 
                                                       scoring_timeout_ms=5000, 
                                                       replica_max_concurrent_requests=2, 
                                                       max_request_wait_time=5000)

詳細については、リファレンスドキュメントで AksService.deploy_configuration に関するページをご覧ください。

推論構成を定義する

推論構成は、エントリスクリプトと環境オブジェクトをポイントします。

from azureml.core.model import InferenceConfig
inf_config = InferenceConfig(entry_script='score.py', environment=myenv)

詳細については、リファレンスドキュメントで InferenceConfig に関するページをご覧ください。

モデルをデプロイする

モデルを自分の AKS クラスターにデプロイし、それによってサービスが作成されるのを待ちます。この例では、登録済みの 2 つのモデルがレジストリから読み込まれて AKS にデプロイされます。デプロイ後、そこに含まれる score.py ファイルが、それらのモデルを読み込んで推論を実行します。

from azureml.core.webservice import AksWebservice, Webservice

c_aspect_lex = Model(ws, 'hotel_aspect_lex')
c_opinion_lex = Model(ws, 'hotel_opinion_lex') 
service_name = "hotel-absa-v2"

aks_service = Model.deploy(workspace=ws,
                           name=service_name,
                           models=[c_aspect_lex, c_opinion_lex],
                           inference_config=inf_config,
                           deployment_config=aks_config,
                           deployment_target=aks_target,
                           overwrite=True)

aks_service.wait_for_deployment(show_output = True)
print(aks_service.state)

詳細については、リファレンスドキュメントで Model に関するページをご覧ください。

サービスにサンプルクエリを発行する

次の例では、前のコードセクションで aks_service 変数に格納されたデプロイ情報を使用しています。この変数を使用して、サービスとの通信に必要な認証トークンとスコアリングの URL を取得します。

import requests
import json

primary, secondary = aks_service.get_keys()

# Test data
input_data = '{"raw_data": {"text": "This is a nice place for a relaxing evening out with friends. The owners seem pretty nice, too. I have been there a few times including last night. Recommend."}}'

# Since authentication was enabled for the deployment, set the authorization header.
headers = {'Content-Type':'application/json',  'Authorization':('Bearer '+ primary)} 

# Send the request and display the results
resp = requests.post(aks_service.scoring_uri, input_data, headers=headers)
print(resp.text)

サービスから返される結果は、次のような JSON になっています。

{"sentiment": {"sentence": "This is a nice place for a relaxing evening out with friends. The owners seem pretty nice, too. I have been there a few times including last night. Recommend.", "terms": [{"text": "place", "type": "AS", "polarity": "POS", "score": 1.0, "start": 15, "len": 5}, {"text": "nice", "type": "OP", "polarity": "POS", "score": 1.0, "start": 10, "len": 4}]}}

Azure AI Search に接続する

このモデルを Azure AI Search から使用する方法については、「Azure Machine Learning を使用してカスタムスキルを作成およびデプロイする」のチュートリアルを参照してください。

リソースのクリーンアップ

この例専用に AKS クラスターを作成した場合は、Azure AI Search でテストした後にリソースを削除してください。

重要

Azure では、AKS クラスターがデプロイされている時間に基づいて課金が行われます。これが完了した後は必ずクリーンアップしてください。

aks_service.delete()
aks_target.delete()

次のステップ

Azure Machine Learning を使用してカスタムスキルを作成およびデプロイする

次の方法で共有

Azure AI Search で使用するモデルをデプロイする

前提条件

ワークスペースに接続する

Kubernetes クラスターを作成する

エントリスクリプトを記述する

ソフトウェア環境を定義する

デプロイ構成を定義する

推論構成を定義する

モデルをデプロイする

サービスにサンプルクエリを発行する

Azure AI Search に接続する

リソースのクリーンアップ

次のステップ

フィードバック

その他のリソース

次の方法で共有

Azure AI Search で使用するモデルをデプロイする

前提条件

ワークスペースに接続する

Kubernetes クラスターを作成する

エントリ スクリプトを記述する

ソフトウェア環境を定義する

デプロイ構成を定義する

推論構成を定義する

モデルをデプロイする

サービスにサンプル クエリを発行する

Azure AI Search に接続する

リソースのクリーンアップ

次のステップ

フィードバック

その他のリソース

エントリスクリプトを記述する

サービスにサンプルクエリを発行する