CLI (v2) 特徴量セット YAML スキーマ

[アーティクル]
09/01/2024

Note

このドキュメントで詳しく説明されている YAML 構文は、最新バージョンの ML CLI v2 拡張機能の JSON スキーマに基づいています。この構文は、ML CLI v2 拡張機能の最新バージョンでのみ動作することが保証されています。以前のバージョンの拡張機能のスキーマについては、https://azuremlschemasprod.azureedge.net/ でご確認いただけます。

YAML 構文

キー	Type	説明	使用できる値	既定値
$schema	string	YAML スキーマ。 Azure Machine Learning 用 VS Code 拡張機能を使用して YAML ファイルを作成する場合は、ファイルの先頭に $schema を含めることで、スキーマとリソースの入力候補を呼び出すことができます。
name	string	必須。特徴量セット名。
version	string	必須。特徴量セットのバージョン。
description	string	特徴量セットの説明。
specification	object	必須。特徴量セットの仕様。
specification.path	string	必須ローカル特徴量セット仕様フォルダーへのパス。
entities	オブジェクト (文字列の一覧)	必須。この特徴量セットが関連付けられているエンティティ。
ステージ	string	特徴量セットステージ。	開発、運用、アーカイブ	開発
tags	object	特徴量セットのタグの辞書。
materialization_settings	object	特徴量セットの具体化設定。
materialization_settings.offline_enabled	boolean	オフラインストレージへの具体化の特徴量値が有効になっているかどうか。	True、False
materialization_settings.schedule	object	具体化スケジュール。「CLI (v2) スケジュール YAML スキーマ」を参照してください
materialization_settings.schedule.frequency	string	スケジュールが構成されている場合は必須。繰り返しスケジュールの頻度を表す列挙型。	Day、Hour、Minute、Week、Month	日間
materialization_settings.schedule.interval	整数 (integer)	スケジュールが構成されている場合は必須。繰り返しジョブ間の間隔。
materialization_settings.schedule.time_zone	string	スケジュールトリガーのタイムゾーン。		UTC
materialization_settings.schedule.start_time	string	スケジュールトリガー時間。
materialization_settings.notification	object	具体化通知の設定。
materialization_settings.notification.email_on	オブジェクト (文字列の一覧)	通知が構成されている場合は必須。ジョブの状態がこの設定と一致すると、電子メール通知が送信されます。	JobFailed、JobCompleted、JobCancelled。
materialization_settings.notification.emails	オブジェクト (文字列の一覧)	通知が構成されている場合は必須。通知の送信先の電子メールアドレス。
materialization_settings.resource	object	具体化ジョブに使用される Azure Machine Learning Spark コンピューティングリソース。
materialization_settings.resource.instance_type	string	Azure Machine Learning Spark コンピューティングインスタンスの種類。	Standard_E4s_v3、Standard_E8s_v3、Standard_E16s_v3、Standard_E32s_v3、Standard_E64s_v3。サポートされている種類の更新された一覧については、「Azure Machine Learning での Apache Spark を使用した対話型データラングリング (プレビュー)」を参照してください。
materialization_settings.spark_configuration	ディクショナリ	Spark 構成の辞書

解説

az ml feature-set コマンドは、特徴量セットの管理に使用できます。

例

例は、GitHub リポジトリの例にあります。以下にいくつか示します。

YAML: basic

$schema: http://azureml/sdk-2-0/Featureset.json

name: transactions
version: "1"
description: 7-day and 3-day rolling aggregation of transactions featureset
specification:
  path: ./spec # path to feature set specification folder. Can be local (absolute path or relative path to current location) or cloud uri. Contains FeatureSetSpec.yaml + transformation code
entities: # entities associated with this feature-set
  - azureml:account:1
stage: Development

YAML: 具体化の構成を使用

name: transactions
version: "1"
description: 7-day and 3-day rolling aggregation of transactions featureset
specification:
  path: ./spec # path to feature set specification folder. Can be local (absolute path or relative path to current location) or cloud uri. Contains FeatureSetSpec.yaml + transformation code
entities: # entities associated with this feature-set
  - azureml:account:1
stage: Development
materialization_settings:
    offline_enabled: True
    schedule: # we use existing definition of schedule under job with some constraints. Recurrence pattern will not be supported.
        type: recurrence  # Only recurrence type would be supported
        frequency: Day # Only support Day and Hour
        interval: 1 #every day
        time_zone: "Pacific Standard Time"
    notification: 
        email_on:
        - JobFailed
        emails:
        - alice@microsoft.com

    resource:
        instance_type: Standard_E8S_V3
    spark_configuration:
        spark.driver.cores: 4
        spark.driver.memory: 36g
        spark.executor.cores: 4
        spark.executor.memory: 36g
        spark.executor.instances: 2

次の方法で共有

CLI (v2) 特徴量セット YAML スキーマ

YAML 構文

解説

例

YAML: basic

YAML: 具体化の構成を使用

次の手順

フィードバック

その他のリソース