TextCatalog.ApplyWordEmbedding メソッド

定義

オーバーロード

ApplyWordEmbedding(TransformsCatalog+TextTransforms, String, String, WordEmbeddingEstimator+PretrainedModelKind)

事前トレーニング WordEmbeddingEstimator済みの埋め込みモデルを使用して、テキストのベクターを数値ベクターに変換するテキスト特徴付け器を作成します。

ApplyWordEmbedding(TransformsCatalog+TextTransforms, String, String, String)

事前トレーニング WordEmbeddingEstimator済みの埋め込みモデルを使用して、テキストのベクトルを数値ベクトルに変換するテキスト特徴付け器を作成します。

ApplyWordEmbedding(TransformsCatalog+TextTransforms, String, String, WordEmbeddingEstimator+PretrainedModelKind)

事前トレーニング WordEmbeddingEstimator済みの埋め込みモデルを使用して、テキストのベクターを数値ベクターに変換するテキスト特徴付け器を作成します。

public static Microsoft.ML.Transforms.Text.WordEmbeddingEstimator ApplyWordEmbedding (this Microsoft.ML.TransformsCatalog.TextTransforms catalog, string outputColumnName, string inputColumnName = default, Microsoft.ML.Transforms.Text.WordEmbeddingEstimator.PretrainedModelKind modelKind = Microsoft.ML.Transforms.Text.WordEmbeddingEstimator+PretrainedModelKind.SentimentSpecificWordEmbedding);
static member ApplyWordEmbedding : Microsoft.ML.TransformsCatalog.TextTransforms * string * string * Microsoft.ML.Transforms.Text.WordEmbeddingEstimator.PretrainedModelKind -> Microsoft.ML.Transforms.Text.WordEmbeddingEstimator
<Extension()>
Public Function ApplyWordEmbedding (catalog As TransformsCatalog.TextTransforms, outputColumnName As String, Optional inputColumnName As String = Nothing, Optional modelKind As WordEmbeddingEstimator.PretrainedModelKind = Microsoft.ML.Transforms.Text.WordEmbeddingEstimator+PretrainedModelKind.SentimentSpecificWordEmbedding) As WordEmbeddingEstimator

パラメーター

catalog
TransformsCatalog.TextTransforms

テキスト関連の変換のカタログ。

outputColumnName
String

の変換によって生成される列の inputColumnName名前。 この列のデータ型は、次の Singleベクターになります。

inputColumnName
String

変換する列の名前。 に null設定すると、その値が outputColumnName ソースとして使用されます。 このエスティメーターは、テキスト データ型の既知のサイズベクトルに対して動作します。

戻り値

using System;
using System.Collections.Generic;
using Microsoft.ML;
using Microsoft.ML.Transforms.Text;

namespace Samples.Dynamic
{
    public static class ApplyWordEmbedding
    {
        public static void Example()
        {
            // Create a new ML context, for ML.NET operations. It can be used for
            // exception tracking and logging, as well as the source of randomness.
            var mlContext = new MLContext();

            // Create an empty list as the dataset. The 'ApplyWordEmbedding' does
            // not require training data as the estimator ('WordEmbeddingEstimator')
            // created by 'ApplyWordEmbedding' API is not a trainable estimator.
            // The empty list is only needed to pass input schema to the pipeline.
            var emptySamples = new List<TextData>();

            // Convert sample list to an empty IDataView.
            var emptyDataView = mlContext.Data.LoadFromEnumerable(emptySamples);

            // A pipeline for converting text into a 150-dimension embedding vector
            // using pretrained 'SentimentSpecificWordEmbedding' model. The
            // 'ApplyWordEmbedding' computes the minimum, average and maximum values
            // for each token's embedding vector. Tokens in 
            // 'SentimentSpecificWordEmbedding' model are represented as
            // 50 -dimension vector. Therefore, the output is of 150-dimension [min,
            // avg, max].
            //
            // The 'ApplyWordEmbedding' API requires vector of text as input.
            // The pipeline first normalizes and tokenizes text then applies word
            // embedding transformation.
            var textPipeline = mlContext.Transforms.Text.NormalizeText("Text")
                .Append(mlContext.Transforms.Text.TokenizeIntoWords("Tokens",
                    "Text"))
                .Append(mlContext.Transforms.Text.ApplyWordEmbedding("Features",
                    "Tokens", WordEmbeddingEstimator.PretrainedModelKind
                    .SentimentSpecificWordEmbedding));

            // Fit to data.
            var textTransformer = textPipeline.Fit(emptyDataView);

            // Create the prediction engine to get the embedding vector from the
            // input text/string.
            var predictionEngine = mlContext.Model.CreatePredictionEngine<TextData,
                TransformedTextData>(textTransformer);

            // Call the prediction API to convert the text into embedding vector.
            var data = new TextData()
            {
                Text = "This is a great product. I would " +
                "like to buy it again."
            };
            var prediction = predictionEngine.Predict(data);

            // Print the length of the embedding vector.
            Console.WriteLine($"Number of Features: {prediction.Features.Length}");

            // Print the embedding vector.
            Console.Write("Features: ");
            foreach (var f in prediction.Features)
                Console.Write($"{f:F4} ");

            //  Expected output:
            //   Number of Features: 150
            //   Features: -1.2489 0.2384 -1.3034 -0.9135 -3.4978 -0.1784 -1.3823 -0.3863 -2.5262 -0.8950 ...
        }

        private class TextData
        {
            public string Text { get; set; }
        }

        private class TransformedTextData : TextData
        {
            public float[] Features { get; set; }
        }
    }
}

適用対象

ApplyWordEmbedding(TransformsCatalog+TextTransforms, String, String, String)

事前トレーニング WordEmbeddingEstimator済みの埋め込みモデルを使用して、テキストのベクトルを数値ベクトルに変換するテキスト特徴付け器を作成します。

public static Microsoft.ML.Transforms.Text.WordEmbeddingEstimator ApplyWordEmbedding (this Microsoft.ML.TransformsCatalog.TextTransforms catalog, string outputColumnName, string customModelFile, string inputColumnName = default);
static member ApplyWordEmbedding : Microsoft.ML.TransformsCatalog.TextTransforms * string * string * string -> Microsoft.ML.Transforms.Text.WordEmbeddingEstimator
<Extension()>
Public Function ApplyWordEmbedding (catalog As TransformsCatalog.TextTransforms, outputColumnName As String, customModelFile As String, Optional inputColumnName As String = Nothing) As WordEmbeddingEstimator

パラメーター

catalog
TransformsCatalog.TextTransforms

テキスト関連の変換のカタログ。

outputColumnName
String

の変換によって生成される列の inputColumnName名前。 この列のデータ型は、次の Singleベクターになります。

customModelFile
String

使用する事前トレーニング済みの埋め込みモデルのパス。

inputColumnName
String

変換する列の名前。 に null設定すると、その値が outputColumnName ソースとして使用されます。 このエスティメーターは、テキスト データ型の既知のサイズベクトルに対して動作します。

戻り値

using System;
using System.Collections.Generic;
using System.IO;
using Microsoft.ML;

namespace Samples.Dynamic
{
    public static class ApplyCustomWordEmbedding
    {
        public static void Example()
        {
            // Create a new ML context, for ML.NET operations. It can be used for
            // exception tracking and logging, as well as the source of randomness.
            var mlContext = new MLContext();

            // Create an empty list as the dataset. The 'ApplyWordEmbedding' does
            // not require training data as the estimator ('WordEmbeddingEstimator')
            // created by 'ApplyWordEmbedding' API is not a trainable estimator.
            // The empty list is only needed to pass input schema to the pipeline.
            var emptySamples = new List<TextData>();

            // Convert sample list to an empty IDataView.
            var emptyDataView = mlContext.Data.LoadFromEnumerable(emptySamples);

            // Write a custom 3-dimensional word embedding model with 4 words.
            // Each line follows '<word> <float> <float> <float>' pattern.
            // Lines that do not confirm to the pattern are ignored.
            var pathToCustomModel = @".\custommodel.txt";
            using (StreamWriter file = new StreamWriter(pathToCustomModel, false))
            {
                file.WriteLine("great 1.0 2.0 3.0");
                file.WriteLine("product -1.0 -2.0 -3.0");
                file.WriteLine("like -1 100.0 -100");
                file.WriteLine("buy 0 0 20");
            }

            // A pipeline for converting text into a 9-dimension word embedding
            // vector using the custom word embedding model. The 
            // 'ApplyWordEmbedding' computes the minimum, average and maximum values
            // for each token's embedding vector. Tokens in 'custommodel.txt' model
            // are represented as 3-dimension vector. Therefore, the output is of
            // 9 -dimension [min, avg, max].
            //
            // The 'ApplyWordEmbedding' API requires vector of text as input.
            // The pipeline first normalizes and tokenizes text then applies word
            // embedding transformation.
            var textPipeline = mlContext.Transforms.Text.NormalizeText("Text")
                .Append(mlContext.Transforms.Text.TokenizeIntoWords("Tokens",
                    "Text"))
                .Append(mlContext.Transforms.Text.ApplyWordEmbedding("Features",
                    pathToCustomModel, "Tokens"));

            // Fit to data.
            var textTransformer = textPipeline.Fit(emptyDataView);

            // Create the prediction engine to get the embedding vector from the
            // input text/string.
            var predictionEngine = mlContext.Model.CreatePredictionEngine<TextData,
                TransformedTextData>(textTransformer);

            // Call the prediction API to convert the text into embedding vector.
            var data = new TextData()
            {
                Text = "This is a great product. I would " +
                "like to buy it again."
            };
            var prediction = predictionEngine.Predict(data);

            // Print the length of the embedding vector.
            Console.WriteLine($"Number of Features: {prediction.Features.Length}");

            // Print the embedding vector.
            Console.Write("Features: ");
            foreach (var f in prediction.Features)
                Console.Write($"{f:F4} ");

            //  Expected output:
            //   Number of Features: 9
            //   Features: -1.0000 0.0000 -100.0000 0.0000 34.0000 -25.6667 1.0000 100.0000 20.0000
        }

        private class TextData
        {
            public string Text { get; set; }
        }

        private class TransformedTextData : TextData
        {
            public float[] Features { get; set; }
        }
    }
}

適用対象