TextCatalog.RemoveDefaultStopWords Yöntem
Tanım
Önemli
Bazı bilgiler ürünün ön sürümüyle ilgilidir ve sürüm öncesinde önemli değişiklikler yapılmış olabilir. Burada verilen bilgilerle ilgili olarak Microsoft açık veya zımni hiçbir garanti vermez.
CustomStopWordsRemovingEstimatoriçinde belirtilen inputColumnName
sütundaki verileri yeni bir sütuna kopyalayan ve bu sütuna outputColumnName
özgü language
önceden tanımlı metin kümesini kaldıran bir oluşturun.
public static Microsoft.ML.Transforms.Text.StopWordsRemovingEstimator RemoveDefaultStopWords (this Microsoft.ML.TransformsCatalog.TextTransforms catalog, string outputColumnName, string inputColumnName = default, Microsoft.ML.Transforms.Text.StopWordsRemovingEstimator.Language language = Microsoft.ML.Transforms.Text.StopWordsRemovingEstimator+Language.English);
static member RemoveDefaultStopWords : Microsoft.ML.TransformsCatalog.TextTransforms * string * string * Microsoft.ML.Transforms.Text.StopWordsRemovingEstimator.Language -> Microsoft.ML.Transforms.Text.StopWordsRemovingEstimator
<Extension()>
Public Function RemoveDefaultStopWords (catalog As TransformsCatalog.TextTransforms, outputColumnName As String, Optional inputColumnName As String = Nothing, Optional language As StopWordsRemovingEstimator.Language = Microsoft.ML.Transforms.Text.StopWordsRemovingEstimator+Language.English) As StopWordsRemovingEstimator
Parametreler
- catalog
- TransformsCatalog.TextTransforms
Dönüşümün kataloğu.
- outputColumnName
- String
dönüştürmesinden kaynaklanan sütunun inputColumnName
adı.
Bu sütunun veri türü, metnin değişken boyutlu vektörleri olacaktır.
- inputColumnName
- String
Verilerin kopyalanması için sütunun adı. Bu tahmin aracı, metin vektörünün üzerinde çalışır.
- language
- StopWordsRemovingEstimator.Language
Giriş metni sütununun inputColumnName
langauge'sı.
Döndürülenler
Örnekler
using System;
using System.Collections.Generic;
using Microsoft.ML;
using Microsoft.ML.Transforms.Text;
namespace Samples.Dynamic
{
public static class RemoveDefaultStopWords
{
public static void Example()
{
// Create a new ML context, for ML.NET operations. It can be used for
// exception tracking and logging, as well as the source of randomness.
var mlContext = new MLContext();
// Create an empty list as the dataset. The 'RemoveDefaultStopWords'
// does not require training data as the estimator
// ('StopWordsRemovingEstimator') created by 'RemoveDefaultStopWords'
// API is not a trainable estimator. The empty list is only needed to
// pass input schema to the pipeline.
var emptySamples = new List<TextData>();
// Convert sample list to an empty IDataView.
var emptyDataView = mlContext.Data.LoadFromEnumerable(emptySamples);
// A pipeline for removing stop words from input text/string.
// The pipeline first tokenizes text into words then removes stop words.
// The 'RemoveDefaultStopWords' API ignores casing of the text/string
// e.g. 'tHe' and 'the' are considered the same stop words.
var textPipeline = mlContext.Transforms.Text.TokenizeIntoWords("Words",
"Text")
.Append(mlContext.Transforms.Text.RemoveDefaultStopWords(
"WordsWithoutStopWords", "Words", language:
StopWordsRemovingEstimator.Language.English));
// Fit to data.
var textTransformer = textPipeline.Fit(emptyDataView);
// Create the prediction engine to remove the stop words from the input
// text /string.
var predictionEngine = mlContext.Model.CreatePredictionEngine<TextData,
TransformedTextData>(textTransformer);
// Call the prediction API to remove stop words.
var data = new TextData()
{
Text = "ML.NET's RemoveDefaultStopWords " +
"API removes stop words from tHe text/string. It requires the " +
"text/string to be tokenized beforehand."
};
var prediction = predictionEngine.Predict(data);
// Print the length of the word vector after the stop words removed.
Console.WriteLine("Number of words: " + prediction.WordsWithoutStopWords
.Length);
// Print the word vector without stop words.
Console.WriteLine("\nWords without stop words: " + string.Join(",",
prediction.WordsWithoutStopWords));
// Expected output:
// Number of words: 11
// Words without stop words: ML.NET's,RemoveDefaultStopWords,API,removes,stop,words,text/string.,requires,text/string,tokenized,beforehand.
}
private class TextData
{
public string Text { get; set; }
}
private class TransformedTextData : TextData
{
public string[] WordsWithoutStopWords { get; set; }
}
}
}