TorchSharpCatalog.SentenceSimilarity Method

Definition

Overloads

SentenceSimilarity(RegressionCatalog+RegressionTrainers, SentenceSimilarityTrainer+SentenceSimilarityOptions)

Fine tune a NAS-BERT model for NLP sentence Similarity. The limit for any sentence is 512 tokens. Each word typically will map to a single token, and we automatically add 2 specical tokens (a start token and a separator token) so in general this limit will be 510 words for all sentences.

SentenceSimilarity(RegressionCatalog+RegressionTrainers, String, String, String, String, Int32, Int32, BertArchitecture, IDataView)

Fine tune a NAS-BERT model for NLP sentence Similarity. The limit for any sentence is 512 tokens. Each word typically will map to a single token, and we automatically add 2 specical tokens (a start token and a separator token) so in general this limit will be 510 words for all sentences.

SentenceSimilarity(RegressionCatalog+RegressionTrainers, SentenceSimilarityTrainer+SentenceSimilarityOptions)

Fine tune a NAS-BERT model for NLP sentence Similarity. The limit for any sentence is 512 tokens. Each word typically will map to a single token, and we automatically add 2 specical tokens (a start token and a separator token) so in general this limit will be 510 words for all sentences.

public static Microsoft.ML.TorchSharp.NasBert.SentenceSimilarityTrainer SentenceSimilarity (this Microsoft.ML.RegressionCatalog.RegressionTrainers catalog, Microsoft.ML.TorchSharp.NasBert.SentenceSimilarityTrainer.SentenceSimilarityOptions options);
static member SentenceSimilarity : Microsoft.ML.RegressionCatalog.RegressionTrainers * Microsoft.ML.TorchSharp.NasBert.SentenceSimilarityTrainer.SentenceSimilarityOptions -> Microsoft.ML.TorchSharp.NasBert.SentenceSimilarityTrainer
<Extension()>
Public Function SentenceSimilarity (catalog As RegressionCatalog.RegressionTrainers, options As SentenceSimilarityTrainer.SentenceSimilarityOptions) As SentenceSimilarityTrainer

Parameters

catalog
RegressionCatalog.RegressionTrainers

The transform's catalog.

Returns

Applies to

SentenceSimilarity(RegressionCatalog+RegressionTrainers, String, String, String, String, Int32, Int32, BertArchitecture, IDataView)

Fine tune a NAS-BERT model for NLP sentence Similarity. The limit for any sentence is 512 tokens. Each word typically will map to a single token, and we automatically add 2 specical tokens (a start token and a separator token) so in general this limit will be 510 words for all sentences.

public static Microsoft.ML.TorchSharp.NasBert.SentenceSimilarityTrainer SentenceSimilarity (this Microsoft.ML.RegressionCatalog.RegressionTrainers catalog, string labelColumnName = "Label", string scoreColumnName = "Score", string sentence1ColumnName = "Sentence1", string sentence2ColumnName = "Sentence2", int batchSize = 32, int maxEpochs = 10, Microsoft.ML.TorchSharp.NasBert.BertArchitecture architecture = Microsoft.ML.TorchSharp.NasBert.BertArchitecture.Roberta, Microsoft.ML.IDataView validationSet = default);
static member SentenceSimilarity : Microsoft.ML.RegressionCatalog.RegressionTrainers * string * string * string * string * int * int * Microsoft.ML.TorchSharp.NasBert.BertArchitecture * Microsoft.ML.IDataView -> Microsoft.ML.TorchSharp.NasBert.SentenceSimilarityTrainer
<Extension()>
Public Function SentenceSimilarity (catalog As RegressionCatalog.RegressionTrainers, Optional labelColumnName As String = "Label", Optional scoreColumnName As String = "Score", Optional sentence1ColumnName As String = "Sentence1", Optional sentence2ColumnName As String = "Sentence2", Optional batchSize As Integer = 32, Optional maxEpochs As Integer = 10, Optional architecture As BertArchitecture = Microsoft.ML.TorchSharp.NasBert.BertArchitecture.Roberta, Optional validationSet As IDataView = Nothing) As SentenceSimilarityTrainer

Parameters

catalog
RegressionCatalog.RegressionTrainers

The transform's catalog.

labelColumnName
String

Name of the label column. Column should be a float type.

scoreColumnName
String

Name of the score column.

sentence1ColumnName
String

Name of the column for the first sentence.

sentence2ColumnName
String

Name of the column for the second sentence. Only required if your NLP classification requires sentence pairs.

batchSize
Int32

Number of rows in the batch.

maxEpochs
Int32

Maximum number of times to loop through your training set.

architecture
BertArchitecture

Architecture for the model. Defaults to Roberta.

validationSet
IDataView

The validation set used while training to improve model quality.

Returns

Applies to