TransformExtensionsCatalog.Concatenate Method

Reference

Definition

Namespace:: Microsoft.ML

Assembly:: Microsoft.ML.Data.dll

Package:: Microsoft.ML v3.0.1

Package:: Microsoft.ML v1.0.0

Package:: Microsoft.ML v1.1.0

Package:: Microsoft.ML v1.2.0

Package:: Microsoft.ML v1.3.1

Package:: Microsoft.ML v1.4.0

Package:: Microsoft.ML v1.5.5

Package:: Microsoft.ML v1.6.0

Package:: Microsoft.ML v1.7.0

Package:: Microsoft.ML v2.0.0

Important

Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.

Create a ColumnConcatenatingEstimator, which concatenates one or more input columns into a new output column.

public static Microsoft.ML.Transforms.ColumnConcatenatingEstimator Concatenate (this Microsoft.ML.TransformsCatalog catalog, string outputColumnName, params string[] inputColumnNames);

static member Concatenate : Microsoft.ML.TransformsCatalog * string * string[] -> Microsoft.ML.Transforms.ColumnConcatenatingEstimator

<Extension()>
Public Function Concatenate (catalog As TransformsCatalog, outputColumnName As String, ParamArray inputColumnNames As String()) As ColumnConcatenatingEstimator

Parameters

catalog: TransformsCatalog

The transform's catalog.

outputColumnName: String

Name of the column resulting from the transformation of inputColumnNames. This column's data type will be a vector of the input columns' data type.

inputColumnNames: String[]

Name of the columns to concatenate. This estimator operates over any data type except key type. If more that one column is provided, they must all have the same data type.

Returns

ColumnConcatenatingEstimator

Examples

using System;
using System.Collections.Generic;
using Microsoft.ML;
using Microsoft.ML.Data;

namespace Samples.Dynamic
{
    public static class Concatenate
    {
        public static void Example()
        {
            // Create a new ML context, for ML.NET operations. It can be used for
            // exception tracking and logging, as well as the source of randomness.
            var mlContext = new MLContext();

            // Create a small dataset as an IEnumerable.
            var samples = new List<InputData>()
            {
                new InputData(){ Feature1 = 0.1f, Feature2 = new[]{ 1.1f, 2.1f,
                    3.1f }, Feature3 = 1 },

                new InputData(){ Feature1 = 0.2f, Feature2 = new[]{ 1.2f, 2.2f,
                    3.2f }, Feature3 = 2 },

                new InputData(){ Feature1 = 0.3f, Feature2 = new[]{ 1.3f, 2.3f,
                    3.3f }, Feature3 = 3 },

                new InputData(){ Feature1 = 0.4f, Feature2 = new[]{ 1.4f, 2.4f,
                    3.4f }, Feature3 = 4 },

                new InputData(){ Feature1 = 0.5f, Feature2 = new[]{ 1.5f, 2.5f,
                    3.5f }, Feature3 = 5 },

                new InputData(){ Feature1 = 0.6f, Feature2 = new[]{ 1.6f, 2.6f,
                    3.6f }, Feature3 = 6 },
            };

            // Convert training data to IDataView.
            var dataview = mlContext.Data.LoadFromEnumerable(samples);

            // A pipeline for concatenating the "Feature1", "Feature2" and
            // "Feature3" columns together into a vector that will be the Features
            // column. Concatenation is necessary because trainers take feature
            // vectors as inputs.
            //
            // Please note that the "Feature3" column is converted from int32 to
            // float using the ConvertType. The Concatenate requires all columns to
            // be of same type.
            var pipeline = mlContext.Transforms.Conversion.ConvertType("Feature3",
                outputKind: DataKind.Single)
                .Append(mlContext.Transforms.Concatenate("Features", new[]
                    { "Feature1", "Feature2", "Feature3" }));

            // The transformed data.
            var transformedData = pipeline.Fit(dataview).Transform(dataview);

            // Now let's take a look at what this concatenation did.
            // We can extract the newly created column as an IEnumerable of
            // TransformedData.
            var featuresColumn = mlContext.Data.CreateEnumerable<TransformedData>(
                transformedData, reuseRowObject: false);

            // And we can write out a few rows
            Console.WriteLine($"Features column obtained post-transformation.");
            foreach (var featureRow in featuresColumn)
                Console.WriteLine(string.Join(" ", featureRow.Features));

            // Expected output:
            //  Features column obtained post-transformation.
            //  0.1 1.1 2.1 3.1 1
            //  0.2 1.2 2.2 3.2 2
            //  0.3 1.3 2.3 3.3 3
            //  0.4 1.4 2.4 3.4 4
            //  0.5 1.5 2.5 3.5 5
            //  0.6 1.6 2.6 3.6 6
        }

        private class InputData
        {
            public float Feature1;
            [VectorType(3)]
            public float[] Feature2;
            public int Feature3;
        }

        private sealed class TransformedData
        {
            public float[] Features { get; set; }
        }
    }
}

Applies to

Share via