Tour of “Cloud Numerics” Linear Algebra Library
Introduction
The LinearAlgebra namespace in “Cloud Numerics” is divided into 3 main classes:
- Microsoft.Numerics.LinearAlgebra.Decompositions
- Microsoft.Numerics.LinearAlgebra.LinearSolvers
- Microsoft.Numerics.LinearAlgebra.Operations
In this post we’ll look at examples from each of the classes, and how they can be combined to implement interesting algorithms on matrices.
Operations
Let’s kick off the tour with an example using methods from the Operations class. We compute the largest eigenvector of a matrix by power iteration. We use Operations.MatrixMultiply to iterate the result, and Operations.Norm, first to scale the result to unit length, and then to check the convergence
using System;
using System.Text;
using Microsoft.Numerics;
using msnl = Microsoft.Numerics.Local;
using msnd = Microsoft.Numerics.Distributed;
using Microsoft.Numerics.Statistics;
using Microsoft.Numerics.LinearAlgebra;
namespace MSCloudNumericsApp
{
public class Program
{
static void LargestEigenVector()
{
long size = 20;
var D = ProbabilityDistributions.Uniform(0d, 1d, new long[] { size, size });
var x = ProbabilityDistributions.Uniform(0d, 1d, new long[] { size,1 });
Microsoft.Numerics.Local.NumericDenseArray<double> xnew;
int maxiter = 20;
double tol = 1E-6;
int iter = 0;
double norm = 1.0;
while (iter < maxiter && norm > tol)
{
iter++;
xnew = Operations.MatrixMultiply(D, x);
xnew = xnew / Operations.Norm(xnew,NormType.TwoNorm);
norm = Operations.Norm(x - xnew, NormType.InfinityNorm);
x = xnew;
Console.WriteLine("Iteration {0}:\t Norm of difference: {1}", iter, norm);
}
Console.WriteLine("Largest eigenvector: {0}", x);
}
}
}
For normalization we use the L2 norm, by passing the NormType.TwoNorm argument value.
For exact eigenvalues, the identity A*x=e*x should hold. We check this by comparing x and xnew: at convergence they should be close to one another. We perform the comparison by using NormType.InfinityNorm that gives the largest magnitude element-wise difference.
The algorithm should reach convergence after about 8 iterations.
Matrix decompositions
The decompositions, broadly speaking, fall into two categories
- Decompositions that facilitate solving systems of linear equations, for example LU and Cholesky
- Decompositions aimed at analyzing the spectrum of a matrix, for example EigenSolve and Svd
The workflow for category 1 involves computing the decomposition and passing the result object into the appropriate linear solver function. For example, one can use Cholesky decomposition as
var choleskyDecomposition = Decompositions.Cholesky(A, UpperLower.UpperTriangle);
x = LinearSolvers.Cholesky(choleskyDecomposition, b);
Because this category of methods is closely related to linear solvers, we’ll take a closer look in next section “Linear solvers”
The methods from category 2 can be used to represent a matrix in a form that can be analyzed more easily. Let’s try an example: singular value decomposition to compute a low-rank approximation of a matrix. In the following code, we decompose the matrix and then reconstruct it, first by keeping only the largest singular value, and then adding second largest value, and so forth.
static void TestSvdConvergence()
{
long rows = 20;
long columns = 10;
var D = ProbabilityDistributions.Uniform(0d, 1d, new long[] { rows, columns });
var svdResult = Decompositions.SvdThin(D);
var approxS = msnl.NumericDenseArrayFactory.Zeros<double>(new long[] { columns, columns });
for (int i = 0; i < columns; i++)
{
approxS[i, i] = svdResult.S[i];
var reconstructedD = Operations.MatrixMultiply(svdResult.U, Operations.MatrixMultiply(approxS, svdResult.V));
double norm = Operations.Norm(D - reconstructedD, NormType.FrobeniusNorm);
Console.WriteLine("Frobenius norm of difference with {0} singular values {1} ", i + 1, norm);
}
}
We compare the reconstructed matrix to the original one by computing the Frobenius norm of the difference. As we increase the number of singular values, the norm gets smaller and smaller as the reconstructed matrix gets closer and closer to the original one. For all 10 singular values the result should be exactly the same, apart from small numerical noise.
Note that we are using the Decompositions.SvdThin method. This is more memory-efficient than Svd for non-square matrices. To give an example, if your data was a tall-and-thin 1 million by 100 matrix of doubles, the resulting U matrix would be a huge million by million array that consumes 8 terabytes of memory, with 999900 rows filled with zeros. However, SvdThin keeps only the 100 non-zero rows, producing a much more manageable 0.8 gigabyte array.
Linear solvers
There are two kinds of operations: general-purpose ones and specialized ones that can be much faster if your data has special symmetries. Consider, for example, Cholesky decomposition. If your data happens to be a symmetric positive-definite matrix, it’s much faster to compute the Cholesky decomposition first, and then use the LinearSolvers.Cholesky method to solve the linear system of equations, compared to using the general-purpose LinearSolvers.General method.
There’s another benefit to using specialized methods. Consider what LinearSolvers.General does under the covers: it first computes a matrix factorization internally and then solves the system of linear equations against the factorized matrix. Often, this internal factorization step is the expensive part of the computation. The factorization only depends on matrix A. If you have to compute linear solve many times against the same matrix - for example, in iteration A*x2 = f(x1), A*x3 = f(x2) - it would be nice to be able to re-use the same factorization. Using the specialized method allows you to do exactly that: pre-compute the Cholesky decomposition once and then re-use it as input to LinearSolvers.Cholesky.
Let’s look at an example to compare the performance of the two approaches. We’ll first need to generate a matrix that is guaranteed to be symmetric and positive-definite. First, we matrix multiply an n-by-1 and a 1-by-n random number array to create an n-by-n distributed matrix: this is an efficient way to create a large distributed matrix. Second, we add together the matrix and its transpose to create a symmetric matrix. Finally, we increase the values on the diagonal to make the matrix positive-definite: this follows from the Gershgorin circle theorem.
static msnd.NumericDenseArray<double> CreatePositiveDefiniteSymmetricMatrix(long n)
{
msnd.NumericDenseArray<double> columnVector = ProbabilityDistributions.Uniform(0.1d, 0.9d, new long[] { n, 1 });
msnd.NumericDenseArray<double> rowVector = ProbabilityDistributions.Uniform(0.1d, 0.9d, new long[] { 1, n });
msnd.NumericDenseArray<double> diagVector = ProbabilityDistributions.Uniform(2.0d, 4.0d, new long[] { n });
var A = Operations.MatrixMultiply(columnVector, rowVector);
// Make matrix symmetric
A = A + A.Transpose();
// Increase the diagonal values to make sure the matrix is positive-definite
for (long i = 0; i < n; i++)
{
A[i, i] = n * diagVector[i];
}
return A;
}
We’ll time 5 repeated computations of LinearSolve.General.
static void TimeLinearSolversGeneral(int nIter, msnd.NumericDenseArray<double> A, msnd.NumericDenseArray<double> b, msnd.NumericDenseArray<double> delta)
{
msnd.NumericDenseArray<double> x;
DateTime t0 = DateTime.Now;
for (int i = 0; i < nIter; i++)
{
x = LinearSolvers.General(A, b);
x = x + delta;
}
Console.WriteLine("General linear solve time: {0} seconds ", (DateTime.Now - t0).TotalMilliseconds / 1000.0);
}
Next, we perform the same computation using Cholesky decomposition: we store the decomposition and re-use it as input to the LinearSolvers.Cholesky method. Note that because the matrix is assumed symmetric, the Cholesky method only uses values in its upper or lower triangle, as specified by the UpperLower argument.
static void TimeLinearSolversCholesky(int nIter, msnd.NumericDenseArray<double> A, msnd.NumericDenseArray<double> b, Microsoft.Numerics.Distributed.NumericDenseArray<double> delta)
{
msnd.NumericDenseArray<double> x;
DateTime t1 = DateTime.Now;
var choleskyDecomposition = Decompositions.Cholesky(A, UpperLower.UpperTriangle);
x = LinearSolvers.Cholesky(choleskyDecomposition, b);
for (int i = 0; i < nIter; i++)
{
x = LinearSolvers.Cholesky(choleskyDecomposition, b);
x = x + delta;
}
Console.WriteLine("Linear solve time using Cholesky: {0} seconds ", (DateTime.Now - t1).TotalMilliseconds / 1000.0);
}
The Cholesky code should be 6-7 times faster.
Examples as single code
This concludes the our tour of the LinearAlgebra namespace. To try the examples yourself, you can create a Cloud Numerics application and copy-paste the code below:
using System;
using System.Text;
using Microsoft.Numerics;
using msnl = Microsoft.Numerics.Local;
using msnd = Microsoft.Numerics.Distributed;
using Microsoft.Numerics.Statistics;
using Microsoft.Numerics.LinearAlgebra;
namespace MSCloudNumericsApp
{
public class Program
{
static void LargestEigenVector()
{
long size = 20;
var D = ProbabilityDistributions.Uniform(0d, 1d, new long[] { size, size });
var x = ProbabilityDistributions.Uniform(0d, 1d, new long[] { size,1 });
Microsoft.Numerics.Local.NumericDenseArray<double> xnew;
int maxiter = 20;
double tol = 1E-6;
int iter = 0;
double norm = 1.0;
while (iter < maxiter && norm > tol)
{
iter++;
xnew = Operations.MatrixMultiply(D, x);
xnew = xnew / Operations.Norm(xnew,NormType.TwoNorm);
norm = Operations.Norm(x - xnew, NormType.InfinityNorm);
x = xnew;
Console.WriteLine("Iteration {0}:\t Norm of difference: {1}", iter, norm);
}
Console.WriteLine("Largest eigenvector: {0}", x);
}
static void TestSvdConvergence()
{
long rows = 20;
long columns = 10;
var D = ProbabilityDistributions.Uniform(0d, 1d, new long[] { rows, columns });
var svdResult = Decompositions.SvdThin(D);
var approxS = msnl.NumericDenseArrayFactory.Zeros<double>(new long[] { columns, columns });
for (int i = 0; i < columns; i++)
{
approxS[i, i] = svdResult.S[i];
var reconstructedD = Operations.MatrixMultiply(svdResult.U, Operations.MatrixMultiply(approxS, svdResult.V));
double norm = Operations.Norm(D - reconstructedD, NormType.FrobeniusNorm);
Console.WriteLine("Frobenius norm of difference with {0} singular values {1} ", i + 1, norm);
}
}
static msnd.NumericDenseArray<double> CreatePositiveDefiniteSymmetricMatrix(long n)
{
msnd.NumericDenseArray<double> columnVector = ProbabilityDistributions.Uniform(0.1d, 0.9d, new long[] { n, 1 });
msnd.NumericDenseArray<double> rowVector = ProbabilityDistributions.Uniform(0.1d, 0.9d, new long[] { 1, n });
msnd.NumericDenseArray<double> diagVector = ProbabilityDistributions.Uniform(2.0d, 4.0d, new long[] { n });
var A = Operations.MatrixMultiply(columnVector, rowVector);
// Make matrix symmetric
A = A + A.Transpose();
// Increase the diagonal values to make sure the matrix is positive-definite
for (long i = 0; i < n; i++)
{
A[i, i] = n * diagVector[i];
}
return A;
}
static void TimeLinearSolversGeneral(int nIter, msnd.NumericDenseArray<double> A, msnd.NumericDenseArray<double> b, msnd.NumericDenseArray<double> delta)
{
msnd.NumericDenseArray<double> x;
DateTime t0 = DateTime.Now;
for (int i = 0; i < nIter; i++)
{
x = LinearSolvers.General(A, b);
x = x + delta;
}
Console.WriteLine("General linear solve time: {0} seconds ", (DateTime.Now - t0).TotalMilliseconds / 1000.0);
}
static void TimeLinearSolversCholesky(int nIter, msnd.NumericDenseArray<double> A, msnd.NumericDenseArray<double> b, Microsoft.Numerics.Distributed.NumericDenseArray<double> delta)
{
msnd.NumericDenseArray<double> x;
DateTime t1 = DateTime.Now;
var choleskyDecomposition = Decompositions.Cholesky(A, UpperLower.UpperTriangle);
x = LinearSolvers.Cholesky(choleskyDecomposition, b);
for (int i = 0; i < nIter; i++)
{
x = LinearSolvers.Cholesky(choleskyDecomposition, b);
x = x + delta;
}
Console.WriteLine("Linear solve time using Cholesky: {0} seconds ", (DateTime.Now - t1).TotalMilliseconds / 1000.0);
}
public static void Main(string[] args)
{
// Initialize Runtime
NumericsRuntime.Initialize();
LargestEigenVector();
// Test convergence of matrix approximation from singular value decompositions
TestSvdConvergence();
int nIter = 5;
long n = 2000;
msnd.NumericDenseArray<double> A = CreatePositiveDefiniteSymmetricMatrix(n);
msnd.NumericDenseArray<double> b = ProbabilityDistributions.Uniform(0.1d, 0.9d, new long[] { n });
msnd.NumericDenseArray<double> delta = ProbabilityDistributions.Uniform(0.1d, 0.9d, new long[] { n });
// Time general algorithm
TimeLinearSolversGeneral(nIter, A, b, delta);
// Time Cholesky
TimeLinearSolversCholesky(nIter, A, b, delta);
// Shut down the Microsoft Numerics runtime.
NumericsRuntime.Shutdown();
}
}
}
Comments
- Anonymous
April 23, 2012
Nice summary, thanks!