What is a Linear Operator?

A linear operator is a generalization of a matrix. It is a linear function that is defined in by its application to a vector. The most common linear operators are (potentially structured) matrices, where the function applying them to a vector are (potentially efficient) matrix-vector multiplication routines.

In code, a LinearOperator is a class that

  1. specifies the tensor(s) needed to define the LinearOperator,

  2. specifies a _matmul function (how the LinearOperator is applied to a vector),

  3. specifies a _size function (how big is the LinearOperator if it is represented as a matrix, or batch of matrices), and

  4. specifies a _transpose_nonbatch function (the adjoint of the LinearOperator).

  5. (optionally) defines other functions (e.g. logdet, eigh, etc.) to accelerate computations for which efficient sturcture-exploiting routines exist.

For example:

class DiagLinearOperator(linear_operator.LinearOperator):
    A LinearOperator representing a diagonal matrix.
    def __init__(self, diag):
        # diag: the vector that defines the diagonal of the matrix
        self.diag = diag

    def _matmul(self, v):
        return self.diag.unsqueeze(-1) * v

    def _size(self):
        return torch.Size([*self.diag.shape, self.diag.size(-1)])

    def _transpose_nonbatch(self):
        return self  # Diagonal matrices are symmetric

    # this function is optional, but it will accelerate computation
    def logdet(self):
        return self.diag.log().sum(dim=-1)
# ...

D = DiagLinearOperator(torch.tensor([1., 2., 3.])
# Represents the matrix
#   [[1., 0., 0.],
#    [0., 2., 0.],
#    [0., 0., 3.]]
torch.matmul(D, torch.tensor([4., 5., 6.])
# Returns [4., 10., 18.]

While _matmul, _size, and _transpose_nonbatch might seem like a limited set of functions, it turns out that most functions on the torch and torch.linalg namespaces can be efficiently implemented using only these three primitative functions.

Moreover, because _matmul is a linear function, it is very easy to compose linear operators in various ways. For example: adding two linear operators (SumLinearOperator) just requires adding the output of their _matmul functions. This makes it possible to define very complex compositional structures that still yield efficient linear algebraic routines.

Finally, LinearOperator objects can be composed with one another, yielding new LinearOperator objects and automatically keeping track of algebraic structure after each computation. As a result, users never need to reason about what efficient linear algebra routines to use (so long as the input elements defined by the user encode known input structure).