Data-Sparse LinearOperators

BlockDiagLinearOperator

class linear_operator.operators.BlockDiagLinearOperator(base_linear_op: LinearOperator | Tensor, block_dim: int = -3)[source]

Represents a lazy tensor that is the block diagonal of square matrices. The block_dim attribute specifies which dimension of the base LinearOperator specifies the blocks. For example, (with block_dim=-3 a k x n x n tensor represents k n x n blocks (a kn x kn matrix). A b x k x n x n tensor represents k b x n x n blocks (a b x kn x kn batch matrix).

Args:

base_linear_op (LinearOperator or Tensor):: Must be at least 3 dimensional.
block_dim (int):: The dimension that specifies the blocks.

CholLinearOperator

class linear_operator.operators.CholLinearOperator(chol, upper=False)[source]

A LinearOperator (… x N x N) that represents a positive definite matrix given a lower trinagular Cholesky factor \(\mathbf L\) (or upper triangular Cholesky factor \(\mathbf R\)).

Parameters:

chol (TriangularLinearOperator (... x N x N)) – The Cholesky factor \(\mathbf L\) (or \(\mathbf R\)).
upper (bool) – If the orientation of the cholesky factor is an upper triangular matrix (i.e. \(\mathbf R^\top \mathbf R\)). If false, then the orientation is assumed to be a lower triangular matrix (i.e. \(\mathbf L \mathbf L^\top\)).

inverse()[source]

Returns the inverse of the CholLinearOperator.

Return type:: ~linear_operator.LinearOperator

ConstantDiagLinearOperator

class linear_operator.operators.ConstantDiagLinearOperator(diag_values, diag_shape)[source]

Diagonal lazy tensor with constant entries. Supports arbitrary batch sizes. Used e.g. for adding jitter to matrices.

Parameters:

diag_values (~torch.Tensor) – A … 1 Tensor, representing a of (batch of) diag_shape x diag_shape diagonal matrix.
diag_shape (int) – The (non-batch) dimension of the (square) matrix

abs()[source]

Returns a DiagLinearOperator with the absolute value of all diagonal entries.

Return type:: ~linear_operator.LinearOperator

exp()[source]

Returns a DiagLinearOperator with all diagonal entries exponentiated.

Return type:: ~linear_operator.LinearOperator

inverse()[source]

Returns the inverse of the DiagLinearOperator.

Return type:: ~linear_operator.LinearOperator

log()[source]

Returns a DiagLinearOperator with the log of all diagonal entries.

Return type:: ~linear_operator.LinearOperator

sqrt()[source]

Returns a DiagLinearOperator with the square root of all diagonal entries.

Return type:: ~linear_operator.LinearOperator

DiagLinearOperator

class linear_operator.operators.DiagLinearOperator(diag)[source]

Diagonal linear operator (… x N x N).

Parameters:: diag (~torch.Tensor) – Diagonal elements of LinearOperator.

abs()[source]

Returns a DiagLinearOperator with the absolute value of all diagonal entries.

Return type:: ~linear_operator.LinearOperator

exp()[source]

Returns a DiagLinearOperator with all diagonal entries exponentiated.

Return type:: ~linear_operator.LinearOperator

inverse()[source]

Returns the inverse of the DiagLinearOperator.

Return type:: ~linear_operator.LinearOperator

log()[source]

Returns a DiagLinearOperator with the log of all diagonal entries.

Return type:: ~linear_operator.LinearOperator

sqrt()[source]

Returns a DiagLinearOperator with the square root of all diagonal entries.

Return type:: ~linear_operator.LinearOperator

IdentityLinearOperator

class linear_operator.operators.IdentityLinearOperator(diag_shape, batch_shape=(), dtype=torch.float32, device=None)[source]

Identity linear operator. Supports arbitrary batch sizes.

Parameters:

diag_shape (int) – The size of the identity matrix (i.e. \(N\)).
batch_shape (torch.Size | None) – The size of the batch dimensions. It may be useful to set these dimensions for broadcasting.
dtype (torch.dtype | None) – Dtype that the LinearOperator will be operating on. (Default: torch.get_default_dtype()).
device (typing.Optional) – Device that the LinearOperator will be operating on. (Default: CPU).

KernelLinearOperator

class linear_operator.operators.KernelLinearOperator(x1, x2, covar_func, num_outputs_per_input=(1, 1), num_nonbatch_dimensions=None, **params)[source]

Represents the kernel matrix \(\boldsymbol K\) of data \(\boldsymbol X_1 \in \mathbb R^{M \times D}\) and \(\boldsymbol X_2 \in \mathbb R^{N \times D}\) under the covariance function \(k_{\boldsymbol \theta}(\cdot, \cdot)\) (parameterized by hyperparameters \(\boldsymbol \theta\) so that \(\boldsymbol K_{ij} = k_{\boldsymbol \theta}([\boldsymbol X_1]_i, [\boldsymbol X_2]_j)\).

The output of \(k_{\boldsymbol \theta}(\cdot,\cdot)\) (covar_func) can either be a torch.Tensor or a LinearOperator.

Note

All hyperparameters have some number of batch dimensions (which broadcast with the batch dimensions of x1 and x2) and some number of non-batch dimensions (dimensions that would exist if we were computing a single covariance matrix).

By default, each hyperparameter is assumed to have 2 (potentially singleton) non-batch dimensions. However, the number of non_batch dimensions can be specified on a per-hyperparameter through the optional num_nonbatch_dimensions dictionary argument.

For example, to implement the RBF kernel

\[o^2 \exp\left( -\tfrac{1}{2} (\boldsymbol x_1 - \boldsymbol x2)^\top \boldsymbol D_\ell^{-2} (\boldsymbol x_1 - \boldsymbol x2) \right),\]

where \(o\) is an outputscale parameter and \(D_\ell\) is a diagonal lengthscale matrix, we would expect the following shapes:

x1: (*batch_shape x N x D)
x2: (*batch_shape x M x D)
lengthscale: (*batch_shape x 1 x D)
outputscale: (*batch_shape) # Note this parameter does not have non-batch dimensions

We would then supply the dictionary num_nonbatch_dimensions = {“outputscale”: 0}. (We do not need to include lengthscale in the dictionary since it has 2 non-batch dimensions.)

# NOTE: _covar_func intentionally does not close over any parameters
def _covar_func(x1, x2, lengthscale, outputscale):
    # RBF kernel function
    # x1: ... x N x D
    # x2: ... x M x D
    # lengthscale: ... x 1 x D
    # outputscale: ...
    x1 = x1.div(lengthscale)
    x2 = x2.div(lengthscale)
    sq_dist = (x1.unsqueeze(-2) - x2.unsqueeze(-3)).square().sum(dim=-1)
    kern = sq_dist.div(-2.0).exp().mul(outputscale[..., None, None].square())
    return kern


# Batches of data
x1 = torch.randn(3, 5, 6)
x2 = torch.randn(3, 4, 6)
# Broadcasting lengthscale and output parameters
lengthscale = torch.randn(2, 1, 1, 6)  # Batch shape is 2 x 1, with 2 non-batch dimensions
outputscale = torch.randn(2, 1)  # Batch shape is 2 x 1, no non-batch dimensions
kern = KernelLinearOperator(
    x1, x2, lengthscale=lengthscale, outputscale=outputscale,
    covar_func=covar_func, num_nonbatch_dimensions={"outputscale": 0}
)

# kern is of size 2 x 3 x 5 x 4

Warning

covar_func should not close over any parameters. Any parameters that are closed over will not have propagated gradients.

See the example above: the lengthscale and outputscale of _covar_func are passed in as arguments, rather than being externally defined variables.

Parameters:

x1 (~torch.Tensor) – The data \(\boldsymbol X_1.\)
x2 (~torch.Tensor) – The data \(\boldsymbol X_2.\)
covar_func (typing.Callable) – The covariance function \(k_{\boldsymbol \theta}(\cdot, \cdot)\). Its arguments should be x1, x2, **params, and it should output the covariance matrix between \(\boldsymbol X_1\) and \(\boldsymbol X_2\).
num_outputs_per_input (tuple) – The number of outputs per data point. This parameter should be 1 for most kernels, but will be >1 for multitask kernels, gradient kernels, and any other kernels that require cross-covariance terms for multiple domains. If a tuple is passed, there will be a different number of outputs per input dimension for the rows/cols of the kernel matrix.
params (typing.Union) – Additional hyperparameters (\(\boldsymbol \theta\)) or keyword arguments passed into covar_func.

RootLinearOperator

class linear_operator.operators.RootLinearOperator(root)[source]

ToeplitzLinearOperator

class linear_operator.operators.ToeplitzLinearOperator(column)[source]

ZeroLinearOperator

class linear_operator.operators.ZeroLinearOperator(*sizes, dtype=None, device=None)[source]

Special LinearOperator representing zero.

Parameters:

sizes (tuple) – The size of each dimension (including batch dimensions).
dtype (typing.Optional) – Dtype that the LinearOperator will be operating on. (Default: torch.get_default_dtype()).
device (typing.Optional) – Device that the LinearOperator will be operating on. (Default: CPU).