LinearOperator

class linear_operator.LinearOperator(*args, **kwargs)[source]

Base class for LinearOperators. Unless otherwise stated, we assume that the LinearOperator represents a matrix (or batch of matrices) of size (… x M x N).

Variables:
  • batch_dim (int) – The number of batch dimensions defined by the LinearOperator. (This should be equal to linear_operator.dim() - 2.

  • batch_shape (torch.Size) – The shape over which the LinearOperator is batched.

  • device (torch.device) – The device that the LinearOperator is stored on. Any tensor that interacts with this LinearOperator should be on the same device.

  • dtype (torch.dtype) – The dtype that the LinearOperator interacts with.

  • is_square (bool) – Whether or not the LinearOperator is a square operator.

  • matrix_shape (torch.Size) – The 2-dimensional shape of the implicit matrix represented by the LinearOperator. In other words: a torch.Size that consists of the operators’ output dimension and input dimension.

  • requires_grad (bool) – Whether or not any tensor that make up the LinearOperator require gradients.

  • shape (torch.Size) – The overall operator shape: batch_shape + matrix_shape.

property T: Float[LinearOperator, '*batch N M']

Alias of t()

add(other, alpha=None)[source]

Each element of the tensor other is multiplied by the scalar alpha and added to each element of the LinearOperator. The resulting LinearOperator is returned.

\[\text{out} = \text{self} + \text{alpha} ( \text{other} )\]
Parameters:
  • other (torch.Tensor (… x M x N) or ~linear_operator.operators._linear_operator.LinearOperator (… x M x N)) – object to add to self.

  • alpha (float, optional) – Optional scalar multiple to apply to other.

Return type:

~linear_operator.operators._linear_operator.LinearOperator (… x M x N)

Returns:

\(\mathbf A + \alpha \mathbf O\), where \(\mathbf A\) is the linear operator and \(\mathbf O\) is other.

add_diagonal(diag)[source]

Adds an element to the diagonal of the matrix.

Parameters:

diag (torch.Tensor (… x N) or torch.Tensor (… x 1) or torch.Tensor ()) – Diagonal to add

Return type:

~linear_operator.operators._linear_operator.LinearOperator (… x N x N)

Returns:

\(\mathbf A + \text{diag}(\mathbf d)\), where \(\mathbf A\) is the linear operator and \(\mathbf d\) is the diagonal component

add_jitter(jitter_val=0.001)[source]

Adds jitter (i.e., a small diagonal component) to the matrix this LinearOperator represents. This is equivalent to calling add_diagonal() with a scalar tensor.

Parameters:

jitter_val (float) – The diagonal component to add

Return type:

~linear_operator.operators._linear_operator.LinearOperator (… x N x N)

Returns:

\(\mathbf A + \alpha (\mathbf I)\), where \(\mathbf A\) is the linear operator and \(\alpha\) is jitter_val.

add_low_rank(low_rank_mat, root_decomp_method=None, root_inv_decomp_method=None, generate_roots=True, **root_decomp_kwargs)[source]

Adds a low rank matrix to the matrix that this LinearOperator represents, e.g. computes \(\mathbf A + \mathbf{BB}^\top\). We then update both the tensor and its root decomposition.

We have access to, \(\mathbf L\) and \(\mathbf M\) where \(\mathbf A \approx \mathbf{LL}^\top\) and \(\mathbf A^{-1} \approx \mathbf{MM}^\top\). We then compute

\[\widetilde{\mathbf A} = \mathbf A + \mathbf {BB}^\top = \mathbf L(\mathbf I + \mathbf {M B B}^\top \mathbf M^\top)\mathbf L^\top\]

and then decompose \((\mathbf I + \mathbf{M VV}^\top \mathbf M^\top) \approx \mathbf{RR}^\top\), using \(\mathbf{LR}\) as our new root decomposition.

This strategy is described in more detail in “Kernel Interpolation for Scalable Online Gaussian Processes,” Stanton et al, AISTATS, 2021.

Parameters:
  • low_rank_mat (torch.Tensor (… x N x …) or ~linear_operator.operators._linear_operator.LinearOperator (… x N x …)) – The matrix factor \(\mathbf B\) to add to \(\mathbf A\).

  • root_decomp_method (str, optional) – How to compute the root decomposition of \(\mathbf A\).

  • root_inv_decomp_method (str, optional) – How to compute the root inverse decomposition of \(\mathbf A\).

  • generate_roots (bool, optional) – Whether to generate the root decomposition of \(\mathbf A\) even if it has not been created yet.

Return type:

~linear_operator.operators._linear_operator.LinearOperator (… x N x N)

Returns:

Addition of \(\mathbf A\) and \(\mathbf{BB}^\top\).

cat_rows(cross_mat, new_mat, generate_roots=True, generate_inv_roots=True, **root_decomp_kwargs)[source]

Concatenates new rows and columns to the matrix that this LinearOperator represents, e.g.

\[\begin{split}\mathbf C = \begin{bmatrix} \mathbf A & \mathbf B^\top \\ \mathbf B & \mathbf D \end{bmatrix}\end{split}\]

where \(\mathbf A\) is the existing LinearOperator, and \(\mathbf B\) (cross_mat) and \(\mathbf D\) (new_mat) are new components. This is most commonly used when fantasizing with kernel matrices.

We have access to \(\mathbf A \approx \mathbf{LL}^\top\) and \(\mathbf A^{-1} \approx \mathbf{RR}^\top\), where \(\mathbf L\) and \(\mathbf R\) are low rank matrices resulting from root and root inverse decompositions (see Pleiss et al., 2018).

To update \(\mathbf R\), we first update \(\mathbf L\):

\[\begin{split}\begin{bmatrix} \mathbf A & \mathbf B^\top \\ \mathbf B & \mathbf D \end{bmatrix} = \begin{bmatrix} \mathbf E & \mathbf 0 \\ \mathbf F & \mathbf G \end{bmatrix} \begin{bmatrix} \mathbf E^\top & \mathbf F^\top \\ \mathbf 0 & \mathbf G^\top \end{bmatrix}\end{split}\]

Solving this matrix equation, we get:

\[\begin{split}\mathbf A &= \mathbf{EE}^\top = \mathbf{LL}^\top \quad (\Rightarrow \mathbf E = L) \\ \mathbf B &= \mathbf{EF}^\top \quad (\Rightarrow \mathbf F = \mathbf{BR}) \\ \mathbf D &= \mathbf{FF}^\top + \mathbf{GG}^\top \quad (\Rightarrow \mathbf G = (\mathbf D - \mathbf{FF}^\top)^{1/2})\end{split}\]

Once we’ve computed \([\mathbf E 0; \mathbf F \mathbf G]\), we have that the new kernel matrix \([\mathbf K \mathbf U; \mathbf U^\top \mathbf S] \approx \mathbf{ZZ}^\top\). Therefore, we can form a pseudo-inverse of \(\mathbf Z\) directly to approximate \([\mathbf K \mathbf U; \mathbf U^\top \mathbf S]^{-1/2}\).

This strategy is also described in “Efficient Nonmyopic Bayesian Optimization via One-Shot Multistep Trees,” Jiang et al, NeurIPS, 2020.

Parameters:
  • cross_mat (torch.Tensor (… x O x N)) – the matrix \(\mathbf B\) we are appending to the matrix \(\mathbf A\). If \(\mathbf A\) is … x N x N, then this matrix should be … x N x K.

  • new_mat (torch.Tensor (… x O x O)) – the matrix \(\mathbf D\) we are appending to the matrix \(\mathbf A\). If \(\mathbf B\) is … x N x K, then this matrix should be … x K x K.

  • generate_roots (bool) – whether to generate the root decomposition of \(\mathbf A\) even if it has not been created yet.

  • generate_inv_roots (bool) – whether to generate the root inv decomposition of \(\mathbf A\) even if it has not been created yet.

Return type:

~linear_operator.operators._linear_operator.LinearOperator (… x (M + O) x (N + O))

Returns:

The concatenated LinearOperator with the new rows and columns.

cholesky(upper=False)[source]

Cholesky-factorizes the LinearOperator.

Parameters:

upper (bool) – Upper triangular or lower triangular factor (default: False).

Return type:

~linear_operator.operators._linear_operator.LinearOperator (… x N x N)

Returns:

Cholesky factor (lower or upper triangular)

clone()[source]

Returns clone of the LinearOperator (with clones of all underlying tensors)

Return type:

~linear_operator.operators._linear_operator.LinearOperator (… x M x N)

cpu()[source]

Returns new LinearOperator identical to self, but on the CPU.

Return type:

~linear_operator.operators._linear_operator.LinearOperator (… x M x N)

cuda(device_id=None)[source]

This method operates identically to torch.nn.Module.cuda().

Parameters:

device_id (str, optional) – Device ID of GPU to use.

Return type:

~linear_operator.operators._linear_operator.LinearOperator (… x M x N)

detach()[source]

Removes the LinearOperator from the current computation graph. (In practice, this function removes all Tensors that make up the LinearOperator from the computation graph.)

Return type:

~linear_operator.operators._linear_operator.LinearOperator (… x M x N)

detach_()[source]

An in-place version of detach().

Return type:

~linear_operator.operators._linear_operator.LinearOperator (… x M x N)

diagonal(offset=0, dim1=-2, dim2=-1)[source]

As torch.diagonal(), returns the diagonal of the matrix \(\mathbf A\) this LinearOperator represents as a vector.

Note

This method is only implemented for when dim1 and dim2 are equal to -2 and -1, respectively, and offset = 0.

Parameters:
  • offset (int) – Unused. Use default value.

  • dim1 (int) – Unused. Use default value.

  • dim2 (int) – Unused. Use default value.

Return type:

torch.Tensor (… x N)

Returns:

The diagonal (or batch of diagonals) of \(\mathbf A\).

diagonalization(method=None)[source]

Returns a (usually partial) diagonalization of a symmetric PSD matrix. Options are either “lanczos” or “symeig”. “lanczos” runs Lanczos while “symeig” runs LinearOperator.symeig.

Parameters:

method (str, optional) – Specify the method to use (“lanczos” or “symeig”). The method will be determined based on size if not specified.

Return type:

(torch.Tensor (… x M), ~linear_operator.operators._linear_operator.LinearOperator (… x N x M), optional)

Returns:

eigenvalues and eigenvectors representing the diagonalization.

dim()[source]

Alias of ndimension()

Return type:

int

div(other)[source]

Returns the product of this LinearOperator the elementwise reciprocal of another matrix.

Parameters:

other (float or torch.Tensor) – Object to divide against

Return type:

~linear_operator.operators._linear_operator.LinearOperator

Returns:

Result of division.

double(device_id=None)[source]

This method operates identically to torch.Tensor.double().

Parameters:

device_id (str, optional) – Device ID of GPU to use.

Return type:

~linear_operator.operators._linear_operator.LinearOperator (… x M x N)

eigh()[source]

Compute the symmetric eigendecomposition of the linear operator. This can be very slow for large tensors. Should be special-cased for tensors with particular structure.

Note

This method does NOT sort the eigenvalues.

Return type:

(torch.Tensor (… x N), ~linear_operator.operators._linear_operator.LinearOperator (… x N x N), optional)

Returns:

  • The eigenvalues (… x N)

  • The eigenvectors (… x N x N).

eigvalsh()[source]

Compute the eigenvalues of symmetric linear operator. This can be very slow for large tensors. Should be special-cased for tensors with particular structure.

Note

This method does NOT sort the eigenvalues.

Return type:

torch.Tensor (… x N) or (torch.Tensor (… x N), ~linear_operator.operators._linear_operator.LinearOperator (… x N x N), optional)

Returns:

the eigenvalues (… x N)

evaluate_kernel()[source]

Return a new LinearOperator representing the same one as this one, but with all lazily evaluated kernels actually evaluated.

expand(*sizes)[source]

Returns a new view of the self LinearOperator with singleton dimensions expanded to a larger size.

Passing -1 as the size for a dimension means not changing the size of that dimension.

The LinearOperator can be also expanded to a larger number of dimensions, and the new ones will be appended at the front. For the new dimensions, the size cannot be set to -1.

Expanding a LinearOperator does not allocate new memory, but only creates a new view on the existing LinearOperator where a dimension of size one is expanded to a larger size by setting the stride to 0. Any dimension of size 1 can be expanded to an arbitrary value without allocating new memory.

Parameters:

sizes (torch.Size or int) – the desired expanded size

Return type:

~linear_operator.operators._linear_operator.LinearOperator

Returns:

The expanded LinearOperator

float(device_id=None)[source]

This method operates identically to torch.Tensor.float().

Parameters:

device_id (str, optional) – Device ID of GPU to use.

Return type:

~linear_operator.operators._linear_operator.LinearOperator (… x M x N)

half(device_id=None)[source]

This method operates identically to torch.Tensor.half().

Parameters:

device_id (str, optional) – Device ID of GPU to use.

Return type:

~linear_operator.operators._linear_operator.LinearOperator (… x M x N)

inv_quad(inv_quad_rhs, reduce_inv_quad=True)[source]

Computes an inverse quadratic form (w.r.t self) with several right hand sides, i.e:

\[\text{tr}\left( \mathbf R^\top \mathbf A^{-1} \mathbf R \right),\]

where \(\mathbf A\) is the (positive definite) LinearOperator and \(\mathbf R\) represents the right hand sides (inv_quad_rhs).

If reduce_inv_quad is set to false (and inv_quad_rhs is supplied), the function instead computes

\[\text{diag}\left( \mathbf R^\top \mathbf A^{-1} \mathbf R \right).\]
Parameters:
  • input\(\mathbf A\) - the positive definite matrix (… X N X N)

  • inv_quad_rhs (torch.Tensor (… x N x M) or torch.Tensor (… x N)) – \(\mathbf R\) - the right hand sides of the inverse quadratic term (… x N x M)

  • reduce_inv_quad (bool) – Whether to compute \(\text{tr}\left( \mathbf R^\top \mathbf A^{-1} \mathbf R \right)\) or \(\text{diag}\left( \mathbf R^\top \mathbf A^{-1} \mathbf R \right)\).

Return type:

torch.Tensor (… x M) or torch.Tensor (…)

Returns:

The inverse quadratic term. If reduce_inv_quad=True, the inverse quadratic term is of shape (…). Otherwise, it is (… x M).

inv_quad_logdet(inv_quad_rhs=None, logdet=False, reduce_inv_quad=True)[source]

Calls both inv_quad() and logdet() on a positive definite matrix (or batch) \(\mathbf A\). However, calling this method is far more efficient and stable than calling each method independently.

Parameters:
  • inv_quad_rhs (torch.Tensor (… x N x M) or torch.Tensor (… x N), optional) – \(\mathbf R\) - the right hand sides of the inverse quadratic term

  • logdet (bool, optional) – Whether or not to compute the logdet term \(\log \vert \mathbf A \vert\).

  • reduce_inv_quad (bool, optional) – Whether to compute \(\text{tr}\left( \mathbf R^\top \mathbf A^{-1} \mathbf R \right)\) or \(\text{diag}\left( \mathbf R^\top \mathbf A^{-1} \mathbf R \right)\).

Return type:

(torch.Tensor (… x M) or torch.Tensor (…) or torch.Tensor (0), optional, torch.Tensor (…), optional)

Returns:

The inverse quadratic term (or None), and the logdet term (or None). If reduce_inv_quad=True, the inverse quadratic term is of shape (…). Otherwise, it is (… x M).

logdet()[source]

Computes the log determinant \(\log \vert \mathbf A \vert\).

Return type:

torch.Tensor (…)

property mT: Float[LinearOperator, '*batch N M']

Alias of transpose(-1, -2)

matmul(other)[source]

Performs \(\mathbf A \mathbf B\), where \(\mathbf A \in \mathbb R^{M \times N}\) is the LinearOperator and \(\mathbf B\) is a right hand side torch.Tensor (or LinearOperator).

Parameters:

other (torch.Tensor (… x N x P) or torch.Tensor (… x N) or ~linear_operator.operators._linear_operator.LinearOperator (… x N x P)) – \(\mathbf B\) - the matrix or vector to multiply against.

Return type:

torch.Tensor (… x M x P) or torch.Tensor (… x M) or ~linear_operator.operators._linear_operator.LinearOperator (… x M x P)

Returns:

The resulting of applying the linear operator to \(\mathbf B\). The return type will be the same as other’s type.

mul(other)[source]

Multiplies the matrix by a constant, or elementwise the matrix by another matrix.

Parameters:

other (float or torch.Tensor (… x M x N) or ~linear_operator.operators._linear_operator.LinearOperator (… x M x N)) – Constant or matrix to elementwise multiply by.

Return type:

~linear_operator.operators._linear_operator.LinearOperator (… x M x N)

Returns:

Another linear operator representing the result of the multiplication. If other was a constant (or batch of constants), this will likely be a ConstantMulLinearOperator. If other was a matrix or LinearOperator, this will likely be a MulLinearOperator.

ndimension()[source]

Returns the number of dimensions.

Return type:

int

numel()[source]

Returns the number of elements.

Return type:

int

numpy()[source]

Returns the LinearOperator as an dense numpy array.

Return type:

numpy.ndarray

permute(*dims)[source]

Returns a view of the original tensor with its dimensions permuted.

Parameters:

dims (int or (int, …)) – The desired ordering of dimensions.

Return type:

~linear_operator.operators._linear_operator.LinearOperator

pivoted_cholesky(rank, error_tol=None, return_pivots=False)[source]

Performs a partial pivoted Cholesky factorization of the (positive definite) LinearOperator. \(\mathbf L \mathbf L^\top = \mathbf K\). The partial pivoted Cholesky factor \(\mathbf L \in \mathbb R^{N \times \text{rank}}\) forms a low rank approximation to the LinearOperator.

The pivots are selected greedily, corresponding to the maximum diagonal element in the residual after each Cholesky iteration. See Harbrecht et al., 2012.

Parameters:
  • rank (int) – The size of the partial pivoted Cholesky factor.

  • error_tol (float, optional) – Defines an optional stopping criterion. If the residual of the factorization is less than error_tol, then the factorization will exit early. This will result in a \(\leq \text{ rank}\) factor.

  • return_pivots (bool) – Whether or not to return the pivots alongside the partial pivoted Cholesky factor.

Return type:

torch.Tensor (… x N x R) or (torch.Tensor (… x N x R), torch.Tensor (… x N))

Returns:

The … x N x rank factor (and optionally the … x N pivots if return_pivots is True).

prod(dim)[source]

Returns the product of each row of \(\mathbf A\) along the batch dimension dim.

>>> linear_operator = DenseLinearOperator(torch.tensor([
        [[2, 4], [1, 2]],
        [[1, 1], [2., -1]],
        [[2, 1], [1, 1.]],
        [[3, 2], [2, -1]],
    ]))
>>> linear_operator.prod().to_dense()
>>> # Returns: torch.Tensor(768.)
>>> linear_operator.prod(dim=-3)
>>> # Returns: tensor([[8., 2.], [1., -2.], [2., 1.], [6., -2.]])
Parameters:

dim (int) – Which dimension to compute the product along.

Return type:

~linear_operator.operators._linear_operator.LinearOperator or torch.Tensor

repeat(*sizes)[source]

Repeats this tensor along the specified dimensions.

Currently, this only works to create repeated batches of a 2D LinearOperator. I.e. all calls should be linear_operator.repeat(*batch_sizes, 1, 1).

>>> linear_operator = ToeplitzLinearOperator(torch.tensor([4. 1., 0.5]))
>>> linear_operator.repeat(2, 1, 1).to_dense()
tensor([[[4.0000, 1.0000, 0.5000],
         [1.0000, 4.0000, 1.0000],
         [0.5000, 1.0000, 4.0000]],
        [[4.0000, 1.0000, 0.5000],
         [1.0000, 4.0000, 1.0000],
         [0.5000, 1.0000, 4.0000]]])
Parameters:

sizes (int or (int, …)) – The number of times to repeat this tensor along each dimension.

Return type:

~linear_operator.operators._linear_operator.LinearOperator

Returns:

A LinearOperator with repeated dimensions.

representation()[source]

Returns the Tensors that are used to define the LinearOperator

Return type:

(torch.Tensor, …)

representation_tree()[source]

Returns a linear_operator.operators.LinearOperatorRepresentationTree tree object that recursively encodes the representation of this LinearOperator. In particular, if the definition of this LinearOperator depends on other LinearOperators, the tree is an object that can be used to reconstruct the full structure of this LinearOperator, including all subobjects. This is used internally.

Return type:

~linear_operator.operators.linear_operator_representation_tree.LinearOperatorRepresentationTree

requires_grad_(val)[source]

Sets requires_grad=val on all the Tensors that make up the LinearOperator This is an inplace operation.

Parameters:

val (bool) – Whether or not to require gradients.

Return type:

~linear_operator.operators._linear_operator.LinearOperator

Returns:

self.

reshape(*sizes)[source]

Alias for expand

Return type:

~linear_operator.operators._linear_operator.LinearOperator

rmatmul(other)[source]

Performs \(\mathbf B \mathbf A\), where \(\mathbf A \in \mathbb R^{M \times N}\) is the LinearOperator and \(\mathbf B\) is a left hand side torch.Tensor (or LinearOperator).

Parameters:

other (torch.Tensor (… x P x M) or torch.Tensor (… x M) or ~linear_operator.operators._linear_operator.LinearOperator (… x P x M)) – \(\mathbf B\) - the matrix or vector that \(\mathbf A\) will right multiply against.

Return type:

torch.Tensor (… x P x N) or torch.Tensor (N) or ~linear_operator.operators._linear_operator.LinearOperator (… x P x N)

Returns:

The product \(\mathbf B \mathbf A\). The return type will be the same as other’s type.

root_decomposition(method=None)[source]

Returns a (usually low-rank) root decomposition linear operator of the PSD LinearOperator \(\mathbf A\). This can be used for sampling from a Gaussian distribution, or for obtaining a low-rank version of a matrix.

Parameters:

method (str, optional) – Which method to use to perform the root decomposition. Choices are: “cholesky”, “lanczos”, “symeig”, “pivoted_cholesky”, or “svd”.

Return type:

~linear_operator.operators._linear_operator.LinearOperator (… x N x N)

Returns:

A tensor \(\mathbf R\) such that \(\mathbf R \mathbf R^\top \approx \mathbf A\).

root_inv_decomposition(initial_vectors=None, test_vectors=None, method=None)[source]

Returns a (usually low-rank) inverse root decomposition linear operator of the PSD LinearOperator \(\mathbf A\). This can be used for sampling from a Gaussian distribution, or for obtaining a low-rank version of a matrix.

The root_inv_decomposition is performed using a partial Lanczos tridiagonalization.

Parameters:
  • initial_vectors (torch.Tensor, optional) – Vectors used to initialize the Lanczos decomposition. The best initialization vector (determined by test_vectors) will be chosen.

  • test_vectors (torch.Tensor, optional) – Vectors used to test the accuracy of the decomposition.

  • method (str, optional) – Root decomposition method to use (symeig, diagonalization, lanczos, or cholesky).

Return type:

~linear_operator.operators._linear_operator.LinearOperator (… x N x N) or torch.Tensor (… x N x N)

Returns:

A tensor \(\mathbf R\) such that \(\mathbf R \mathbf R^\top \approx \mathbf A^{-1}\).

size(dim=None)[source]

Returns he size of the LinearOperator (or the specified dimension).

Parameters:

dim (int, optional) – A specific dimension.

Return type:

torch.Size or int

solve(right_tensor, left_tensor=None)[source]

Computes a linear solve (w.r.t self = \(\mathbf A\)) with right hand side \(\mathbf R\). I.e. computes

\[\begin{equation} \mathbf A^{-1} \mathbf R, \end{equation}\]

where \(\mathbf R\) is right_tensor and \(\mathbf A\) is the LinearOperator.

If left_tensor is supplied, computes

\[\begin{equation} \mathbf L \mathbf A^{-1} \mathbf R, \end{equation}\]

where \(\mathbf L\) is left_tensor. Supplying this can reduce the number of solver calls required in the backward pass.

Parameters:
  • right_tensor (torch.Tensor (… x N x P) or torch.Tensor (N)) – \(\mathbf R\) - the right hand side

  • left_tensor (torch.Tensor (… x O x N), optional) – \(\mathbf L\) - the left hand side

Return type:

torch.Tensor (… x N x P) or torch.Tensor (… x N) or torch.Tensor (… x O x P) or torch.Tensor (… x O)

Returns:

\(\mathbf A^{-1} \mathbf R\) or \(\mathbf L \mathbf A^{-1} \mathbf R\).

solve_triangular(rhs, upper, left=True, unitriangular=False)[source]

Computes a triangular linear solve (w.r.t self = \(\mathbf A\)) with right hand side \(\mathbf R\). If left=True, computes the soluton \(\mathbf X\) to

\[\begin{equation} \mathbf A \mathbf X = \mathbf R, \end{equation}\]

If left=False, computes the soluton \(\mathbf X\) to

\[\begin{equation} \mathbf X \mathbf A = \mathbf R, \end{equation}\]

where \(\mathbf R\) is rhs and \(\mathbf A\) is the (triangular) LinearOperator.

Parameters:
  • rhs (torch.Tensor) – \(\mathbf R\) - the right hand side

  • upper (bool) – If True (False), consider \(\mathbf A\) to be upper (lower) triangular.

  • left (bool) – If True (False), solve for \(\mathbf A \mathbf X = \mathbf R\) (\(\mathbf X \mathbf A = \mathbf R\)).

  • unitriangular (bool) – Unsupported (must be False),

Return type:

torch.Tensor

Returns:

\(\mathbf A^{-1} \mathbf R\) or \(\mathbf L \mathbf A^{-1} \mathbf R\).

sqrt_inv_matmul(rhs, lhs=None)[source]

If the LinearOperator \(\mathbf A\) is positive definite, computes

\[\begin{equation} \mathbf A^{-1/2} \mathbf R, \end{equation}\]

where \(\mathbf R\) is rhs.

If lhs is supplied, computes

\[\begin{equation} \mathbf L \mathbf A^{-1/2} \mathbf R, \end{equation}\]

where \(\mathbf L\) is lhs. Supplying this can reduce the number of solver calls required in the backward pass.

Parameters:
  • rhs (torch.Tensor (… x N x P)) – \(\mathbf R\) - the right hand side

  • lhs (torch.Tensor (… x O x N), optional) – \(\mathbf L\) - the left hand side

Return type:

torch.Tensor (… x N x P) or (torch.Tensor (… x O x P), torch.Tensor (… x O))

Returns:

\(\mathbf A^{-1/2} \mathbf R\) or \(\mathbf L \mathbf A^{-1/2} \mathbf R\).

squeeze(dim)[source]

Removes the singleton dimension of a LinearOperator specifed by dim.

Parameters:

dim (int) – Which singleton dimension to remove.

Return type:

~linear_operator.operators._linear_operator.LinearOperator or torch.Tensor

Returns:

The squeezed LinearOperator. Will be a torch.Tensor if the squeezed dimension was a matrix dimension; otherwise it will return a LinearOperator.

sub(other, alpha=None)[source]

Each element of the tensor other is multiplied by the scalar alpha and subtracted to each element of the LinearOperator. The resulting LinearOperator is returned.

\[\text{out} = \text{self} - \text{alpha} ( \text{other} )\]
Parameters:
  • other (torch.Tensor (… x M x N) or ~linear_operator.operators._linear_operator.LinearOperator (… x M x N)) – object to subtract against self.

  • alpha (float, optional) – Optional scalar multiple to apply to other.

Return type:

~linear_operator.operators._linear_operator.LinearOperator (… x M x N)

Returns:

\(\mathbf A - \alpha \mathbf O\), where \(\mathbf A\) is the linear operator and \(\mathbf O\) is other.

sum(dim=None)[source]

Sum the LinearOperator across a dimension. The dim controls which batch dimension is summed over. If set to None, then sums all dimensions.

>>> linear_operator = DenseLinearOperator(torch.tensor([
        [[2, 4], [1, 2]],
        [[1, 1], [0, -1]],
        [[2, 1], [1, 0]],
        [[3, 2], [2, -1]],
    ]))
>>> linear_operator.sum(0).to_dense()
Parameters:

dim (int, optional) – Which dimension is being summed over (default=None).

Return type:

~linear_operator.operators._linear_operator.LinearOperator or torch.Tensor

Returns:

The summed LinearOperator. Will be a torch.Tensor if the sumemd dimension was a matrix dimension (or all dimensions); otherwise it will return a LinearOperator.

svd()[source]

Compute the SVD of the linear operator \(\mathbf A \in \mathbb R^{M \times N}\) s.t. \(\mathbf A = \mathbf{U S V^\top}\). This can be very slow for large tensors. Should be special-cased for tensors with particular structure.

Note

This method does NOT sort the sigular values.

Return type:

(~linear_operator.operators._linear_operator.LinearOperator (… x N x N), torch.Tensor (… x N), ~linear_operator.operators._linear_operator.LinearOperator (… x N x N))

Returns:

  • The left singular vectors \(\mathbf U\) (… x M, M),

  • The singlar values \(\mathbf S\) (… x min(M, N)),

  • The right singluar vectors \(\mathbf V\) (… x N x N)),

t()[source]

Alias of transpose() for 2D LinearOperator. (Tranposes the two dimensions.)

Return type:

~linear_operator.operators._linear_operator.LinearOperator (… x N x M)

to(*args, **kwargs)[source]

A device-agnostic method of moving the LinearOperator to the specified device or dtype. This method functions just like torch.Tensor.to().

Return type:

~linear_operator.operators._linear_operator.LinearOperator (… x M x N)

Returns:

New LinearOperator identical to self on specified device/dtype.

to_dense()[source]

Explicitly evaluates the matrix this LinearOperator represents. This function should return a torch.Tensor storing an exact representation of this LinearOperator.

Return type:

torch.Tensor (… x M x N)

transpose(dim1, dim2)[source]

Transpose the dimensions dim1 and dim2 of the LinearOperator.

>>> linear_op = linear_operator.operators.DenseLinearOperator(torch.randn(3, 5))
>>> linear_op.transpose(0, 1)
Parameters:
  • dim1 (int) – First dimension to transpose.

  • dim2 (int) – Second dimension to transpose.

Return type:

~linear_operator.operators._linear_operator.LinearOperator

type(dtype)[source]

A device-agnostic method of moving the LienarOperator to the specified dtype. This method operates similarly to torch.Tensor.dtype().

Parameters:

dtype (torch.dtype) – Target dtype.

Return type:

~linear_operator.operators._linear_operator.LinearOperator

unsqueeze(dim)[source]

Inserts a singleton batch dimension of a LinearOperator, specifed by dim. Note that dim cannot correspond to matrix dimension of the LinearOperator.

Parameters:

dim (int) – Where to insert singleton dimension.

Return type:

~linear_operator.operators._linear_operator.LinearOperator

Returns:

The unsqueezed LinearOperator.

zero_mean_mvn_samples(num_samples)[source]

Assumes that the LinearOpeator \(\mathbf A\) is a covariance matrix, or a batch of covariance matrices. Returns samples from a zero-mean MVN, defined by \(\mathcal N( \mathbf 0, \mathbf A)\).

Parameters:

num_samples (int) – Number of samples to draw.

Return type:

torch.Tensor (num_samples x … x N)

Returns:

Samples from MVN \(\mathcal N( \mathbf 0, \mathbf A)\).