API

`torch` algorithm utils

lm_opt

lm_opt(f, theta0, ytrue, batch_dims=0, iters=10, residtol=None, loss_mult=1, loss_shift=0, f_kwargs_vec={}, f_kwargs_no_vec={}, lam0=1e-06, alpha0=1.0, lam_factors=[[1 / 2, 1, 2]], alpha_factors=[[1 / 2, 1, 2]], vmap_chunk_size=None, jacmode='auto', verbose=False, verbose_indent=4, quantiles_losses=[0, 1, 5, 10, 25, 40, 50, 60, 75, 90, 95, 99, 100], quantiles_lams=[0, 1, 5, 10, 25, 40, 50, 60, 75, 90, 95, 99, 100], quantiles_alphas=[0, 1, 5, 10, 25, 40, 50, 60, 75, 90, 95, 99, 100], verbose_quantiles_losses=[5, 25, 50, 75, 90], verbose_quantiles_lams=[5, 25, 50, 75, 90], verbose_quantiles_alphas=[5, 25, 50, 75, 90], verbose_times=True, warn=True, store_data_iters=False, store_all_data=False)

Levenberg--Marquardt optimization

Parameters:

Name	Type	Description	Default
`f`	`func`	Residual function.	required
`theta0`	`Tensor`	Initial guess for parameters \(\theta\).	required
`ytrue`	`Tensor`	True `y` values, i.e. `f(theta_true)`.	required
`batch_dims`	`int`	Number of batch dimension.	`0`
`iters`	`int`	Number of iterations.	`10`
`residtol`	`float`	Non-negative tolerance on the maximum residual for early stopping, defaults to `1e-12` for `torch.float64` and `2.5e-4` for `torch.float32`.	`None`
`loss_mult`	`bool`	Scalar amount by which to multiply the loss so `loss = loss_multtorch.sum(resid*2,dim=-1)+loss_shift`.	`1`
`loss_shift`	`bool`	Scalar amount by which to shift the loss so `loss = loss_multtorch.sum(resid*2,dim=-1)+loss_shift`.	`0`
`f_kwargs_vec`	`dict`	Keyword arguments to `f` which will be vectorized over the first dimension.	`{}`
`f_kwargs_no_vec`	`dict`	Keyword arguments to `f` which will not be vectorized over the first dimension.	`{}`
`lam0`	`float`	Initial positive relaxation parameter \(\lambda\).	`1e-06`
`alpha0`	`float`	Initial positive step size \(\alpha\).	`1.0`
`lam_factors`	`Tensor`	Either a 1D `torch.Tensor` or list of 1d `torch.Tensor`. Passing in a `float` for `lam_factors` is equivalent to passing in `torch.tensor([lam_factors])` on the correct device If a 1D `torch.Tensor` for `lam_factors` will consider all `lam*lam_factors` options at each step. If a list of 1D `torch.Tensor`s are passed in for `lam_factors`, iterations will cycle through the list and then return to the start after exhausting the list.	`[[1 / 2, 1, 2]]`
`alpha_factors`	`Tensor`	Either a 1D `torch.Tensor` or list of 1d `torch.Tensor`. Passing in a `float` for `alpha_factors` is equivalent to passing in `torch.tensor([alpha_factors])` on the correct device If a 1D `torch.Tensor` for `alpha_factors` will consider all `lam*alpha_factors` options at each step. If a list of 1D `torch.Tensor`s are passed in for `alpha_factors`, iterations will cycle through the list and then return to the start after exhausting the list.	`[[1 / 2, 1, 2]]`
`vmap_chunk_size`	`int`	Parameter `chunksize` to pass to `torch.vmap`.	`None`
`jacmode`	`bool`	Choose between `torch.func.jacfwd` and `torch.func.jacrev` using options: `"fwd"`: Use `torch.func.jacfwd`. `"rev"`: Use `torch.func.jacrev`. `"auto"`: Choose between `torch.func.jacfwd` and `torch.func.jacrev` depending on the size of the inputs and outputs.	`'auto'`
`verbose`	`int`	Controls logging verbosity If `True`, perform logging. If a positive int, only log every verbose iterations. If `None`, set to a reasonable positive int based on the maximum number of iterations If `False`, don't log.	`False`
`verbose_indent`	`int`	Non-negative number of indentation spaces for logging.	`4`
`quantiles_losses`	`list`	Loss quantiles to record.	`[0, 1, 5, 10, 25, 40, 50, 60, 75, 90, 95, 99, 100]`
`quantiles_lams`	`list`	\(\lambda\) quantiles to record.	`[0, 1, 5, 10, 25, 40, 50, 60, 75, 90, 95, 99, 100]`
`quantiles_alphas`	`list`	\(\alpha\) quantiles to record.	`[0, 1, 5, 10, 25, 40, 50, 60, 75, 90, 95, 99, 100]`
`verbose_quantiles_losses`	`list`	Loss quantiles to show in verbose log.	`[5, 25, 50, 75, 90]`
`verbose_quantiles_lams`	`list`	\(\lambda\) quantiles to show in verbose log.	`[5, 25, 50, 75, 90]`
`verbose_quantiles_alphas`	`list`	\(\alpha\) quantiles to show in verbose log.	`[5, 25, 50, 75, 90]`
`verbose_times`	`bool`	If `False`, do not show the times in the verbose log. This is mostly for testing where timing is not reproducible.	`True`
`warn`	`bool`	If `False`, then suppress warnings.	`True`
`store_data_iters`	`int`	Controls storage iterations with the same options as verbose. If `store_data_iters==0`, then the data is not collected or returned. If `True`, store every iteration. If a positive int, only store every `store_data_iters` iterations. If `None`, set to a reasonable positive int based on the maximum number of iterations If `False`, don't store data, and do not return data	`False`
`store_all_data`	`bool`	If `True`, store the `x` values as well as the metrics.	`False`

Returns:

Name	Type	Description
`theta`	`Tensor`	Optimized parameters.
`data`	`dict`	Iteration data, only returned when `store_data_iters>0`

Examples:

>>> torch.set_default_dtype(torch.float64)
>>> rng = torch.Generator().manual_seed(7)

Standard example

>>> x = torch.rand((10,4,),generator=rng)
>>> theta_true = torch.rand((4,),generator=rng)
>>> ytrue = torch.exp((x*theta_true).sum(-1)) # (10,)
>>> def f(theta):
...     yhat = torch.exp((x*theta[...,None,:]).sum(-1)) # (...,10)
...     return yhat
>>> theta_hat,data = lm_opt(
...     f = f, 
...     theta0 = torch.rand_like(theta_true,generator=rng),
...     ytrue = ytrue,
...     iters = 3,
...     batch_dims = 0,
...     verbose = True,
...     verbose_times = False,
...     store_data_iters = None,
...     store_all_data = True,
...     )
    iter i     | losses_quantiles                                          | lams_quantiles                                            | alphas_quantiles                                          
               | 5         | 25        | 50        | 75        | 90        | 5         | 25        | 50        | 75        | 90        | 5         | 25        | 50        | 75        | 90        
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    0          | 2.3e+01   | 2.3e+01   | 2.3e+01   | 2.3e+01   | 2.3e+01   | 1.0e-06   | 1.0e-06   | 1.0e-06   | 1.0e-06   | 1.0e-06   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
    1          | 3.4e+00   | 3.4e+00   | 3.4e+00   | 3.4e+00   | 3.4e+00   | 5.0e-07   | 5.0e-07   | 5.0e-07   | 5.0e-07   | 5.0e-07   | 5.0e-01   | 5.0e-01   | 5.0e-01   | 5.0e-01   | 5.0e-01   
    2          | 4.5e-02   | 4.5e-02   | 4.5e-02   | 4.5e-02   | 4.5e-02   | 1.0e-06   | 1.0e-06   | 1.0e-06   | 1.0e-06   | 1.0e-06   | 5.0e-01   | 5.0e-01   | 5.0e-01   | 5.0e-01   | 5.0e-01   
    3          | 4.8e-06   | 4.8e-06   | 4.8e-06   | 4.8e-06   | 4.8e-06   | 5.0e-07   | 5.0e-07   | 5.0e-07   | 5.0e-07   | 5.0e-07   | 5.0e-01   | 5.0e-01   | 5.0e-01   | 5.0e-01   | 5.0e-01   
>>> torch.allclose(theta_hat,theta_true,atol=5e-2)
True
>>> print_data_signatures(data)
    data['theta'].shape = (4,)
    data['iterrange'].shape = (4,)
    data['times'].shape = (4,)
    data['losses_quantiles']
        data['losses_quantiles']['0'].shape = (4,)
        data['losses_quantiles']['1'].shape = (4,)
        data['losses_quantiles']['5'].shape = (4,)
        data['losses_quantiles']['10'].shape = (4,)
        data['losses_quantiles']['25'].shape = (4,)
        data['losses_quantiles']['40'].shape = (4,)
        data['losses_quantiles']['50'].shape = (4,)
        data['losses_quantiles']['60'].shape = (4,)
        data['losses_quantiles']['75'].shape = (4,)
        data['losses_quantiles']['90'].shape = (4,)
        data['losses_quantiles']['95'].shape = (4,)
        data['losses_quantiles']['99'].shape = (4,)
        data['losses_quantiles']['100'].shape = (4,)
    data['lams_quantiles']
        data['lams_quantiles']['0'].shape = (4,)
        data['lams_quantiles']['1'].shape = (4,)
        data['lams_quantiles']['5'].shape = (4,)
        data['lams_quantiles']['10'].shape = (4,)
        data['lams_quantiles']['25'].shape = (4,)
        data['lams_quantiles']['40'].shape = (4,)
        data['lams_quantiles']['50'].shape = (4,)
        data['lams_quantiles']['60'].shape = (4,)
        data['lams_quantiles']['75'].shape = (4,)
        data['lams_quantiles']['90'].shape = (4,)
        data['lams_quantiles']['95'].shape = (4,)
        data['lams_quantiles']['99'].shape = (4,)
        data['lams_quantiles']['100'].shape = (4,)
    data['alphas_quantiles']
        data['alphas_quantiles']['0'].shape = (4,)
        data['alphas_quantiles']['1'].shape = (4,)
        data['alphas_quantiles']['5'].shape = (4,)
        data['alphas_quantiles']['10'].shape = (4,)
        data['alphas_quantiles']['25'].shape = (4,)
        data['alphas_quantiles']['40'].shape = (4,)
        data['alphas_quantiles']['50'].shape = (4,)
        data['alphas_quantiles']['60'].shape = (4,)
        data['alphas_quantiles']['75'].shape = (4,)
        data['alphas_quantiles']['90'].shape = (4,)
        data['alphas_quantiles']['95'].shape = (4,)
        data['alphas_quantiles']['99'].shape = (4,)
        data['alphas_quantiles']['100'].shape = (4,)
    data['thetas'].shape = (4, 4)
    data['losses'].shape = (4,)
    data['lams'].shape = (4,)
    data['alphas'].shape = (4,)

Batched example

>>> x = torch.rand((3,3,3,2,2),generator=rng)
>>> theta_true = torch.rand((4,4,2,2),generator=rng)
>>> ytrue = torch.exp((x*theta_true[...,None,None,None,:,:]).sum((-2,-1))) # (4,4,3,3,3)
>>> def f(theta):
...     yhat = torch.exp((x*theta[...,None,None,None,:,:]).sum((-2,-1))) # (...,3,3,3)
...     return yhat
>>> theta_hat,data = lm_opt(
...     f = f, 
...     theta0 = torch.rand_like(theta_true,generator=rng),
...     ytrue = ytrue,
...     iters = 2,
...     batch_dims = 2,
...     lam_factors = [torch.tensor([1/4,1/2,1,2,4])],
...     alpha_factors = [torch.tensor([2/3,1,3/2])],
...     verbose = True,
...     verbose_times = False,
...     store_data_iters = None,
...     store_all_data = True,
...     )
    iter i     | losses_quantiles                                          | lams_quantiles                                            | alphas_quantiles                                          
               | 5         | 25        | 50        | 75        | 90        | 5         | 25        | 50        | 75        | 90        | 5         | 25        | 50        | 75        | 90        
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    0          | 5.5e+00   | 2.5e+01   | 5.3e+01   | 1.3e+02   | 5.1e+02   | 1.0e-06   | 1.0e-06   | 1.0e-06   | 1.0e-06   | 1.0e-06   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
    1          | 7.4e-02   | 6.4e-01   | 1.1e+00   | 2.2e+00   | 7.7e+01   | 2.5e-07   | 2.5e-07   | 2.5e-07   | 2.5e-07   | 4.0e-06   | 6.7e-01   | 6.7e-01   | 1.0e+00   | 1.1e+00   | 1.5e+00   
    2          | 1.0e-05   | 6.9e-04   | 1.8e-03   | 3.9e-03   | 1.2e+00   | 6.2e-08   | 6.2e-08   | 1.0e-06   | 1.0e-06   | 1.0e-06   | 6.7e-01   | 6.7e-01   | 1.0e+00   | 1.1e+00   | 1.5e+00   
>>> torch.allclose(theta_hat,theta_true,atol=5e-2)
False
>>> print_data_signatures(data)
    data['theta'].shape = (4, 4, 2, 2)
    data['iterrange'].shape = (3,)
    data['times'].shape = (3,)
    data['losses_quantiles']
        data['losses_quantiles']['0'].shape = (3,)
        data['losses_quantiles']['1'].shape = (3,)
        data['losses_quantiles']['5'].shape = (3,)
        data['losses_quantiles']['10'].shape = (3,)
        data['losses_quantiles']['25'].shape = (3,)
        data['losses_quantiles']['40'].shape = (3,)
        data['losses_quantiles']['50'].shape = (3,)
        data['losses_quantiles']['60'].shape = (3,)
        data['losses_quantiles']['75'].shape = (3,)
        data['losses_quantiles']['90'].shape = (3,)
        data['losses_quantiles']['95'].shape = (3,)
        data['losses_quantiles']['99'].shape = (3,)
        data['losses_quantiles']['100'].shape = (3,)
    data['lams_quantiles']
        data['lams_quantiles']['0'].shape = (3,)
        data['lams_quantiles']['1'].shape = (3,)
        data['lams_quantiles']['5'].shape = (3,)
        data['lams_quantiles']['10'].shape = (3,)
        data['lams_quantiles']['25'].shape = (3,)
        data['lams_quantiles']['40'].shape = (3,)
        data['lams_quantiles']['50'].shape = (3,)
        data['lams_quantiles']['60'].shape = (3,)
        data['lams_quantiles']['75'].shape = (3,)
        data['lams_quantiles']['90'].shape = (3,)
        data['lams_quantiles']['95'].shape = (3,)
        data['lams_quantiles']['99'].shape = (3,)
        data['lams_quantiles']['100'].shape = (3,)
    data['alphas_quantiles']
        data['alphas_quantiles']['0'].shape = (3,)
        data['alphas_quantiles']['1'].shape = (3,)
        data['alphas_quantiles']['5'].shape = (3,)
        data['alphas_quantiles']['10'].shape = (3,)
        data['alphas_quantiles']['25'].shape = (3,)
        data['alphas_quantiles']['40'].shape = (3,)
        data['alphas_quantiles']['50'].shape = (3,)
        data['alphas_quantiles']['60'].shape = (3,)
        data['alphas_quantiles']['75'].shape = (3,)
        data['alphas_quantiles']['90'].shape = (3,)
        data['alphas_quantiles']['95'].shape = (3,)
        data['alphas_quantiles']['99'].shape = (3,)
        data['alphas_quantiles']['100'].shape = (3,)
    data['thetas'].shape = (3, 4, 4, 2, 2)
    data['losses'].shape = (3, 4, 4)
    data['lams'].shape = (3, 4, 4)
    data['alphas'].shape = (3, 4, 4)

Source code in agsutil/algos.py

def lm_opt(
        f,
        theta0,
        ytrue,
        batch_dims = 0, 
        iters = 10,
        residtol = None,
        loss_mult = 1,
        loss_shift = 0,
        f_kwargs_vec = {},
        f_kwargs_no_vec = {},
        lam0 = 1e-6,
        alpha0 = 1e0,
        lam_factors = [[1/2,1,2]],
        alpha_factors = [[1/2,1,2]],
        vmap_chunk_size = None,
        jacmode = "auto",
        verbose = False, 
        verbose_indent = 4,
        quantiles_losses = [0,1,5,10,25,40,50,60,75,90,95,99,100],
        quantiles_lams =   [0,1,5,10,25,40,50,60,75,90,95,99,100],
        quantiles_alphas = [0,1,5,10,25,40,50,60,75,90,95,99,100],
        verbose_quantiles_losses = [5,25,50,75,90],
        verbose_quantiles_lams =   [5,25,50,75,90],
        verbose_quantiles_alphas = [5,25,50,75,90],
        verbose_times = True, 
        warn = True,
        store_data_iters = False,
        store_all_data = False, 
        ):
    r"""
    Levenberg--Marquardt optimization 

    Args:
        f (func): Residual function. 
        theta0 (torch.Tensor): Initial guess for parameters $\theta$. 
        ytrue (torch.Tensor): True `y` values, i.e. `f(theta_true)`. 
        batch_dims (int): Number of batch dimension. 
        iters (int): Number of iterations. 
        residtol (float): Non-negative tolerance on the maximum residual for early stopping, defaults to `1e-12` for `torch.float64` and `2.5e-4` for `torch.float32`.
        loss_mult (bool): Scalar amount by which to multiply the loss so `loss = loss_mult*torch.sum(resid**2,dim=-1)+loss_shift`.
        loss_shift (bool): Scalar amount by which to shift the loss so `loss = loss_mult*torch.sum(resid**2,dim=-1)+loss_shift`.
        f_kwargs_vec (dict): Keyword arguments to `f` which will be vectorized over the first dimension. 
        f_kwargs_no_vec (dict): Keyword arguments to `f` which will not be vectorized over the first dimension. 
        lam0 (float): Initial positive relaxation parameter $\lambda$.
        alpha0 (float): Initial positive step size $\alpha$.
        lam_factors (torch.Tensor): Either a 1D `torch.Tensor` or list of 1d `torch.Tensor`. 

            - Passing in a `float` for `lam_factors` is equivalent to passing in `torch.tensor([lam_factors])` on the correct device
            - If a 1D `torch.Tensor` for `lam_factors` will consider all `lam*lam_factors` options at each step. 
            - If a list of 1D `torch.Tensor`s are passed in for `lam_factors`, iterations will cycle through the list and then return to the start after exhausting the list.

        alpha_factors (torch.Tensor): Either a 1D `torch.Tensor` or list of 1d `torch.Tensor`. 

            - Passing in a `float` for `alpha_factors` is equivalent to passing in `torch.tensor([alpha_factors])` on the correct device
            - If a 1D `torch.Tensor` for `alpha_factors` will consider all `lam*alpha_factors` options at each step. 
            - If a list of 1D `torch.Tensor`s are passed in for `alpha_factors`, iterations will cycle through the list and then return to the start after exhausting the list. 

        vmap_chunk_size (int): Parameter `chunksize` to pass to `torch.vmap`.
        jacmode (bool): Choose between `torch.func.jacfwd` and `torch.func.jacrev` using options:

            - `"fwd"`: Use `torch.func.jacfwd`.
            - `"rev"`: Use `torch.func.jacrev`.
            - `"auto"`: Choose between `torch.func.jacfwd` and `torch.func.jacrev` depending on the size of the inputs and outputs. 

        verbose (int): Controls logging verbosity

            - If `True`, perform logging. 
            - If a positive int, only log every verbose iterations. 
            - If `None`, set to a reasonable positive int based on the maximum number of iterations
            - If `False`, don't log. 

        verbose_indent (int): Non-negative number of indentation spaces for logging.
        quantiles_losses (list): Loss quantiles to record.
        quantiles_lams (list): $\lambda$ quantiles to record.
        quantiles_alphas (list): $\alpha$ quantiles to record.
        verbose_quantiles_losses (list): Loss quantiles to show in verbose log.
        verbose_quantiles_lams (list): $\lambda$ quantiles to show in verbose log.
        verbose_quantiles_alphas (list): $\alpha$ quantiles to show in verbose log.
        verbose_times (bool): If `False`, do not show the times in the verbose log. This is mostly for testing where timing is not reproducible. 
        warn (bool): If `False`, then suppress warnings.
        store_data_iters (int): Controls storage iterations with the same options as verbose. If `store_data_iters==0`, then the data is not collected or returned. 

            - If `True`, store every iteration. 
            - If a positive int, only store every `store_data_iters` iterations. 
            - If `None`, set to a reasonable positive int based on the maximum number of iterations
            - If `False`, don't store data, and do not return data 

        store_all_data (bool): If `True`, store the `x` values as well as the metrics. 

    Returns:
        theta (torch.Tensor): Optimized parameters.
        data (dict): Iteration data, only returned when `store_data_iters>0`

    Examples:

        >>> torch.set_default_dtype(torch.float64)
        >>> rng = torch.Generator().manual_seed(7)


    Standard example

        >>> x = torch.rand((10,4,),generator=rng)
        >>> theta_true = torch.rand((4,),generator=rng)
        >>> ytrue = torch.exp((x*theta_true).sum(-1)) # (10,)
        >>> def f(theta):
        ...     yhat = torch.exp((x*theta[...,None,:]).sum(-1)) # (...,10)
        ...     return yhat
        >>> theta_hat,data = lm_opt(
        ...     f = f, 
        ...     theta0 = torch.rand_like(theta_true,generator=rng),
        ...     ytrue = ytrue,
        ...     iters = 3,
        ...     batch_dims = 0,
        ...     verbose = True,
        ...     verbose_times = False,
        ...     store_data_iters = None,
        ...     store_all_data = True,
        ...     )
            iter i     | losses_quantiles                                          | lams_quantiles                                            | alphas_quantiles                                          
                       | 5         | 25        | 50        | 75        | 90        | 5         | 25        | 50        | 75        | 90        | 5         | 25        | 50        | 75        | 90        
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            0          | 2.3e+01   | 2.3e+01   | 2.3e+01   | 2.3e+01   | 2.3e+01   | 1.0e-06   | 1.0e-06   | 1.0e-06   | 1.0e-06   | 1.0e-06   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
            1          | 3.4e+00   | 3.4e+00   | 3.4e+00   | 3.4e+00   | 3.4e+00   | 5.0e-07   | 5.0e-07   | 5.0e-07   | 5.0e-07   | 5.0e-07   | 5.0e-01   | 5.0e-01   | 5.0e-01   | 5.0e-01   | 5.0e-01   
            2          | 4.5e-02   | 4.5e-02   | 4.5e-02   | 4.5e-02   | 4.5e-02   | 1.0e-06   | 1.0e-06   | 1.0e-06   | 1.0e-06   | 1.0e-06   | 5.0e-01   | 5.0e-01   | 5.0e-01   | 5.0e-01   | 5.0e-01   
            3          | 4.8e-06   | 4.8e-06   | 4.8e-06   | 4.8e-06   | 4.8e-06   | 5.0e-07   | 5.0e-07   | 5.0e-07   | 5.0e-07   | 5.0e-07   | 5.0e-01   | 5.0e-01   | 5.0e-01   | 5.0e-01   | 5.0e-01   
        >>> torch.allclose(theta_hat,theta_true,atol=5e-2)
        True
        >>> print_data_signatures(data)
            data['theta'].shape = (4,)
            data['iterrange'].shape = (4,)
            data['times'].shape = (4,)
            data['losses_quantiles']
                data['losses_quantiles']['0'].shape = (4,)
                data['losses_quantiles']['1'].shape = (4,)
                data['losses_quantiles']['5'].shape = (4,)
                data['losses_quantiles']['10'].shape = (4,)
                data['losses_quantiles']['25'].shape = (4,)
                data['losses_quantiles']['40'].shape = (4,)
                data['losses_quantiles']['50'].shape = (4,)
                data['losses_quantiles']['60'].shape = (4,)
                data['losses_quantiles']['75'].shape = (4,)
                data['losses_quantiles']['90'].shape = (4,)
                data['losses_quantiles']['95'].shape = (4,)
                data['losses_quantiles']['99'].shape = (4,)
                data['losses_quantiles']['100'].shape = (4,)
            data['lams_quantiles']
                data['lams_quantiles']['0'].shape = (4,)
                data['lams_quantiles']['1'].shape = (4,)
                data['lams_quantiles']['5'].shape = (4,)
                data['lams_quantiles']['10'].shape = (4,)
                data['lams_quantiles']['25'].shape = (4,)
                data['lams_quantiles']['40'].shape = (4,)
                data['lams_quantiles']['50'].shape = (4,)
                data['lams_quantiles']['60'].shape = (4,)
                data['lams_quantiles']['75'].shape = (4,)
                data['lams_quantiles']['90'].shape = (4,)
                data['lams_quantiles']['95'].shape = (4,)
                data['lams_quantiles']['99'].shape = (4,)
                data['lams_quantiles']['100'].shape = (4,)
            data['alphas_quantiles']
                data['alphas_quantiles']['0'].shape = (4,)
                data['alphas_quantiles']['1'].shape = (4,)
                data['alphas_quantiles']['5'].shape = (4,)
                data['alphas_quantiles']['10'].shape = (4,)
                data['alphas_quantiles']['25'].shape = (4,)
                data['alphas_quantiles']['40'].shape = (4,)
                data['alphas_quantiles']['50'].shape = (4,)
                data['alphas_quantiles']['60'].shape = (4,)
                data['alphas_quantiles']['75'].shape = (4,)
                data['alphas_quantiles']['90'].shape = (4,)
                data['alphas_quantiles']['95'].shape = (4,)
                data['alphas_quantiles']['99'].shape = (4,)
                data['alphas_quantiles']['100'].shape = (4,)
            data['thetas'].shape = (4, 4)
            data['losses'].shape = (4,)
            data['lams'].shape = (4,)
            data['alphas'].shape = (4,)

    Batched example 

        >>> x = torch.rand((3,3,3,2,2),generator=rng)
        >>> theta_true = torch.rand((4,4,2,2),generator=rng)
        >>> ytrue = torch.exp((x*theta_true[...,None,None,None,:,:]).sum((-2,-1))) # (4,4,3,3,3)
        >>> def f(theta):
        ...     yhat = torch.exp((x*theta[...,None,None,None,:,:]).sum((-2,-1))) # (...,3,3,3)
        ...     return yhat
        >>> theta_hat,data = lm_opt(
        ...     f = f, 
        ...     theta0 = torch.rand_like(theta_true,generator=rng),
        ...     ytrue = ytrue,
        ...     iters = 2,
        ...     batch_dims = 2,
        ...     lam_factors = [torch.tensor([1/4,1/2,1,2,4])],
        ...     alpha_factors = [torch.tensor([2/3,1,3/2])],
        ...     verbose = True,
        ...     verbose_times = False,
        ...     store_data_iters = None,
        ...     store_all_data = True,
        ...     )
            iter i     | losses_quantiles                                          | lams_quantiles                                            | alphas_quantiles                                          
                       | 5         | 25        | 50        | 75        | 90        | 5         | 25        | 50        | 75        | 90        | 5         | 25        | 50        | 75        | 90        
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            0          | 5.5e+00   | 2.5e+01   | 5.3e+01   | 1.3e+02   | 5.1e+02   | 1.0e-06   | 1.0e-06   | 1.0e-06   | 1.0e-06   | 1.0e-06   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
            1          | 7.4e-02   | 6.4e-01   | 1.1e+00   | 2.2e+00   | 7.7e+01   | 2.5e-07   | 2.5e-07   | 2.5e-07   | 2.5e-07   | 4.0e-06   | 6.7e-01   | 6.7e-01   | 1.0e+00   | 1.1e+00   | 1.5e+00   
            2          | 1.0e-05   | 6.9e-04   | 1.8e-03   | 3.9e-03   | 1.2e+00   | 6.2e-08   | 6.2e-08   | 1.0e-06   | 1.0e-06   | 1.0e-06   | 6.7e-01   | 6.7e-01   | 1.0e+00   | 1.1e+00   | 1.5e+00   
        >>> torch.allclose(theta_hat,theta_true,atol=5e-2)
        False
        >>> print_data_signatures(data)
            data['theta'].shape = (4, 4, 2, 2)
            data['iterrange'].shape = (3,)
            data['times'].shape = (3,)
            data['losses_quantiles']
                data['losses_quantiles']['0'].shape = (3,)
                data['losses_quantiles']['1'].shape = (3,)
                data['losses_quantiles']['5'].shape = (3,)
                data['losses_quantiles']['10'].shape = (3,)
                data['losses_quantiles']['25'].shape = (3,)
                data['losses_quantiles']['40'].shape = (3,)
                data['losses_quantiles']['50'].shape = (3,)
                data['losses_quantiles']['60'].shape = (3,)
                data['losses_quantiles']['75'].shape = (3,)
                data['losses_quantiles']['90'].shape = (3,)
                data['losses_quantiles']['95'].shape = (3,)
                data['losses_quantiles']['99'].shape = (3,)
                data['losses_quantiles']['100'].shape = (3,)
            data['lams_quantiles']
                data['lams_quantiles']['0'].shape = (3,)
                data['lams_quantiles']['1'].shape = (3,)
                data['lams_quantiles']['5'].shape = (3,)
                data['lams_quantiles']['10'].shape = (3,)
                data['lams_quantiles']['25'].shape = (3,)
                data['lams_quantiles']['40'].shape = (3,)
                data['lams_quantiles']['50'].shape = (3,)
                data['lams_quantiles']['60'].shape = (3,)
                data['lams_quantiles']['75'].shape = (3,)
                data['lams_quantiles']['90'].shape = (3,)
                data['lams_quantiles']['95'].shape = (3,)
                data['lams_quantiles']['99'].shape = (3,)
                data['lams_quantiles']['100'].shape = (3,)
            data['alphas_quantiles']
                data['alphas_quantiles']['0'].shape = (3,)
                data['alphas_quantiles']['1'].shape = (3,)
                data['alphas_quantiles']['5'].shape = (3,)
                data['alphas_quantiles']['10'].shape = (3,)
                data['alphas_quantiles']['25'].shape = (3,)
                data['alphas_quantiles']['40'].shape = (3,)
                data['alphas_quantiles']['50'].shape = (3,)
                data['alphas_quantiles']['60'].shape = (3,)
                data['alphas_quantiles']['75'].shape = (3,)
                data['alphas_quantiles']['90'].shape = (3,)
                data['alphas_quantiles']['95'].shape = (3,)
                data['alphas_quantiles']['99'].shape = (3,)
                data['alphas_quantiles']['100'].shape = (3,)
            data['thetas'].shape = (3, 4, 4, 2, 2)
            data['losses'].shape = (3, 4, 4)
            data['lams'].shape = (3, 4, 4)
            data['alphas'].shape = (3, 4, 4)
    """
    if warn and (not torch.get_default_dtype()==torch.float64): warnings.warn('''
            torch.get_default_dtype() = %s, but lm_opt often requires high precision updates. We recommend using:
                torch.set_default_dtype(torch.float64)'''%str(torch.get_default_dtype()))
    assert torch.get_default_dtype() in [torch.float32,torch.float64]
    default_dtype = torch.get_default_dtype()
    device = str(theta0.device)
    default_device = str(torch.get_default_device())
    assert iters%1==0, "iters should be an int"
    assert iters>=0
    assert callable(f) 
    assert batch_dims>=0
    assert isinstance(theta0,torch.Tensor)
    batch_shape = tuple(theta0.shape[:batch_dims])
    R = int(torch.tensor(batch_shape).prod())
    nonbatch_theta_dims = theta0.ndim-batch_dims
    nonbatch_theta_shape = tuple(theta0.shape[batch_dims:])
    nonbatch_y_dims = ytrue.ndim-batch_dims
    nonbatch_y_shape = tuple(ytrue.shape[batch_dims:])
    K = int(torch.tensor(nonbatch_y_shape).prod())
    T = int(torch.tensor(nonbatch_theta_shape).prod())
    if batch_dims==0:
        theta = theta0[None,...]
    else: # batch_dims>0:
        theta = theta0.flatten(end_dim=batch_dims-1)
    if batch_dims==0:
        ytrue = ytrue[None,...]
    else: # batch_dims>0:
        ytrue = ytrue.flatten(end_dim=batch_dims-1)
    assert isinstance(f_kwargs_vec,dict)
    assert isinstance(f_kwargs_no_vec,dict)
    f_kwargs_vec_names = list(f_kwargs_vec.keys())
    f_kwargs_vec_vals = []
    for key in f_kwargs_vec_names:
        assert f_kwargs_vec[key].shape[:batch_dims]==batch_shape, "f_kwargs_vec['%s'].shape[:%d] = %s but bs = %s"%(key,batch_dims,f_kwargs_vec[key].shape[:batch_dims])
        if batch_dims==0:
            f_kwargs_vec_vals.append(f_kwargs_vec[key][None,...])
        else: # batch_dims>0
            f_kwargs_vec_vals.append(f_kwargs_vec[key].flatten(end_dim=batch_dims-1))
    f_kwargs_vec_names = ["ytrue"]+f_kwargs_vec_names
    f_kwargs_vec_vals = [ytrue]+f_kwargs_vec_vals
    if verbose is None: 
        verbose = max(1,iters//20)
    assert verbose%1==0
    assert verbose>=0 
    if store_data_iters is None: 
        store_data_iters = max(1,iters//1000)
    assert store_data_iters%1==0
    assert store_data_iters>=0 
    assert isinstance(store_all_data,bool)
    loss_mult = float(loss_mult)
    assert loss_mult!=0 
    signminimize = -1 if loss_mult<0 else 1
    loss_shift = float(loss_shift)
    if residtol is None: 
        if default_dtype==torch.float64:
            residtol = 1e-12
        elif default_dtype==torch.float32:
            residtol = 2.5e-4
        else:
            raise Exception("default_dtype = %s not parsed"%str(default_dtype))
    assert residtol>=0
    assert lam0>0
    assert alpha0>0
    if np.isscalar(lam_factors):
        lam_factors = [torch.tensor([lam_factors],device=device)]
    elif isinstance(lam_factors,torch.Tensor):
        lam_factors = [lam_factors.to(device)]
    lam_factors = [torch.tensor(list(lam_factors[i])).to(device) for i in range(len(lam_factors))]
    assert isinstance(lam_factors,list)
    assert all(isinstance(lam_factors[i],torch.Tensor) for i in range(len(lam_factors)))
    assert all(lam_factors[i].ndim==1 for i in range(len(lam_factors)))
    if np.isscalar(alpha_factors):
        alpha_factors = [torch.tensor([alpha_factors],device=device)]
    elif isinstance(alpha_factors,torch.Tensor):
        alpha_factors = [alpha_factors.to(device)]
    alpha_factors = [torch.tensor(list(alpha_factors[i])).to(device) for i in range(len(alpha_factors))]
    assert isinstance(alpha_factors,list)
    assert all(isinstance(alpha_factors[i],torch.Tensor) for i in range(len(alpha_factors)))
    assert all(alpha_factors[i].ndim==1 for i in range(len(alpha_factors)))
    if len(alpha_factors)==1:
        alpha_factors = alpha_factors*len(lam_factors)
    if len(lam_factors)==1:
        lam_factors = lam_factors*len(alpha_factors)
    assert len(lam_factors)==len(alpha_factors)
    assert isinstance(quantiles_losses,list)
    assert all(0<=qt<=100 for qt in quantiles_losses)
    assert isinstance(quantiles_lams,list)
    assert all(0<=qt<=100 for qt in quantiles_lams)
    assert isinstance(quantiles_alphas,list)
    assert all(0<=qt<=100 for qt in quantiles_alphas)
    assert isinstance(verbose_quantiles_losses,list)
    assert all(qt in quantiles_losses for qt in verbose_quantiles_losses)
    assert isinstance(verbose_quantiles_lams,list)
    assert all(qt in quantiles_lams for qt in verbose_quantiles_lams)
    assert isinstance(verbose_quantiles_alphas,list)
    assert all(qt in quantiles_alphas for qt in verbose_quantiles_alphas)
    assert verbose_indent%1==0 
    assert verbose_indent>=0
    assert isinstance(verbose_times,bool)
    if store_data_iters:
        iterrange = []
        times = []
        losses = []
        losses_quantiles = {str(qt):[] for qt in quantiles_losses}
        lams_quantiles = {str(qt):[] for qt in quantiles_lams}
        alphas_quantiles = {str(qt):[] for qt in quantiles_alphas}
        if store_all_data:
            thetas = []
            lams = []
            alphas = []
    def f_resid(theta, *f_kwargs_vec_vals):
        assert len(f_kwargs_vec_vals)==len(f_kwargs_vec_names)
        ytrue = f_kwargs_vec_vals[0]
        f_kwargs_vec = {f_kwargs_vec_names[i]:f_kwargs_vec_vals[i] for i in range(1,len(f_kwargs_vec_names))}
        all_args = f(theta,**f_kwargs_vec,**f_kwargs_no_vec)
        assert isinstance(all_args,tuple) or isinstance(all_args,torch.Tensor), "f must return a tuple or torch.Tensor"
        if isinstance(all_args,torch.Tensor):
            yhat = all_args
            args = ()
        else: #  isinstance(all_args,tuple)
            assert all(isinstance(arg,torch.Tensor) for arg in all_args)
            yhat = all_args[0]
            args = all_args[1:]
        resid = yhat-ytrue
        return resid,(resid,yhat,*args)
    assert jacmode in ["auto","fwd","rev"]
    if jacmode=="auto":
        if T<=K:
            jacfunc = torch.func.jacfwd
        else:
            jacfunc = torch.func.jacrev
    elif jacmode=="fwd":
        jacfunc = torch.func.jacfwd
        if warn and T>K: warnings.warn('''
            For T the number of inputs and K the number of outputs:
                torch.func.jacfwd performs best when T << K. 
                torch.func.jacrev performs best when K << T.
            You are using jacmode=='fwd' but T = %d > %d = K. 
            Try using jacmode=='rev' by setting jacfwd = False.'''%(T,K))
    elif jacmode=="rev":
        jacfunc = torch.func.jacrev
        if warn and T<K: warnings.warn('''
            For T the number of inputs and K the number of outputs:
                torch.func.jacfwd performs best when T << K. 
                torch.func.jacrev performs best when K << T.
            You are using jacmode=='rev' but T = %d < %d = K. 
            Try using jacmode=='fwd' for better performance.'''%(T,K))
    jac_ftilde = jacfunc(f_resid,argnums=(0,),has_aux=True)
    vjac_ftilde = torch.func.vmap(jac_ftilde,in_dims=(0,)+(0,)*len(f_kwargs_vec_names),chunk_size=vmap_chunk_size)
    eyeT = torch.eye(T,device=device)
    Rrange = torch.arange(R,device=device)
    lam = lam0*torch.ones(R,device=device)
    alpha = alpha0*torch.ones(R,device=device)
    if verbose:
        _h_iter = "%-10s "%"iter i"
        _h_times = "| %-10s"%"times" if verbose_times else ""
        _s_losses_qt = ("| %-9s "*len(verbose_quantiles_losses))%tuple(str(qt) for qt in verbose_quantiles_losses)
        _s_lams_qt = ("| %-9s "*len(verbose_quantiles_lams))%tuple(str(qt) for qt in verbose_quantiles_lams)
        _s_alphas_qt = ("| %-9s "*len(verbose_quantiles_alphas))%tuple(str(qt) for qt in verbose_quantiles_alphas)
        _h_losses_qt = "| losses_quantiles"+" "*(len(_s_losses_qt)-len("| losses_quantiles"))
        _h_lams_qt   = "| lams_quantiles"  +" "*(len(_s_lams_qt)  -len("| lams_quantiles"))
        _h_alphas_qt = "| alphas_quantiles"+" "*(len(_s_alphas_qt)-len("| alphas_quantiles"))
        _h = _h_iter+_h_losses_qt+_h_lams_qt+_h_alphas_qt+_h_times
        _s = " "*len(_h_iter)+_s_losses_qt+_s_lams_qt+_s_alphas_qt+("|"+" "*(len(_h_times)-1) if verbose_times else " "*len(_h_times))
        print(" "*verbose_indent+_h)
        print(" "*verbose_indent+_s)
        print(" "*verbose_indent+"~"*len(_s))
    timer = Timer(device=device)
    timer.tic()
    for i in range(iters+1):
        if i==iters:
            _,(resid,yhat,*args) = f_resid(theta,*f_kwargs_vec_vals)
        else:
            (Jfull,),(resid,yhat,*args) = vjac_ftilde(theta,*f_kwargs_vec_vals)
        assert Jfull.shape==(R,*nonbatch_y_shape,*nonbatch_theta_shape)
        breakcond = i==iters or resid.abs().amax()<=residtol
        loss = loss_mult*(resid**2).flatten(start_dim=1).sum(-1)+loss_shift
        times_i = timer.toc()
        losses_quantiles_i = {str(qt): loss.nanquantile(qt/100) for qt in quantiles_losses}
        lams_quantiles_i = {str(qt): lam.nanquantile(qt/100) for qt in quantiles_lams}
        alphas_quantiles_i = {str(qt): alpha.nanquantile(qt/100) for qt in quantiles_alphas}
        if store_data_iters and (i%store_data_iters==0 or breakcond):
            iterrange.append(i)
            losses.append(loss.reshape(batch_shape).to(default_device))
            times.append(times_i)
            for qt in quantiles_losses:
                losses_quantiles[str(qt)].append(losses_quantiles_i[str(qt)].to(default_device))
            for qt in quantiles_lams:
                lams_quantiles[str(qt)].append(lams_quantiles_i[str(qt)].to(default_device))
            for qt in quantiles_alphas:
                alphas_quantiles[str(qt)].append(alphas_quantiles_i[str(qt)].to(default_device))
            if store_all_data:
                thetas.append(theta.reshape(*batch_shape,*nonbatch_theta_shape).to(default_device))
                lams.append(lam.reshape(batch_shape).to(default_device))
                alphas.append(alpha.reshape(batch_shape).to(default_device))
        if verbose and (i%verbose==0 or i==iters):
            _s_iter = "%-10d "%i
            _s_losses_qt = ("| %-9.1e "*len(verbose_quantiles_losses))%tuple(losses_quantiles_i[str(qt)] for qt in verbose_quantiles_losses)
            _s_lams_qt = ("| %-9.1e "*len(verbose_quantiles_lams))%tuple(lams_quantiles_i[str(qt)] for qt in verbose_quantiles_lams)
            _s_alphas_qt = ("| %-9.1e "*len(verbose_quantiles_alphas))%tuple(alphas_quantiles_i[str(qt)] for qt in verbose_quantiles_alphas)
            _s_times = "| %-10.1f "%(times_i) if verbose_times else ""
            print(" "*verbose_indent+_s_iter+_s_losses_qt+_s_lams_qt+_s_alphas_qt+_s_times)
        if breakcond: break
        J = Jfull.reshape((R,K,T))
        residf = resid.reshape((R,K))
        gamma = torch.einsum("rij,ri->rj",J,residf)
        JtJ = torch.einsum("rij,ril->rjl",J,J) # (R,T,T)
        lam_factors_i = lam_factors[i%len(lam_factors)]
        alpha_factors_i = alpha_factors[i%len(alpha_factors)]
        Q_lams = len(lam_factors_i)
        Q_alphas = len(alpha_factors_i)
        lams_try = lam_factors_i[:,None]*lam # (Q_lams,R)
        alphas_try = alpha_factors_i[:,None]*alpha # (Q_alphas,R)
        JtJplam = JtJ[None,:,:,:]+lams_try[:,:,None,None]*eyeT # (Q_lams,R,T,T)
        L,fails = torch.linalg.cholesky_ex(JtJplam,upper=False) # L.shape==(Q_lams,R,T,T) and fails.shape==(Q_lams,R)
        success = ~fails.to(bool) # (Q_lams,R)
        deltaf = torch.nan*torch.ones((Q_lams,R,T),device=device)
        gammas = torch.ones((Q_lams,1,1),device=device)*gamma[None,:,:] # (Q_lams,R,T)
        deltaf[success] = torch.linalg.solve_triangular(L[success].transpose(dim0=-2,dim1=-1),torch.linalg.solve_triangular(L[success],gammas[success].unsqueeze(-1),upper=False),upper=True)[...,0] # (Q_lam,R,T)
        thetasf = torch.ones((Q_lams,1,1),device=device)*theta.reshape((1,R,T)) # (Q_lam,R,T)
        thetasf_new = torch.nan*torch.ones((Q_alphas,Q_lams,R,T),device=device)
        thetasf_new[:,success] = thetasf[success]-signminimize*alpha_factors_i[:,None,None]*deltaf[success]
        thetas_new = thetasf_new.reshape((Q_alphas,Q_lams,R,*nonbatch_theta_shape))
        f_kwargs_vec_vals_success = [(torch.ones((Q_alphas,Q_lams)+(1,)*f_kwargs_vec_vals[l].ndim,device=device)*f_kwargs_vec_vals[l][None,None,...])[:,success] for l in range(len(f_kwargs_vec_vals))]
        residf_new = torch.inf*torch.ones((Q_alphas,Q_lams,R,K),device=device)
        _,(resid_new_success,*_) = f_resid(thetas_new[:,success],*f_kwargs_vec_vals_success)
        residf_new[:,success] = resid_new_success.reshape((Q_alphas,resid_new_success.size(1),K))
        losses_new = loss_mult*(residf_new**2).sum(-1)+loss_shift
        ibest = losses_new.reshape((Q_alphas*Q_lams,R)).argmin(0) # (R,)
        ibest_alpha,ibest_lam = ibest//Q_lams,ibest%Q_lams
        lam_best_new = lams_try[ibest_lam,Rrange] # (R,)
        alpha_best_new = alphas_try[ibest_alpha,Rrange] # (R,)
        loss_best_new = losses_new[ibest_alpha,ibest_lam,Rrange] # (R,)
        thetas_best_new = thetas_new[ibest_alpha,ibest_lam,Rrange] # (R,*nonbatch_theta_shape)
        improved = loss_best_new<loss # (R,)
        lam[improved] = lam_best_new[improved]
        alpha[improved] = alpha_best_new[improved]
        theta[improved] = thetas_best_new[improved]
        lam[~improved] = lams_try[:,~improved].amax(0)
        alpha[~improved] = alphas_try[:,~improved].amin(0)
    theta = theta.reshape((*batch_shape,*nonbatch_theta_shape))
    if batch_shape==():
        args = [arg[0] for arg in args]
    if store_data_iters==0:
        if args==[]:
            return theta
        else:
            return theta,*args
    else:
        data = {
            "theta": theta.to(default_device), 
            "iterrange": torch.tensor(iterrange,dtype=int), 
            "times": torch.tensor(times), 
            "losses_quantiles": {str(qt):torch.tensor(losses_quantiles[str(qt)]) for qt in quantiles_losses},
            "lams_quantiles": {str(qt):torch.tensor(lams_quantiles[str(qt)]) for qt in quantiles_lams},
            "alphas_quantiles": {str(qt):torch.tensor(alphas_quantiles[str(qt)]) for qt in quantiles_alphas},
            }
        if store_all_data:
            data["thetas"] = torch.stack(thetas,dim=0)
            data["losses"] = torch.stack(losses,dim=0)
            data["lams"] = torch.stack(lams,dim=0)
            data["alphas"] = torch.stack(alphas,dim=0)
        if args==[]:
            return theta,data
        else:
            return theta,*args,data

minres

minres(A, B, X0=None, iters=None, residtol=None, verbose=False, verbose_indent=4, quantiles_losses=[0, 1, 5, 10, 25, 40, 50, 60, 75, 90, 95, 99, 100], verbose_quantiles_losses=[5, 25, 50, 75, 90], verbose_times=True, warn=True, store_data_iters=False, store_all_data=False)

MINRES algorithm for solving linear systems \(AX=B\) where \(A\) is real-symmetric or complex-Hermitian

A translation of scipy.sparse.linalg.minres.

Parameters:

Name	Type	Description	Default
`A`	`Union[Tensor, callable]`	Symmetric matrix `A` with shape `(...,n,n)`, or `callable(A)` where `a(x)` should return the batch matrix multiplication of `A` and `X`,	required
`B`	`Tensor`	Right hand side tensor \(B\) with shape `(...,n,k)`	required
`X0`	`Tensor`	Initial guess for \(X\) with shape `(...,n,k)`, defaults to zeros.	`None`
`iters`	`int`	number of minres iterations, defaults to `5n`.	`None`
`residtol`	`float`	Non-negative tolerance on the maximum residual for early stopping, defaults to `1e-12` for `torch.float64` and `2.5e-4` for `torch.float32`.	`None`
`verbose`	`int`	Controls logging verbosity If `True`, perform logging. If a positive int, only log every `verbose` iterations. If `None`, set to a reasonable positive int based on the maximum number of iterations If `False`, don't log.	`False`
`verbose_indent`	`int`	Non-negative number of indentation spaces for logging.	`4`
`quantiles_losses`	`list`	Loss quantiles to record.	`[0, 1, 5, 10, 25, 40, 50, 60, 75, 90, 95, 99, 100]`
`verbose_quantiles_losses`	`list`	Loss quantiles to show in verbose log.	`[5, 25, 50, 75, 90]`
`verbose_times`	`bool`	If `False`, do not show the times in the verbose log. This is mostly for testing where timing is not reproducible.	`True`
`warn`	`bool`	If `False`, then suppress warnings.	`True`
`store_data_iters`	`int`	Controls storage iterations with the same options as verbose. If `store_data_iters==0`, then the data is not collected or returned. If `True`, store every iteration. If a positive int, only store every `store_data_iters` iterations. If `None`, set to a reasonable positive int based on the maximum number of iterations If `False`, don't store data, and do not return data	`False`
`store_all_data`	`bool`	If `True`, store the `x` values as well as the metrics.	`False`

Returns:

Name	Type	Description
`x`	`Tensor`	Optimized \(X\).
`data`	`dict`	Iteration data, only returned when `store_data_iters>0`

Examples:

>>> torch.set_default_dtype(torch.float64)
>>> rng = torch.Generator().manual_seed(7)

Real-symmetric example with column vector \(b\)

>>> n = 5
>>> A = torch.randn(n,n,generator=rng)
>>> A = (A+A.T)/2
>>> b = torch.rand(n,generator=rng)
>>> x_true = torch.linalg.solve(A,b[...,None])[...,0]
>>> x_true
tensor([-0.1402,  0.4565,  0.2920,  0.2470,  0.3251])
>>> torch.allclose(A@x_true-b,torch.zeros_like(b))
True
>>> x_minres = minres(A,b[...,None],verbose=None,verbose_times=False)[...,0]
    iter i     | losses_quantiles                                          
               | 5         | 25        | 50        | 75        | 90        
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    0          | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
    1          | 6.1e-01   | 6.1e-01   | 6.1e-01   | 6.1e-01   | 6.1e-01   
    2          | 2.1e-01   | 2.1e-01   | 2.1e-01   | 2.1e-01   | 2.1e-01   
    3          | 1.9e-01   | 1.9e-01   | 1.9e-01   | 1.9e-01   | 1.9e-01   
    4          | 7.3e-02   | 7.3e-02   | 7.3e-02   | 7.3e-02   | 7.3e-02   
    5          | 5.4e-16   | 5.4e-16   | 5.4e-16   | 5.4e-16   | 5.4e-16   
>>> torch.allclose(x_minres,x_true)
True

Complex-Hermitian example with column vector \(b\)

>>> n = 5
>>> A = torch.randn(n,n,dtype=torch.complex128,generator=rng)
>>> A = (A+A.adjoint())/2
>>> b = torch.rand(n,dtype=torch.complex128,generator=rng)
>>> x_true = torch.linalg.solve(A,b[...,None])[...,0]
>>> x_true
tensor([ 0.2207+0.2879j,  0.0928-0.0057j,  0.2681+1.3488j, -1.2520-0.4214j,
        -0.8860-0.6922j])
>>> torch.allclose(A@x_true-b,torch.zeros_like(b))
True
>>> x_minres = minres(A,b[...,None],verbose=None,verbose_times=False)[...,0]
    iter i     | losses_quantiles                                          
               | 5         | 25        | 50        | 75        | 90        
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    0          | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
    1          | 8.7e-01   | 8.7e-01   | 8.7e-01   | 8.7e-01   | 8.7e-01   
    2          | 5.4e-01   | 5.4e-01   | 5.4e-01   | 5.4e-01   | 5.4e-01   
    3          | 5.4e-01   | 5.4e-01   | 5.4e-01   | 5.4e-01   | 5.4e-01   
    4          | 2.6e-01   | 2.6e-01   | 2.6e-01   | 2.6e-01   | 2.6e-01   
    5          | 3.0e-15   | 3.0e-15   | 3.0e-15   | 3.0e-15   | 3.0e-15   
>>> torch.allclose(x_minres,x_true)
True

Matrix \(B\)

>>> n = 5
>>> k = 3
>>> A = torch.randn(n,n,generator=rng)
>>> A = (A+A.T)/2
>>> B = torch.rand(n,k,generator=rng)
>>> X_true = torch.linalg.solve(A,B)
>>> X_true
tensor([[ 0.8801, -0.0116,  0.4805],
        [-1.1095, -1.6166, -0.7103],
        [-2.9918, -1.9201, -3.5855],
        [-4.1777, -3.6586, -5.1658],
        [ 1.5417,  0.9814,  1.3790]])
>>> torch.allclose(A@X_true-B,torch.zeros_like(B))
True
>>> X_minres = minres(A,B,verbose=None,verbose_times=False)
    iter i     | losses_quantiles                                          
               | 5         | 25        | 50        | 75        | 90        
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    0          | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
    1          | 6.6e-01   | 7.5e-01   | 8.7e-01   | 8.8e-01   | 8.9e-01   
    2          | 6.2e-01   | 6.4e-01   | 6.6e-01   | 7.3e-01   | 7.6e-01   
    3          | 3.5e-01   | 4.7e-01   | 6.1e-01   | 6.3e-01   | 6.4e-01   
    4          | 1.5e-01   | 1.8e-01   | 2.2e-01   | 3.4e-01   | 4.1e-01   
    5          | 7.0e-15   | 9.1e-15   | 1.2e-14   | 2.2e-14   | 2.8e-14   
>>> torch.allclose(X_minres,X_true)
True

Tri-diagonal \(A\) with storage-saving multiplication function

>>> n = 5
>>> k = 3
>>> A_diag = torch.randn(n,generator=rng)
>>> A_off_diag = torch.randn(n-1,generator=rng) 
>>> A = torch.zeros(n,n)
>>> A[torch.arange(n),torch.arange(n)] = A_diag 
>>> A[torch.arange(n-1),torch.arange(1,n)] = A_off_diag
>>> A[torch.arange(1,n),torch.arange(n-1)] = A_off_diag
>>> A
tensor([[-0.2728, -0.1545,  0.0000,  0.0000,  0.0000],
        [-0.1545, -0.0275, -0.0120,  0.0000,  0.0000],
        [ 0.0000, -0.0120, -0.4436,  0.2802,  0.0000],
        [ 0.0000,  0.0000,  0.2802, -0.7303,  0.9724],
        [ 0.0000,  0.0000,  0.0000,  0.9724, -0.4180]])
>>> B = torch.rand(n,k,generator=rng)
>>> X_true = torch.linalg.solve(A,B)
>>> X_true
tensor([[-5.5996, -3.1344, -6.0989],
        [ 6.5450, -0.5211,  5.0505],
        [-1.8303, -0.9818, -0.4647],
        [ 0.5862,  0.9233,  1.2342],
        [ 1.3614,  1.4523,  1.9166]])
>>> torch.allclose(A@X_true-B,torch.zeros_like(B))
True
>>> def A_mult(x):
...     y = x*A_diag[:,None]
...     y[1:,:] += x[:-1,:]*A_off_diag[:,None]
...     y[:-1,:] += x[1:,:]*A_off_diag[:,None]
...     return y
>>> torch.allclose(A_mult(X_true),A@X_true)
True
>>> X_minres = minres(A_mult,B,verbose=None,verbose_times=False)
    iter i     | losses_quantiles                                          
               | 5         | 25        | 50        | 75        | 90        
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    0          | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
    1          | 7.5e-01   | 8.1e-01   | 8.8e-01   | 9.3e-01   | 9.6e-01   
    2          | 4.5e-01   | 5.6e-01   | 7.1e-01   | 8.0e-01   | 8.5e-01   
    3          | 1.5e-01   | 1.9e-01   | 2.4e-01   | 2.9e-01   | 3.2e-01   
    4          | 5.7e-02   | 1.4e-01   | 2.4e-01   | 2.9e-01   | 3.2e-01   
    5          | 4.0e-15   | 5.4e-15   | 7.1e-15   | 1.9e-14   | 2.6e-14   
>>> torch.allclose(X_minres,X_true)
True

Batched tri-diagonal \(A\) with storage-saving multiplication function

>>> n = 100
>>> k = 3
>>> A_diag = torch.randn(2,1,4,n,generator=rng)
>>> A_off_diag = torch.randn(2,1,4,n-1,generator=rng) 
>>> A = torch.zeros(2,1,4,n,n)
>>> A[...,torch.arange(n),torch.arange(n)] = A_diag 
>>> A[...,torch.arange(n-1),torch.arange(1,n)] = A_off_diag
>>> A[...,torch.arange(1,n),torch.arange(n-1)] = A_off_diag
>>> B = torch.rand(2,6,1,n,k,generator=rng)
>>> X_true = torch.linalg.solve(A,B)
>>> torch.allclose(torch.einsum("...ij,...jk->...ik",A,X_true)-B,torch.zeros_like(B))
True
>>> def A_mult(x):
...     y = x*A_diag[...,:,None]
...     y[...,1:,:] += x[...,:-1,:]*A_off_diag[...,:,None]
...     y[...,:-1,:] += x[...,1:,:]*A_off_diag[...,:,None]
...     return y
>>> torch.allclose(A_mult(X_true),torch.einsum("...ij,...jk->...ik",A,X_true))
True
>>> X_minres,data = minres(A_mult,B,verbose=None,verbose_times=False,store_data_iters=None,store_all_data=True)
    iter i     | losses_quantiles                                          
               | 5         | 25        | 50        | 75        | 90        
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    0          | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
    25         | 1.8e-01   | 2.3e-01   | 2.6e-01   | 2.9e-01   | 3.2e-01   
    50         | 9.4e-02   | 1.2e-01   | 1.5e-01   | 1.9e-01   | 2.1e-01   
    75         | 3.3e-02   | 5.1e-02   | 8.2e-02   | 1.5e-01   | 1.8e-01   
    ...   
>>> X_minres.shape
torch.Size([2, 6, 4, 100, 3])
>>> torch.allclose(X_minres,X_true)
True
>>> print_data_signatures(data)
    data['x'].shape = (2, 6, 4, 100, 3)
    data['iterrange'].shape = (177,)
    data['times'].shape = (177,)
    data['losses_quantiles']
        data['losses_quantiles']['0'].shape = (177,)
        data['losses_quantiles']['1'].shape = (177,)
        data['losses_quantiles']['5'].shape = (177,)
        data['losses_quantiles']['10'].shape = (177,)
        data['losses_quantiles']['25'].shape = (177,)
        data['losses_quantiles']['40'].shape = (177,)
        data['losses_quantiles']['50'].shape = (177,)
        data['losses_quantiles']['60'].shape = (177,)
        data['losses_quantiles']['75'].shape = (177,)
        data['losses_quantiles']['90'].shape = (177,)
        data['losses_quantiles']['95'].shape = (177,)
        data['losses_quantiles']['99'].shape = (177,)
        data['losses_quantiles']['100'].shape = (177,)
    data['xs'].shape = (177, 2, 6, 4, 100, 3)
    data['losses'].shape = (177, 2, 6, 4, 3)

Source code in agsutil/algos.py

def minres(
        A,
        B,
        X0 = None,
        iters = None,
        residtol = None,
        verbose = False, 
        verbose_indent = 4,
        quantiles_losses = [0,1,5,10,25,40,50,60,75,90,95,99,100],
        verbose_quantiles_losses = [5,25,50,75,90],
        verbose_times = True, 
        warn = True,
        store_data_iters = False, 
        store_all_data = False,
        ):
    r"""
    [MINRES algorithm](https://en.wikipedia.org/wiki/Minimal_residual_method) for solving linear systems $AX=B$ where $A$ is real-symmetric or complex-Hermitian

    A translation of [`scipy.sparse.linalg.minres`](https://github.com/scipy/scipy/blob/v1.17.0/scipy/sparse/linalg/_isolve/minres.py).

    Args:
        A (Union[torch.Tensor,callable]): Symmetric matrix `A` with shape `(...,n,n)`, or  
            `callable(A)` where `a(x)` should return the batch matrix multiplication of `A` and `X`,  
        B (torch.Tensor): Right hand side tensor $B$ with shape `(...,n,k)`
        X0 (torch.Tensor): Initial guess for $X$ with shape `(...,n,k)`, defaults to zeros. 
        iters (int): number of minres iterations, defaults to `5n`. 
        residtol (float): Non-negative tolerance on the maximum residual for early stopping, defaults to `1e-12` for `torch.float64` and `2.5e-4` for `torch.float32`.
        verbose (int): Controls logging verbosity

            - If `True`, perform logging. 
            - If a positive int, only log every `verbose` iterations. 
            - If `None`, set to a reasonable positive int based on the maximum number of iterations
            - If `False`, don't log. 

        verbose_indent (int): Non-negative number of indentation spaces for logging.
        quantiles_losses (list): Loss quantiles to record.
        verbose_quantiles_losses (list): Loss quantiles to show in verbose log.
        verbose_times (bool): If `False`, do not show the times in the verbose log. This is mostly for testing where timing is not reproducible. 
        warn (bool): If `False`, then suppress warnings.
        store_data_iters (int): Controls storage iterations with the same options as verbose. If `store_data_iters==0`, then the data is not collected or returned. 

            - If `True`, store every iteration. 
            - If a positive int, only store every `store_data_iters` iterations. 
            - If `None`, set to a reasonable positive int based on the maximum number of iterations
            - If `False`, don't store data, and do not return data 

        store_all_data (bool): If `True`, store the `x` values as well as the metrics. 

    Returns:
        x (torch.Tensor): Optimized $X$.
        data (dict): Iteration data, only returned when `store_data_iters>0`

    Examples:

        >>> torch.set_default_dtype(torch.float64)
        >>> rng = torch.Generator().manual_seed(7)

    Real-symmetric example with column vector $b$ 

        >>> n = 5
        >>> A = torch.randn(n,n,generator=rng)
        >>> A = (A+A.T)/2
        >>> b = torch.rand(n,generator=rng)
        >>> x_true = torch.linalg.solve(A,b[...,None])[...,0]
        >>> x_true
        tensor([-0.1402,  0.4565,  0.2920,  0.2470,  0.3251])
        >>> torch.allclose(A@x_true-b,torch.zeros_like(b))
        True
        >>> x_minres = minres(A,b[...,None],verbose=None,verbose_times=False)[...,0]
            iter i     | losses_quantiles                                          
                       | 5         | 25        | 50        | 75        | 90        
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            0          | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
            1          | 6.1e-01   | 6.1e-01   | 6.1e-01   | 6.1e-01   | 6.1e-01   
            2          | 2.1e-01   | 2.1e-01   | 2.1e-01   | 2.1e-01   | 2.1e-01   
            3          | 1.9e-01   | 1.9e-01   | 1.9e-01   | 1.9e-01   | 1.9e-01   
            4          | 7.3e-02   | 7.3e-02   | 7.3e-02   | 7.3e-02   | 7.3e-02   
            5          | 5.4e-16   | 5.4e-16   | 5.4e-16   | 5.4e-16   | 5.4e-16   
        >>> torch.allclose(x_minres,x_true)
        True

    Complex-Hermitian example with column vector $b$ 

        >>> n = 5
        >>> A = torch.randn(n,n,dtype=torch.complex128,generator=rng)
        >>> A = (A+A.adjoint())/2
        >>> b = torch.rand(n,dtype=torch.complex128,generator=rng)
        >>> x_true = torch.linalg.solve(A,b[...,None])[...,0]
        >>> x_true
        tensor([ 0.2207+0.2879j,  0.0928-0.0057j,  0.2681+1.3488j, -1.2520-0.4214j,
                -0.8860-0.6922j])
        >>> torch.allclose(A@x_true-b,torch.zeros_like(b))
        True
        >>> x_minres = minres(A,b[...,None],verbose=None,verbose_times=False)[...,0]
            iter i     | losses_quantiles                                          
                       | 5         | 25        | 50        | 75        | 90        
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            0          | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
            1          | 8.7e-01   | 8.7e-01   | 8.7e-01   | 8.7e-01   | 8.7e-01   
            2          | 5.4e-01   | 5.4e-01   | 5.4e-01   | 5.4e-01   | 5.4e-01   
            3          | 5.4e-01   | 5.4e-01   | 5.4e-01   | 5.4e-01   | 5.4e-01   
            4          | 2.6e-01   | 2.6e-01   | 2.6e-01   | 2.6e-01   | 2.6e-01   
            5          | 3.0e-15   | 3.0e-15   | 3.0e-15   | 3.0e-15   | 3.0e-15   
        >>> torch.allclose(x_minres,x_true)
        True

    Matrix $B$

        >>> n = 5
        >>> k = 3
        >>> A = torch.randn(n,n,generator=rng)
        >>> A = (A+A.T)/2
        >>> B = torch.rand(n,k,generator=rng)
        >>> X_true = torch.linalg.solve(A,B)
        >>> X_true
        tensor([[ 0.8801, -0.0116,  0.4805],
                [-1.1095, -1.6166, -0.7103],
                [-2.9918, -1.9201, -3.5855],
                [-4.1777, -3.6586, -5.1658],
                [ 1.5417,  0.9814,  1.3790]])
        >>> torch.allclose(A@X_true-B,torch.zeros_like(B))
        True
        >>> X_minres = minres(A,B,verbose=None,verbose_times=False)
            iter i     | losses_quantiles                                          
                       | 5         | 25        | 50        | 75        | 90        
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            0          | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
            1          | 6.6e-01   | 7.5e-01   | 8.7e-01   | 8.8e-01   | 8.9e-01   
            2          | 6.2e-01   | 6.4e-01   | 6.6e-01   | 7.3e-01   | 7.6e-01   
            3          | 3.5e-01   | 4.7e-01   | 6.1e-01   | 6.3e-01   | 6.4e-01   
            4          | 1.5e-01   | 1.8e-01   | 2.2e-01   | 3.4e-01   | 4.1e-01   
            5          | 7.0e-15   | 9.1e-15   | 1.2e-14   | 2.2e-14   | 2.8e-14   
        >>> torch.allclose(X_minres,X_true)
        True

    Tri-diagonal $A$ with storage-saving multiplication function 

        >>> n = 5
        >>> k = 3
        >>> A_diag = torch.randn(n,generator=rng)
        >>> A_off_diag = torch.randn(n-1,generator=rng) 
        >>> A = torch.zeros(n,n)
        >>> A[torch.arange(n),torch.arange(n)] = A_diag 
        >>> A[torch.arange(n-1),torch.arange(1,n)] = A_off_diag
        >>> A[torch.arange(1,n),torch.arange(n-1)] = A_off_diag
        >>> A
        tensor([[-0.2728, -0.1545,  0.0000,  0.0000,  0.0000],
                [-0.1545, -0.0275, -0.0120,  0.0000,  0.0000],
                [ 0.0000, -0.0120, -0.4436,  0.2802,  0.0000],
                [ 0.0000,  0.0000,  0.2802, -0.7303,  0.9724],
                [ 0.0000,  0.0000,  0.0000,  0.9724, -0.4180]])
        >>> B = torch.rand(n,k,generator=rng)
        >>> X_true = torch.linalg.solve(A,B)
        >>> X_true
        tensor([[-5.5996, -3.1344, -6.0989],
                [ 6.5450, -0.5211,  5.0505],
                [-1.8303, -0.9818, -0.4647],
                [ 0.5862,  0.9233,  1.2342],
                [ 1.3614,  1.4523,  1.9166]])
        >>> torch.allclose(A@X_true-B,torch.zeros_like(B))
        True
        >>> def A_mult(x):
        ...     y = x*A_diag[:,None]
        ...     y[1:,:] += x[:-1,:]*A_off_diag[:,None]
        ...     y[:-1,:] += x[1:,:]*A_off_diag[:,None]
        ...     return y
        >>> torch.allclose(A_mult(X_true),A@X_true)
        True
        >>> X_minres = minres(A_mult,B,verbose=None,verbose_times=False)
            iter i     | losses_quantiles                                          
                       | 5         | 25        | 50        | 75        | 90        
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            0          | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
            1          | 7.5e-01   | 8.1e-01   | 8.8e-01   | 9.3e-01   | 9.6e-01   
            2          | 4.5e-01   | 5.6e-01   | 7.1e-01   | 8.0e-01   | 8.5e-01   
            3          | 1.5e-01   | 1.9e-01   | 2.4e-01   | 2.9e-01   | 3.2e-01   
            4          | 5.7e-02   | 1.4e-01   | 2.4e-01   | 2.9e-01   | 3.2e-01   
            5          | 4.0e-15   | 5.4e-15   | 7.1e-15   | 1.9e-14   | 2.6e-14   
        >>> torch.allclose(X_minres,X_true)
        True

    Batched tri-diagonal $A$ with storage-saving multiplication function 

        >>> n = 100
        >>> k = 3
        >>> A_diag = torch.randn(2,1,4,n,generator=rng)
        >>> A_off_diag = torch.randn(2,1,4,n-1,generator=rng) 
        >>> A = torch.zeros(2,1,4,n,n)
        >>> A[...,torch.arange(n),torch.arange(n)] = A_diag 
        >>> A[...,torch.arange(n-1),torch.arange(1,n)] = A_off_diag
        >>> A[...,torch.arange(1,n),torch.arange(n-1)] = A_off_diag
        >>> B = torch.rand(2,6,1,n,k,generator=rng)
        >>> X_true = torch.linalg.solve(A,B)
        >>> torch.allclose(torch.einsum("...ij,...jk->...ik",A,X_true)-B,torch.zeros_like(B))
        True
        >>> def A_mult(x):
        ...     y = x*A_diag[...,:,None]
        ...     y[...,1:,:] += x[...,:-1,:]*A_off_diag[...,:,None]
        ...     y[...,:-1,:] += x[...,1:,:]*A_off_diag[...,:,None]
        ...     return y
        >>> torch.allclose(A_mult(X_true),torch.einsum("...ij,...jk->...ik",A,X_true))
        True
        >>> X_minres,data = minres(A_mult,B,verbose=None,verbose_times=False,store_data_iters=None,store_all_data=True)
            iter i     | losses_quantiles                                          
                       | 5         | 25        | 50        | 75        | 90        
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            0          | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
            25         | 1.8e-01   | 2.3e-01   | 2.6e-01   | 2.9e-01   | 3.2e-01   
            50         | 9.4e-02   | 1.2e-01   | 1.5e-01   | 1.9e-01   | 2.1e-01   
            75         | 3.3e-02   | 5.1e-02   | 8.2e-02   | 1.5e-01   | 1.8e-01   
            ...   
        >>> X_minres.shape
        torch.Size([2, 6, 4, 100, 3])
        >>> torch.allclose(X_minres,X_true)
        True
        >>> print_data_signatures(data)
            data['x'].shape = (2, 6, 4, 100, 3)
            data['iterrange'].shape = (177,)
            data['times'].shape = (177,)
            data['losses_quantiles']
                data['losses_quantiles']['0'].shape = (177,)
                data['losses_quantiles']['1'].shape = (177,)
                data['losses_quantiles']['5'].shape = (177,)
                data['losses_quantiles']['10'].shape = (177,)
                data['losses_quantiles']['25'].shape = (177,)
                data['losses_quantiles']['40'].shape = (177,)
                data['losses_quantiles']['50'].shape = (177,)
                data['losses_quantiles']['60'].shape = (177,)
                data['losses_quantiles']['75'].shape = (177,)
                data['losses_quantiles']['90'].shape = (177,)
                data['losses_quantiles']['95'].shape = (177,)
                data['losses_quantiles']['99'].shape = (177,)
                data['losses_quantiles']['100'].shape = (177,)
            data['xs'].shape = (177, 2, 6, 4, 100, 3)
            data['losses'].shape = (177, 2, 6, 4, 3)
    """
    if warn and (not torch.get_default_dtype()==torch.float64): warnings.warn('''
            torch.get_default_dtype() = %s, but lm_opt often requires high precision updates. We recommend using:
                torch.set_default_dtype(torch.float64)'''%str(torch.get_default_dtype()))
    assert torch.get_default_dtype() in [torch.float32,torch.float64]
    default_dtype = torch.get_default_dtype()
    device = str(B.device)
    default_device = str(torch.get_default_device())
    assert B.ndim>=2, "B should have shape (...,n,k)"
    n = B.size(-2)
    k = B.size(-1)
    if X0 is None: 
        X0 = torch.zeros_like(B)
    if isinstance(A,torch.Tensor):
        assert A.shape[-2:]==(n,n)
        assert torch.allclose(A.adjoint(),A)
        matvec = lambda X: torch.einsum("...ij,...jk->...ik",A,X)
    else:
        assert callable(A)
        matvec = A
    if iters is None: 
        iters = 5*n 
    assert iters>=0
    assert iters%1==0
    if residtol is None: 
        if default_dtype==torch.float64:
            residtol = 1e-12
        elif default_dtype==torch.float32:
            residtol = 2.5e-4
        else:
            raise Exception("default_dtype = %s not parsed"%str(default_dtype))
    assert residtol>=0
    if verbose is None: 
        verbose = max(1,iters//20)
    assert verbose%1==0
    assert verbose>=0 
    if store_data_iters is None: 
        store_data_iters = max(1,iters//1000)
    assert store_data_iters%1==0
    assert store_data_iters>=0 
    assert isinstance(store_all_data,bool)
    assert isinstance(quantiles_losses,list)
    assert all(0<=qt<=100 for qt in quantiles_losses)
    assert isinstance(verbose_quantiles_losses,list)
    assert all(qt in quantiles_losses for qt in verbose_quantiles_losses)
    assert verbose_indent%1==0 
    assert verbose_indent>=0
    assert isinstance(verbose_times,bool)
    if verbose:
        _h_iter = "%-10s "%"iter i"
        _h_times = "| %-10s"%"times" if verbose_times else ""
        _s_losses_qt = ("| %-9s "*len(verbose_quantiles_losses))%tuple(str(qt) for qt in verbose_quantiles_losses)
        _h_losses_qt = "| losses_quantiles"+" "*(len(_s_losses_qt)-len("| losses_quantiles"))
        _h = _h_iter+_h_losses_qt+_h_times
        _s = " "*len(_h_iter)+_s_losses_qt+("|"+" "*(len(_h_times)-1) if verbose_times else " "*len(_h_times))
        print(" "*verbose_indent+_h)
        print(" "*verbose_indent+_s)
        print(" "*verbose_indent+"~"*len(_s))
    timer = Timer(device=device)
    timer.tic()
    psolve = lambda X: X # TODO: implement more involved preconditioned solver
    inner = lambda a,b: torch.einsum("...ij,...ij->...j",a.conj(),b)
    Anorm = 0
    eps = torch.finfo(B.dtype).eps
    x = X0 
    Ax = matvec(x)
    assert Ax.shape[-2:]==(n,k)
    batch_shape = tuple(Ax.shape[:-2])
    if store_data_iters:
        iterrange = []
        times = []
        losses = []
        losses_quantiles = {str(qt):[] for qt in quantiles_losses}
        if store_all_data:
            xs = []
    r1 = B-Ax # (...,n,k)
    y = psolve(r1) # (...,n,k)
    beta1 = torch.sqrt(inner(r1,y)) # (...,k)
    bnorm = torch.linalg.norm(B,dim=-2) # (...,k)
    oldb = 0
    beta = beta1
    dbar = 0
    epsln = torch.zeros(1,device=device)
    phibar = beta1
    tnorm2 = 0
    cs = -1
    sn = 0
    w = torch.zeros_like(B)
    w2 = torch.zeros_like(B)
    r2 = r1
    shift = 0 # TODO: If shift != 0 then the method solves (A - shift*I)x = b
    for i in range(iters+1):
        resid = matvec(x)-B 
        breakcond = i==iters or resid.abs().amax()<=residtol
        loss = torch.linalg.norm(resid,dim=-2)/bnorm
        times_i = timer.toc()
        losses_quantiles_i = {str(qt): loss.nanquantile(qt/100) for qt in quantiles_losses}
        if store_data_iters and (i%store_data_iters==0 or breakcond):
            iterrange.append(i)
            losses.append(loss.to(default_device))
            times.append(times_i)
            for qt in quantiles_losses:
                losses_quantiles[str(qt)].append(losses_quantiles_i[str(qt)].to(default_device))
            if store_all_data:
                xs.append(x.expand(resid.shape).to(default_device))
        if verbose and (i%verbose==0 or breakcond):
            _s_iter = "%-10d "%i
            _s_losses_qt = ("| %-9.1e "*len(verbose_quantiles_losses))%tuple(losses_quantiles_i[str(qt)] for qt in verbose_quantiles_losses)
            _s_times = "| %-10.1f "%(times_i) if verbose_times else ""
            print(" "*verbose_indent+_s_iter+_s_losses_qt+_s_times)
        if breakcond: break 
        s = 1/beta
        v = s[...,None,:]*y
        y = matvec(v)
        y = y-shift*v
        if i>0:
            y = y-(beta/oldb)[...,None,:]*r1
        alfa = inner(v,y)
        y = y-(alfa/beta)[...,None,:]*r2
        r1 = r2
        r2 = y
        y = psolve(r2)
        oldb = beta
        beta = inner(r2,y)
        beta = torch.sqrt(beta)
        tnorm2 += alfa**2+oldb**2+beta**2
        oldeps = epsln
        delta = cs*dbar+sn*alfa
        gbar = sn*dbar-cs*alfa
        epsln = sn*beta
        dbar = -cs*beta
        gamma = torch.linalg.norm(torch.stack([gbar,beta],dim=-1),dim=-1)
        gamma = torch.maximum(gamma,eps*torch.ones(1,device=device))
        cs = gbar/gamma
        sn = beta/gamma
        phi = cs*phibar
        phibar = sn*phibar
        denom = 1/gamma
        w1 = w2
        w2 = w
        w = (v-oldeps[...,None,:]*w1-delta[...,None,:]*w2)*denom[...,None,:]
        x = x+phi[...,None,:]*w
    if store_data_iters==0:
        return x 
    else:
        data = {
            "x": x.to(default_device), 
            "iterrange": torch.tensor(iterrange,dtype=int), 
            "times": torch.tensor(times), 
            "losses_quantiles": {str(qt):torch.tensor(losses_quantiles[str(qt)]) for qt in quantiles_losses},
            }
        if store_all_data:
            data["xs"] = torch.stack(xs,dim=0)
            data["losses"] = torch.stack(losses,dim=0)
        return x,data

minres_qlp_cs

minres_qlp_cs(A, B, X0=None, iters=None, residtol=None, verbose=False, verbose_indent=4, quantiles_losses=[0, 1, 5, 10, 25, 40, 50, 60, 75, 90, 95, 99, 100], verbose_quantiles_losses=[5, 25, 50, 75, 90], verbose_times=True, warn=True, store_data_iters=False, store_all_data=False)

MINRES QLP algorith for complex-symmetric matrices.

A translation of the MATLAB version of MINRESQLP.

References

S.-C. Choi, C. C. Paige, and M. A. Saunders, MINRES-QLP: A Krylov subspace method for indefinite or singular symmetric systems, SIAM Journal of Scientific Computing, submitted on March 7, 2010.
S.-C. Choi's PhD Dissertation, Stanford University, 2006: http://www.stanford.edu/group/SOL/dissertations.html

Parameters:

Name	Type	Description	Default
`A`	`Union[Tensor, callable]`	Symmetric matrix `A` with shape `(...,n,n)`, or `callable(A)` where `a(x)` should return the batch matrix multiplication of `A` and `X`,	required
`B`	`Tensor`	Right hand side tensor \(B\) with shape `(...,n,k)`	required
`X0`	`Tensor`	Initial guess for \(X\) with shape `(...,n,k)`, defaults to zeros.	`None`
`iters`	`int`	number of minres iterations, defaults to `5n`.	`None`
`residtol`	`float`	Non-negative tolerance on the maximum residual for early stopping, defaults to `1e-12` for `torch.float64` and `2.5e-4` for `torch.float32`.	`None`
`verbose`	`int`	Controls logging verbosity If `True`, perform logging. If a positive int, only log every `verbose` iterations. If `None`, set to a reasonable positive int based on the maximum number of iterations If `False`, don't log.	`False`
`verbose_indent`	`int`	Non-negative number of indentation spaces for logging.	`4`
`quantiles_losses`	`list`	Loss quantiles to record.	`[0, 1, 5, 10, 25, 40, 50, 60, 75, 90, 95, 99, 100]`
`verbose_quantiles_losses`	`list`	Loss quantiles to show in verbose log.	`[5, 25, 50, 75, 90]`
`verbose_times`	`bool`	If `False`, do not show the times in the verbose log. This is mostly for testing where timing is not reproducible.	`True`
`warn`	`bool`	If `False`, then suppress warnings.	`True`
`store_data_iters`	`int`	Controls storage iterations with the same options as verbose. If `store_data_iters==0`, then the data is not collected or returned. If `True`, store every iteration. If a positive int, only store every `store_data_iters` iterations. If `None`, set to a reasonable positive int based on the maximum number of iterations If `False`, don't store data, and do not return data	`False`
`store_all_data`	`bool`	If `True`, store the `x` values as well as the metrics.	`False`

Returns:

Name	Type	Description
`x`	`Tensor`	Optimized \(X\).
`data`	`dict`	Iteration data, only returned when `store_data_iters>0`

Examples:

>>> torch.set_default_dtype(torch.float64)
>>> rng = torch.Generator().manual_seed(7)

Column vector \(b\)

>>> n = 5
>>> A = torch.randn(n,n,dtype=torch.complex128,generator=rng)
>>> A = (A+A.T)/2
>>> b = torch.rand(n,dtype=torch.complex128,generator=rng)
>>> x_true = torch.linalg.solve(A,b[...,None])[...,0]
>>> x_true
tensor([-0.6207-0.4121j,  0.5221+0.3249j, -1.0952+0.8594j,  0.9080-1.2110j,
        -0.9799+0.7372j])
>>> torch.allclose(A@x_true-b,torch.zeros_like(b))
True
>>> x_minres = minres_qlp_cs(A,b[...,None],verbose=None,verbose_times=False)[...,0]
    iter i     | losses_quantiles                                          
               | 5         | 25        | 50        | 75        | 90        
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    0          | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
    1          | 9.6e-01   | 9.6e-01   | 9.6e-01   | 9.6e-01   | 9.6e-01   
    2          | 5.8e-01   | 5.8e-01   | 5.8e-01   | 5.8e-01   | 5.8e-01   
    3          | 4.5e-01   | 4.5e-01   | 4.5e-01   | 4.5e-01   | 4.5e-01   
    4          | 2.8e-01   | 2.8e-01   | 2.8e-01   | 2.8e-01   | 2.8e-01   
    5          | 1.7e-14   | 1.7e-14   | 1.7e-14   | 1.7e-14   | 1.7e-14   
>>> torch.allclose(x_minres,x_true)
True

Matrix \(B\)

>>> n = 5
>>> k = 3
>>> A = torch.randn(n,n,dtype=torch.complex128,generator=rng)
>>> A = (A+A.T)/2
>>> B = torch.rand(n,k,dtype=torch.complex128,generator=rng)
>>> X_true = torch.linalg.solve(A,B)
>>> X_true
tensor([[ 0.0142+0.7190j,  0.0097+0.9734j, -0.3620+0.6413j],
        [ 0.4527+0.7455j,  0.4270+0.4941j,  0.9255+0.5685j],
        [-0.7182-1.0284j,  0.3193-0.4463j, -0.3421-0.8780j],
        [-0.5973+0.4910j, -0.8147+0.8256j, -0.8821+0.7147j],
        [-0.0926+0.5500j, -0.5192+0.2817j, -0.8496+0.6976j]])
>>> torch.allclose(A@X_true-B,torch.zeros_like(B))
True
>>> X_minres = minres_qlp_cs(A,B,verbose=None,verbose_times=False)
    iter i     | losses_quantiles                                          
               | 5         | 25        | 50        | 75        | 90        
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    0          | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
    1          | 9.0e-01   | 9.1e-01   | 9.2e-01   | 9.5e-01   | 9.7e-01   
    2          | 2.2e-01   | 3.1e-01   | 4.3e-01   | 5.0e-01   | 5.5e-01   
    3          | 2.0e-01   | 2.2e-01   | 2.3e-01   | 2.9e-01   | 3.3e-01   
    4          | 3.4e-02   | 5.0e-02   | 7.1e-02   | 8.1e-02   | 8.7e-02   
    5          | 1.9e-15   | 4.2e-15   | 7.0e-15   | 1.5e-14   | 2.0e-14   
>>> torch.allclose(X_minres,X_true)
True

Tri-diagonal \(A\) with storage-saving multiplication function

>>> n = 4
>>> k = 3
>>> A_diag = torch.randn(n,dtype=torch.complex128,generator=rng)
>>> A_off_diag = torch.randn(n-1,dtype=torch.complex128,generator=rng) 
>>> A = torch.zeros(n,n,dtype=torch.complex128)
>>> A[torch.arange(n),torch.arange(n)] = A_diag 
>>> A[torch.arange(n-1),torch.arange(1,n)] = A_off_diag
>>> A[torch.arange(1,n),torch.arange(n-1)] = A_off_diag
>>> A
tensor([[ 0.4070+0.4993j, -0.3137-0.5164j,  0.0000+0.0000j,  0.0000+0.0000j],
        [-0.3137-0.5164j, -0.2736+0.1860j, -0.2956-0.1092j,  0.0000+0.0000j],
        [ 0.0000+0.0000j, -0.2956-0.1092j,  0.4033-0.3862j, -0.0085+0.1981j],
        [ 0.0000+0.0000j,  0.0000+0.0000j, -0.0085+0.1981j, -0.1929-0.0194j]])
>>> B = torch.rand(n,k,dtype=torch.complex128,generator=rng)
>>> X_true = torch.linalg.solve(A,B)
>>> X_true
tensor([[-0.0284-1.1816j, -0.6400-0.1975j,  0.7656-1.1306j],
        [-1.6586-0.3570j, -2.4720-0.1880j, -1.1169-0.6640j],
        [-2.4585-1.2378j, -1.0485-2.0358j, -3.1323-0.3632j],
        [ 0.1591-7.4823j,  1.8120-2.6001j, -2.3910-7.6943j]])
>>> torch.allclose(A@X_true-B,torch.zeros_like(B))
True
>>> def A_mult(x):
...     y = x*A_diag[:,None]
...     y[1:,:] += x[:-1,:]*A_off_diag[:,None]
...     y[:-1,:] += x[1:,:]*A_off_diag[:,None]
...     return y
>>> torch.allclose(A_mult(X_true),A@X_true)
True
>>> X_minres = minres_qlp_cs(A_mult,B,verbose=None,verbose_times=False)
    iter i     | losses_quantiles                                          
               | 5         | 25        | 50        | 75        | 90        
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    0          | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
    1          | 7.8e-01   | 8.3e-01   | 8.9e-01   | 9.4e-01   | 9.7e-01   
    2          | 5.1e-01   | 5.2e-01   | 5.2e-01   | 5.4e-01   | 5.5e-01   
    3          | 3.2e-01   | 3.2e-01   | 3.2e-01   | 3.2e-01   | 3.3e-01   
    4          | 9.1e-16   | 1.1e-15   | 1.3e-15   | 2.1e-15   | 2.6e-15   
>>> torch.allclose(X_minres,X_true)
True

Batched tri-diagonal \(A\) with storage-saving multiplication function

>>> n = 100
>>> k = 3
>>> A_diag = torch.randn(2,1,4,n,dtype=torch.complex128,generator=rng)
>>> A_off_diag = torch.randn(2,1,4,n-1,dtype=torch.complex128,generator=rng) 
>>> A = torch.zeros(2,1,4,n,n,dtype=torch.complex128)
>>> A[...,torch.arange(n),torch.arange(n)] = A_diag 
>>> A[...,torch.arange(n-1),torch.arange(1,n)] = A_off_diag
>>> A[...,torch.arange(1,n),torch.arange(n-1)] = A_off_diag
>>> B = torch.rand(2,6,1,n,k,dtype=torch.complex128,generator=rng)
>>> X_true = torch.linalg.solve(A,B)
>>> torch.allclose(torch.einsum("...ij,...jk->...ik",A,X_true)-B,torch.zeros_like(B))
True
>>> def A_mult(x):
...     y = x*A_diag[...,:,None]
...     y[...,1:,:] += x[...,:-1,:]*A_off_diag[...,:,None]
...     y[...,:-1,:] += x[...,1:,:]*A_off_diag[...,:,None]
...     return y
>>> torch.allclose(A_mult(X_true),torch.einsum("...ij,...jk->...ik",A,X_true))
True
>>> X_minres,data = minres_qlp_cs(A_mult,B,verbose=None,verbose_times=False,store_data_iters=None,store_all_data=True,iters=40)
    iter i     | losses_quantiles                                          
               | 5         | 25        | 50        | 75        | 90        
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    0          | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
    2          | 6.6e-01   | 6.8e-01   | 7.0e-01   | 7.2e-01   | 7.3e-01   
    4          | 5.1e-01   | 5.4e-01   | 5.6e-01   | 5.8e-01   | 5.9e-01   
    6          | 4.2e-01   | 4.5e-01   | 4.6e-01   | 4.9e-01   | 5.0e-01   
    8          | 3.5e-01   | 3.8e-01   | 4.0e-01   | 4.2e-01   | 4.4e-01   
    10         | 3.0e-01   | 3.3e-01   | 3.6e-01   | 3.8e-01   | 4.0e-01   
    12         | 2.6e-01   | 2.9e-01   | 3.2e-01   | 3.4e-01   | 3.6e-01   
    14         | 2.3e-01   | 2.6e-01   | 2.9e-01   | 3.1e-01   | 3.3e-01   
    16         | 2.1e-01   | 2.3e-01   | 2.6e-01   | 2.8e-01   | 3.0e-01   
    18         | 1.9e-01   | 2.1e-01   | 2.4e-01   | 2.6e-01   | 2.8e-01   
    20         | 1.7e-01   | 1.9e-01   | 2.2e-01   | 2.3e-01   | 2.6e-01   
    22         | 1.5e-01   | 1.7e-01   | 2.0e-01   | 2.2e-01   | 2.4e-01   
    24         | 1.3e-01   | 1.6e-01   | 1.8e-01   | 2.0e-01   | 2.2e-01   
    26         | 1.2e-01   | 1.4e-01   | 1.7e-01   | 1.9e-01   | 2.1e-01   
    28         | 1.0e-01   | 1.3e-01   | 1.6e-01   | 1.8e-01   | 2.0e-01   
    30         | 9.3e-02   | 1.2e-01   | 1.4e-01   | 1.7e-01   | 1.9e-01   
    32         | 8.4e-02   | 1.1e-01   | 1.3e-01   | 1.6e-01   | 1.9e-01   
    34         | 7.6e-02   | 1.0e-01   | 1.2e-01   | 1.6e-01   | 1.8e-01   
    36         | 6.9e-02   | 9.3e-02   | 1.1e-01   | 1.5e-01   | 1.7e-01   
    38         | 6.4e-02   | 8.5e-02   | 1.0e-01   | 1.4e-01   | 1.7e-01   
    40         | 5.8e-02   | 7.5e-02   | 9.2e-02   | 1.4e-01   | 1.6e-01   
>>> X_minres.shape
torch.Size([2, 6, 4, 100, 3])
>>> torch.allclose(X_minres,X_true)
False
>>> print_data_signatures(data)
    data['x'].shape = (2, 6, 4, 100, 3)
    data['iterrange'].shape = (41,)
    data['times'].shape = (41,)
    data['losses_quantiles']
        data['losses_quantiles']['0'].shape = (41,)
        data['losses_quantiles']['1'].shape = (41,)
        data['losses_quantiles']['5'].shape = (41,)
        data['losses_quantiles']['10'].shape = (41,)
        data['losses_quantiles']['25'].shape = (41,)
        data['losses_quantiles']['40'].shape = (41,)
        data['losses_quantiles']['50'].shape = (41,)
        data['losses_quantiles']['60'].shape = (41,)
        data['losses_quantiles']['75'].shape = (41,)
        data['losses_quantiles']['90'].shape = (41,)
        data['losses_quantiles']['95'].shape = (41,)
        data['losses_quantiles']['99'].shape = (41,)
        data['losses_quantiles']['100'].shape = (41,)
    data['xs'].shape = (41, 2, 6, 4, 100, 3)
    data['losses'].shape = (41, 2, 6, 4, 3)

Source code in agsutil/algos.py

def minres_qlp_cs(
        A,
        B,
        X0 = None,
        iters = None,
        residtol = None,
        verbose = False, 
        verbose_indent = 4,
        quantiles_losses = [0,1,5,10,25,40,50,60,75,90,95,99,100],
        verbose_quantiles_losses = [5,25,50,75,90],
        verbose_times = True, 
        warn = True,
        store_data_iters = False, 
        store_all_data = False,
        ):
    # https://github.com/schoi32/sci498rms/blob/master/Sorokin_MinresQLP_Python_Workspace/MinresQLP/Algorithms/cs_mqlp.py
    r"""
    MINRES QLP algorith for complex-symmetric matrices. 

    A translation of the [MATLAB version of MINRESQLP](http://www.stanford.edu/group/SOL/software.html).

    References 

    1.  S.-C. Choi, C. C. Paige, and M. A. Saunders,
        MINRES-QLP: A Krylov subspace method for indefinite or singular symmetric systems,
        SIAM Journal of Scientific Computing, submitted on March 7, 2010.

    2.  S.-C. Choi's PhD Dissertation, Stanford University, 2006: 
        http://www.stanford.edu/group/SOL/dissertations.html

    Args:
        A (Union[torch.Tensor,callable]): Symmetric matrix `A` with shape `(...,n,n)`, or  
            `callable(A)` where `a(x)` should return the batch matrix multiplication of `A` and `X`,  
        B (torch.Tensor): Right hand side tensor $B$ with shape `(...,n,k)`
        X0 (torch.Tensor): Initial guess for $X$ with shape `(...,n,k)`, defaults to zeros. 
        iters (int): number of minres iterations, defaults to `5n`. 
        residtol (float): Non-negative tolerance on the maximum residual for early stopping, defaults to `1e-12` for `torch.float64` and `2.5e-4` for `torch.float32`.
        verbose (int): Controls logging verbosity

            - If `True`, perform logging. 
            - If a positive int, only log every `verbose` iterations. 
            - If `None`, set to a reasonable positive int based on the maximum number of iterations
            - If `False`, don't log. 

        verbose_indent (int): Non-negative number of indentation spaces for logging.
        quantiles_losses (list): Loss quantiles to record.
        verbose_quantiles_losses (list): Loss quantiles to show in verbose log.
        verbose_times (bool): If `False`, do not show the times in the verbose log. This is mostly for testing where timing is not reproducible. 
        warn (bool): If `False`, then suppress warnings.
        store_data_iters (int): Controls storage iterations with the same options as verbose. If `store_data_iters==0`, then the data is not collected or returned. 

            - If `True`, store every iteration. 
            - If a positive int, only store every `store_data_iters` iterations. 
            - If `None`, set to a reasonable positive int based on the maximum number of iterations
            - If `False`, don't store data, and do not return data 

        store_all_data (bool): If `True`, store the `x` values as well as the metrics. 

    Returns:
        x (torch.Tensor): Optimized $X$.
        data (dict): Iteration data, only returned when `store_data_iters>0`

    Examples:

        >>> torch.set_default_dtype(torch.float64)
        >>> rng = torch.Generator().manual_seed(7)

    Column vector $b$ 

        >>> n = 5
        >>> A = torch.randn(n,n,dtype=torch.complex128,generator=rng)
        >>> A = (A+A.T)/2
        >>> b = torch.rand(n,dtype=torch.complex128,generator=rng)
        >>> x_true = torch.linalg.solve(A,b[...,None])[...,0]
        >>> x_true
        tensor([-0.6207-0.4121j,  0.5221+0.3249j, -1.0952+0.8594j,  0.9080-1.2110j,
                -0.9799+0.7372j])
        >>> torch.allclose(A@x_true-b,torch.zeros_like(b))
        True
        >>> x_minres = minres_qlp_cs(A,b[...,None],verbose=None,verbose_times=False)[...,0]
            iter i     | losses_quantiles                                          
                       | 5         | 25        | 50        | 75        | 90        
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            0          | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
            1          | 9.6e-01   | 9.6e-01   | 9.6e-01   | 9.6e-01   | 9.6e-01   
            2          | 5.8e-01   | 5.8e-01   | 5.8e-01   | 5.8e-01   | 5.8e-01   
            3          | 4.5e-01   | 4.5e-01   | 4.5e-01   | 4.5e-01   | 4.5e-01   
            4          | 2.8e-01   | 2.8e-01   | 2.8e-01   | 2.8e-01   | 2.8e-01   
            5          | 1.7e-14   | 1.7e-14   | 1.7e-14   | 1.7e-14   | 1.7e-14   
        >>> torch.allclose(x_minres,x_true)
        True

    Matrix $B$

        >>> n = 5
        >>> k = 3
        >>> A = torch.randn(n,n,dtype=torch.complex128,generator=rng)
        >>> A = (A+A.T)/2
        >>> B = torch.rand(n,k,dtype=torch.complex128,generator=rng)
        >>> X_true = torch.linalg.solve(A,B)
        >>> X_true
        tensor([[ 0.0142+0.7190j,  0.0097+0.9734j, -0.3620+0.6413j],
                [ 0.4527+0.7455j,  0.4270+0.4941j,  0.9255+0.5685j],
                [-0.7182-1.0284j,  0.3193-0.4463j, -0.3421-0.8780j],
                [-0.5973+0.4910j, -0.8147+0.8256j, -0.8821+0.7147j],
                [-0.0926+0.5500j, -0.5192+0.2817j, -0.8496+0.6976j]])
        >>> torch.allclose(A@X_true-B,torch.zeros_like(B))
        True
        >>> X_minres = minres_qlp_cs(A,B,verbose=None,verbose_times=False)
            iter i     | losses_quantiles                                          
                       | 5         | 25        | 50        | 75        | 90        
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            0          | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
            1          | 9.0e-01   | 9.1e-01   | 9.2e-01   | 9.5e-01   | 9.7e-01   
            2          | 2.2e-01   | 3.1e-01   | 4.3e-01   | 5.0e-01   | 5.5e-01   
            3          | 2.0e-01   | 2.2e-01   | 2.3e-01   | 2.9e-01   | 3.3e-01   
            4          | 3.4e-02   | 5.0e-02   | 7.1e-02   | 8.1e-02   | 8.7e-02   
            5          | 1.9e-15   | 4.2e-15   | 7.0e-15   | 1.5e-14   | 2.0e-14   
        >>> torch.allclose(X_minres,X_true)
        True

    Tri-diagonal $A$ with storage-saving multiplication function 

        >>> n = 4
        >>> k = 3
        >>> A_diag = torch.randn(n,dtype=torch.complex128,generator=rng)
        >>> A_off_diag = torch.randn(n-1,dtype=torch.complex128,generator=rng) 
        >>> A = torch.zeros(n,n,dtype=torch.complex128)
        >>> A[torch.arange(n),torch.arange(n)] = A_diag 
        >>> A[torch.arange(n-1),torch.arange(1,n)] = A_off_diag
        >>> A[torch.arange(1,n),torch.arange(n-1)] = A_off_diag
        >>> A
        tensor([[ 0.4070+0.4993j, -0.3137-0.5164j,  0.0000+0.0000j,  0.0000+0.0000j],
                [-0.3137-0.5164j, -0.2736+0.1860j, -0.2956-0.1092j,  0.0000+0.0000j],
                [ 0.0000+0.0000j, -0.2956-0.1092j,  0.4033-0.3862j, -0.0085+0.1981j],
                [ 0.0000+0.0000j,  0.0000+0.0000j, -0.0085+0.1981j, -0.1929-0.0194j]])
        >>> B = torch.rand(n,k,dtype=torch.complex128,generator=rng)
        >>> X_true = torch.linalg.solve(A,B)
        >>> X_true
        tensor([[-0.0284-1.1816j, -0.6400-0.1975j,  0.7656-1.1306j],
                [-1.6586-0.3570j, -2.4720-0.1880j, -1.1169-0.6640j],
                [-2.4585-1.2378j, -1.0485-2.0358j, -3.1323-0.3632j],
                [ 0.1591-7.4823j,  1.8120-2.6001j, -2.3910-7.6943j]])
        >>> torch.allclose(A@X_true-B,torch.zeros_like(B))
        True
        >>> def A_mult(x):
        ...     y = x*A_diag[:,None]
        ...     y[1:,:] += x[:-1,:]*A_off_diag[:,None]
        ...     y[:-1,:] += x[1:,:]*A_off_diag[:,None]
        ...     return y
        >>> torch.allclose(A_mult(X_true),A@X_true)
        True
        >>> X_minres = minres_qlp_cs(A_mult,B,verbose=None,verbose_times=False)
            iter i     | losses_quantiles                                          
                       | 5         | 25        | 50        | 75        | 90        
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            0          | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
            1          | 7.8e-01   | 8.3e-01   | 8.9e-01   | 9.4e-01   | 9.7e-01   
            2          | 5.1e-01   | 5.2e-01   | 5.2e-01   | 5.4e-01   | 5.5e-01   
            3          | 3.2e-01   | 3.2e-01   | 3.2e-01   | 3.2e-01   | 3.3e-01   
            4          | 9.1e-16   | 1.1e-15   | 1.3e-15   | 2.1e-15   | 2.6e-15   
        >>> torch.allclose(X_minres,X_true)
        True

    Batched tri-diagonal $A$ with storage-saving multiplication function 

        >>> n = 100
        >>> k = 3
        >>> A_diag = torch.randn(2,1,4,n,dtype=torch.complex128,generator=rng)
        >>> A_off_diag = torch.randn(2,1,4,n-1,dtype=torch.complex128,generator=rng) 
        >>> A = torch.zeros(2,1,4,n,n,dtype=torch.complex128)
        >>> A[...,torch.arange(n),torch.arange(n)] = A_diag 
        >>> A[...,torch.arange(n-1),torch.arange(1,n)] = A_off_diag
        >>> A[...,torch.arange(1,n),torch.arange(n-1)] = A_off_diag
        >>> B = torch.rand(2,6,1,n,k,dtype=torch.complex128,generator=rng)
        >>> X_true = torch.linalg.solve(A,B)
        >>> torch.allclose(torch.einsum("...ij,...jk->...ik",A,X_true)-B,torch.zeros_like(B))
        True
        >>> def A_mult(x):
        ...     y = x*A_diag[...,:,None]
        ...     y[...,1:,:] += x[...,:-1,:]*A_off_diag[...,:,None]
        ...     y[...,:-1,:] += x[...,1:,:]*A_off_diag[...,:,None]
        ...     return y
        >>> torch.allclose(A_mult(X_true),torch.einsum("...ij,...jk->...ik",A,X_true))
        True
        >>> X_minres,data = minres_qlp_cs(A_mult,B,verbose=None,verbose_times=False,store_data_iters=None,store_all_data=True,iters=40)
            iter i     | losses_quantiles                                          
                       | 5         | 25        | 50        | 75        | 90        
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            0          | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   | 1.0e+00   
            2          | 6.6e-01   | 6.8e-01   | 7.0e-01   | 7.2e-01   | 7.3e-01   
            4          | 5.1e-01   | 5.4e-01   | 5.6e-01   | 5.8e-01   | 5.9e-01   
            6          | 4.2e-01   | 4.5e-01   | 4.6e-01   | 4.9e-01   | 5.0e-01   
            8          | 3.5e-01   | 3.8e-01   | 4.0e-01   | 4.2e-01   | 4.4e-01   
            10         | 3.0e-01   | 3.3e-01   | 3.6e-01   | 3.8e-01   | 4.0e-01   
            12         | 2.6e-01   | 2.9e-01   | 3.2e-01   | 3.4e-01   | 3.6e-01   
            14         | 2.3e-01   | 2.6e-01   | 2.9e-01   | 3.1e-01   | 3.3e-01   
            16         | 2.1e-01   | 2.3e-01   | 2.6e-01   | 2.8e-01   | 3.0e-01   
            18         | 1.9e-01   | 2.1e-01   | 2.4e-01   | 2.6e-01   | 2.8e-01   
            20         | 1.7e-01   | 1.9e-01   | 2.2e-01   | 2.3e-01   | 2.6e-01   
            22         | 1.5e-01   | 1.7e-01   | 2.0e-01   | 2.2e-01   | 2.4e-01   
            24         | 1.3e-01   | 1.6e-01   | 1.8e-01   | 2.0e-01   | 2.2e-01   
            26         | 1.2e-01   | 1.4e-01   | 1.7e-01   | 1.9e-01   | 2.1e-01   
            28         | 1.0e-01   | 1.3e-01   | 1.6e-01   | 1.8e-01   | 2.0e-01   
            30         | 9.3e-02   | 1.2e-01   | 1.4e-01   | 1.7e-01   | 1.9e-01   
            32         | 8.4e-02   | 1.1e-01   | 1.3e-01   | 1.6e-01   | 1.9e-01   
            34         | 7.6e-02   | 1.0e-01   | 1.2e-01   | 1.6e-01   | 1.8e-01   
            36         | 6.9e-02   | 9.3e-02   | 1.1e-01   | 1.5e-01   | 1.7e-01   
            38         | 6.4e-02   | 8.5e-02   | 1.0e-01   | 1.4e-01   | 1.7e-01   
            40         | 5.8e-02   | 7.5e-02   | 9.2e-02   | 1.4e-01   | 1.6e-01   
        >>> X_minres.shape
        torch.Size([2, 6, 4, 100, 3])
        >>> torch.allclose(X_minres,X_true)
        False
        >>> print_data_signatures(data)
            data['x'].shape = (2, 6, 4, 100, 3)
            data['iterrange'].shape = (41,)
            data['times'].shape = (41,)
            data['losses_quantiles']
                data['losses_quantiles']['0'].shape = (41,)
                data['losses_quantiles']['1'].shape = (41,)
                data['losses_quantiles']['5'].shape = (41,)
                data['losses_quantiles']['10'].shape = (41,)
                data['losses_quantiles']['25'].shape = (41,)
                data['losses_quantiles']['40'].shape = (41,)
                data['losses_quantiles']['50'].shape = (41,)
                data['losses_quantiles']['60'].shape = (41,)
                data['losses_quantiles']['75'].shape = (41,)
                data['losses_quantiles']['90'].shape = (41,)
                data['losses_quantiles']['95'].shape = (41,)
                data['losses_quantiles']['99'].shape = (41,)
                data['losses_quantiles']['100'].shape = (41,)
            data['xs'].shape = (41, 2, 6, 4, 100, 3)
            data['losses'].shape = (41, 2, 6, 4, 3)
    """
    if warn and (not torch.get_default_dtype()==torch.float64): warnings.warn('''
            torch.get_default_dtype() = %s, but lm_opt often requires high precision updates. We recommend using:
                torch.set_default_dtype(torch.float64)'''%str(torch.get_default_dtype()))
    assert torch.get_default_dtype() in [torch.float32,torch.float64]
    default_dtype = torch.get_default_dtype()
    device = str(B.device)
    default_device = str(torch.get_default_device())
    assert B.ndim>=2, "B should have shape (...,n,k)"
    n = B.size(-2)
    k = B.size(-1)
    assert B.dtype in [torch.complex64,torch.complex128]
    if X0 is None: 
        X0 = torch.zeros_like(B)
    if isinstance(A,torch.Tensor):
        assert A.shape[-2:]==(n,n)
        assert torch.allclose(A.T,A)
        matvec = lambda X: torch.einsum("...ij,...jk->...ik",A,X)
    else:
        assert callable(A)
        matvec = A
    if iters is None: 
        iters = 5*n 
    assert iters>=0
    assert iters%1==0
    if residtol is None: 
        if default_dtype==torch.float64:
            residtol = 1e-12
        elif default_dtype==torch.float32:
            residtol = 2.5e-4
        else:
            raise Exception("default_dtype = %s not parsed"%str(default_dtype))
    assert residtol>=0
    if verbose is None: 
        verbose = max(1,iters//20)
    assert verbose%1==0
    assert verbose>=0 
    if store_data_iters is None: 
        store_data_iters = max(1,iters//1000)
    assert store_data_iters%1==0
    assert store_data_iters>=0 
    assert isinstance(store_all_data,bool)
    assert isinstance(quantiles_losses,list)
    assert all(0<=qt<=100 for qt in quantiles_losses)
    assert isinstance(verbose_quantiles_losses,list)
    assert all(qt in quantiles_losses for qt in verbose_quantiles_losses)
    assert verbose_indent%1==0 
    assert verbose_indent>=0
    assert isinstance(verbose_times,bool)
    if verbose:
        _h_iter = "%-10s "%"iter i"
        _h_times = "| %-10s"%"times" if verbose_times else ""
        _s_losses_qt = ("| %-9s "*len(verbose_quantiles_losses))%tuple(str(qt) for qt in verbose_quantiles_losses)
        _h_losses_qt = "| losses_quantiles"+" "*(len(_s_losses_qt)-len("| losses_quantiles"))
        _h = _h_iter+_h_losses_qt+_h_times
        _s = " "*len(_h_iter)+_s_losses_qt+("|"+" "*(len(_h_times)-1) if verbose_times else " "*len(_h_times))
        print(" "*verbose_indent+_h)
        print(" "*verbose_indent+_s)
        print(" "*verbose_indent+"~"*len(_s))
    timer = Timer(device=device)
    timer.tic()
    psolve = lambda X: X # TODO: implement more involved preconditioned solver
    inner = lambda a,b: torch.einsum("...ij,...ij->...j",a.conj(),b)
    Anorm = 0
    x = X0 
    Ax = matvec(x)
    assert Ax.shape[-2:]==(n,k)
    batch_shape = tuple(Ax.shape[:-2])
    if store_data_iters:
        iterrange = []
        times = []
        losses = []
        losses_quantiles = {str(qt):[] for qt in quantiles_losses}
        if store_all_data:
            xs = []
    r2 = B  # (...,k)
    r3 = r2 # (...,k)
    r3 = psolve(r2)
    beta1 = torch.sqrt(inner(r3,r2))  # (...,k)
    bnorm = torch.linalg.norm(B,dim=-2) # (...,k)
    # TODO: Check if below variables are necessary    
    beta = torch.zeros_like(beta1)
    tau = torch.zeros((*batch_shape,k),dtype=B.dtype,device=device)
    taul = torch.zeros((*batch_shape,k),dtype=B.dtype,device=device)
    phi = beta1
    betan = beta1
    cs = -torch.ones((*batch_shape,k),dtype=B.dtype,device=device)
    sn = torch.zeros((*batch_shape,k),dtype=B.dtype,device=device)
    cr1 = torch.ones((*batch_shape,k),dtype=B.dtype,device=device)
    sr1 = torch.zeros((*batch_shape,k),dtype=B.dtype,device=device)
    cr2 = -torch.ones((*batch_shape,k),dtype=B.dtype,device=device)
    sr2 = torch.zeros((*batch_shape,k),dtype=B.dtype,device=device)
    dltan = torch.zeros((*batch_shape,k),dtype=B.dtype,device=device)
    eplnn = torch.zeros((*batch_shape,k),dtype=B.dtype,device=device)
    gama = torch.zeros((*batch_shape,k),dtype=B.dtype,device=device)
    gamal = torch.zeros((*batch_shape,k),dtype=B.dtype,device=device)
    gamal2 = torch.zeros((*batch_shape,k),dtype=B.dtype,device=device)
    eta = torch.zeros((*batch_shape,k),dtype=B.dtype,device=device)
    etal = torch.zeros((*batch_shape,k),dtype=B.dtype,device=device)
    etal2 = torch.zeros((*batch_shape,k),dtype=B.dtype,device=device)
    vepln = torch.zeros((*batch_shape,k),dtype=B.dtype,device=device)
    veplnl = torch.zeros((*batch_shape,k),dtype=B.dtype,device=device)
    veplnl2 = torch.zeros((*batch_shape,k),dtype=B.dtype,device=device)
    ul3 = torch.zeros((*batch_shape,k),dtype=B.dtype,device=device)
    ul2 = torch.zeros((*batch_shape,k),dtype=B.dtype,device=device)
    ul = torch.zeros((*batch_shape,k),dtype=B.dtype,device=device)
    u = torch.zeros((*batch_shape,k),dtype=B.dtype,device=device)
    w = torch.zeros_like(B)
    wl = torch.zeros_like(B)
    r1 = torch.zeros_like(B)
    xl2 = torch.zeros_like(B)
    alfa = torch.zeros_like(B)
    shift = 0 # TODO: If shift != 0 then the method solves (A - shift*I)x = b
    for i in range(iters+1):
        resid = matvec(x)-B 
        breakcond = i==iters or resid.abs().amax()<=residtol
        loss = torch.linalg.norm(resid,dim=-2)/bnorm
        times_i = timer.toc()
        losses_quantiles_i = {str(qt): loss.nanquantile(qt/100) for qt in quantiles_losses}
        if store_data_iters and (i%store_data_iters==0 or breakcond):
            iterrange.append(i)
            losses.append(loss.to(default_device))
            times.append(times_i)
            for qt in quantiles_losses:
                losses_quantiles[str(qt)].append(losses_quantiles_i[str(qt)].to(default_device))
            if store_all_data:
                xs.append(x.expand(resid.shape).to(default_device))
        if verbose and (i%verbose==0 or breakcond):
            _s_iter = "%-10d "%i
            _s_losses_qt = ("| %-9.1e "*len(verbose_quantiles_losses))%tuple(losses_quantiles_i[str(qt)] for qt in verbose_quantiles_losses)
            _s_times = "| %-10.1f "%(times_i) if verbose_times else ""
            print(" "*verbose_indent+_s_iter+_s_losses_qt+_s_times)
        if breakcond: break 
        betal = beta
        beta = betan
        v = r3/beta[...,None,:]    
        r3 = matvec(v.conj())-shift*v.conj()
        if i>0:
            r3 = r3-(beta/betal)[...,None,:]*r1
        alfa = inner(v,r3)
        r3 = r3-(alfa/beta)[...,None,:]*r2
        r1 = r2
        r2 = r3
        r3 = psolve(r2)
        betan = torch.sqrt(inner(r2,r3))
        dbar = dltan
        dlta = cs*dbar+sn*alfa
        epln = eplnn
        gbar = sn.conj()*dbar-cs*alfa
        eplnn = sn*betan
        dltan = -cs*betan
        dlta_QLP = dlta
        gamal3 = gamal2
        gamal2 = gamal
        gamal = gama
        (cs,sn,gama) = symOrtho(gbar,betan)
        gama_tmp = gama
        taul2  = taul
        taul = tau
        tau = cs*phi
        phi = sn.conj()*phi
        if i>1:
            veplnl2  = veplnl
            etal2 = etal
            etal = eta
            dlta_tmp = sr2*vepln - cr2*dlta
            veplnl = cr2*vepln+sr2.conj()*dlta
            dlta = dlta_tmp
            eta = sr2.conj()*gama
            gama = -cr2*gama
        if i>0:
            (cr1,sr1,gamal) = symOrtho(gamal.conj(),dlta.conj())
            gamal = gamal.conj()
            vepln = sr1.conj()*gama
            gama = -cr1*gama
        ul4 = ul3
        ul3 = ul2
        if i>1:
            ul2 = (taul2-etal2*ul4-veplnl2*ul3)/gamal2
        if i>0:
            ul = (taul-etal*ul3-veplnl*ul2)/gamal
        u = (tau - eta*ul2 - vepln*ul) / gama
        if i==0:
            wl2 = wl
            wl = v.conj()*sr1.conj()[...,None,:]
            w  = v.conj()*cr1[...,None,:]
        elif i==1:
            wl2 = wl
            wl = w*cr1[...,None,:]+v.conj()*sr1.conj()[...,None,:]
            w = w*sr1[...,None,:]-v.conj()*cr1[...,None,:]
        else:
            wl2 = wl
            wl = w
            w  = wl2*sr2[...,None,:]-v.conj()*cr2[...,None,:]
            wl2 = wl2*cr2[...,None,:]+v.conj()*sr2.conj()[...,None,:]
            v = wl*cr1[...,None,:]+w*sr1.conj()[...,None,:]
            w = wl*sr1[...,None,:]-w*cr1[...,None,:]
            wl = v
        xl2 = xl2+wl2*ul2[...,None,:]
        x = xl2+wl*ul[...,None,:]+w*u[...,None,:]
        pass
        (cr2,sr2,gamal) = symOrtho(gamal.conj(),eplnn.conj())
        gamal = gamal.conj()
    if store_data_iters==0:
        return x 
    else:
        data = {
            "x": x.to(default_device), 
            "iterrange": torch.tensor(iterrange,dtype=int), 
            "times": torch.tensor(times), 
            "losses_quantiles": {str(qt):torch.tensor(losses_quantiles[str(qt)]) for qt in quantiles_losses},
            }
        if store_all_data:
            data["xs"] = torch.stack(xs,dim=0)
            data["losses"] = torch.stack(losses,dim=0)
        return x,data

`torch` autograd utils

gradb

gradb(f, x, bkwargs={}, bdims=0, chunk_size=None)

Batched torch.func.grad and function evaluation

Parameters:

Name	Type	Description	Default
`f`	`callable`	Function to compute `torch.func.grad` of.	required
`x`	`Union[Tensor, Tuple]`	(batched) `Torch.Tensor` items whose gradients will be computed.	required
`bkwargs`	`dict`	(batched) `Torch.Tensor` items whose gradients will not be computed.	`{}`
`bdims`	`int`	number of batch dimensions	`0`
`chunk_size`	`int`	to be passed into `torch.func.vmap`.	`None`

Returns:

Name	Type	Description
`gradys`	`tuple`	`torch.Tensor` gradients, one with respect to each item in `x`.
`y`	`Tensor`	Function evaluations.

Examples:

>>> torch.set_default_dtype(torch.float64)
>>> rng = torch.Generator().manual_seed(7)

>>> f = lambda x: (x**2).sum(-1)
>>> x = torch.rand(5,generator=rng) 
>>> (grady,),y = gradb(f,x)
>>> grady.shape 
torch.Size([5])
>>> y.shape 
torch.Size([])
>>> torch.allclose(y,f(x))
True
>>> torch.allclose(grady,2*x)
True

>>> f = lambda x,z: (x**2*z**2).sum(-1)
>>> x = torch.rand(5,generator=rng) 
>>> z = torch.rand(5,generator=rng) 
>>> (grady_x,grady_z),y = gradb(f,(x,z))
>>> grady_x.shape 
torch.Size([5])
>>> grady_z.shape 
torch.Size([5])
>>> y.shape
torch.Size([])
>>> torch.allclose(y,f(x,z))
True
>>> torch.allclose(grady_x,2*x*z**2)
True
>>> torch.allclose(grady_z,x**2*2*z)
True

>>> f = lambda x: (x**2).sum(-1)
>>> x = torch.rand(3,4,5,generator=rng) 
>>> (grady,),y = gradb(f,x,bdims=2)
>>> grady.shape 
torch.Size([3, 4, 5])
>>> y.shape 
torch.Size([3, 4])
>>> torch.allclose(y,f(x))
True
>>> torch.allclose(grady,2*x)
True

>>> f = lambda x,z: (x**2*z**2).sum(-1)
>>> x = torch.rand(3,4,5,generator=rng) 
>>> z = torch.rand(3,4,5,generator=rng) 
>>> (grady_x,grady_z),y = gradb(f,(x,z),bdims=2)
>>> grady_x.shape 
torch.Size([3, 4, 5])
>>> grady_z.shape 
torch.Size([3, 4, 5])
>>> y.shape
torch.Size([3, 4])
>>> torch.allclose(y,f(x,z))
True
>>> torch.allclose(grady_x,2*x*z**2)
True
>>> torch.allclose(grady_z,x**2*2*z)
True

>>> f = lambda x,z: (x**2*z**2).sum((-2,-1))
>>> x = torch.rand(3,4,5,generator=rng) 
>>> z = torch.rand(3,4,5,generator=rng) 
>>> (grady_x,),y = gradb(f,x,bkwargs={"z":z},bdims=1)
>>> grady_x.shape 
torch.Size([3, 4, 5])
>>> y.shape
torch.Size([3])
>>> torch.allclose(y,f(x,z))
True
>>> torch.allclose(grady_x,2*x*z**2)
True

Source code in agsutil/autograd.py

def gradb(f, x, bkwargs={}, bdims=0, chunk_size=None):
    r"""
    Batched `torch.func.grad` and function evaluation 

    Args:
        f (callable): Function to compute `torch.func.grad` of. 
        x (Union[torch.Tensor,Tuple]): (batched) `Torch.Tensor` items whose gradients will be computed.
        bkwargs (dict): (batched) `Torch.Tensor` items whose gradients will not be computed.
        bdims (int): number of batch dimensions 
        chunk_size (int): to be passed into `torch.func.vmap`. 

    Returns: 
        gradys (tuple): `torch.Tensor` gradients, one with respect to each item in `x`. 
        y (torch.Tensor): Function evaluations. 

    Examples: 
        >>> torch.set_default_dtype(torch.float64)
        >>> rng = torch.Generator().manual_seed(7)

        >>> f = lambda x: (x**2).sum(-1)
        >>> x = torch.rand(5,generator=rng) 
        >>> (grady,),y = gradb(f,x)
        >>> grady.shape 
        torch.Size([5])
        >>> y.shape 
        torch.Size([])
        >>> torch.allclose(y,f(x))
        True
        >>> torch.allclose(grady,2*x)
        True

        >>> f = lambda x,z: (x**2*z**2).sum(-1)
        >>> x = torch.rand(5,generator=rng) 
        >>> z = torch.rand(5,generator=rng) 
        >>> (grady_x,grady_z),y = gradb(f,(x,z))
        >>> grady_x.shape 
        torch.Size([5])
        >>> grady_z.shape 
        torch.Size([5])
        >>> y.shape
        torch.Size([])
        >>> torch.allclose(y,f(x,z))
        True
        >>> torch.allclose(grady_x,2*x*z**2)
        True
        >>> torch.allclose(grady_z,x**2*2*z)
        True

        >>> f = lambda x: (x**2).sum(-1)
        >>> x = torch.rand(3,4,5,generator=rng) 
        >>> (grady,),y = gradb(f,x,bdims=2)
        >>> grady.shape 
        torch.Size([3, 4, 5])
        >>> y.shape 
        torch.Size([3, 4])
        >>> torch.allclose(y,f(x))
        True
        >>> torch.allclose(grady,2*x)
        True

        >>> f = lambda x,z: (x**2*z**2).sum(-1)
        >>> x = torch.rand(3,4,5,generator=rng) 
        >>> z = torch.rand(3,4,5,generator=rng) 
        >>> (grady_x,grady_z),y = gradb(f,(x,z),bdims=2)
        >>> grady_x.shape 
        torch.Size([3, 4, 5])
        >>> grady_z.shape 
        torch.Size([3, 4, 5])
        >>> y.shape
        torch.Size([3, 4])
        >>> torch.allclose(y,f(x,z))
        True
        >>> torch.allclose(grady_x,2*x*z**2)
        True
        >>> torch.allclose(grady_z,x**2*2*z)
        True

        >>> f = lambda x,z: (x**2*z**2).sum((-2,-1))
        >>> x = torch.rand(3,4,5,generator=rng) 
        >>> z = torch.rand(3,4,5,generator=rng) 
        >>> (grady_x,),y = gradb(f,x,bkwargs={"z":z},bdims=1)
        >>> grady_x.shape 
        torch.Size([3, 4, 5])
        >>> y.shape
        torch.Size([3])
        >>> torch.allclose(y,f(x,z))
        True
        >>> torch.allclose(grady_x,2*x*z**2)
        True
    """
    if isinstance(x,torch.Tensor): x = (x,)
    lenx = len(x)
    assert len(x)>=1
    batch_shape = list(x[0].shape[:bdims])
    for i,xi in enumerate(x): assert list(xi.shape[:bdims])==batch_shape, "input %d has shape = %s, but expected first dims to match batch_shape = %s"%(i,list(xi.shape),list(batch_shape))
    for k,v in bkwargs.items(): assert list(v.shape[:bdims])==batch_shape, "bkwargs['%s'].shape = %s, but expected first dims to match batch_shape = %s"%(k,list(v.shape),list(batch_shape))
    len_bkwargs = len(bkwargs) 
    bkwargs_keys = list(bkwargs.keys())
    def fwrap(*inputs):
        x = inputs[:lenx]
        bkwargs_vals = inputs[lenx:(lenx+len_bkwargs)]
        bkwargs = {bkwargs_keys[l]:bkwargs_vals[l] for l in range(len_bkwargs)}
        y = f(*x,**bkwargs)
        return y,y
    gradfwrap = torch.func.grad(fwrap,argnums=tuple(i for i in range(len(x))),has_aux=True)
    gradfwrapvec = torch.vmap(gradfwrap,in_dims=(0,)*(lenx+len_bkwargs),chunk_size=chunk_size)
    x_input = [xi.flatten(end_dim=bdims-1) if bdims>0 else xi[None,...] for xi in x]
    bkwargs_vals_input = [bkwargs[key].flatten(end_dim=bdims-1) if bdims>0 else bkwargs[key][None,...] for key in bkwargs_keys]
    gradys,y = gradfwrapvec(*x_input,*bkwargs_vals_input)
    if bdims==0:
        return tuple(grady[0] for grady in gradys),y[0]
    else:
        return tuple(grady.reshape(batch_shape+list(grady.shape[1:])) for grady in gradys),y.reshape(batch_shape+list(y.shape[1:]))

jacfwdb

jacfwdb(f, x, bkwargs={}, bdims=0, chunk_size=None)

Batched torch.func.jacfwd with function evaluation

Parameters:

Name	Type	Description	Default
`f`	`callable`	Function to compute `torch.func.jacfwd` of.	required
`x`	`Union[Tensor, Tuple]`	(batched) `Torch.Tensor` items whose jacobians will be computed.	required
`bkwargs`	`dict`	(batched) `Torch.Tensor` items whose jacobians will not be computed.	`{}`
`bdims`	`int`	number of batch dimensions	`0`
`chunk_size`	`int`	to be passed into `torch.func.vmap`.	`None`

Returns:

Name	Type	Description
`jacys`	`tuple`	`torch.Tensor` jacobians, one with respect to each item in `x`.
`y`	`Tensor`	Function evaluations.

Examples:

>>> torch.set_default_dtype(torch.float64)
>>> rng = torch.Generator().manual_seed(7)

>>> f = lambda x: (x[...,None]**torch.arange(2,5)).sum(-2)
>>> x = torch.rand(5,generator=rng) 
>>> (jacy,),y = jacfwdb(f,x)
>>> jacy.shape 
torch.Size([3, 5])
>>> y.shape 
torch.Size([3])
>>> torch.allclose(y,f(x))
True
>>> torch.allclose(jacy,torch.arange(2,5)[:,None]*x**torch.arange(1,4)[:,None])
True

>>> f = lambda x,z: (x[...,None,None]**torch.arange(2,4)[:,None]*z[...,None,None]**torch.arange(3,5)[None,:]).sum(-3)
>>> x = torch.rand(5,generator=rng) 
>>> z = torch.rand(5,generator=rng) 
>>> (jacy_x,jacy_z),y = jacfwdb(f,(x,z))
>>> jacy_x.shape 
torch.Size([2, 2, 5])
>>> jacy_z.shape 
torch.Size([2, 2, 5])
>>> y.shape
torch.Size([2, 2])
>>> torch.allclose(y,f(x,z))
True
>>> torch.allclose(jacy_x,torch.arange(2,4)[:,None,None]*x**torch.arange(1,3)[:,None,None]*z**torch.arange(3,5)[None,:,None])
True
>>> torch.allclose(jacy_z,x**torch.arange(2,4)[:,None,None]*torch.arange(3,5)[None,:,None]*z**torch.arange(2,4)[None,:,None])
True

>>> f = lambda x: (x[...,None]**torch.arange(2,5)).sum(-2)
>>> x = torch.rand(3,4,5,generator=rng) 
>>> (jacy,),y = jacfwdb(f,x,bdims=2)
>>> jacy.shape 
torch.Size([3, 4, 3, 5])
>>> y.shape 
torch.Size([3, 4, 3])
>>> torch.allclose(y,f(x))
True
>>> torch.allclose(jacy,torch.arange(2,5)[:,None]*x[...,None,:]**torch.arange(1,4)[:,None])
True

>>> f = lambda x,z: (x[...,None,None]**torch.arange(2,4)[:,None]*z[...,None,None]**torch.arange(3,5)[None,:]).sum(-3)
>>> x = torch.rand(3,4,5,generator=rng) 
>>> z = torch.rand(3,4,5,generator=rng) 
>>> (jacy_x,jacy_z),y = jacfwdb(f,(x,z),bdims=2)
>>> jacy_x.shape 
torch.Size([3, 4, 2, 2, 5])
>>> jacy_z.shape 
torch.Size([3, 4, 2, 2, 5])
>>> y.shape
torch.Size([3, 4, 2, 2])
>>> torch.allclose(y,f(x,z))
True
>>> torch.allclose(jacy_x,torch.arange(2,4)[:,None,None]*x[...,None,None,:]**torch.arange(1,3)[:,None,None]*z[...,None,None,:]**torch.arange(3,5)[None,:,None])
True
>>> torch.allclose(jacy_z,x[...,None,None,:]**torch.arange(2,4)[:,None,None]*torch.arange(3,5)[None,:,None]*z[...,None,None,:]**torch.arange(2,4)[None,:,None])
True

>>> f = lambda x,z: (x[...,None,None]**torch.arange(2,4)[:,None]*z[...,None,None]**torch.arange(3,5)[None,:]).sum((-4,-3))
>>> x = torch.rand(3,4,5,6,generator=rng) 
>>> z = torch.rand(3,4,5,6,generator=rng) 
>>> (jacy_x,),y = jacfwdb(f,x,bkwargs={"z":z},bdims=2)
>>> jacy_x.shape 
torch.Size([3, 4, 2, 2, 5, 6])
>>> y.shape
torch.Size([3, 4, 2, 2])
>>> torch.allclose(y,f(x,z))
True
>>> torch.allclose(jacy_x,torch.arange(2,4)[:,None,None,None]*x[...,None,None,:,:]**torch.arange(1,3)[:,None,None,None]*z[...,None,None,:,:]**torch.arange(3,5)[None,:,None,None])
True

>>> def f(x):
...     y = (x[...,None]**torch.arange(2,4)).sum(-2)
...     u = (x[...,None]**torch.arange(3,5)).sum(-2)
...     v = (x[...,None]**torch.arange(4,6)).sum(-2)
...     return y,u,v
>>> x = torch.rand(5,generator=rng) 
>>> ((jacy_x,),(jacu_x,),(jacv_x,)),(y,u,v) = jacfwdb(f,x)
>>> jacy_x.shape
torch.Size([2, 5])
>>> jacu_x.shape
torch.Size([2, 5])
>>> jacv_x.shape
torch.Size([2, 5])
>>> y.shape 
torch.Size([2])
>>> u.shape 
torch.Size([2])
>>> v.shape
torch.Size([2])

>>> def f(x,z):
...     y = (x[...,None,None]**torch.arange(2,4)[:,None]*z[...,None,None]**torch.arange(3,5)[None,:]).sum(-3)
...     u = (x[...,None,None]**torch.arange(3,5)[:,None]*z[...,None,None]**torch.arange(4,6)[None,:]).sum(-3)
...     v = (x[...,None,None]**torch.arange(4,6)[:,None]*z[...,None,None]**torch.arange(5,7)[None,:]).sum(-3)
...     return y,u,v
>>> x = torch.rand(3,4,5,generator=rng) 
>>> z = torch.rand(3,4,5,generator=rng) 
>>> ((jacy_x,jacy_z),(jacu_x,jacu_z),(jacv_x,jacv_z)),(y,u,v) = jacfwdb(f,(x,z),bdims=2)
>>> jacy_x.shape
torch.Size([3, 4, 2, 2, 5])
>>> jacy_z.shape
torch.Size([3, 4, 2, 2, 5])
>>> jacu_x.shape
torch.Size([3, 4, 2, 2, 5])
>>> jacu_z.shape
torch.Size([3, 4, 2, 2, 5])
>>> jacv_x.shape
torch.Size([3, 4, 2, 2, 5])
>>> jacv_z.shape
torch.Size([3, 4, 2, 2, 5])
>>> y.shape 
torch.Size([3, 4, 2, 2])
>>> u.shape 
torch.Size([3, 4, 2, 2])
>>> v.shape
torch.Size([3, 4, 2, 2])

Source code in agsutil/autograd.py

def jacfwdb(f, x, bkwargs={}, bdims=0, chunk_size=None):
    r"""
    Batched `torch.func.jacfwd` with function evaluation 

    Args:
        f (callable): Function to compute `torch.func.jacfwd` of. 
        x (Union[torch.Tensor,Tuple]): (batched) `Torch.Tensor` items whose jacobians will be computed.
        bkwargs (dict): (batched) `Torch.Tensor` items whose jacobians will not be computed.
        bdims (int): number of batch dimensions 
        chunk_size (int): to be passed into `torch.func.vmap`. 

    Returns: 
        jacys (tuple): `torch.Tensor` jacobians, one with respect to each item in `x`. 
        y (torch.Tensor): Function evaluations. 

    Examples: 
        >>> torch.set_default_dtype(torch.float64)
        >>> rng = torch.Generator().manual_seed(7)

        >>> f = lambda x: (x[...,None]**torch.arange(2,5)).sum(-2)
        >>> x = torch.rand(5,generator=rng) 
        >>> (jacy,),y = jacfwdb(f,x)
        >>> jacy.shape 
        torch.Size([3, 5])
        >>> y.shape 
        torch.Size([3])
        >>> torch.allclose(y,f(x))
        True
        >>> torch.allclose(jacy,torch.arange(2,5)[:,None]*x**torch.arange(1,4)[:,None])
        True

        >>> f = lambda x,z: (x[...,None,None]**torch.arange(2,4)[:,None]*z[...,None,None]**torch.arange(3,5)[None,:]).sum(-3)
        >>> x = torch.rand(5,generator=rng) 
        >>> z = torch.rand(5,generator=rng) 
        >>> (jacy_x,jacy_z),y = jacfwdb(f,(x,z))
        >>> jacy_x.shape 
        torch.Size([2, 2, 5])
        >>> jacy_z.shape 
        torch.Size([2, 2, 5])
        >>> y.shape
        torch.Size([2, 2])
        >>> torch.allclose(y,f(x,z))
        True
        >>> torch.allclose(jacy_x,torch.arange(2,4)[:,None,None]*x**torch.arange(1,3)[:,None,None]*z**torch.arange(3,5)[None,:,None])
        True
        >>> torch.allclose(jacy_z,x**torch.arange(2,4)[:,None,None]*torch.arange(3,5)[None,:,None]*z**torch.arange(2,4)[None,:,None])
        True

        >>> f = lambda x: (x[...,None]**torch.arange(2,5)).sum(-2)
        >>> x = torch.rand(3,4,5,generator=rng) 
        >>> (jacy,),y = jacfwdb(f,x,bdims=2)
        >>> jacy.shape 
        torch.Size([3, 4, 3, 5])
        >>> y.shape 
        torch.Size([3, 4, 3])
        >>> torch.allclose(y,f(x))
        True
        >>> torch.allclose(jacy,torch.arange(2,5)[:,None]*x[...,None,:]**torch.arange(1,4)[:,None])
        True

        >>> f = lambda x,z: (x[...,None,None]**torch.arange(2,4)[:,None]*z[...,None,None]**torch.arange(3,5)[None,:]).sum(-3)
        >>> x = torch.rand(3,4,5,generator=rng) 
        >>> z = torch.rand(3,4,5,generator=rng) 
        >>> (jacy_x,jacy_z),y = jacfwdb(f,(x,z),bdims=2)
        >>> jacy_x.shape 
        torch.Size([3, 4, 2, 2, 5])
        >>> jacy_z.shape 
        torch.Size([3, 4, 2, 2, 5])
        >>> y.shape
        torch.Size([3, 4, 2, 2])
        >>> torch.allclose(y,f(x,z))
        True
        >>> torch.allclose(jacy_x,torch.arange(2,4)[:,None,None]*x[...,None,None,:]**torch.arange(1,3)[:,None,None]*z[...,None,None,:]**torch.arange(3,5)[None,:,None])
        True
        >>> torch.allclose(jacy_z,x[...,None,None,:]**torch.arange(2,4)[:,None,None]*torch.arange(3,5)[None,:,None]*z[...,None,None,:]**torch.arange(2,4)[None,:,None])
        True

        >>> f = lambda x,z: (x[...,None,None]**torch.arange(2,4)[:,None]*z[...,None,None]**torch.arange(3,5)[None,:]).sum((-4,-3))
        >>> x = torch.rand(3,4,5,6,generator=rng) 
        >>> z = torch.rand(3,4,5,6,generator=rng) 
        >>> (jacy_x,),y = jacfwdb(f,x,bkwargs={"z":z},bdims=2)
        >>> jacy_x.shape 
        torch.Size([3, 4, 2, 2, 5, 6])
        >>> y.shape
        torch.Size([3, 4, 2, 2])
        >>> torch.allclose(y,f(x,z))
        True
        >>> torch.allclose(jacy_x,torch.arange(2,4)[:,None,None,None]*x[...,None,None,:,:]**torch.arange(1,3)[:,None,None,None]*z[...,None,None,:,:]**torch.arange(3,5)[None,:,None,None])
        True

        >>> def f(x):
        ...     y = (x[...,None]**torch.arange(2,4)).sum(-2)
        ...     u = (x[...,None]**torch.arange(3,5)).sum(-2)
        ...     v = (x[...,None]**torch.arange(4,6)).sum(-2)
        ...     return y,u,v
        >>> x = torch.rand(5,generator=rng) 
        >>> ((jacy_x,),(jacu_x,),(jacv_x,)),(y,u,v) = jacfwdb(f,x)
        >>> jacy_x.shape
        torch.Size([2, 5])
        >>> jacu_x.shape
        torch.Size([2, 5])
        >>> jacv_x.shape
        torch.Size([2, 5])
        >>> y.shape 
        torch.Size([2])
        >>> u.shape 
        torch.Size([2])
        >>> v.shape
        torch.Size([2])

        >>> def f(x,z):
        ...     y = (x[...,None,None]**torch.arange(2,4)[:,None]*z[...,None,None]**torch.arange(3,5)[None,:]).sum(-3)
        ...     u = (x[...,None,None]**torch.arange(3,5)[:,None]*z[...,None,None]**torch.arange(4,6)[None,:]).sum(-3)
        ...     v = (x[...,None,None]**torch.arange(4,6)[:,None]*z[...,None,None]**torch.arange(5,7)[None,:]).sum(-3)
        ...     return y,u,v
        >>> x = torch.rand(3,4,5,generator=rng) 
        >>> z = torch.rand(3,4,5,generator=rng) 
        >>> ((jacy_x,jacy_z),(jacu_x,jacu_z),(jacv_x,jacv_z)),(y,u,v) = jacfwdb(f,(x,z),bdims=2)
        >>> jacy_x.shape
        torch.Size([3, 4, 2, 2, 5])
        >>> jacy_z.shape
        torch.Size([3, 4, 2, 2, 5])
        >>> jacu_x.shape
        torch.Size([3, 4, 2, 2, 5])
        >>> jacu_z.shape
        torch.Size([3, 4, 2, 2, 5])
        >>> jacv_x.shape
        torch.Size([3, 4, 2, 2, 5])
        >>> jacv_z.shape
        torch.Size([3, 4, 2, 2, 5])
        >>> y.shape 
        torch.Size([3, 4, 2, 2])
        >>> u.shape 
        torch.Size([3, 4, 2, 2])
        >>> v.shape
        torch.Size([3, 4, 2, 2])
    """
    if isinstance(x,torch.Tensor): x = (x,)
    lenx = len(x) 
    assert len(x)>=1
    batch_shape = list(x[0].shape[:bdims])
    for i,xi in enumerate(x): assert list(xi.shape[:bdims])==batch_shape, "input %d has shape = %s, but expected first dims to match batch_shape = %s"%(i,list(xi.shape),list(batch_shape))
    for k,v in bkwargs.items(): assert list(v.shape[:bdims])==batch_shape, "bkwargs['%s'].shape = %s, but expected first dims to match batch_shape = %s"%(k,list(v.shape),list(batch_shape))
    len_bkwargs = len(bkwargs) 
    bkwargs_keys = list(bkwargs.keys())
    def fwrap(*inputs):
        x = inputs[:lenx]
        bkwargs_vals = inputs[lenx:(lenx+len_bkwargs)]
        bkwargs = {bkwargs_keys[l]:bkwargs_vals[l] for l in range(len_bkwargs)}
        y = f(*x,**bkwargs)
        return y,y
    jacfwrap = torch.func.jacfwd(fwrap,argnums=tuple(i for i in range(len(x))),has_aux=True)
    jacfwrapvec = torch.vmap(jacfwrap,in_dims=(0,)*(lenx+len_bkwargs),chunk_size=chunk_size)
    x_input = [xi.flatten(end_dim=bdims-1) if bdims>0 else xi[None,...] for xi in x]
    bkwargs_vals_input = [bkwargs[key].flatten(end_dim=bdims-1) if bdims>0 else bkwargs[key][None,...] for key in bkwargs_keys]
    jacys,y = jacfwrapvec(*x_input,*bkwargs_vals_input)
    if isinstance(y,torch.Tensor):
        if bdims==0:
            return tuple(jacy[0] for jacy in jacys),y[0]
        else:
            return tuple(jacy.reshape(batch_shape+list(jacy.shape[1:])) for jacy in jacys),y.reshape(batch_shape+list(y.shape[1:]))
    else:
        if bdims==0:
            return tuple(tuple(jacyk[0] for jacyk in jacy) for jacy in jacys),tuple(yk[0] for yk in y)
        else:
            return tuple(tuple(jacyk.reshape(batch_shape+list(jacyk.shape[1:])) for jacyk in jacy) for jacy in jacys),tuple(yk.reshape(batch_shape+list(yk.shape[1:])) for yk in y)

jacrevb

jacrevb(f, x, bkwargs={}, bdims=0, chunk_size=None)

Batched torch.func.jacrev with function evaluation

Parameters:

Name	Type	Description	Default
`f`	`callable`	Function to compute `torch.func.jacrev` of.	required
`x`	`Tuple`	(Union[torch.Tensor,Tuple]) `Torch.Tensor` items whose jacobians will be computed.	required
`bkwargs`	`dict`	(batched) `Torch.Tensor` items whose jacobians will not be computed.	`{}`
`bdims`	`int`	number of batch dimensions	`0`
`chunk_size`	`int`	to be passed into `torch.func.vmap`.	`None`

Returns:

Name	Type	Description
`jacys`	`tuple`	`torch.Tensor` jacobians, one with respect to each item in `x`.
`y`	`Tensor`	Function evaluations.

Examples:

>>> torch.set_default_dtype(torch.float64)
>>> rng = torch.Generator().manual_seed(7)

>>> f = lambda x: (x[...,None]**torch.arange(2,5)).sum(-2)
>>> x = torch.rand(5,generator=rng) 
>>> (jacy,),y = jacrevb(f,x)
>>> jacy.shape 
torch.Size([3, 5])
>>> y.shape 
torch.Size([3])
>>> torch.allclose(y,f(x))
True
>>> torch.allclose(jacy,torch.arange(2,5)[:,None]*x**torch.arange(1,4)[:,None])
True

>>> f = lambda x,z: (x[...,None,None]**torch.arange(2,4)[:,None]*z[...,None,None]**torch.arange(3,5)[None,:]).sum(-3)
>>> x = torch.rand(5,generator=rng) 
>>> z = torch.rand(5,generator=rng) 
>>> (jacy_x,jacy_z),y = jacrevb(f,(x,z))
>>> jacy_x.shape 
torch.Size([2, 2, 5])
>>> jacy_z.shape 
torch.Size([2, 2, 5])
>>> y.shape
torch.Size([2, 2])
>>> torch.allclose(y,f(x,z))
True
>>> torch.allclose(jacy_x,torch.arange(2,4)[:,None,None]*x**torch.arange(1,3)[:,None,None]*z**torch.arange(3,5)[None,:,None])
True
>>> torch.allclose(jacy_z,x**torch.arange(2,4)[:,None,None]*torch.arange(3,5)[None,:,None]*z**torch.arange(2,4)[None,:,None])
True

>>> f = lambda x: (x[...,None]**torch.arange(2,5)).sum(-2)
>>> x = torch.rand(3,4,5,generator=rng) 
>>> (jacy,),y = jacrevb(f,x,bdims=2)
>>> jacy.shape 
torch.Size([3, 4, 3, 5])
>>> y.shape 
torch.Size([3, 4, 3])
>>> torch.allclose(y,f(x))
True
>>> torch.allclose(jacy,torch.arange(2,5)[:,None]*x[...,None,:]**torch.arange(1,4)[:,None])
True

>>> f = lambda x,z: (x[...,None,None]**torch.arange(2,4)[:,None]*z[...,None,None]**torch.arange(3,5)[None,:]).sum(-3)
>>> x = torch.rand(3,4,5,generator=rng) 
>>> z = torch.rand(3,4,5,generator=rng) 
>>> (jacy_x,jacy_z),y = jacrevb(f,(x,z),bdims=2)
>>> jacy_x.shape 
torch.Size([3, 4, 2, 2, 5])
>>> jacy_z.shape 
torch.Size([3, 4, 2, 2, 5])
>>> y.shape
torch.Size([3, 4, 2, 2])
>>> torch.allclose(y,f(x,z))
True
>>> torch.allclose(jacy_x,torch.arange(2,4)[:,None,None]*x[...,None,None,:]**torch.arange(1,3)[:,None,None]*z[...,None,None,:]**torch.arange(3,5)[None,:,None])
True
>>> torch.allclose(jacy_z,x[...,None,None,:]**torch.arange(2,4)[:,None,None]*torch.arange(3,5)[None,:,None]*z[...,None,None,:]**torch.arange(2,4)[None,:,None])
True

>>> f = lambda x,z: (x[...,None,None]**torch.arange(2,4)[:,None]*z[...,None,None]**torch.arange(3,5)[None,:]).sum((-4,-3))
>>> x = torch.rand(3,4,5,6,generator=rng) 
>>> z = torch.rand(3,4,5,6,generator=rng) 
>>> (jacy_x,),y = jacrevb(f,x,bkwargs={"z":z},bdims=2)
>>> jacy_x.shape 
torch.Size([3, 4, 2, 2, 5, 6])
>>> y.shape
torch.Size([3, 4, 2, 2])
>>> torch.allclose(y,f(x,z))
True
>>> torch.allclose(jacy_x,torch.arange(2,4)[:,None,None,None]*x[...,None,None,:,:]**torch.arange(1,3)[:,None,None,None]*z[...,None,None,:,:]**torch.arange(3,5)[None,:,None,None])
True

>>> def f(x):
...     y = (x[...,None]**torch.arange(2,4)).sum(-2)
...     u = (x[...,None]**torch.arange(3,5)).sum(-2)
...     v = (x[...,None]**torch.arange(4,6)).sum(-2)
...     return y,u,v
>>> x = torch.rand(5,generator=rng) 
>>> ((jacy_x,),(jacu_x,),(jacv_x,)),(y,u,v) = jacrevb(f,x)
>>> jacy_x.shape
torch.Size([2, 5])
>>> jacu_x.shape
torch.Size([2, 5])
>>> jacv_x.shape
torch.Size([2, 5])
>>> y.shape 
torch.Size([2])
>>> u.shape 
torch.Size([2])
>>> v.shape
torch.Size([2])

>>> def f(x,z):
...     y = (x[...,None,None]**torch.arange(2,4)[:,None]*z[...,None,None]**torch.arange(3,5)[None,:]).sum(-3)
...     u = (x[...,None,None]**torch.arange(3,5)[:,None]*z[...,None,None]**torch.arange(4,6)[None,:]).sum(-3)
...     v = (x[...,None,None]**torch.arange(4,6)[:,None]*z[...,None,None]**torch.arange(5,7)[None,:]).sum(-3)
...     return y,u,v
>>> x = torch.rand(3,4,5,generator=rng) 
>>> z = torch.rand(3,4,5,generator=rng) 
>>> ((jacy_x,jacy_z),(jacu_x,jacu_z),(jacv_x,jacv_z)),(y,u,v) = jacrevb(f,(x,z),bdims=2)
>>> jacy_x.shape
torch.Size([3, 4, 2, 2, 5])
>>> jacy_z.shape
torch.Size([3, 4, 2, 2, 5])
>>> jacu_x.shape
torch.Size([3, 4, 2, 2, 5])
>>> jacu_z.shape
torch.Size([3, 4, 2, 2, 5])
>>> jacv_x.shape
torch.Size([3, 4, 2, 2, 5])
>>> jacv_z.shape
torch.Size([3, 4, 2, 2, 5])
>>> y.shape 
torch.Size([3, 4, 2, 2])
>>> u.shape 
torch.Size([3, 4, 2, 2])
>>> v.shape
torch.Size([3, 4, 2, 2])

Source code in agsutil/autograd.py

def jacrevb(f, x, bkwargs={}, bdims=0, chunk_size=None):
    r"""
    Batched `torch.func.jacrev` with function evaluation 

    Args:
        f (callable): Function to compute `torch.func.jacrev` of. 
        x (Tuple): (Union[torch.Tensor,Tuple]) `Torch.Tensor` items whose jacobians will be computed.
        bkwargs (dict): (batched) `Torch.Tensor` items whose jacobians will not be computed.
        bdims (int): number of batch dimensions 
        chunk_size (int): to be passed into `torch.func.vmap`. 

    Returns: 
        jacys (tuple): `torch.Tensor` jacobians, one with respect to each item in `x`. 
        y (torch.Tensor): Function evaluations. 

    Examples: 
        >>> torch.set_default_dtype(torch.float64)
        >>> rng = torch.Generator().manual_seed(7)

        >>> f = lambda x: (x[...,None]**torch.arange(2,5)).sum(-2)
        >>> x = torch.rand(5,generator=rng) 
        >>> (jacy,),y = jacrevb(f,x)
        >>> jacy.shape 
        torch.Size([3, 5])
        >>> y.shape 
        torch.Size([3])
        >>> torch.allclose(y,f(x))
        True
        >>> torch.allclose(jacy,torch.arange(2,5)[:,None]*x**torch.arange(1,4)[:,None])
        True

        >>> f = lambda x,z: (x[...,None,None]**torch.arange(2,4)[:,None]*z[...,None,None]**torch.arange(3,5)[None,:]).sum(-3)
        >>> x = torch.rand(5,generator=rng) 
        >>> z = torch.rand(5,generator=rng) 
        >>> (jacy_x,jacy_z),y = jacrevb(f,(x,z))
        >>> jacy_x.shape 
        torch.Size([2, 2, 5])
        >>> jacy_z.shape 
        torch.Size([2, 2, 5])
        >>> y.shape
        torch.Size([2, 2])
        >>> torch.allclose(y,f(x,z))
        True
        >>> torch.allclose(jacy_x,torch.arange(2,4)[:,None,None]*x**torch.arange(1,3)[:,None,None]*z**torch.arange(3,5)[None,:,None])
        True
        >>> torch.allclose(jacy_z,x**torch.arange(2,4)[:,None,None]*torch.arange(3,5)[None,:,None]*z**torch.arange(2,4)[None,:,None])
        True

        >>> f = lambda x: (x[...,None]**torch.arange(2,5)).sum(-2)
        >>> x = torch.rand(3,4,5,generator=rng) 
        >>> (jacy,),y = jacrevb(f,x,bdims=2)
        >>> jacy.shape 
        torch.Size([3, 4, 3, 5])
        >>> y.shape 
        torch.Size([3, 4, 3])
        >>> torch.allclose(y,f(x))
        True
        >>> torch.allclose(jacy,torch.arange(2,5)[:,None]*x[...,None,:]**torch.arange(1,4)[:,None])
        True

        >>> f = lambda x,z: (x[...,None,None]**torch.arange(2,4)[:,None]*z[...,None,None]**torch.arange(3,5)[None,:]).sum(-3)
        >>> x = torch.rand(3,4,5,generator=rng) 
        >>> z = torch.rand(3,4,5,generator=rng) 
        >>> (jacy_x,jacy_z),y = jacrevb(f,(x,z),bdims=2)
        >>> jacy_x.shape 
        torch.Size([3, 4, 2, 2, 5])
        >>> jacy_z.shape 
        torch.Size([3, 4, 2, 2, 5])
        >>> y.shape
        torch.Size([3, 4, 2, 2])
        >>> torch.allclose(y,f(x,z))
        True
        >>> torch.allclose(jacy_x,torch.arange(2,4)[:,None,None]*x[...,None,None,:]**torch.arange(1,3)[:,None,None]*z[...,None,None,:]**torch.arange(3,5)[None,:,None])
        True
        >>> torch.allclose(jacy_z,x[...,None,None,:]**torch.arange(2,4)[:,None,None]*torch.arange(3,5)[None,:,None]*z[...,None,None,:]**torch.arange(2,4)[None,:,None])
        True

        >>> f = lambda x,z: (x[...,None,None]**torch.arange(2,4)[:,None]*z[...,None,None]**torch.arange(3,5)[None,:]).sum((-4,-3))
        >>> x = torch.rand(3,4,5,6,generator=rng) 
        >>> z = torch.rand(3,4,5,6,generator=rng) 
        >>> (jacy_x,),y = jacrevb(f,x,bkwargs={"z":z},bdims=2)
        >>> jacy_x.shape 
        torch.Size([3, 4, 2, 2, 5, 6])
        >>> y.shape
        torch.Size([3, 4, 2, 2])
        >>> torch.allclose(y,f(x,z))
        True
        >>> torch.allclose(jacy_x,torch.arange(2,4)[:,None,None,None]*x[...,None,None,:,:]**torch.arange(1,3)[:,None,None,None]*z[...,None,None,:,:]**torch.arange(3,5)[None,:,None,None])
        True

        >>> def f(x):
        ...     y = (x[...,None]**torch.arange(2,4)).sum(-2)
        ...     u = (x[...,None]**torch.arange(3,5)).sum(-2)
        ...     v = (x[...,None]**torch.arange(4,6)).sum(-2)
        ...     return y,u,v
        >>> x = torch.rand(5,generator=rng) 
        >>> ((jacy_x,),(jacu_x,),(jacv_x,)),(y,u,v) = jacrevb(f,x)
        >>> jacy_x.shape
        torch.Size([2, 5])
        >>> jacu_x.shape
        torch.Size([2, 5])
        >>> jacv_x.shape
        torch.Size([2, 5])
        >>> y.shape 
        torch.Size([2])
        >>> u.shape 
        torch.Size([2])
        >>> v.shape
        torch.Size([2])

        >>> def f(x,z):
        ...     y = (x[...,None,None]**torch.arange(2,4)[:,None]*z[...,None,None]**torch.arange(3,5)[None,:]).sum(-3)
        ...     u = (x[...,None,None]**torch.arange(3,5)[:,None]*z[...,None,None]**torch.arange(4,6)[None,:]).sum(-3)
        ...     v = (x[...,None,None]**torch.arange(4,6)[:,None]*z[...,None,None]**torch.arange(5,7)[None,:]).sum(-3)
        ...     return y,u,v
        >>> x = torch.rand(3,4,5,generator=rng) 
        >>> z = torch.rand(3,4,5,generator=rng) 
        >>> ((jacy_x,jacy_z),(jacu_x,jacu_z),(jacv_x,jacv_z)),(y,u,v) = jacrevb(f,(x,z),bdims=2)
        >>> jacy_x.shape
        torch.Size([3, 4, 2, 2, 5])
        >>> jacy_z.shape
        torch.Size([3, 4, 2, 2, 5])
        >>> jacu_x.shape
        torch.Size([3, 4, 2, 2, 5])
        >>> jacu_z.shape
        torch.Size([3, 4, 2, 2, 5])
        >>> jacv_x.shape
        torch.Size([3, 4, 2, 2, 5])
        >>> jacv_z.shape
        torch.Size([3, 4, 2, 2, 5])
        >>> y.shape 
        torch.Size([3, 4, 2, 2])
        >>> u.shape 
        torch.Size([3, 4, 2, 2])
        >>> v.shape
        torch.Size([3, 4, 2, 2])
    """
    if isinstance(x,torch.Tensor): x = (x,)
    lenx = len(x) 
    assert len(x)>=1
    batch_shape = list(x[0].shape[:bdims])
    for i,xi in enumerate(x): assert list(xi.shape[:bdims])==batch_shape, "input %d has shape = %s, but expected first dims to match batch_shape = %s"%(i,list(xi.shape),list(batch_shape))
    for k,v in bkwargs.items(): assert list(v.shape[:bdims])==batch_shape, "bkwargs['%s'].shape = %s, but expected first dims to match batch_shape = %s"%(k,list(v.shape),list(batch_shape))
    len_bkwargs = len(bkwargs) 
    bkwargs_keys = list(bkwargs.keys())
    def fwrap(*inputs):
        x = inputs[:lenx]
        bkwargs_vals = inputs[lenx:(lenx+len_bkwargs)]
        bkwargs = {bkwargs_keys[l]:bkwargs_vals[l] for l in range(len_bkwargs)}
        y = f(*x,**bkwargs)
        return y,y
    jacfwrap = torch.func.jacrev(fwrap,argnums=tuple(i for i in range(len(x))),has_aux=True)
    jacfwrapvec = torch.vmap(jacfwrap,in_dims=(0,)*(lenx+len_bkwargs),chunk_size=chunk_size)
    x_input = [xi.flatten(end_dim=bdims-1) if bdims>0 else xi[None,...] for xi in x]
    bkwargs_vals_input = [bkwargs[key].flatten(end_dim=bdims-1) if bdims>0 else bkwargs[key][None,...] for key in bkwargs_keys]
    jacys,y = jacfwrapvec(*x_input,*bkwargs_vals_input)
    if isinstance(y,torch.Tensor):
        if bdims==0:
            return tuple(jacy[0] for jacy in jacys),y[0]
        else:
            return tuple(jacy.reshape(batch_shape+list(jacy.shape[1:])) for jacy in jacys),y.reshape(batch_shape+list(y.shape[1:]))
    else:
        if bdims==0:
            return tuple(tuple(jacyk[0] for jacyk in jacy) for jacy in jacys),tuple(yk[0] for yk in y)
        else:
            return tuple(tuple(jacyk.reshape(batch_shape+list(jacyk.shape[1:])) for jacyk in jacy) for jacy in jacys),tuple(yk.reshape(batch_shape+list(yk.shape[1:])) for yk in y)

jvpb

jvpb(f, x, p, bkwargs={}, bdims=0, chunk_size=None)

Batched torch.func.jvp with function evaluation (forward-mode auto-diff)

Parameters:

Name	Type	Description	Default
`f`	`callable`	Function to compute `torch.func.jvp` of.	required
`x`	`Union[Tensor, Tuple]`	(batched) `Torch.Tensor` primals.	required
`p`	`Union[Tensor, Tuple]`	(batched) `Torch.Tensor` tangents.	required
`bkwargs`	`dict`	(batched) `Torch.Tensor` items whose jacobians will not be computed.	`{}`
`bdims`	`int`	number of batch dimensions	`0`
`chunk_size`	`int`	to be passed into `torch.func.vmap`.	`None`

Returns:

Name	Type	Description
`jvp`	`tuple`	`torch.Tensor` jacobian vector products, one with respect to each item in `x`.
`y`	`Tensor`	Function evaluations.

Examples:

>>> torch.set_default_dtype(torch.float64)
>>> rng = torch.Generator().manual_seed(7)

>>> f = lambda x: (x[...,None]**torch.arange(2,5)).sum(-2)
>>> x = torch.rand(5,generator=rng)
>>> p = torch.rand(5,generator=rng)
>>> (jvpy,),y = jvpb(f,x,p)
>>> jvpy.shape 
torch.Size([3])
>>> y.shape 
torch.Size([3])
>>> torch.allclose(y,f(x))
True
>>> (jac_x,),_ = jacfwdb(f,x)
>>> torch.allclose(jvpy,jac_x@p)
True

>>> f = lambda x,z: (x[...,:,None,None]**torch.arange(2,5)*z[...,None,:,None]**torch.arange(1,4)).sum((-3,-2))
>>> x = torch.rand(5,generator=rng)
>>> z = torch.rand(4,generator=rng)
>>> p = torch.rand(5,generator=rng)
>>> q = torch.rand(4,generator=rng)
>>> (jvpy,),y = jvpb(f,(x,z),(p,q))
>>> jvpy.shape
torch.Size([3])
>>> y.shape
torch.Size([3])
>>> torch.allclose(y,f(x,z))
True
>>> (jac_x,jac_z),_ = jacfwdb(f,(x,z))
>>> torch.allclose(jvpy,jac_x@p+jac_z@q)
True

>>> f = lambda x: (x[...,None]**torch.arange(2,5)).sum(-2)
>>> x = torch.rand(6,4,5,generator=rng)
>>> p = torch.rand(6,4,5,generator=rng)
>>> (jvpy,),y = jvpb(f,x,p,bdims=2)
>>> jvpy.shape 
torch.Size([6, 4, 3])
>>> y.shape 
torch.Size([6, 4, 3])
>>> torch.allclose(y,f(x))
True
>>> (jac_x,),_ = jacfwdb(f,x,bdims=2)
>>> torch.allclose(jvpy,(jac_x*p[...,None,:]).sum(-1))
True

>>> f = lambda x,z: (x[...,:,None,None]**torch.arange(2,5)*z[...,None,:,None]**torch.arange(1,4)).sum((-3,-2))
>>> x = torch.rand(6,7,5,generator=rng)
>>> z = torch.rand(6,7,4,generator=rng)
>>> p = torch.rand(6,7,5,generator=rng)
>>> q = torch.rand(6,7,4,generator=rng)
>>> (jvpy,),y = jvpb(f,(x,z),(p,q),bdims=2)
>>> jvpy.shape
torch.Size([6, 7, 3])
>>> y.shape
torch.Size([6, 7, 3])
>>> torch.allclose(y,f(x,z))
True
>>> (jac_x,jac_z),_ = jacfwdb(f,(x,z),bdims=2)
>>> torch.allclose(jvpy,(jac_x*p[...,None,:]).sum(-1)+(jac_z*q[...,None,:]).sum(-1))
True

>>> f = lambda x,z: (x[...,:,None,None]**torch.arange(2,5)*z[...,None,:,None]**torch.arange(1,4)).sum((-3,-2))
>>> x = torch.rand(6,7,5,generator=rng)
>>> z = torch.rand(6,7,4,generator=rng)
>>> p = torch.rand(6,7,5,generator=rng)
>>> (jvpy,),y = jvpb(f,x,p,bkwargs={"z":z},bdims=2)
>>> jvpy.shape
torch.Size([6, 7, 3])
>>> y.shape
torch.Size([6, 7, 3])
>>> torch.allclose(y,f(x,z))
True
>>> (jac_x,jac_z),_ = jacfwdb(f,(x,z),bdims=2)
>>> torch.allclose(jvpy,(jac_x*p[...,None,:]).sum(-1))
True

>>> def f(x):
...     y = (x[...,:,None]**torch.arange(2,5)).sum(-2)
...     u = (x[...,:,None]**torch.arange(3,6)).sum(-2)
...     v = (x[...,:,None]**torch.arange(4,7)).sum(-2)
...     return y,u,v
>>> x = torch.rand(5,generator=rng)
>>> p = torch.rand(5,generator=rng)
>>> (jvpy,jvpu,jvpv),(y,u,v) = jvpb(f,x,p)
>>> jvpy.shape
torch.Size([3])
>>> jvpu.shape
torch.Size([3])
>>> jvpv.shape
torch.Size([3])
>>> y.shape
torch.Size([3])
>>> u.shape
torch.Size([3])
>>> v.shape
torch.Size([3])
>>> torch.allclose(y,f(x)[0])
True
>>> torch.allclose(u,f(x)[1])
True
>>> torch.allclose(v,f(x)[2])
True
>>> ((jacy_x,),(jacu_x,),(jacv_x,)),_ = jacfwdb(f,x)
>>> torch.allclose(jvpy,(jacy_x*p[...,None,:]).sum(-1))
True
>>> torch.allclose(jvpu,(jacu_x*p[...,None,:]).sum(-1))
True
>>> torch.allclose(jvpv,(jacv_x*p[...,None,:]).sum(-1))
True

>>> def f(x,z):
...     y = (x[...,:,None,None]**torch.arange(2,5)*z[...,None,:,None]**torch.arange(1,4)).sum((-3,-2))
...     u = (x[...,:,None,None]**torch.arange(3,6)*z[...,None,:,None]**torch.arange(2,5)).sum((-3,-2))
...     v = (x[...,:,None,None]**torch.arange(4,7)*z[...,None,:,None]**torch.arange(3,6)).sum((-3,-2))
...     return y,u,v
>>> x = torch.rand(6,7,5,generator=rng)
>>> z = torch.rand(6,7,4,generator=rng)
>>> p = torch.rand(6,7,5,generator=rng)
>>> q = torch.rand(6,7,4,generator=rng)
>>> (jvpy,jvpu,jvpv),(y,u,v) = jvpb(f,(x,z),(p,q),bdims=2)
>>> jvpy.shape
torch.Size([6, 7, 3])
>>> jvpu.shape
torch.Size([6, 7, 3])
>>> jvpv.shape
torch.Size([6, 7, 3])
>>> y.shape
torch.Size([6, 7, 3])
>>> u.shape
torch.Size([6, 7, 3])
>>> v.shape
torch.Size([6, 7, 3])
>>> torch.allclose(y,f(x,z)[0])
True
>>> torch.allclose(u,f(x,z)[1])
True
>>> torch.allclose(v,f(x,z)[2])
True
>>> ((jacy_x,jacy_z),(jacu_x,jacu_z),(jacv_x,jacv_z)),_ = jacfwdb(f,(x,z),bdims=2)
>>> torch.allclose(jvpy,(jacy_x*p[...,None,:]).sum(-1)+(jacy_z*q[...,None,:]).sum(-1))
True
>>> torch.allclose(jvpu,(jacu_x*p[...,None,:]).sum(-1)+(jacu_z*q[...,None,:]).sum(-1))
True
>>> torch.allclose(jvpv,(jacv_x*p[...,None,:]).sum(-1)+(jacv_z*q[...,None,:]).sum(-1))
True

Source code in agsutil/autograd.py

def jvpb(f, x, p, bkwargs={}, bdims=0, chunk_size=None):
    r"""
    Batched `torch.func.jvp` with function evaluation (forward-mode auto-diff)

    Args:
        f (callable): Function to compute `torch.func.jvp` of.
        x (Union[torch.Tensor,Tuple]): (batched) `Torch.Tensor` primals.
        p (Union[torch.Tensor,Tuple]): (batched) `Torch.Tensor` tangents.
        bkwargs (dict): (batched) `Torch.Tensor` items whose jacobians will not be computed.
        bdims (int): number of batch dimensions 
        chunk_size (int): to be passed into `torch.func.vmap`. 

    Returns: 
        jvp (tuple): `torch.Tensor` jacobian vector products, one with respect to each item in `x`. 
        y (torch.Tensor): Function evaluations. 

    Examples: 
        >>> torch.set_default_dtype(torch.float64)
        >>> rng = torch.Generator().manual_seed(7)

        >>> f = lambda x: (x[...,None]**torch.arange(2,5)).sum(-2)
        >>> x = torch.rand(5,generator=rng)
        >>> p = torch.rand(5,generator=rng)
        >>> (jvpy,),y = jvpb(f,x,p)
        >>> jvpy.shape 
        torch.Size([3])
        >>> y.shape 
        torch.Size([3])
        >>> torch.allclose(y,f(x))
        True
        >>> (jac_x,),_ = jacfwdb(f,x)
        >>> torch.allclose(jvpy,jac_x@p)
        True

        >>> f = lambda x,z: (x[...,:,None,None]**torch.arange(2,5)*z[...,None,:,None]**torch.arange(1,4)).sum((-3,-2))
        >>> x = torch.rand(5,generator=rng)
        >>> z = torch.rand(4,generator=rng)
        >>> p = torch.rand(5,generator=rng)
        >>> q = torch.rand(4,generator=rng)
        >>> (jvpy,),y = jvpb(f,(x,z),(p,q))
        >>> jvpy.shape
        torch.Size([3])
        >>> y.shape
        torch.Size([3])
        >>> torch.allclose(y,f(x,z))
        True
        >>> (jac_x,jac_z),_ = jacfwdb(f,(x,z))
        >>> torch.allclose(jvpy,jac_x@p+jac_z@q)
        True

        >>> f = lambda x: (x[...,None]**torch.arange(2,5)).sum(-2)
        >>> x = torch.rand(6,4,5,generator=rng)
        >>> p = torch.rand(6,4,5,generator=rng)
        >>> (jvpy,),y = jvpb(f,x,p,bdims=2)
        >>> jvpy.shape 
        torch.Size([6, 4, 3])
        >>> y.shape 
        torch.Size([6, 4, 3])
        >>> torch.allclose(y,f(x))
        True
        >>> (jac_x,),_ = jacfwdb(f,x,bdims=2)
        >>> torch.allclose(jvpy,(jac_x*p[...,None,:]).sum(-1))
        True

        >>> f = lambda x,z: (x[...,:,None,None]**torch.arange(2,5)*z[...,None,:,None]**torch.arange(1,4)).sum((-3,-2))
        >>> x = torch.rand(6,7,5,generator=rng)
        >>> z = torch.rand(6,7,4,generator=rng)
        >>> p = torch.rand(6,7,5,generator=rng)
        >>> q = torch.rand(6,7,4,generator=rng)
        >>> (jvpy,),y = jvpb(f,(x,z),(p,q),bdims=2)
        >>> jvpy.shape
        torch.Size([6, 7, 3])
        >>> y.shape
        torch.Size([6, 7, 3])
        >>> torch.allclose(y,f(x,z))
        True
        >>> (jac_x,jac_z),_ = jacfwdb(f,(x,z),bdims=2)
        >>> torch.allclose(jvpy,(jac_x*p[...,None,:]).sum(-1)+(jac_z*q[...,None,:]).sum(-1))
        True

        >>> f = lambda x,z: (x[...,:,None,None]**torch.arange(2,5)*z[...,None,:,None]**torch.arange(1,4)).sum((-3,-2))
        >>> x = torch.rand(6,7,5,generator=rng)
        >>> z = torch.rand(6,7,4,generator=rng)
        >>> p = torch.rand(6,7,5,generator=rng)
        >>> (jvpy,),y = jvpb(f,x,p,bkwargs={"z":z},bdims=2)
        >>> jvpy.shape
        torch.Size([6, 7, 3])
        >>> y.shape
        torch.Size([6, 7, 3])
        >>> torch.allclose(y,f(x,z))
        True
        >>> (jac_x,jac_z),_ = jacfwdb(f,(x,z),bdims=2)
        >>> torch.allclose(jvpy,(jac_x*p[...,None,:]).sum(-1))
        True

        >>> def f(x):
        ...     y = (x[...,:,None]**torch.arange(2,5)).sum(-2)
        ...     u = (x[...,:,None]**torch.arange(3,6)).sum(-2)
        ...     v = (x[...,:,None]**torch.arange(4,7)).sum(-2)
        ...     return y,u,v
        >>> x = torch.rand(5,generator=rng)
        >>> p = torch.rand(5,generator=rng)
        >>> (jvpy,jvpu,jvpv),(y,u,v) = jvpb(f,x,p)
        >>> jvpy.shape
        torch.Size([3])
        >>> jvpu.shape
        torch.Size([3])
        >>> jvpv.shape
        torch.Size([3])
        >>> y.shape
        torch.Size([3])
        >>> u.shape
        torch.Size([3])
        >>> v.shape
        torch.Size([3])
        >>> torch.allclose(y,f(x)[0])
        True
        >>> torch.allclose(u,f(x)[1])
        True
        >>> torch.allclose(v,f(x)[2])
        True
        >>> ((jacy_x,),(jacu_x,),(jacv_x,)),_ = jacfwdb(f,x)
        >>> torch.allclose(jvpy,(jacy_x*p[...,None,:]).sum(-1))
        True
        >>> torch.allclose(jvpu,(jacu_x*p[...,None,:]).sum(-1))
        True
        >>> torch.allclose(jvpv,(jacv_x*p[...,None,:]).sum(-1))
        True

        >>> def f(x,z):
        ...     y = (x[...,:,None,None]**torch.arange(2,5)*z[...,None,:,None]**torch.arange(1,4)).sum((-3,-2))
        ...     u = (x[...,:,None,None]**torch.arange(3,6)*z[...,None,:,None]**torch.arange(2,5)).sum((-3,-2))
        ...     v = (x[...,:,None,None]**torch.arange(4,7)*z[...,None,:,None]**torch.arange(3,6)).sum((-3,-2))
        ...     return y,u,v
        >>> x = torch.rand(6,7,5,generator=rng)
        >>> z = torch.rand(6,7,4,generator=rng)
        >>> p = torch.rand(6,7,5,generator=rng)
        >>> q = torch.rand(6,7,4,generator=rng)
        >>> (jvpy,jvpu,jvpv),(y,u,v) = jvpb(f,(x,z),(p,q),bdims=2)
        >>> jvpy.shape
        torch.Size([6, 7, 3])
        >>> jvpu.shape
        torch.Size([6, 7, 3])
        >>> jvpv.shape
        torch.Size([6, 7, 3])
        >>> y.shape
        torch.Size([6, 7, 3])
        >>> u.shape
        torch.Size([6, 7, 3])
        >>> v.shape
        torch.Size([6, 7, 3])
        >>> torch.allclose(y,f(x,z)[0])
        True
        >>> torch.allclose(u,f(x,z)[1])
        True
        >>> torch.allclose(v,f(x,z)[2])
        True
        >>> ((jacy_x,jacy_z),(jacu_x,jacu_z),(jacv_x,jacv_z)),_ = jacfwdb(f,(x,z),bdims=2)
        >>> torch.allclose(jvpy,(jacy_x*p[...,None,:]).sum(-1)+(jacy_z*q[...,None,:]).sum(-1))
        True
        >>> torch.allclose(jvpu,(jacu_x*p[...,None,:]).sum(-1)+(jacu_z*q[...,None,:]).sum(-1))
        True
        >>> torch.allclose(jvpv,(jacv_x*p[...,None,:]).sum(-1)+(jacv_z*q[...,None,:]).sum(-1))
        True
    """
    if isinstance(x,torch.Tensor): x = (x,)
    if isinstance(p,torch.Tensor): p = (p,)
    lenx = len(x) 
    assert len(x)>=1
    assert len(p)==lenx
    batch_shape = list(x[0].shape[:bdims])
    for i,xi in enumerate(x): assert list(xi.shape[:bdims])==batch_shape, "x input %d has shape = %s, but expected first dims to match batch_shape = %s"%(i,list(xi.shape),list(batch_shape))
    for i,pi in enumerate(p): assert list(pi.shape[:bdims])==batch_shape, "p input %d has shape = %s, but expected first dims to match batch_shape = %s"%(i,list(xi.shape),list(batch_shape))
    for k,v in bkwargs.items(): assert list(v.shape[:bdims])==batch_shape, "bkwargs['%s'].shape = %s, but expected first dims to match batch_shape = %s"%(k,list(v.shape),list(batch_shape))
    len_bkwargs = len(bkwargs) 
    bkwargs_keys = list(bkwargs.keys())
    def jvpfwrap(*inputs):
        x,p,bkwargs_vals = inputs[:lenx],inputs[lenx:(2*lenx)],inputs[(2*lenx):]
        return torch.func.jvp(lambda *x: f(*x,*bkwargs_vals),x,p)
    jvpfwrapvec = torch.vmap(jvpfwrap,in_dims=(0,)*(2*lenx+len_bkwargs),chunk_size=chunk_size)
    x_input = [xi.flatten(end_dim=bdims-1) if bdims>0 else xi[None,...] for xi in x]
    p_input = [pi.flatten(end_dim=bdims-1) if bdims>0 else pi[None,...] for pi in p]
    bkwargs_vals_input = [bkwargs[key].flatten(end_dim=bdims-1) if bdims>0 else bkwargs[key][None,...] for key in bkwargs_keys]
    y,jvpy = jvpfwrapvec(*x_input,*p_input,*bkwargs_vals_input)
    if isinstance(jvpy,torch.Tensor): jvpy = (jvpy,)
    if isinstance(y,torch.Tensor):
        if bdims==0:
            return tuple(jvpyk[0] for jvpyk in jvpy),y[0]
        else:
            return tuple(jvpyk.reshape(batch_shape+list(jvpyk.shape[1:])) for jvpyk in jvpy),y.reshape(batch_shape+list(y.shape[1:]))
    else:
        if bdims==0:
            return tuple(jvpyk[0] for jvpyk in jvpy),tuple(yk[0] for yk in y)
        else:
            return tuple(jvpyk.reshape(batch_shape+list(jvpyk.shape[1:])) for jvpyk in jvpy),tuple(yk.reshape(batch_shape+list(yk.shape[1:])) for yk in y)

vjpb

vjpb(f, x, p, bkwargs={}, bdims=0, chunk_size=None)

Batched torch.func.vjp with function evaluation (forward-mode auto-diff)

Parameters:

Name	Type	Description	Default
`f`	`callable`	Function to compute `torch.func.vjp` of.	required
`x`	`Union[Tensor, Tuple]`	(batched) `Torch.Tensor` primals.	required
`p`	`Union[Tensor, Tuple]`	(batched) `Torch.Tensor` tangents.	required
`bkwargs`	`dict`	(batched) `Torch.Tensor` items whose jacobians will not be computed.	`{}`
`bdims`	`int`	number of batch dimensions	`0`
`chunk_size`	`int`	to be passed into `torch.func.vmap`.	`None`

Returns:

Name	Type	Description
`vjp`	`tuple`	`torch.Tensor` jacobian vector products, one with respect to each item in `x`.
`y`	`Tensor`	Function evaluations.

Examples:

>>> torch.set_default_dtype(torch.float64)
>>> rng = torch.Generator().manual_seed(7)

>>> f = lambda x: (x[...,None]**torch.arange(2,5)).sum(-2)
>>> x = torch.rand(5,generator=rng)
>>> p = torch.rand(3,generator=rng)
>>> (vjpx,),y = vjpb(f,x,p)
>>> vjpx.shape 
torch.Size([5])
>>> y.shape 
torch.Size([3])
>>> torch.allclose(y,f(x))
True
>>> (jac_x,),_ = jacfwdb(f,x)
>>> torch.allclose(vjpx,p@jac_x)
True

>>> def f(x,z):
...     y = (x[...,:,None,None]**torch.arange(2,5)*z[...,None,:,None]**torch.arange(1,4)).sum((-3,-2))
...     u = (x[...,:,None,None]**torch.arange(3,6)*z[...,None,:,None]**torch.arange(2,5)).sum((-3,-2))
...     v = (x[...,:,None,None]**torch.arange(4,7)*z[...,None,:,None]**torch.arange(3,6)).sum((-3,-2))
...     return y,u,v
>>> x = torch.rand(5,generator=rng)
>>> z = torch.rand(6,generator=rng)
>>> p = torch.rand(3,generator=rng)
>>> q = torch.rand(3,generator=rng)
>>> r = torch.rand(3,generator=rng)
>>> (vjp_x,vjp_z),(y,u,v) = vjpb(f,(x,z),(p,q,r))
>>> vjp_x.shape
torch.Size([5])
>>> vjp_z.shape
torch.Size([6])
>>> y.shape
torch.Size([3])
>>> v.shape
torch.Size([3])
>>> torch.allclose(y,f(x,z)[0])
True
>>> torch.allclose(u,f(x,z)[1])
True
>>> torch.allclose(v,f(x,z)[1])
False
>>> ((jacy_x,jacy_z),(jacu_x,jacu_z),(jacv_x,jacv_z)),_ = jacfwdb(f,(x,z))
>>> torch.allclose(vjp_x,p@jacy_x+q@jacu_x+r@jacv_x)
True
>>> torch.allclose(vjp_z,p@jacy_z+q@jacu_z+r@jacv_z)
True

>>> f = lambda x: (x[...,None]**torch.arange(2,5)).sum(-2)
>>> x = torch.rand(7,5,generator=rng)
>>> p = torch.rand(7,3,generator=rng)
>>> (vjpx,),y = vjpb(f,x,p,bdims=1)
>>> vjpx.shape 
torch.Size([7, 5])
>>> y.shape 
torch.Size([7, 3])
>>> torch.allclose(y,f(x))
True
>>> (jac_x,),_ = jacfwdb(f,x,bdims=1)
>>> torch.allclose(vjpx,(p[...,None]*jac_x).sum(-2))
True

>>> def f(x,z):
...     y = (x[...,:,None,None]**torch.arange(2,5)*z[...,None,:,None]**torch.arange(1,4)).sum((-3,-2))
...     u = (x[...,:,None,None]**torch.arange(3,6)*z[...,None,:,None]**torch.arange(2,5)).sum((-3,-2))
...     v = (x[...,:,None,None]**torch.arange(4,7)*z[...,None,:,None]**torch.arange(3,6)).sum((-3,-2))
...     return y,u,v
>>> x = torch.rand(7,5,generator=rng)
>>> z = torch.rand(7,6,generator=rng)
>>> p = torch.rand(7,3,generator=rng)
>>> q = torch.rand(7,3,generator=rng)
>>> r = torch.rand(7,3,generator=rng)
>>> (vjp_x,vjp_z),(y,u,v) = vjpb(f,(x,z),(p,q,r),bdims=1)
>>> vjp_x.shape
torch.Size([7, 5])
>>> vjp_z.shape
torch.Size([7, 6])
>>> y.shape
torch.Size([7, 3])
>>> v.shape
torch.Size([7, 3])
>>> torch.allclose(y,f(x,z)[0])
True
>>> torch.allclose(u,f(x,z)[1])
True
>>> torch.allclose(v,f(x,z)[1])
False
>>> ((jacy_x,jacy_z),(jacu_x,jacu_z),(jacv_x,jacv_z)),_ = jacfwdb(f,(x,z),bdims=1)
>>> torch.allclose(vjp_x,(p[...,None]*jacy_x).sum(-2)+(q[...,None]*jacu_x).sum(-2)+(r[...,None]*jacv_x).sum(-2))
True
>>> torch.allclose(vjp_z,(p[...,None]*jacy_z).sum(-2)+(q[...,None]*jacu_z).sum(-2)+(r[...,None]*jacv_z).sum(-2))
True

>>> def f(x,z):
...     y = (x[...,:,None,None]**torch.arange(2,5)*z[...,None,:,None]**torch.arange(1,4)).sum((-3,-2))
...     u = (x[...,:,None,None]**torch.arange(3,6)*z[...,None,:,None]**torch.arange(2,5)).sum((-3,-2))
...     v = (x[...,:,None,None]**torch.arange(4,7)*z[...,None,:,None]**torch.arange(3,6)).sum((-3,-2))
...     return y,u,v
>>> x = torch.rand(7,5,generator=rng)
>>> z = torch.rand(7,6,generator=rng)
>>> p = torch.rand(7,3,generator=rng)
>>> q = torch.rand(7,3,generator=rng)
>>> r = torch.rand(7,3,generator=rng)
>>> (vjp_x,),(y,u,v) = vjpb(f,x,(p,q,r),bkwargs={"z":z},bdims=1)
>>> vjp_x.shape
torch.Size([7, 5])
>>> y.shape
torch.Size([7, 3])
>>> v.shape
torch.Size([7, 3])
>>> torch.allclose(y,f(x,z)[0])
True
>>> torch.allclose(u,f(x,z)[1])
True
>>> torch.allclose(v,f(x,z)[1])
False
>>> ((jacy_x,jacy_z),(jacu_x,jacu_z),(jacv_x,jacv_z)),_ = jacfwdb(f,(x,z),bdims=1)
>>> torch.allclose(vjp_x,(p[...,None]*jacy_x).sum(-2)+(q[...,None]*jacu_x).sum(-2)+(r[...,None]*jacv_x).sum(-2))
True

Source code in agsutil/autograd.py

def vjpb(f, x, p, bkwargs={}, bdims=0, chunk_size=None):
    r"""
    Batched `torch.func.vjp` with function evaluation (forward-mode auto-diff)

    Args:
        f (callable): Function to compute `torch.func.vjp` of.
        x (Union[torch.Tensor,Tuple]): (batched) `Torch.Tensor` primals.
        p (Union[torch.Tensor,Tuple]): (batched) `Torch.Tensor` tangents.
        bkwargs (dict): (batched) `Torch.Tensor` items whose jacobians will not be computed.
        bdims (int): number of batch dimensions 
        chunk_size (int): to be passed into `torch.func.vmap`. 

    Returns: 
        vjp (tuple): `torch.Tensor` jacobian vector products, one with respect to each item in `x`. 
        y (torch.Tensor): Function evaluations. 

    Examples: 
        >>> torch.set_default_dtype(torch.float64)
        >>> rng = torch.Generator().manual_seed(7)

        >>> f = lambda x: (x[...,None]**torch.arange(2,5)).sum(-2)
        >>> x = torch.rand(5,generator=rng)
        >>> p = torch.rand(3,generator=rng)
        >>> (vjpx,),y = vjpb(f,x,p)
        >>> vjpx.shape 
        torch.Size([5])
        >>> y.shape 
        torch.Size([3])
        >>> torch.allclose(y,f(x))
        True
        >>> (jac_x,),_ = jacfwdb(f,x)
        >>> torch.allclose(vjpx,p@jac_x)
        True

        >>> def f(x,z):
        ...     y = (x[...,:,None,None]**torch.arange(2,5)*z[...,None,:,None]**torch.arange(1,4)).sum((-3,-2))
        ...     u = (x[...,:,None,None]**torch.arange(3,6)*z[...,None,:,None]**torch.arange(2,5)).sum((-3,-2))
        ...     v = (x[...,:,None,None]**torch.arange(4,7)*z[...,None,:,None]**torch.arange(3,6)).sum((-3,-2))
        ...     return y,u,v
        >>> x = torch.rand(5,generator=rng)
        >>> z = torch.rand(6,generator=rng)
        >>> p = torch.rand(3,generator=rng)
        >>> q = torch.rand(3,generator=rng)
        >>> r = torch.rand(3,generator=rng)
        >>> (vjp_x,vjp_z),(y,u,v) = vjpb(f,(x,z),(p,q,r))
        >>> vjp_x.shape
        torch.Size([5])
        >>> vjp_z.shape
        torch.Size([6])
        >>> y.shape
        torch.Size([3])
        >>> v.shape
        torch.Size([3])
        >>> torch.allclose(y,f(x,z)[0])
        True
        >>> torch.allclose(u,f(x,z)[1])
        True
        >>> torch.allclose(v,f(x,z)[1])
        False
        >>> ((jacy_x,jacy_z),(jacu_x,jacu_z),(jacv_x,jacv_z)),_ = jacfwdb(f,(x,z))
        >>> torch.allclose(vjp_x,p@jacy_x+q@jacu_x+r@jacv_x)
        True
        >>> torch.allclose(vjp_z,p@jacy_z+q@jacu_z+r@jacv_z)
        True

        >>> f = lambda x: (x[...,None]**torch.arange(2,5)).sum(-2)
        >>> x = torch.rand(7,5,generator=rng)
        >>> p = torch.rand(7,3,generator=rng)
        >>> (vjpx,),y = vjpb(f,x,p,bdims=1)
        >>> vjpx.shape 
        torch.Size([7, 5])
        >>> y.shape 
        torch.Size([7, 3])
        >>> torch.allclose(y,f(x))
        True
        >>> (jac_x,),_ = jacfwdb(f,x,bdims=1)
        >>> torch.allclose(vjpx,(p[...,None]*jac_x).sum(-2))
        True

        >>> def f(x,z):
        ...     y = (x[...,:,None,None]**torch.arange(2,5)*z[...,None,:,None]**torch.arange(1,4)).sum((-3,-2))
        ...     u = (x[...,:,None,None]**torch.arange(3,6)*z[...,None,:,None]**torch.arange(2,5)).sum((-3,-2))
        ...     v = (x[...,:,None,None]**torch.arange(4,7)*z[...,None,:,None]**torch.arange(3,6)).sum((-3,-2))
        ...     return y,u,v
        >>> x = torch.rand(7,5,generator=rng)
        >>> z = torch.rand(7,6,generator=rng)
        >>> p = torch.rand(7,3,generator=rng)
        >>> q = torch.rand(7,3,generator=rng)
        >>> r = torch.rand(7,3,generator=rng)
        >>> (vjp_x,vjp_z),(y,u,v) = vjpb(f,(x,z),(p,q,r),bdims=1)
        >>> vjp_x.shape
        torch.Size([7, 5])
        >>> vjp_z.shape
        torch.Size([7, 6])
        >>> y.shape
        torch.Size([7, 3])
        >>> v.shape
        torch.Size([7, 3])
        >>> torch.allclose(y,f(x,z)[0])
        True
        >>> torch.allclose(u,f(x,z)[1])
        True
        >>> torch.allclose(v,f(x,z)[1])
        False
        >>> ((jacy_x,jacy_z),(jacu_x,jacu_z),(jacv_x,jacv_z)),_ = jacfwdb(f,(x,z),bdims=1)
        >>> torch.allclose(vjp_x,(p[...,None]*jacy_x).sum(-2)+(q[...,None]*jacu_x).sum(-2)+(r[...,None]*jacv_x).sum(-2))
        True
        >>> torch.allclose(vjp_z,(p[...,None]*jacy_z).sum(-2)+(q[...,None]*jacu_z).sum(-2)+(r[...,None]*jacv_z).sum(-2))
        True

        >>> def f(x,z):
        ...     y = (x[...,:,None,None]**torch.arange(2,5)*z[...,None,:,None]**torch.arange(1,4)).sum((-3,-2))
        ...     u = (x[...,:,None,None]**torch.arange(3,6)*z[...,None,:,None]**torch.arange(2,5)).sum((-3,-2))
        ...     v = (x[...,:,None,None]**torch.arange(4,7)*z[...,None,:,None]**torch.arange(3,6)).sum((-3,-2))
        ...     return y,u,v
        >>> x = torch.rand(7,5,generator=rng)
        >>> z = torch.rand(7,6,generator=rng)
        >>> p = torch.rand(7,3,generator=rng)
        >>> q = torch.rand(7,3,generator=rng)
        >>> r = torch.rand(7,3,generator=rng)
        >>> (vjp_x,),(y,u,v) = vjpb(f,x,(p,q,r),bkwargs={"z":z},bdims=1)
        >>> vjp_x.shape
        torch.Size([7, 5])
        >>> y.shape
        torch.Size([7, 3])
        >>> v.shape
        torch.Size([7, 3])
        >>> torch.allclose(y,f(x,z)[0])
        True
        >>> torch.allclose(u,f(x,z)[1])
        True
        >>> torch.allclose(v,f(x,z)[1])
        False
        >>> ((jacy_x,jacy_z),(jacu_x,jacu_z),(jacv_x,jacv_z)),_ = jacfwdb(f,(x,z),bdims=1)
        >>> torch.allclose(vjp_x,(p[...,None]*jacy_x).sum(-2)+(q[...,None]*jacu_x).sum(-2)+(r[...,None]*jacv_x).sum(-2))
        True
    """
    if isinstance(x,torch.Tensor): x = (x,)
    if isinstance(p,torch.Tensor): p = (p,)
    lenx = len(x) 
    assert len(x)>=1
    lenp = len(p)
    assert len(p)>=1
    batch_shape = list(x[0].shape[:bdims])
    for i,xi in enumerate(x): assert list(xi.shape[:bdims])==batch_shape, "x input %d has shape = %s, but expected first dims to match batch_shape = %s"%(i,list(xi.shape),list(batch_shape))
    for i,pi in enumerate(p): assert list(pi.shape[:bdims])==batch_shape, "p input %d has shape = %s, but expected first dims to match batch_shape = %s"%(i,list(xi.shape),list(batch_shape))
    for k,v in bkwargs.items(): assert list(v.shape[:bdims])==batch_shape, "bkwargs['%s'].shape = %s, but expected first dims to match batch_shape = %s"%(k,list(v.shape),list(batch_shape))
    len_bkwargs = len(bkwargs) 
    bkwargs_keys = list(bkwargs.keys())
    def vjpfwrap(*inputs):
        x,p,bkwargs_vals = inputs[:lenx],inputs[lenx:(lenx+lenp)],inputs[(lenx+lenp):]
        y,vjpfwrap_inner = torch.func.vjp(lambda *x: f(*x,*bkwargs_vals),*x)
        vjpywrap = vjpfwrap_inner(p[0] if isinstance(y,torch.Tensor) else p)
        return y,vjpywrap
    vjpfwrapvec = torch.vmap(vjpfwrap,in_dims=(0,)*(lenx+lenp+len_bkwargs),chunk_size=chunk_size)
    x_input = [xi.flatten(end_dim=bdims-1) if bdims>0 else xi[None,...] for xi in x]
    p_input = [pi.flatten(end_dim=bdims-1) if bdims>0 else pi[None,...] for pi in p]
    bkwargs_vals_input = [bkwargs[key].flatten(end_dim=bdims-1) if bdims>0 else bkwargs[key][None,...] for key in bkwargs_keys]
    y,vjpys = vjpfwrapvec(*x_input,*p_input,*bkwargs_vals_input)
    if isinstance(y,torch.Tensor):
        if bdims==0:
            return (vjpy[0] for vjpy in vjpys),y[0]
        else:
            return (vjpy.reshape(batch_shape+list(vjpy.shape[1:])) for vjpy in vjpys),y.reshape(batch_shape+list(y.shape[1:]))
    else:
        if bdims==0:
            return (vjpy[0] for vjpy in vjpys),(yk[0] for yk in y)
        else:
            return (vjpy.reshape(batch_shape+list(vjpy.shape[1:])) for vjpy in vjpys),(yk.reshape(batch_shape+list(yk.shape[1:])) for yk in y)

Misc utils

Timer

Timer(device)

Timer compatible with CPU and GPU operations.

Parameters:

Name	Type	Description	Default
`device`	`Union[device, str]`	device to perform timing for. For CPU devices, `time.perf_counter()` is used. For CUDA and MPS GPU devices, `torch.{cuda,mps}.Event(enable_timing=True)` is used.	required

Source code in agsutil/utils.py

def __init__(self, device):
    r"""
    Args:
        device (Union[torch.device,str]): device to perform timing for. 

            - For CPU devices, `time.perf_counter()` is used. 
            - For CUDA and MPS GPU devices, `torch.{cuda,mps}.Event(enable_timing=True)` is used. 
    """

    device_str = str(device) 
    if "cpu" in device_str:
        self.torch_backend = torch.cpu 
    elif "cuda" in device_str:
        self.torch_backend = torch.cuda 
    elif "mps" in device_str:
        self.torch_backend = torch.mps 
    else:
        raise Exception("undetected device = %s, should have 'cpu', 'cuda', or 'mps' in it."%device_str)

tic

tic()

Start the stopwatch.

Source code in agsutil/utils.py

def tic(self):
    r"""
    Start the stopwatch.
    """
    if self.torch_backend==torch.cpu:
        self.t0 = time.perf_counter()
    else:
        self.torch_backend.empty_cache()
        self.t0 = self.torch_backend.Event(enable_timing=True)
        self.tend = self.torch_backend.Event(enable_timing=True)
        self.t0.record()

def toc(self):
    r"""
    Lap the stopwatch. 

    Returns: 
        tdelta (float): time elapsed between the start of the stopwatch and the current lap.
    """
    if self.torch_backend==torch.cpu:
        tdelta = time.perf_counter()-self.t0
    else:
        self.tend.record()
        self.torch_backend.synchronize()
        tdelta = self.t0.elapsed_time(self.tend)/1000
    return float(tdelta)

print_data_signatures

print_data_signatures(data, name='data', print_devices=False, print_dtypes=False, verbose_indent=4)

Print data shapes and (optionally) devices.

Parameters:

Name	Type	Description	Default
`data`	`dict`	Dictiony with items that are tensors or dictionaries of tensors.	required
`print_devices`	`bool`	If `True`, also print the device.	`False`
`print_dtypes`	`bool`	If `True`, also print the dtypes.	`False`
`verbose_indent`	`int`	Non-negative number of indentation spaces for logging.	`4`

Examples:

>>> data = {
...     "a": torch.rand(2,3,4),
...     "b": torch.rand(3,4,5),
...     "subdata": {
...         "aa": torch.rand(2,3),
...         "bb": torch.rand(2,3),
...         "subnontensor": ["ags",7,7,7],
...         },
...     "nontensor": [7,7,7,"ags"]
...     }
>>> print_data_signatures(data,print_devices=True,print_dtypes=True,verbose_indent=0)
data['a'].shape = (2, 3, 4) on device = cpu with dtype = torch.float64
data['b'].shape = (3, 4, 5) on device = cpu with dtype = torch.float64
data['subdata']
    data['subdata']['aa'].shape = (2, 3) on device = cpu with dtype = torch.float64
    data['subdata']['bb'].shape = (2, 3) on device = cpu with dtype = torch.float64
    data['subdata']['subnontensor'] a list of length 4
data['nontensor'] a list of length 4

Source code in agsutil/utils.py

def print_data_signatures(data, name="data", print_devices=False, print_dtypes=False, verbose_indent=4):
    r""" 
    Print data shapes and (optionally) devices. 

    Args: 
        data (dict): Dictiony with items that are tensors or dictionaries of tensors. 
        print_devices (bool): If `True`, also print the device. 
        print_dtypes (bool): If `True`, also print the dtypes. 
        verbose_indent (int): Non-negative number of indentation spaces for logging.

    Examples:
        >>> data = {
        ...     "a": torch.rand(2,3,4),
        ...     "b": torch.rand(3,4,5),
        ...     "subdata": {
        ...         "aa": torch.rand(2,3),
        ...         "bb": torch.rand(2,3),
        ...         "subnontensor": ["ags",7,7,7],
        ...         },
        ...     "nontensor": [7,7,7,"ags"]
        ...     }
        >>> print_data_signatures(data,print_devices=True,print_dtypes=True,verbose_indent=0)
        data['a'].shape = (2, 3, 4) on device = cpu with dtype = torch.float64
        data['b'].shape = (3, 4, 5) on device = cpu with dtype = torch.float64
        data['subdata']
            data['subdata']['aa'].shape = (2, 3) on device = cpu with dtype = torch.float64
            data['subdata']['bb'].shape = (2, 3) on device = cpu with dtype = torch.float64
            data['subdata']['subnontensor'] a list of length 4
        data['nontensor'] a list of length 4
    """ 
    for key,val in data.items():
        if isinstance(val,torch.Tensor):
            _s = "%s['%s'].shape = %s"%(name,key,str(tuple(data[key].shape)))
            if print_devices:
                _s += " on device = %s"%str(data[key].device)
            if print_dtypes:
                _s += " with dtype = %s"%str(data[key].dtype)
            print(" "*verbose_indent+_s)
        elif isinstance(val,dict):
            print(" "*verbose_indent+"data['%s']"%key)
            for kkey,vval in val.items():
                if isinstance(vval,torch.Tensor):
                    _s = "%s['%s']['%s'].shape = %s"%(name,key,kkey,str(tuple(data[key][kkey].shape)))
                    if print_devices:
                        _s += " on device = %s"%str(data[key][kkey].device)
                    if print_dtypes:
                        _s += " with dtype = %s"%str(data[key][kkey].dtype)
                    print(" "*(verbose_indent+4)+_s)
                elif isinstance(vval,list):
                    print(" "*(verbose_indent+4)+"%s['%s']['%s'] a list of length %d"%(name,key,kkey,len(data[key][kkey])))
                else:
                    print(" "*(verbose_indent+4)+"%s['%s']['%s'] = %s"%(name,key,kkey,str(data[key][kkey])))
        elif isinstance(val,list):
            print(" "*verbose_indent+"%s['%s'] a list of length %d"%(name,key,len(data[key])))
        else:
            print(" "*verbose_indent+"%s['%s'] = %s"%(name,key,str(data[key])))

to_unitary_expskewh

to_unitary_expskewh(theta, n, complex_case=False)

Transform to a unitary matrix using the exponential of a skew Hermitian matrix.

Parameters:

Name	Type	Description	Default
`theta`	`Tensor`	With `theta.size(-1) == n(n-1)//2` in the real case and `theta.size(-1) == n*2` in the complex case.	required
`complex_case`	`bool`	If `True`, parameterize a complex matrix, otherwise a real one.	`False`

Returns:

Name	Type	Description
`Q`	`Tensor`	With shape `(*theta.shape[:-1],n,n)` unitary matrices

Examples:

>>> torch.set_default_dtype(torch.float64)
>>> rng = torch.Generator().manual_seed(7)

Single matrix

>>> n = 5
>>> theta = torch.rand(n*(n-1)//2,generator=rng)
>>> Q = to_unitary_expskewh(theta,n)
>>> Q 
tensor([[ 0.5027, -0.1038, -0.2508,  0.3769,  0.7291],
        [-0.6870,  0.3679,  0.2777,  0.3454,  0.4430],
        [-0.2330, -0.7975,  0.3425, -0.2999,  0.3201],
        [-0.4638, -0.3141, -0.8037,  0.1779, -0.0934],
        [-0.0769,  0.3453, -0.3111, -0.7855,  0.4013]])
>>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q),torch.eye(n))
True

Single complex matrix

>>> n = 4
>>> theta = torch.rand(n**2,generator=rng)
>>> Q = to_unitary_expskewh(theta,n,complex_case=True)
>>> Q 
tensor([[ 0.1358-0.3038j, -0.4798+0.2431j,  0.1105+0.2308j, -0.2229+0.6963j],
        [-0.4395+0.6202j,  0.1556-0.1974j, -0.1838-0.0838j, -0.1181+0.5516j],
        [-0.4750+0.0762j, -0.7622+0.0020j, -0.2319-0.0263j,  0.2346-0.2795j],
        [-0.0745-0.2729j,  0.2531+0.0496j, -0.7624+0.5079j,  0.1119+0.0410j]])
>>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q.conj()),torch.eye(n,dtype=torch.complex128))
True

Two matrices

>>> n = 3
>>> theta = torch.rand(2,n*(n-1)//2,generator=rng)
>>> Q = to_unitary_expskewh(theta,n)
>>> Q 
tensor([[[ 0.3004,  0.6612,  0.6874],
         [-0.7630,  0.5990, -0.2428],
         [-0.5723, -0.4516,  0.6845]],
<BLANKLINE>
        [[ 0.4583,  0.7552,  0.4687],
         [-0.7977,  0.5820, -0.1579],
         [-0.3921, -0.3015,  0.8691]]])
>>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q),torch.eye(n))
True

Two complex matrices

>>> n = 3
>>> theta = torch.rand(2,n**2,generator=rng)
>>> Q = to_unitary_expskewh(theta,n,complex_case=True)
>>> Q 
tensor([[[ 0.5591+0.3745j, -0.4342+0.5076j, -0.1283+0.2907j],
         [-0.4802+0.4013j,  0.5099+0.4179j, -0.2657+0.3211j],
         [-0.3843+0.0887j, -0.2972+0.1756j,  0.8185+0.2353j]],
<BLANKLINE>
        [[ 0.4860+0.5953j, -0.1642+0.0458j, -0.3162+0.5295j],
         [-0.5638+0.0805j,  0.3753+0.0607j,  0.2470+0.6856j],
         [-0.2898+0.0325j, -0.8621+0.2884j,  0.2790+0.1037j]]])
>>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q.conj()),torch.eye(n,dtype=torch.complex128))
True

Batch support

>>> n = 10
>>> theta = torch.rand(2,3,4,n*(n-1)//2,generator=rng)
>>> Q = to_unitary_expskewh(theta,n)
>>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q),torch.eye(n))
True
>>> theta = torch.rand(2,3,4,n**2,generator=rng)
>>> Q = to_unitary_expskewh(theta,n,complex_case=True)
>>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q.conj()),torch.eye(n,dtype=torch.complex128))
True

Source code in agsutil/utils.py

def to_unitary_expskewh(theta, n, complex_case=False):
    r"""
    Transform to a unitary matrix using the exponential of a skew Hermitian matrix.

    Args:
        theta (torch.Tensor): With `theta.size(-1) == n*(n-1)//2` in the real case and `theta.size(-1) == n**2` in the complex case.
        complex_case (bool): If `True`, parameterize a complex matrix, otherwise a real one. 

    Returns:
        Q (torch.Tensor): With shape `(*theta.shape[:-1],n,n)` unitary matrices

    Examples:
        >>> torch.set_default_dtype(torch.float64)
        >>> rng = torch.Generator().manual_seed(7)

    Single matrix

        >>> n = 5
        >>> theta = torch.rand(n*(n-1)//2,generator=rng)
        >>> Q = to_unitary_expskewh(theta,n)
        >>> Q 
        tensor([[ 0.5027, -0.1038, -0.2508,  0.3769,  0.7291],
                [-0.6870,  0.3679,  0.2777,  0.3454,  0.4430],
                [-0.2330, -0.7975,  0.3425, -0.2999,  0.3201],
                [-0.4638, -0.3141, -0.8037,  0.1779, -0.0934],
                [-0.0769,  0.3453, -0.3111, -0.7855,  0.4013]])
        >>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q),torch.eye(n))
        True

    Single complex matrix

        >>> n = 4
        >>> theta = torch.rand(n**2,generator=rng)
        >>> Q = to_unitary_expskewh(theta,n,complex_case=True)
        >>> Q 
        tensor([[ 0.1358-0.3038j, -0.4798+0.2431j,  0.1105+0.2308j, -0.2229+0.6963j],
                [-0.4395+0.6202j,  0.1556-0.1974j, -0.1838-0.0838j, -0.1181+0.5516j],
                [-0.4750+0.0762j, -0.7622+0.0020j, -0.2319-0.0263j,  0.2346-0.2795j],
                [-0.0745-0.2729j,  0.2531+0.0496j, -0.7624+0.5079j,  0.1119+0.0410j]])
        >>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q.conj()),torch.eye(n,dtype=torch.complex128))
        True

    Two matrices

        >>> n = 3
        >>> theta = torch.rand(2,n*(n-1)//2,generator=rng)
        >>> Q = to_unitary_expskewh(theta,n)
        >>> Q 
        tensor([[[ 0.3004,  0.6612,  0.6874],
                 [-0.7630,  0.5990, -0.2428],
                 [-0.5723, -0.4516,  0.6845]],
        <BLANKLINE>
                [[ 0.4583,  0.7552,  0.4687],
                 [-0.7977,  0.5820, -0.1579],
                 [-0.3921, -0.3015,  0.8691]]])
        >>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q),torch.eye(n))
        True

    Two complex matrices

        >>> n = 3
        >>> theta = torch.rand(2,n**2,generator=rng)
        >>> Q = to_unitary_expskewh(theta,n,complex_case=True)
        >>> Q 
        tensor([[[ 0.5591+0.3745j, -0.4342+0.5076j, -0.1283+0.2907j],
                 [-0.4802+0.4013j,  0.5099+0.4179j, -0.2657+0.3211j],
                 [-0.3843+0.0887j, -0.2972+0.1756j,  0.8185+0.2353j]],
        <BLANKLINE>
                [[ 0.4860+0.5953j, -0.1642+0.0458j, -0.3162+0.5295j],
                 [-0.5638+0.0805j,  0.3753+0.0607j,  0.2470+0.6856j],
                 [-0.2898+0.0325j, -0.8621+0.2884j,  0.2790+0.1037j]]])
        >>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q.conj()),torch.eye(n,dtype=torch.complex128))
        True

    Batch support

        >>> n = 10
        >>> theta = torch.rand(2,3,4,n*(n-1)//2,generator=rng)
        >>> Q = to_unitary_expskewh(theta,n)
        >>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q),torch.eye(n))
        True
        >>> theta = torch.rand(2,3,4,n**2,generator=rng)
        >>> Q = to_unitary_expskewh(theta,n,complex_case=True)
        >>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q.conj()),torch.eye(n,dtype=torch.complex128))
        True
    """
    batch_shape = tuple(theta.shape[:-1])
    batch_ones = torch.ones(batch_shape,dtype=int,device=theta.device)
    iut = torch.triu_indices(n,n,offset=1,device=theta.device)
    iutf = torch.einsum("...,i->...i",batch_ones,iut[0]*n+iut[1])
    iltf = torch.einsum("...,i->...i",batch_ones,iut[1]*n+iut[0])
    if complex_case:
        assert theta.size(-1)==(n**2)
        complex_dtype = (theta+0j).dtype
        H = torch.zeros((*batch_shape,n*n),dtype=complex_dtype,device=theta.device)
        idiag = torch.arange(n,device=theta.device)
        idiagf = torch.einsum("...,i->...i",batch_ones,idiag*n+idiag)
        diag = 1j*theta[...,:n].to(complex_dtype)
        off_real,off_imag = theta[...,n:n+(n**2-n)//2],theta[...,n+(n**2-n)//2:]
        off_vals = off_real.to(complex_dtype)+1j*off_imag.to(complex_dtype)
        H = H.scatter_add(-1,idiagf,diag)
        H = H.scatter_add(-1,iutf,off_vals)
        H = H.scatter_add(-1,iltf,-off_vals.conj())
    else:
        assert theta.size(-1)==(n*(n-1)//2)
        H = torch.zeros((*batch_shape,n*n),dtype=theta.dtype,device=theta.device)
        H = H.scatter_add(-1,iutf,theta)
        H = H.scatter_add(-1,iltf,-theta)
    H = H.reshape((*batch_shape,n,n))
    return torch.matrix_exp(H)

from_unitary_expskewh

from_unitary_expskewh(Q, complex_case=False)

Transform from a unitary matrix using the exponential of a skew Hermitian matrix.

Parameters:

Name	Type	Description	Default
`Q`	`Tensor`	With shape `(*theta.shape[:-1],n,n)` unitary matrices	required
`complex_case`	`bool`	If `True`, parameterize a complex matrix, otherwise a real one.	`False`

Returns:

Name	Type	Description
`theta`	`Tensor`	With `theta.size(-1) == n(n-1)//2` in the real case and `theta.size(-1) == n*2` in the complex case. Note that this `theta` is not unique, as shown in the following doctests.

Examples:

>>> torch.set_default_dtype(torch.float64)
>>> rng = torch.Generator().manual_seed(7)

Single matrix

>>> n = 10
>>> theta = torch.rand(n*(n-1)//2,generator=rng)
>>> Q = to_unitary_expskewh(theta,n)
>>> theta2 = from_unitary_expskewh(Q)
>>> torch.allclose(theta,theta2)
False
>>> Q2 = to_unitary_expskewh(theta2,n)
>>> torch.allclose(Q,Q2)
True

Single complex matrix

>>> n = 10
>>> theta = torch.rand(n**2,generator=rng)
>>> Q = to_unitary_expskewh(theta,n,complex_case=True)
>>> theta2 = from_unitary_expskewh(Q,complex_case=True)
>>> torch.allclose(theta,theta2)
False
>>> Q2 = to_unitary_expskewh(theta2,n,complex_case=True)
>>> torch.allclose(Q,Q2)
True

Two matrices

>>> n = 10
>>> theta = torch.rand(2,n*(n-1)//2,generator=rng)
>>> Q = to_unitary_expskewh(theta,n)
>>> theta2 = from_unitary_expskewh(Q)
>>> torch.allclose(theta,theta2)
False
>>> Q2 = to_unitary_expskewh(theta2,n)
>>> torch.allclose(Q,Q2)
True

Two complex matrices

>>> n = 3
>>> theta = torch.rand(2,n**2,generator=rng)
>>> Q = to_unitary_expskewh(theta,n,complex_case=True)
>>> theta2 = from_unitary_expskewh(Q,complex_case=True)
>>> Q2 = to_unitary_expskewh(theta2,n,complex_case=True)
>>> torch.allclose(Q,Q2)
True

Batch support

>>> n = 10
>>> theta = torch.rand(2,3,4,n*(n-1)//2,generator=rng)
>>> Q = to_unitary_expskewh(theta,n)
>>> theta2 = from_unitary_expskewh(Q)
>>> Q2 = to_unitary_expskewh(theta2,n)
>>> torch.allclose(Q,Q2)
True
>>> theta = torch.rand(2,3,4,n**2,generator=rng)
>>> Q = to_unitary_expskewh(theta,n,complex_case=True)
>>> theta2 = from_unitary_expskewh(Q,complex_case=True)
>>> Q2 = to_unitary_expskewh(theta2,n,complex_case=True)
>>> torch.allclose(Q,Q2)
True

Source code in agsutil/utils.py

def from_unitary_expskewh(Q, complex_case=False):
    r"""
    Transform from a unitary matrix using the exponential of a skew Hermitian matrix.

    Args:
        Q (torch.Tensor): With shape `(*theta.shape[:-1],n,n)` unitary matrices
        complex_case (bool): If `True`, parameterize a complex matrix, otherwise a real one. 

    Returns:
        theta (torch.Tensor): With `theta.size(-1) == n*(n-1)//2` in the real case and `theta.size(-1) == n**2` in the complex case.  
            Note that this `theta` is not unique, as shown in the following doctests.

    Examples:
        >>> torch.set_default_dtype(torch.float64)
        >>> rng = torch.Generator().manual_seed(7)

    Single matrix

        >>> n = 10
        >>> theta = torch.rand(n*(n-1)//2,generator=rng)
        >>> Q = to_unitary_expskewh(theta,n)
        >>> theta2 = from_unitary_expskewh(Q)
        >>> torch.allclose(theta,theta2)
        False
        >>> Q2 = to_unitary_expskewh(theta2,n)
        >>> torch.allclose(Q,Q2)
        True

    Single complex matrix

        >>> n = 10
        >>> theta = torch.rand(n**2,generator=rng)
        >>> Q = to_unitary_expskewh(theta,n,complex_case=True)
        >>> theta2 = from_unitary_expskewh(Q,complex_case=True)
        >>> torch.allclose(theta,theta2)
        False
        >>> Q2 = to_unitary_expskewh(theta2,n,complex_case=True)
        >>> torch.allclose(Q,Q2)
        True

    Two matrices

        >>> n = 10
        >>> theta = torch.rand(2,n*(n-1)//2,generator=rng)
        >>> Q = to_unitary_expskewh(theta,n)
        >>> theta2 = from_unitary_expskewh(Q)
        >>> torch.allclose(theta,theta2)
        False
        >>> Q2 = to_unitary_expskewh(theta2,n)
        >>> torch.allclose(Q,Q2)
        True

    Two complex matrices

        >>> n = 3
        >>> theta = torch.rand(2,n**2,generator=rng)
        >>> Q = to_unitary_expskewh(theta,n,complex_case=True)
        >>> theta2 = from_unitary_expskewh(Q,complex_case=True)
        >>> Q2 = to_unitary_expskewh(theta2,n,complex_case=True)
        >>> torch.allclose(Q,Q2)
        True

    Batch support

        >>> n = 10
        >>> theta = torch.rand(2,3,4,n*(n-1)//2,generator=rng)
        >>> Q = to_unitary_expskewh(theta,n)
        >>> theta2 = from_unitary_expskewh(Q)
        >>> Q2 = to_unitary_expskewh(theta2,n)
        >>> torch.allclose(Q,Q2)
        True
        >>> theta = torch.rand(2,3,4,n**2,generator=rng)
        >>> Q = to_unitary_expskewh(theta,n,complex_case=True)
        >>> theta2 = from_unitary_expskewh(Q,complex_case=True)
        >>> Q2 = to_unitary_expskewh(theta2,n,complex_case=True)
        >>> torch.allclose(Q,Q2)
        True
    """
    n = Q.size(-1)
    L = logm_unitary(Q)
    iut = torch.triu_indices(n,n,offset=1,device=Q.device)
    if complex_case:
        idiag = torch.arange(n,device=Q.device)
        diag = L[...,idiag,idiag].imag
        off_diag = L[...,iut[0],iut[1]]
        theta = torch.cat([diag,off_diag.real,off_diag.imag],dim=-1)
    else:
        theta = L[...,iut[0],iut[1]].real
    return theta

to_unitary_qr

to_unitary_qr(A)

Transform to a unitary matrix using the QR decomposition.

Parameters:

Name	Type	Description	Default
`A`	`Tensor`	With `A.shape == (...,n,n)`.	required

Returns:

Name	Type	Description
`Q`	`Tensor`	With shape `(*A.shape[:-1],n,n)` unitary matrices

Examples:

>>> torch.set_default_dtype(torch.float64)
>>> rng = torch.Generator().manual_seed(7)

Single matrix

>>> n = 5
>>> A = torch.rand(n,n,generator=rng)
>>> Q = to_unitary_qr(A)
>>> Q 
tensor([[ 0.1819,  0.2300,  0.8956,  0.3042,  0.1391],
        [ 0.5466, -0.1828, -0.3033,  0.7517, -0.1037],
        [ 0.0410,  0.0450,  0.1626, -0.0886, -0.9808],
        [ 0.5518, -0.6553,  0.2046, -0.4684,  0.0693],
        [ 0.6017,  0.6944, -0.1939, -0.3392,  0.0555]])
>>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q),torch.eye(n))
True

Single complex matrix

>>> n = 4
>>> A = torch.rand(n,n,generator=rng,dtype=torch.complex128)
>>> Q = to_unitary_qr(A)
>>> Q 
tensor([[ 0.3820+4.9592e-01j,  0.6227-3.3313e-01j,  0.2301+4.5375e-04j,
         -0.1762-1.5935e-01j],
        [ 0.3750+1.7952e-01j, -0.4719+1.6687e-01j,  0.1957+5.4894e-02j,
          0.2213-6.9735e-01j],
        [ 0.4702+1.8505e-01j,  0.0477+2.3842e-01j, -0.8138+2.9071e-02j,
          0.0823+1.2523e-01j],
        [ 0.4241+1.1515e-02j,  0.0064+4.3764e-01j,  0.4874-7.1170e-02j,
          0.2837+5.5258e-01j]])
>>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q.conj()),torch.eye(n,dtype=torch.complex128))
True

Two matrices

>>> n = 3
>>> A = torch.rand(2,n,n,generator=rng)
>>> Q = to_unitary_qr(A)
>>> Q 
tensor([[[ 0.6587,  0.2967, -0.6914],
         [ 0.6269, -0.7246,  0.2864],
         [ 0.4160,  0.6221,  0.6633]],
<BLANKLINE>
        [[ 0.5190,  0.8009, -0.2985],
         [ 0.6364, -0.1289,  0.7605],
         [ 0.5706, -0.5847, -0.5766]]])
>>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q),torch.eye(n))
True

Two complex matrices

>>> n = 3
>>> A = torch.rand(2,n,n,generator=rng,dtype=torch.complex128)
>>> Q = to_unitary_qr(A)
>>> Q 
tensor([[[ 0.5834+0.1938j, -0.0560+0.3895j, -0.3185+0.6048j],
         [ 0.4323+0.4517j,  0.6040-0.4547j,  0.0490-0.1873j],
         [ 0.4700+0.1009j, -0.3015+0.4275j,  0.4046-0.5759j]],
<BLANKLINE>
        [[ 0.2769+0.2957j, -0.1093+0.8955j, -0.0100+0.1480j],
         [ 0.0858+0.6104j,  0.0647-0.1696j,  0.7218-0.2572j],
         [ 0.2022+0.6443j,  0.2591-0.2933j, -0.5677+0.2621j]]])
>>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q.conj()),torch.eye(n,dtype=torch.complex128))
True

Batch support

>>> n = 10
>>> A = torch.rand(2,3,4,n,n,generator=rng)
>>> Q = to_unitary_qr(A)
>>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q),torch.eye(n))
True
>>> A = torch.rand(2,3,4,n,n,generator=rng,dtype=torch.complex128)
>>> Q = to_unitary_qr(A)
>>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q.conj()),torch.eye(n,dtype=torch.complex128))
True

Source code in agsutil/utils.py

def to_unitary_qr(A):
    r"""
    Transform to a unitary matrix using the QR decomposition.

    Args:
        A (torch.Tensor): With `A.shape == (...,n,n)`.

    Returns:
        Q (torch.Tensor): With shape `(*A.shape[:-1],n,n)` unitary matrices

    Examples:
        >>> torch.set_default_dtype(torch.float64)
        >>> rng = torch.Generator().manual_seed(7)

    Single matrix

        >>> n = 5
        >>> A = torch.rand(n,n,generator=rng)
        >>> Q = to_unitary_qr(A)
        >>> Q 
        tensor([[ 0.1819,  0.2300,  0.8956,  0.3042,  0.1391],
                [ 0.5466, -0.1828, -0.3033,  0.7517, -0.1037],
                [ 0.0410,  0.0450,  0.1626, -0.0886, -0.9808],
                [ 0.5518, -0.6553,  0.2046, -0.4684,  0.0693],
                [ 0.6017,  0.6944, -0.1939, -0.3392,  0.0555]])
        >>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q),torch.eye(n))
        True

    Single complex matrix

        >>> n = 4
        >>> A = torch.rand(n,n,generator=rng,dtype=torch.complex128)
        >>> Q = to_unitary_qr(A)
        >>> Q 
        tensor([[ 0.3820+4.9592e-01j,  0.6227-3.3313e-01j,  0.2301+4.5375e-04j,
                 -0.1762-1.5935e-01j],
                [ 0.3750+1.7952e-01j, -0.4719+1.6687e-01j,  0.1957+5.4894e-02j,
                  0.2213-6.9735e-01j],
                [ 0.4702+1.8505e-01j,  0.0477+2.3842e-01j, -0.8138+2.9071e-02j,
                  0.0823+1.2523e-01j],
                [ 0.4241+1.1515e-02j,  0.0064+4.3764e-01j,  0.4874-7.1170e-02j,
                  0.2837+5.5258e-01j]])
        >>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q.conj()),torch.eye(n,dtype=torch.complex128))
        True

    Two matrices

        >>> n = 3
        >>> A = torch.rand(2,n,n,generator=rng)
        >>> Q = to_unitary_qr(A)
        >>> Q 
        tensor([[[ 0.6587,  0.2967, -0.6914],
                 [ 0.6269, -0.7246,  0.2864],
                 [ 0.4160,  0.6221,  0.6633]],
        <BLANKLINE>
                [[ 0.5190,  0.8009, -0.2985],
                 [ 0.6364, -0.1289,  0.7605],
                 [ 0.5706, -0.5847, -0.5766]]])
        >>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q),torch.eye(n))
        True

    Two complex matrices

        >>> n = 3
        >>> A = torch.rand(2,n,n,generator=rng,dtype=torch.complex128)
        >>> Q = to_unitary_qr(A)
        >>> Q 
        tensor([[[ 0.5834+0.1938j, -0.0560+0.3895j, -0.3185+0.6048j],
                 [ 0.4323+0.4517j,  0.6040-0.4547j,  0.0490-0.1873j],
                 [ 0.4700+0.1009j, -0.3015+0.4275j,  0.4046-0.5759j]],
        <BLANKLINE>
                [[ 0.2769+0.2957j, -0.1093+0.8955j, -0.0100+0.1480j],
                 [ 0.0858+0.6104j,  0.0647-0.1696j,  0.7218-0.2572j],
                 [ 0.2022+0.6443j,  0.2591-0.2933j, -0.5677+0.2621j]]])
        >>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q.conj()),torch.eye(n,dtype=torch.complex128))
        True

    Batch support

        >>> n = 10
        >>> A = torch.rand(2,3,4,n,n,generator=rng)
        >>> Q = to_unitary_qr(A)
        >>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q),torch.eye(n))
        True
        >>> A = torch.rand(2,3,4,n,n,generator=rng,dtype=torch.complex128)
        >>> Q = to_unitary_qr(A)
        >>> torch.allclose(torch.einsum("...ji,...jk->...ik",Q,Q.conj()),torch.eye(n,dtype=torch.complex128))
        True
    """
    n = A.size(-1)
    assert A.shape[-2:]==(n,n)
    Q,R = torch.linalg.qr(A)
    phases = torch.sgn(torch.diagonal(R,dim1=-2,dim2=-1))
    return Q*phases.unsqueeze(-2)

get_torch_rng

get_torch_rng(seed=None, device=None)

Get a torch.Generator() with a random seed when seed=None or a fixed seed when seed is an int.
This is necessary because torch.Generator() uses a fixed default seed.

Parameters:

Name	Type	Description	Default
`seed`	`Union[None, int, Generator()]`	Random seed.	`None`
`device`	`device`	Device on which to place the generator.	`None`

Returns:

Name	Type	Description
`rng`	`Generator`	The random number generator

Source code in agsutil/utils.py

def get_torch_rng(seed=None, device=None):
    r"""
    Get a `torch.Generator()` with a random seed when `seed=None` or a fixed seed when `seed` is an int.  
    This is necessary because torch.Generator() uses a fixed default seed.  


    Args:
        seed (Union[None,int,torch.Generator()]): Random seed. 
        device (torch.device): Device on which to place the generator.

    Returns: 
        rng (torch.Generator): The random number generator
    """
    if device is None: 
        device = torch.get_default_device()
    if isinstance(seed,torch.Generator):
        rng = seed 
    elif seed is None: 
        rng = torch.Generator(device=device)
        rng.seed()
    else:
        rng = torch.Generator(device=device).manual_seed(seed)
    return rng

logmultinomialcoeff

logmultinomialcoeff(n, *ks)

\(\log \binom{n}{k_1,\dots,k_m} = \log \left(\frac{n!}{k_1! \cdots k_m!}\right)\).
Note that we do not enforce \(k_1+\cdots+k_m=n\).

Parameters:

Name	Type	Description	Default
`n`	`Tensor`	\(n\).	required
`*ks`	`Tuple`	\(k_1,\dots,k_m\) where \(k_j\) is a `torch.Tensor`.	`()`

Returns:

Name	Type	Description
`y`	`Tensor`	\(\log \binom{n}{k_1,\dots,k_m}\).

Examples:

>>> n = torch.arange(6,12)
>>> k1 = torch.arange(0,2)
>>> k2 = torch.arange(2,4)
>>> k3 = torch.arange(4,6)
>>> logmultinomialcoeff(n[None,:],k1[:,None],k2[:,None],k3[:,None])
tensor([[2.7081e+00, 4.6540e+00, 6.7334e+00, 8.9306e+00, 1.1233e+01, 1.3631e+01],
        [8.8818e-16, 1.9459e+00, 4.0254e+00, 6.2226e+00, 8.5252e+00, 1.0923e+01]])
>>> logmultinomialcoeff(n[:,None,None,None],k1[None,:,None,None],k2[None,None,:,None],k3[None,None,None,:]).shape
torch.Size([6, 2, 2, 2])

Source code in agsutil/utils.py

def logmultinomialcoeff(n, *ks):
    r"""
    $\log \binom{n}{k_1,\dots,k_m} = \log \left(\frac{n!}{k_1! \cdots k_m!}\right)$.  
    Note that we do not enforce $k_1+\cdots+k_m=n$.

    Args:
        n (torch.Tensor): $n$.
        *ks (Tuple): $k_1,\dots,k_m$ where $k_j$ is a `torch.Tensor`.

    Returns:
        y (torch.Tensor): $\log \binom{n}{k_1,\dots,k_m}$.

    Examples:
        >>> n = torch.arange(6,12)
        >>> k1 = torch.arange(0,2)
        >>> k2 = torch.arange(2,4)
        >>> k3 = torch.arange(4,6)
        >>> logmultinomialcoeff(n[None,:],k1[:,None],k2[:,None],k3[:,None])
        tensor([[2.7081e+00, 4.6540e+00, 6.7334e+00, 8.9306e+00, 1.1233e+01, 1.3631e+01],
                [8.8818e-16, 1.9459e+00, 4.0254e+00, 6.2226e+00, 8.5252e+00, 1.0923e+01]])
        >>> logmultinomialcoeff(n[:,None,None,None],k1[None,:,None,None],k2[None,None,:,None],k3[None,None,None,:]).shape
        torch.Size([6, 2, 2, 2])
    """ 
    assert (n>=0).all()
    m = len(ks) 
    assert ((k>=0).all() for k in ks)
    y = torch.lgamma(n+1)
    for k in ks:
        y = y-torch.lgamma(k+1)
    return y

multinomialcoeff

multinomialcoeff(n, *ks)

\(\binom{n}{k_1,\dots,k_m} = \frac{n!}{k_1! \cdots k_m!}\).
Note that we do not enforce \(k_1+\cdots+k_m=n\), so we do not round the result to the nearest integer.

Parameters:

Name	Type	Description	Default
`n`	`Tensor`	\(n\).	required
`*ks`	`Tuple`	\(k_1,\dots,k_m\) where \(k_j\) is a `torch.Tensor`.	`()`

Returns:

Name	Type	Description
`y`	`Tensor`	\(\binom{n}{k_1,\dots,k_m}\).

Examples:

>>> n = torch.arange(6,12)
>>> k1 = torch.arange(0,2)
>>> k2 = torch.arange(2,4)
>>> k3 = torch.arange(4,6)
>>> multinomialcoeff(n[None,:],k1[:,None],k2[:,None],k3[:,None])
tensor([[1.5000e+01, 1.0500e+02, 8.4000e+02, 7.5600e+03, 7.5600e+04, 8.3160e+05],
        [1.0000e+00, 7.0000e+00, 5.6000e+01, 5.0400e+02, 5.0400e+03, 5.5440e+04]])
>>> multinomialcoeff(n[:,None,None,None],k1[None,:,None,None],k2[None,None,:,None],k3[None,None,None,:]).shape
torch.Size([6, 2, 2, 2])

Source code in agsutil/utils.py

def multinomialcoeff(n, *ks):
    r"""
    $\binom{n}{k_1,\dots,k_m} = \frac{n!}{k_1! \cdots k_m!}$.  
    Note that we do not enforce $k_1+\cdots+k_m=n$, so we do not round the result to the nearest integer.

    Args:
        n (torch.Tensor): $n$.
        *ks (Tuple): $k_1,\dots,k_m$ where $k_j$ is a `torch.Tensor`.

    Returns:
        y (torch.Tensor): $\binom{n}{k_1,\dots,k_m}$.

    Examples:
        >>> n = torch.arange(6,12)
        >>> k1 = torch.arange(0,2)
        >>> k2 = torch.arange(2,4)
        >>> k3 = torch.arange(4,6)
        >>> multinomialcoeff(n[None,:],k1[:,None],k2[:,None],k3[:,None])
        tensor([[1.5000e+01, 1.0500e+02, 8.4000e+02, 7.5600e+03, 7.5600e+04, 8.3160e+05],
                [1.0000e+00, 7.0000e+00, 5.6000e+01, 5.0400e+02, 5.0400e+03, 5.5440e+04]])
        >>> multinomialcoeff(n[:,None,None,None],k1[None,:,None,None],k2[None,None,:,None],k3[None,None,None,:]).shape
        torch.Size([6, 2, 2, 2])
    """ 
    return torch.exp(logmultinomialcoeff(n,*ks))

logfactorial

logfactorial(n)

\(\log(n!)\)

Parameters:

Name	Type	Description	Default
`n`	`Tensor`	\(n\).	required

Returns:

Name	Type	Description
`y`	`Tensor`	\(\log(n!)\).

Examples:

>>> logfactorial(torch.arange(1,8))
tensor([0.0000, 0.6931, 1.7918, 3.1781, 4.7875, 6.5793, 8.5252])

Source code in agsutil/utils.py

def logfactorial(n):
    r"""
    $\log(n!)$

    Args:
        n (torch.Tensor): $n$.

    Returns:
        y (torch.Tensor): $\log(n!)$.

    Examples:
        >>> logfactorial(torch.arange(1,8))
        tensor([0.0000, 0.6931, 1.7918, 3.1781, 4.7875, 6.5793, 8.5252])
    """ 
    return logmultinomialcoeff(n)

factorial

factorial(n)

\(n!\)

Parameters:

Name	Type	Description	Default
`n`	`Tensor`	\(n\).	required

Returns:

Name	Type	Description
`y`	`Tensor`	\(n!\).

Examples:

>>> factorial(torch.arange(1,8))
tensor([   1,    2,    6,   24,  120,  720, 5040])

Source code in agsutil/utils.py

def factorial(n):
    r"""
    $n!$

    Args:
        n (torch.Tensor): $n$.

    Returns:
        y (torch.Tensor): $n!$.

    Examples:
        >>> factorial(torch.arange(1,8))
        tensor([   1,    2,    6,   24,  120,  720, 5040])
    """ 
    return multinomialcoeff(n).round().to(int)

logcomb

logcomb(n, k)

\(\log \binom{n}{k} = \log \left(\frac{n!}{k!(n-k)!}\right)\)

Parameters:

Name	Type	Description	Default
`n`	`Tensor`	\(n\).	required
`k`	`Tensor`	\(k\).	required

Returns:

Name	Type	Description
`y`	`Tensor`	\(\log \binom{n}{k}\).

Examples:

>>> logcomb(torch.arange(8)[:,None],torch.arange(6)[None,:])
tensor([[0.0000,   -inf,   -inf,   -inf,   -inf,   -inf],
        [0.0000, 0.0000,   -inf,   -inf,   -inf,   -inf],
        [0.0000, 0.6931, 0.0000,   -inf,   -inf,   -inf],
        [0.0000, 1.0986, 1.0986, 0.0000,   -inf,   -inf],
        [0.0000, 1.3863, 1.7918, 1.3863, 0.0000,   -inf],
        [0.0000, 1.6094, 2.3026, 2.3026, 1.6094, 0.0000],
        [0.0000, 1.7918, 2.7081, 2.9957, 2.7081, 1.7918],
        [0.0000, 1.9459, 3.0445, 3.5553, 3.5553, 3.0445]])

Source code in agsutil/utils.py

def logcomb(n,k):
    r"""
    $\log \binom{n}{k} = \log \left(\frac{n!}{k!(n-k)!}\right)$

    Args:
        n (torch.Tensor): $n$.
        k (torch.Tensor): $k$.

    Returns:
        y (torch.Tensor): $\log \binom{n}{k}$.

    Examples:
        >>> logcomb(torch.arange(8)[:,None],torch.arange(6)[None,:])
        tensor([[0.0000,   -inf,   -inf,   -inf,   -inf,   -inf],
                [0.0000, 0.0000,   -inf,   -inf,   -inf,   -inf],
                [0.0000, 0.6931, 0.0000,   -inf,   -inf,   -inf],
                [0.0000, 1.0986, 1.0986, 0.0000,   -inf,   -inf],
                [0.0000, 1.3863, 1.7918, 1.3863, 0.0000,   -inf],
                [0.0000, 1.6094, 2.3026, 2.3026, 1.6094, 0.0000],
                [0.0000, 1.7918, 2.7081, 2.9957, 2.7081, 1.7918],
                [0.0000, 1.9459, 3.0445, 3.5553, 3.5553, 3.0445]])
    """ 
    return logmultinomialcoeff(n,k,n-k)

comb

comb(n, k)

\(\binom{n}{k} = \frac{n!}{k!(n-k)!}\)

Parameters:

Name	Type	Description	Default
`n`	`Tensor`	\(n\).	required
`k`	`Tensor`	\(k\).	required

Returns:

Name	Type	Description
`y`	`Tensor`	\(\binom{n}{k}\).

Examples:

>>> comb(torch.arange(8)[:,None],torch.arange(6)[None,:])
tensor([[ 1,  0,  0,  0,  0,  0],
        [ 1,  1,  0,  0,  0,  0],
        [ 1,  2,  1,  0,  0,  0],
        [ 1,  3,  3,  1,  0,  0],
        [ 1,  4,  6,  4,  1,  0],
        [ 1,  5, 10, 10,  5,  1],
        [ 1,  6, 15, 20, 15,  6],
        [ 1,  7, 21, 35, 35, 21]])

Source code in agsutil/utils.py

def comb(n,k):
    r"""
    $\binom{n}{k} = \frac{n!}{k!(n-k)!}$

    Args:
        n (torch.Tensor): $n$.
        k (torch.Tensor): $k$.

    Returns:
        y (torch.Tensor): $\binom{n}{k}$.

    Examples:
        >>> comb(torch.arange(8)[:,None],torch.arange(6)[None,:])
        tensor([[ 1,  0,  0,  0,  0,  0],
                [ 1,  1,  0,  0,  0,  0],
                [ 1,  2,  1,  0,  0,  0],
                [ 1,  3,  3,  1,  0,  0],
                [ 1,  4,  6,  4,  1,  0],
                [ 1,  5, 10, 10,  5,  1],
                [ 1,  6, 15, 20, 15,  6],
                [ 1,  7, 21, 35, 35, 21]])
    """ 
    return multinomialcoeff(n,k,n-k).round().to(int)

enumerate_sums

enumerate_sums(s, t)

Generator of all possible ways to choose \(s \in \mathbb{N}_0\) non-negative integers which sum to \(t \in \mathbb{N}_0\).
There are a total of \(\binom{t+s-1}{s-1} = \frac{(t+s-1)!}{t!(s-1)!}\) choices.

Parameters:

Name	Type	Description	Default
`s`	`int`	Number of integers to choose.	required
`t`	`int`	Number of items to allocate.	required

Returns:

Name	Type	Description
`g`	`generator`	Genrator of tuples of length \(s\).

Examples:

2 non-negative integers that sum to 3

>>> for v in enumerate_sums(2,3): 
...     print(v)
(0, 3)
(1, 2)
(2, 1)
(3, 0)
>>> len_enumerate_sums(2,3)
4

3 non-negative integers that sum to 2

>>> for v in enumerate_sums(3,2): 
...     print(v)
(0, 0, 2)
(0, 1, 1)
(0, 2, 0)
(1, 0, 1)
(1, 1, 0)
(2, 0, 0)
>>> len_enumerate_sums(3,2)
6

Source code in agsutil/utils.py

def enumerate_sums(s, t):
    r"""
    Generator of all possible ways to choose $s \in \mathbb{N}_0$ non-negative integers which sum to $t \in \mathbb{N}_0$.  
    There are a total of $\binom{t+s-1}{s-1} = \frac{(t+s-1)!}{t!(s-1)!}$ choices. 

    Args:
        s (int): Number of integers to choose. 
        t (int): Number of items to allocate. 

    Returns: 
        g (generator): Genrator of tuples of length $s$. 

    Examples: 

    2 non-negative integers that sum to 3 

        >>> for v in enumerate_sums(2,3): 
        ...     print(v)
        (0, 3)
        (1, 2)
        (2, 1)
        (3, 0)
        >>> len_enumerate_sums(2,3)
        4

    3 non-negative integers that sum to 2 

        >>> for v in enumerate_sums(3,2): 
        ...     print(v)
        (0, 0, 2)
        (0, 1, 1)
        (0, 2, 0)
        (1, 0, 1)
        (1, 1, 0)
        (2, 0, 0)
        >>> len_enumerate_sums(3,2)
        6
    """ 
    for combo in itertools.combinations(range(t+s-1),s-1):
        bars = (-1,)+combo+(t+s-1,)
        yield tuple(bars[i+1]-bars[i]-1 for i in range(s))

len_enumerate_sums

len_enumerate_sums(s, t)

\(\binom{t+s-1}{s-1} = \frac{(t+s-1)!}{t!(s-1)!}\), the number of ways to choose \(s \in \mathbb{N}_0\) non-negative integers which sum to \(t \in \mathbb{N}_0\).

Parameters:

Name	Type	Description	Default
`s`	`int`	Number of integers to choose.	required
`t`	`int`	Number of items to allocate.	required

Returns:

Name	Type	Description
`y`	`int`	\(\binom{t+s-1}{s-1}\).

Examples:

2 non-negative integers that sum to 3

>>> len_enumerate_sums(2,3)
4

3 non-negative integers that sum to 2

>>> len_enumerate_sums(3,2)
6

Source code in agsutil/utils.py

def len_enumerate_sums(s, t):
    r"""
    $\binom{t+s-1}{s-1} = \frac{(t+s-1)!}{t!(s-1)!}$, the number of ways to choose $s \in \mathbb{N}_0$ non-negative integers which sum to $t \in \mathbb{N}_0$.  

    Args:
        s (int): Number of integers to choose. 
        t (int): Number of items to allocate. 

    Returns: 
        y (int): $\binom{t+s-1}{s-1}$. 

    Examples: 

    2 non-negative integers that sum to 3 

        >>> len_enumerate_sums(2,3)
        4

    3 non-negative integers that sum to 2 

        >>> len_enumerate_sums(3,2)
        6
    """ 
    return comb(torch.tensor(t+s-1),torch.tensor(s-1)).item()

enumerate_partitions

enumerate_partitions(n, max_val=None)

Enumerate all partitions sizes of \(n\) objects.

Parameters:

Name	Type	Description	Default
`n`	`int`	Number of objects	required

Returns:

Name	Type	Description
`g`	`generator`	Generator of partitions.

Examples:

>>> for p in enumerate_partitions(3):
...     print(p)
[3]
[2, 1]
[1, 1, 1]
>>> for p in enumerate_partitions(5):
...     print(p)
[5]
[4, 1]
[3, 2]
[3, 1, 1]
[2, 2, 1]
[2, 1, 1, 1]
[1, 1, 1, 1, 1]

Source code in agsutil/utils.py

def enumerate_partitions(n, max_val=None):
    r"""
    Enumerate all partitions sizes of $n$ objects.

    Args:
        n (int): Number of objects

    Returns: 
        g (generator): Generator of partitions. 

    Examples: 
        >>> for p in enumerate_partitions(3):
        ...     print(p)
        [3]
        [2, 1]
        [1, 1, 1]
        >>> for p in enumerate_partitions(5):
        ...     print(p)
        [5]
        [4, 1]
        [3, 2]
        [3, 1, 1]
        [2, 2, 1]
        [2, 1, 1, 1]
        [1, 1, 1, 1, 1]
    """
    if max_val is None:
        max_val = n
    if n == 0:
        yield []
        return
    for i in range(min(n,max_val),0,-1):
        for p in enumerate_partitions(n-i,i):
            yield [i]+p

icdf_std_normal

icdf_std_normal(x)

Inverse CDF of the standard normal distribution \(\mathcal{N}(0,1)\).

Examples:

>>> torch.set_default_dtype(torch.float64)
>>> rng = torch.Generator().manual_seed(7)

>>> x = torch.rand(3,4,generator=rng)
>>> g = icdf_std_normal(x)
>>> g 
tensor([[-0.5847, -0.6017,  1.0898,  0.4034],
        [ 1.4223,  0.9926, -0.5397,  0.1528],
        [ 0.7161,  0.7042, -1.5299, -1.5756]])
>>> g2 = icdf_normal = torch.distributions.Normal(loc=torch.zeros(1),scale=torch.ones(1)).icdf(x)
>>> torch.allclose(g,g2)
True

Source code in agsutil/utils.py

def icdf_std_normal(x):
    r"""
    Inverse CDF of the standard normal distribution $\mathcal{N}(0,1)$. 

    Examples:
        >>> torch.set_default_dtype(torch.float64)
        >>> rng = torch.Generator().manual_seed(7)

        >>> x = torch.rand(3,4,generator=rng)
        >>> g = icdf_std_normal(x)
        >>> g 
        tensor([[-0.5847, -0.6017,  1.0898,  0.4034],
                [ 1.4223,  0.9926, -0.5397,  0.1528],
                [ 0.7161,  0.7042, -1.5299, -1.5756]])
        >>> g2 = icdf_normal = torch.distributions.Normal(loc=torch.zeros(1),scale=torch.ones(1)).icdf(x)
        >>> torch.allclose(g,g2)
        True
    """
    return np.sqrt(2)*torch.erfinv(2*x-1)

`matplotlib` plotting utils

mpl_setup

mpl_setup()

Setup matplotlib default parameters

Returns:

Name	Type	Description
`mplparams`	`dict`	matplotlib helpful parameters

Source code in agsutil/plots.py

def mpl_setup():
    r""" 
    Setup matplotlib default parameters

    Returns:
        mplparams (dict): matplotlib helpful parameters"""
    from matplotlib import pyplot 
    import seaborn as sns
    pyplot.style.use("seaborn-v0_8-whitegrid")
    sns.set_palette("colorblind") # Options: "deep", "muted", "pastel", "bright", "dark", "colorblind
    COLORS = pyplot.rcParams['axes.prop_cycle'].by_key()['color']
    PW = 30 # page width in inches
    FS = 30 # font size
    LINESTYLES = [
        'solid',
        'dotted',
        'dashdot',
        'dashed',
        (0, (3, 5, 1, 5, 1, 5)),
        (5, (10, 3)),
        (0, (1, 1)),
        (0, (1, 10)),
        (0, (1, 5)),
        (0, (5, 5)),
        ]
    MARKERS = [
        "o",
        "s",
        "D",
        "P",
        "^",
        "v",
        "<",
        ">",
        "$a$",
        "$b$",
        "$c$",
        "$d$",
        ]
    pyplot.rcParams['xtick.labelsize'] = FS
    pyplot.rcParams['ytick.labelsize'] = FS
    pyplot.rcParams['ytick.labelsize'] = FS
    pyplot.rcParams['axes.titlesize'] = FS
    pyplot.rcParams['figure.titlesize'] = FS
    pyplot.rcParams["axes.labelsize"] = FS
    pyplot.rcParams['legend.fontsize'] = FS
    pyplot.rcParams['font.size'] = FS
    pyplot.rcParams['lines.linewidth'] = 5
    pyplot.rcParams['lines.markersize'] = 15
    mplparams = {
        "PW": PW,
        "FS": FS,
        "COLORS": COLORS,
        "LINESTYLES": LINESTYLES,
        "MARKERS": MARKERS,
        }
    return mplparams

set_aspect

set_aspect(ax, ratio=1)

Set aspect ratio of the ax.

Parameters:

Name	Type	Description	Default
`ax`	`Axes`	axes to set the aspect ratio of.	required
`ratio`	`float`	positive aspect ratio for the axis	`1`

Source code in agsutil/plots.py

def set_aspect(ax, ratio=1):
    r"""
    Set aspect ratio of the ax. 

    Args:
        ax (Axes): axes to set the aspect ratio of. 
        ratio (float): positive aspect ratio for the axis
    """
    assert ratio>0
    xmin,xmax = ax.get_xlim()
    ymin,ymax = ax.get_ylim()
    aspect = ratio*(xmax-xmin)/(ymax-ymin)
    ax.set_aspect(aspect)

API

torch algorithm utils

lm_opt

minres

minres_qlp_cs

torch autograd utils

gradb

jacfwdb

jacrevb

jvpb

vjpb

Misc utils

Timer

tic

toc

print_data_signatures

to_unitary_expskewh

from_unitary_expskewh

to_unitary_qr

get_torch_rng

logmultinomialcoeff

multinomialcoeff

logfactorial

factorial

logcomb

comb

enumerate_sums

len_enumerate_sums

enumerate_partitions

icdf_std_normal

matplotlib plotting utils

mpl_setup

set_aspect

`torch` algorithm utils

`torch` autograd utils

`matplotlib` plotting utils