Examples¶
This section contains some usage examples for TorchJD.
Basic Usage provides a toy example using torchjd.backward to make a step of Jacobian descent with the UPGrad aggregator.
Instance-Wise Risk Minimization (IWRM) provides an example in which we minimize the vector of per-instance losses, using stochastic sub-Jacobian descent (SSJD). It is compared to the usual minimization of the average loss, called empirical risk minimization (ERM), using stochastic gradient descent (SGD).
Partial Jacobian Descent for IWRM provides an example in which we minimize the vector of per-instance losses using stochastic sub-Jacobian descent, similar to our IWRM example. However, this method bases the aggregation decision on the Jacobian of the losses with respect to only a subset of the model’s parameters, offering a trade-off between computational cost and aggregation precision.
Multi-Task Learning (MTL) provides an example of multi-task learning where Jacobian descent is used to optimize the vector of per-task losses of a multi-task model, using the dedicated backpropagation function mtl_backward.
Instance-Wise Multi-Task Learning (IWMTL) shows how to combine multi-task learning with instance-wise risk minimization: one loss per task and per element of the batch, using the autogram.Engine and a GeneralizedWeighting.
Recurrent Neural Network (RNN) shows how to apply Jacobian descent to RNN training, with one loss per output sequence element.
Monitoring Aggregations shows how to monitor the aggregation performed by the aggregator, to check if Jacobian descent is prescribed for your use-case.
PyTorch Lightning Integration showcases how to combine TorchJD with PyTorch Lightning, by providing an example implementation of a multi-task
LightningModule
optimized by Jacobian descent.Automatic Mixed Precision shows how to combine mixed precision training with TorchJD.