Examples

This section contains some usage examples for TorchJD.

  • Basic Usage provides a toy example using torchjd.backward to make a step of Jacobian descent with the UPGrad aggregator.

  • Instance-Wise Risk Minimization (IWRM) provides an example in which we minimize the vector of per-instance losses, using stochastic sub-Jacobian descent (SSJD). It is compared to the usual minimization of the average loss, called empirical risk minimization (ERM), using stochastic gradient descent (SGD).

  • Partial Jacobian Descent for IWRM provides an example in which we minimize the vector of per-instance losses using stochastic sub-Jacobian descent, similar to our IWRM example. However, this method bases the aggregation decision on the Jacobian of the losses with respect to only a subset of the model’s parameters, offering a trade-off between computational cost and aggregation precision.

  • Multi-Task Learning (MTL) provides an example of multi-task learning where Jacobian descent is used to optimize the vector of per-task losses of a multi-task model, using the dedicated backpropagation function mtl_backward.

  • Instance-Wise Multi-Task Learning (IWMTL) shows how to combine multi-task learning with instance-wise risk minimization: one loss per task and per element of the batch, using the autogram.Engine and a GeneralizedWeighting.

  • Recurrent Neural Network (RNN) shows how to apply Jacobian descent to RNN training, with one loss per output sequence element.

  • Monitoring Aggregations shows how to monitor the aggregation performed by the aggregator, to check if Jacobian descent is prescribed for your use-case.

  • PyTorch Lightning Integration showcases how to combine TorchJD with PyTorch Lightning, by providing an example implementation of a multi-task LightningModule optimized by Jacobian descent.

  • Automatic Mixed Precision shows how to combine mixed precision training with TorchJD.