Bayesian Layers

DeepUncertainty.TrainableMvNormal

We follow the paper "Weight Uncertainty in Neural Networks" to implement Bayesian Layers. The paper proposes replacing point estimates of weights in the weight matrix with trainable distributions. So instead of optimizing for weights directly, we optimize the parameters of the distributions from which we sample weights at every forward pass. The predictive uncertainty is given by approximating the integral over the weight distribution using Monte Carlo sampling.

The first component of creating Bayesian layers is trainable distributions, we use DistributionsAD to backprop through distributions using Zygote, the flux autodiff framework. A trainable distribution should be a subtype of the type AbstractTrainableDist

DeepUncertainty.TrainableMvNormal — Type

TrainableMvNormal(shape;
                init=glorot_normal, 
                device=cpu) <: AbstractTrainableDist
TrainableMvNormal(mean, stddev, sample, shape)

A Multivariate Normal distribution with trainable mean and stddev.

Fields

mean: Trainable mean vector of the distribution
stddev: Trainable standard deviation vector of the distibution
sample: The latest sample from the distribution, used in calculating loglikelhood loss
shape::Tuple: The shape of the sample to be returned

Arguments

shape::Tuple: The shape of the sample to returned from the distribution
init: glorot_normal; to initialize the mean and stddev trainable params
device: cpu; the device to move the sample to, used for convinience while using both GPU and CPU

source