Variational Inference layers

DeepUncertainty.VariationalConv
DeepUncertainty.VariationalDense

DeepUncertainty.VariationalDense — Type

VariationalDense(in, out, σ=identity;
                weight_init=TrainableDistribution, 
                bias_init=TrainableDistribution, 
                bias=true)
VariationalDense(weight_sampler, bias_sampler, act)

Creates a variational dense layer. Computes variational bayesian approximation to the distribution over the parameters of the dense layer. The stochasticity is during the forward pass, instead of using point estimates for weights and biases, we sample from the distribution over weights and biases. Gradients of the distribution's learnable parameters are trained using the reparameterization trick.

Reference - https://arxiv.org/abs/1505.05424 We use DistributionsAD - https://github.com/TuringLang/DistributionsAD.jl to help us with backprop.

Fields

weight_sampler: A trainable distribution from which weights are sampled in every forward pass
bias_sampler: A trainable distribution from which biases are sampled in every forward pass
act: Activation function, applies to logits after layer transformation

Arguents

in::Integer: Input dimension size
out::Integer: Output dimension size
σ: Acivation function, defaults to identity
init: Distribution parameters Initialization, defaults to glorot_normal
weight_dist: Weight distribution, defaults to a trainable multivariate normal
bias_dist: Bias distribution, defaults to trainable multivariate normal

source

DeepUncertainty.VariationalConv — Type

VariationalConv(filter, in => out, σ = identity;
                stride = 1, pad = 0, dilation = 1, 
                groups = 1, 
                weight_dist = TrainableMvNormal, 
                bias_dist = TrainableMvNormal,
                [bias, weight, init])
VariationalConvBE(σ, weight_sampler, bias_sampler,
                stride, pad, dilation, groups)

Creates a variational conv layer. Computes variational bayesian approximation to the distribution over the parameters of the conv layer. The stochasticity is during the forward pass, instead of using point estimates for weights and biases, we sample from the distribution over weights and biases. Gradients of the distribution's learnable parameters are trained using the reparameterization trick.

Reference - https://arxiv.org/abs/1505.05424 We use DistributionsAD - https://github.com/TuringLang/DistributionsAD.jl to help us with backprop.

Fields

σ: Activation function, applies to logits after layer transformation
weight_sampler: A trainable distribution from which weights are sampled in every forward pass
bias_sampler: A trainable distribution from which biases are sampled in every forward pass
stride: Convolution stride
pad
dilation
groups

Arguments

filter::NTuple{N,Integer}: Kernel dimensions, eg, (5, 5)
ch::Pair{<:Integer,<:Integer}: Input channels => output channels
σ::F=identity: Activation of the dense layer, defaults to identity
weight_dist=TrainableMvNormal: Initialization function for weights.
bias_dist=TrainableMvNormal: Initialization function for biases.

source