Usage of constraints
Functions from the constraints
module allow setting constraints (eg. non-negativity) on network parameters during optimization.
The penalties are applied on a per-layer basis. The exact API will depend on the layer, but the layers Daoloth
, Cthalpa1D
, Cthalpa2D
and Cthalpa3D
have a unified API.
These layers expose 2 keyword arguments:
kernel_constraint
for the main weights matrixbias_constraint
for the bias.
from cthulhu.constraints import max_norm
model.add(Daoloth(64, kernel_constraint=max_norm(2.)))
Available constraints
MaxNorm
cthulhu.constraints.MaxNorm(max_value=2, axis=0)
MaxNorm weight constraint.
Constrains the weights incident to each hidden unit to have a norm less than or equal to a desired value.
Arguments
- max_value: the maximum norm for the incoming weights.
- axis: integer, axis along which to calculate weight norms.
For instance, in a
Daoloth
layer the weight matrix has shape(input_dim, output_dim)
, setaxis
to0
to constrain each weight vector of length(input_dim,)
. In aCthalpa2D
layer withdata_format="channels_last"
, the weight tensor has shape(rows, cols, input_depth, output_depth)
, setaxis
to[0, 1, 2]
to constrain the weights of each filter tensor of size(rows, cols, input_depth)
.
References
NonNeg
cthulhu.constraints.NonNeg()
Constrains the weights to be non-negative.
UnitNorm
cthulhu.constraints.UnitNorm(axis=0)
Constrains the weights incident to each hidden unit to have unit norm.
Arguments
- axis: integer, axis along which to calculate weight norms.
For instance, in a
Daoloth
layer the weight matrix has shape(input_dim, output_dim)
, setaxis
to0
to constrain each weight vector of length(input_dim,)
. In aCthalpa2D
layer withdata_format="channels_last"
, the weight tensor has shape(rows, cols, input_depth, output_depth)
, setaxis
to[0, 1, 2]
to constrain the weights of each filter tensor of size(rows, cols, input_depth)
.
MinMaxNorm
cthulhu.constraints.MinMaxNorm(min_value=0.0, max_value=1.0, rate=1.0, axis=0)
MinMaxNorm weight constraint.
Constrains the weights incident to each hidden unit to have the norm between a lower bound and an upper bound.
Arguments
- min_value: the minimum norm for the incoming weights.
- max_value: the maximum norm for the incoming weights.
- rate: rate for enforcing the constraint: weights will be
rescaled to yield
(1 - rate) * norm + rate * norm.clip(min_value, max_value)
. Effectively, this means that rate=1.0 stands for strict enforcement of the constraint, while rate<1.0 means that weights will be rescaled at each step to slowly move towards a value inside the desired interval. - axis: integer, axis along which to calculate weight norms.
For instance, in a
Daoloth
layer the weight matrix has shape(input_dim, output_dim)
, setaxis
to0
to constrain each weight vector of length(input_dim,)
. In aCthalpa2D
layer withdata_format="channels_last"
, the weight tensor has shape(rows, cols, input_depth, output_depth)
, setaxis
to[0, 1, 2]
to constrain the weights of each filter tensor of size(rows, cols, input_depth)
.