Initializers¶
In addition to initializers in tensorflow.keras.initializers
, libspn-keras
implements a few
more useful initialization schemes for both leaf layers as well as sum weights.
Setting Defaults¶
Since accumulator initializers are often the same for all layers in an SPN, libspn-keras
provides the following
functions to get and set default accumulator initializers. These can still be overridden by providing the initializers
explicitly at initialization of a layer.
-
libspn_keras.
set_default_accumulator_initializer
(initializer)¶ Configure the default accumulator that will be used for sum accumulators.
- Parameters
initializer (
Initializer
) – The initializer which will be used by default for sum accumulators.- Return type
None
-
libspn_keras.
get_default_accumulator_initializer
()¶ Obtain default accumulator initializer.
- Return type
Initializer
- Returns
The default accumulator initializer that will be use in sum accumulators, unless specified explicitly at initialization.
Location initializers¶
For a leaf distribution of the location scale family, the following initializers can be used for initializing the location parameters
-
class
libspn_keras.initializers.
PoonDomingosMeanOfQuantileSplit
(data=None)¶ Initializes the data according to the algorithm described in (Poon and Domingos, 2011).
The data is divided over \(K\) quantiles where \(K\) is the number of nodes along the last axis of the tensor to be initialized. The quantiles are computed over all samples in the provided
data
. Then, the mean per quantile is taken as the value for initialization.- Parameters
data (numpy.ndarray) – Data to compute quantiles over
References
Sum-Product Networks, a New Deep Architecture Poon and Domingos, 2011
-
class
libspn_keras.initializers.
KMeans
(data=None, samplewise_normalization=True, data_fraction=0.2, normalization_epsilon=0.01, stop_epsilon=0.0001, num_iters=100, group_centroids=True, max_num_clusters=8, jitter_factor=0.05, centroid_initialization='kmeans++', downsample=None, use_groups=False)¶ Initializer learned through K-means from data.
The centroids learned from K-means are used to initialize the location parameters of a location-scale leaf, such as a
NormalLeaf
. This is particularly useful for variables with dimensionality of greater than 1.Notes
Currently only works for spatial SPNs.
- Parameters
data (numpy.ndarray) – Data on which to perform K-means.
samplewise_normalization (bool) – Whether to normalize data before learning centroids.
data_fraction (float) – Fraction of the data to use for K-means (chosen randomly)
normalization_epsilon (float) – Normalization constant (only used when
sample_normalization
isTrue
.stop_epsilon (
float
) – Non-zero constant for difference in MSE on which to stop K-means fitting.num_iters (int) – Maximum number of iterations.
group_centroids (bool) – If
True
, performs another round of K-means to group the centroids along the scope axes.max_num_clusters (int) – Maximum number of clusters (use this to limit the memory needed)
jitter_factor (float) – If the number of clusters is larger than allowed according to
max_num_clusters
, the learnedmax_num_clusters
centroids are repeated and then jittered with noise generated from a truncated normal distribution with a standard deviation ofjitter_factor
centroid_initialization (str) – Centroid initialization algorithm. If
"kmeans++"
, will iteratively initialize clusters far apart from each other. Otherwise, the centroids will be initialized from the data randomly.
-
class
libspn_keras.initializers.
Equidistant
(minval=0.0, maxval=1.0)¶ Initializer that generates tensors where the last axis is initialized with ‘equidistant’ values.
- Parameters
minval (
float
) – A python scalar or a scalar tensor. Lower bound of the range of random values to generate.maxval (
float
) – A python scalar or a scalar tensor. Upper bound of the range of random values to generate. Defaults to 1 for float types.
Scale initializers¶
For a leaf distribution of the location scale family, the following initializers can be used for initializing the scale parameters
-
class
libspn_keras.initializers.
PoonDomingosStddevOfQuantileSplit
(data=None)¶ Initializes the data according to the algorithm described in (Poon and Domingos, 2011).
The data is divided over \(K\) quantiles where \(K\) is the number of nodes along the last axis of the tensor to be initialized. The quantiles are computed over all samples in the provided
data
. Then, the stddev per quantile is taken as the value for initialization.- Parameters
data (numpy.ndarray) – Data to compute quantiles over
References
Sum-Product Networks, a New Deep Architecture Poon and Domingos, 2011
Weight initializers¶
-
class
libspn_keras.initializers.
Dirichlet
(axis=- 2, alpha=0.1)¶ Initializes all values in a tensor with \(Dir(\alpha)\).
- Parameters
axis (
int
) – The axis over which to sample from a \(Dir(\alpha)\).alpha (
float
) – The \(\alpha\) parameter of the Dirichlet distribution. If a scalar, this is broadcast along the given axis.
Note
Initializer for discrete EM (SumOpHardEMBackprop
and
SumOpUnweightedHardEMBackprop
).
-
class
libspn_keras.initializers.
EpsilonInverseFanIn
(axis=- 2, epsilon=0.0001)¶ Initializes all values in a tensor with \(\epsilon K^{-1}\).
Where \(K\) is the dimension at
axis
.This is particularly useful for (unweighted) hard EM learning and should generally be avoided otherwise.
- Parameters
axis (
int
) – The axis for input nodes so that \(K^{-1}\) is the inverse fan in. Usually, this is-2
.epsilon (
float
) – A small non-zero constant