Initializers¶

In addition to initializers in tensorflow.keras.initializers, libspn-keras implements a few more useful initialization schemes for both leaf layers as well as sum weights.

Setting Defaults¶

Since accumulator initializers are often the same for all layers in an SPN, libspn-keras provides the following functions to get and set default accumulator initializers. These can still be overridden by providing the initializers explicitly at initialization of a layer.

libspn_keras.set_default_accumulator_initializer(initializer)¶

Configure the default accumulator that will be used for sum accumulators.

Parameters: initializer (Initializer) – The initializer which will be used by default for sum accumulators.
Return type: None

libspn_keras.get_default_accumulator_initializer()¶

Obtain default accumulator initializer.

Return type: Initializer
Returns: The default accumulator initializer that will be use in sum accumulators, unless specified explicitly at initialization.

Location initializers¶

For a leaf distribution of the location scale family, the following initializers can be used for initializing the location parameters

class libspn_keras.initializers.PoonDomingosMeanOfQuantileSplit(data=None)¶

Initializes the data according to the algorithm described in (Poon and Domingos, 2011).

The data is divided over \(K\) quantiles where \(K\) is the number of nodes along the last axis of the tensor to be initialized. The quantiles are computed over all samples in the provided data. Then, the mean per quantile is taken as the value for initialization.

Parameters: data (numpy.ndarray) – Data to compute quantiles over

References

Sum-Product Networks, a New Deep Architecture Poon and Domingos, 2011

class libspn_keras.initializers.KMeans(data=None, samplewise_normalization=True, data_fraction=0.2, normalization_epsilon=0.01, stop_epsilon=0.0001, num_iters=100, group_centroids=True, max_num_clusters=8, jitter_factor=0.05, centroid_initialization='kmeans++', downsample=None, use_groups=False)¶

Initializer learned through K-means from data.

The centroids learned from K-means are used to initialize the location parameters of a location-scale leaf, such as a NormalLeaf. This is particularly useful for variables with dimensionality of greater than 1.

Notes

Currently only works for spatial SPNs.

Parameters

data (numpy.ndarray) – Data on which to perform K-means.
samplewise_normalization (bool) – Whether to normalize data before learning centroids.
data_fraction (float) – Fraction of the data to use for K-means (chosen randomly)
normalization_epsilon (float) – Normalization constant (only used when sample_normalization is True.
stop_epsilon (float) – Non-zero constant for difference in MSE on which to stop K-means fitting.
num_iters (int) – Maximum number of iterations.
group_centroids (bool) – If True, performs another round of K-means to group the centroids along the scope axes.
max_num_clusters (int) – Maximum number of clusters (use this to limit the memory needed)
jitter_factor (float) – If the number of clusters is larger than allowed according to max_num_clusters, the learned max_num_clusters centroids are repeated and then jittered with noise generated from a truncated normal distribution with a standard deviation of jitter_factor
centroid_initialization (str) – Centroid initialization algorithm. If "kmeans++", will iteratively initialize clusters far apart from each other. Otherwise, the centroids will be initialized from the data randomly.

class libspn_keras.initializers.Equidistant(minval=0.0, maxval=1.0)¶

Initializer that generates tensors where the last axis is initialized with ‘equidistant’ values.

Parameters

minval (float) – A python scalar or a scalar tensor. Lower bound of the range of random values to generate.
maxval (float) – A python scalar or a scalar tensor. Upper bound of the range of random values to generate. Defaults to 1 for float types.

Scale initializers¶

For a leaf distribution of the location scale family, the following initializers can be used for initializing the scale parameters

class libspn_keras.initializers.PoonDomingosStddevOfQuantileSplit(data=None)¶

Initializes the data according to the algorithm described in (Poon and Domingos, 2011).

The data is divided over \(K\) quantiles where \(K\) is the number of nodes along the last axis of the tensor to be initialized. The quantiles are computed over all samples in the provided data. Then, the stddev per quantile is taken as the value for initialization.

Parameters: data (numpy.ndarray) – Data to compute quantiles over

References

Sum-Product Networks, a New Deep Architecture Poon and Domingos, 2011

Weight initializers¶

class libspn_keras.initializers.Dirichlet(axis=- 2, alpha=0.1)¶

Initializes all values in a tensor with \(Dir(\alpha)\).

Parameters

axis (int) – The axis over which to sample from a \(Dir(\alpha)\).
alpha (float) – The \(\alpha\) parameter of the Dirichlet distribution. If a scalar, this is broadcast along the given axis.

Note Initializer for discrete EM (SumOpHardEMBackprop and SumOpUnweightedHardEMBackprop).

class libspn_keras.initializers.EpsilonInverseFanIn(axis=- 2, epsilon=0.0001)¶

Initializes all values in a tensor with \(\epsilon K^{-1}\).

Where \(K\) is the dimension at axis.

This is particularly useful for (unweighted) hard EM learning and should generally be avoided otherwise.

Parameters

axis (int) – The axis for input nodes so that \(K^{-1}\) is the inverse fan in. Usually, this is -2.
epsilon (float) – A small non-zero constant