Initializers

In addition to initializers in tensorflow.keras.initializers, libspn-keras implements a few more useful initialization schemes for both leaf layers as well as sum weights.

Setting Defaults

Since accumulator initializers are often the same for all layers in an SPN, libspn-keras provides the following functions to get and set default accumulator initializers. These can still be overridden by providing the initializers explicitly at initialization of a layer.

libspn_keras.set_default_accumulator_initializer(initializer)

Configure the default accumulator that will be used for sum accumulators.

Parameters

initializer (Initializer) – The initializer which will be used by default for sum accumulators.

Return type

None

libspn_keras.get_default_accumulator_initializer()

Obtain default accumulator initializer.

Return type

Initializer

Returns

The default accumulator initializer that will be use in sum accumulators, unless specified explicitly at initialization.

Location initializers

For a leaf distribution of the location scale family, the following initializers can be used for initializing the location parameters

class libspn_keras.initializers.PoonDomingosMeanOfQuantileSplit(data=None)

Initializes the data according to the algorithm described in (Poon and Domingos, 2011).

The data is divided over \(K\) quantiles where \(K\) is the number of nodes along the last axis of the tensor to be initialized. The quantiles are computed over all samples in the provided data. Then, the mean per quantile is taken as the value for initialization.

Parameters

data (numpy.ndarray) – Data to compute quantiles over

References

Sum-Product Networks, a New Deep Architecture Poon and Domingos, 2011

class libspn_keras.initializers.KMeans(data=None, samplewise_normalization=True, data_fraction=0.2, normalization_epsilon=0.01, stop_epsilon=0.0001, num_iters=100, group_centroids=True, max_num_clusters=8, jitter_factor=0.05, centroid_initialization='kmeans++', downsample=None, use_groups=False)

Initializer learned through K-means from data.

The centroids learned from K-means are used to initialize the location parameters of a location-scale leaf, such as a NormalLeaf. This is particularly useful for variables with dimensionality of greater than 1.

Notes

Currently only works for spatial SPNs.

Parameters
  • data (numpy.ndarray) – Data on which to perform K-means.

  • samplewise_normalization (bool) – Whether to normalize data before learning centroids.

  • data_fraction (float) – Fraction of the data to use for K-means (chosen randomly)

  • normalization_epsilon (float) – Normalization constant (only used when sample_normalization is True.

  • stop_epsilon (float) – Non-zero constant for difference in MSE on which to stop K-means fitting.

  • num_iters (int) – Maximum number of iterations.

  • group_centroids (bool) – If True, performs another round of K-means to group the centroids along the scope axes.

  • max_num_clusters (int) – Maximum number of clusters (use this to limit the memory needed)

  • jitter_factor (float) – If the number of clusters is larger than allowed according to max_num_clusters, the learned max_num_clusters centroids are repeated and then jittered with noise generated from a truncated normal distribution with a standard deviation of jitter_factor

  • centroid_initialization (str) – Centroid initialization algorithm. If "kmeans++", will iteratively initialize clusters far apart from each other. Otherwise, the centroids will be initialized from the data randomly.

class libspn_keras.initializers.Equidistant(minval=0.0, maxval=1.0)

Initializer that generates tensors where the last axis is initialized with ‘equidistant’ values.

Parameters
  • minval (float) – A python scalar or a scalar tensor. Lower bound of the range of random values to generate.

  • maxval (float) – A python scalar or a scalar tensor. Upper bound of the range of random values to generate. Defaults to 1 for float types.

Scale initializers

For a leaf distribution of the location scale family, the following initializers can be used for initializing the scale parameters

class libspn_keras.initializers.PoonDomingosStddevOfQuantileSplit(data=None)

Initializes the data according to the algorithm described in (Poon and Domingos, 2011).

The data is divided over \(K\) quantiles where \(K\) is the number of nodes along the last axis of the tensor to be initialized. The quantiles are computed over all samples in the provided data. Then, the stddev per quantile is taken as the value for initialization.

Parameters

data (numpy.ndarray) – Data to compute quantiles over

References

Sum-Product Networks, a New Deep Architecture Poon and Domingos, 2011

Weight initializers

class libspn_keras.initializers.Dirichlet(axis=- 2, alpha=0.1)

Initializes all values in a tensor with \(Dir(\alpha)\).

Parameters
  • axis (int) – The axis over which to sample from a \(Dir(\alpha)\).

  • alpha (float) – The \(\alpha\) parameter of the Dirichlet distribution. If a scalar, this is broadcast along the given axis.

Note Initializer for discrete EM (SumOpHardEMBackprop and SumOpUnweightedHardEMBackprop).

class libspn_keras.initializers.EpsilonInverseFanIn(axis=- 2, epsilon=0.0001)

Initializes all values in a tensor with \(\epsilon K^{-1}\).

Where \(K\) is the dimension at axis.

This is particularly useful for (unweighted) hard EM learning and should generally be avoided otherwise.

Parameters
  • axis (int) – The axis for input nodes so that \(K^{-1}\) is the inverse fan in. Usually, this is -2.

  • epsilon (float) – A small non-zero constant