@hackage goal-graphical0.20

Optimization of latent variable and dynamical models with Goal

This library provides definitions and algorithms for various graphical models such as mixture models, kalman Filters, and restricted Boltzmann machines, as well as algorithms for fitting them e.g. expectation maximization and contrastive divergence minimization. Underlying all of these models is is a generalized linear object known as a Harmonium, and in the following I will briefly introduce them.

The core definition of this library is a Manifold of joint distributions I call of an AffineHarmonium

newtype AffineHarmonium f y x z w = AffineHarmonium (Affine f y z x, w)

which is a product Manifold composed of a of a Manifold of likelihood functions Affine f y z x, and a Manifold of distributions w that partially define the space of priors. AffineHarmoniums provide a bit more flexibility than what I call a Harmonium

type Harmonium f z w = AffineHarmonium f z w z w

which is a simpler object. Nevertheless, from a theoretical point of view, an AffineHarmonium is a special case of a Harmonium, and we may think of them more or less equivalently.

A Harmonium is a model over a observable variables and latent variables, and represents a sort of generalized linear joint distribution over the two of them. The theory for Harmoniums is summarized well by this paper and this paper. Although Harmoniums might seem like a little-studied, and esoteric object, various well known models, such as mixture models and restricted Boltzmann machines, are in fact Harmoniums, and various other models, such as factor analysis, Kalman filters, or hidden Markov models, can be expressed in terms of them.

All of the aforementioned models can be fit with Expectation-Maximization (EM), and EM can be expressed in an entirely general manner for Harmoniums. Firstly, the expectation step of a Harmonium is implemented by

expectationStep
    :: ( ExponentialFamily z, Map Natural f x y, Bilinear f y x
       , Translation z y, Translation w x, LegendreExponentialFamily w )
    => Sample z -- ^ Model Samples
    -> Natural # AffineHarmonium f y x z w -- ^ Harmonium
    -> Mean # AffineHarmonium f y x z w -- ^ Harmonium expected sufficient statistics
expectationStep zs hrm =
    let mzs = sufficientStatistic <$> zs
        mys = anchor <$> mzs
        pstr = fst . split $ transposeHarmonium hrm
        mws = transition <$> pstr >$> mys
        mxs = anchor <$> mws
        myx = (>$<) mys mxs
     in joinHarmonium (average mzs) myx $ average mws

In summary, what we do is

  • take some observations,
  • compute their sufficientStatistics,
  • map these statistics into the predictions of the latent variables,
  • transition these latent predictions from Natural coordinates to Mean coordinates,
  • and assemble the results into the Mean sufficientStatistics of the joint distribution.

The maximization step then consists simply of mapping the whole joint distribution from Mean back to Natural coordinates, such that expectationMaximization may be expressed as

expectationMaximization
    :: ( DuallyFlatExponentialFamily (AffineHarmonium f y x z w)
       , ExponentialFamily z, Map Natural f x y, Bilinear f y x
       , Translation z y, Translation w x, LegendreExponentialFamily w )
    => Sample z
    -> Natural # AffineHarmonium f y x z w
    -> Natural # AffineHarmonium f y x z w
expectationMaximization zs hrm = transition $ expectationStep zs hrm

As such, for a wide variety of models, we may reduce implementing expectation maximization to instantating the class requirements of the expectationMaximization function. This is rarely trivial, but in some sense, much more straight-forward and well-defined that deriving EM algorithms from scratch.

For in-depth tutorials visit my blog.