We elucidate the mathematical structure of Bayesian filtering, and Bayesian inference more broadly, by applying recent work on category theoretical probability, specifically the concept of a strongly representable Markov category. We show that filtering, along with related concepts such as conjugate priors, arise from an adjunction: the process of taking a hidden Markov process is right adjoint to a forgetful functor. This has an interesting consequence. In practice, filtering is usually implemented using parametrised families of distributions. The Kalman filter is a particularly important example, which uses Gaussians. Rather than calculating a new posterior each time, the implementation only needs to udpate the parameters. This structure arises naturally from our adjunction; the correctness of such a model is witnessed by a map from the model into the system being modelled. Conjugate priors arise from this construction as a special case.
In showing this we define a notion of unifilar machine, which has its origins in the literature on epsilon-machines. Unifilar machines are useful as models of the "observable behaviour" of stochastic systems; we show additionally that in the Kleisli category of the distribution monad there is a terminal unifilar machine, and its elements are controlled stochastic processes, mapping sequences of the input alphabet probabilistically to sequences of the output alphabet. |