Skip to content

BerenMillidge/FEP_Active_Inference_Papers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 

Repository files navigation

FEP and Active Inference Paper Repository

This repository provides a list of papers that I believe are interesting and influential on the Free-Energy-Principle, or in Active Inference. If you believe I have missed any papers, please contact me at beren@millidge.name or make a pull request with the information about the paper. I will be happy to include it.

FEP Outline

This list is of papers focused specifically on the abstract mathematical formulation of the Free-Energy-Principle (FEP). The FEP is a theory which tries to determine the behaviours a non-equilibrium thermodynamical system must exhibit if it is to maintain itself as a separate entity over time. It argues that any such system must minimize a quantity called the free energy and that, over the course of this minimisation, behaviour much like action and perception must emerge.

The key prerequisites for the FEP is that a 'system' has a special kind of statistical separation from the world called a Markov Blanket, which it must maintain if it is to remain a system, and that the system possesses a non-equilibrium steady state which it self-organises to and tries to maintain over time against the dissipative forces of entropy.

Much of the work in the FEP has been applying its general tenets to understand biological far-from-equilibrium systems, especially the brain.

If you are just starting out, I reccomend reading all the papers in the 'Survey' section in order. These are all great tutorials or overviews which should give you a great grounding in the intuitions of the theory, and then the later two tutorials should start building up much of the mathematical core of the theory (especially around predictive coding).

Surveys

This provides a great high level introduction to the basic ideas and intuitions of the FEP, with a small amount of crucial mathematical background.

This provides a great overview for the initial intuitions behind the FEP and its application to the brain.

This is a great review which introduces the basics of predictive coding and the FEP, including the maths and contains MATLAB sample code. If you want to start seriously diving into the maths, I would start here.

This is a fantastic review which presents a complete walkthrough of the mathematical basis of the Free Energy Principle and Variational Inference, and derives predictive coding and (continuous time and state) active inference. I would reccomend reading this after Bogacz' tutorial (although be prepared -- it is a long and serious read)

A detailed and clear walkthrough of discrete-state-space active inference, including detailed MATLAB code for a sample implementation.

Classics

This is Karl's magisterial monograph, and contains the most comprehensive description of the FEP to date

Perhaps the earliest paper describing the FEP. Provides a great description of the fundamental intuitions behind the theory (in needs of living systems to reduce their internal entropy to keep conditions within homeostatic bounds)

An early but complete description of predictive coding as an application of the FEP and variational inference under Gaussian and Laplace assumptions. Also surprisingly readable. This is core reading on predictive coding and the FEP

The earliest paper (I think) on active inference. Introduces the motivation behind the continuous state and time formulation of active inference. Shows how predictive coding can be used to learn actions as well as observations (by treating them the same)

Goes deep into the neuroscientific intuitions behind why you might want to think about action as a predicted observation and not a latent variable for biological brains. Presents Karl's view that action happens primarily at the periphery through simple 'reflex arcs' while all the real work is done by the generative models generating predictions.

A great interview with Karl. Goes into a lot of his personal motivations underlying his work on the FEP. I would recommend this perhaps as an initial place to start out if you know nothing of the FEP to grasp the underlying motivations of what it is trying to explain.

Mathematical paper by Karl and Ping Ao which begins fleshing out formally the notion of desires as attractors

Makes a conjectured link between precision in predictive coding and attention in the brain.

Presents the 'full-construct' predictive coding model with both hierarchies and generalised coordinates.

Extends predictive coding to generalised coordinates, and derives the necessary inference algorithms for working with them -- i.e. DEM, dynamic expectation maximisation.

Foundational treatment of variational inference for dynamical systems, as represented in generalised coordinates. Also relates variational filtering to other non-variational schemes like particle filtering and Kalman filtering.

Philosophical Analyses

This paper critically analyses and deconstructs various philosophical claims about what the FEP is saying. Specifically, it argues that there is not necessarily a connection between the statistical notion of a Markov Blanket, and a functional notion, meaning that an actual dynamical separation (such as a cell membrane) does not necessarily imply a statistical separation in the form of a Markov Blanket and vice versa. Secondly, it demonstrates and clarifies that the FEP only makes claims about the flow of internal states on average over counterfactual realizations of the system, and therefore the FEP cannot describe the individual trajectories of a system in terms of free energy minimization.

This provides a great high level introduction to the basic ideas and intuitions of the FEP, with a small amount of crucial mathematical background.

Self-Organisation and Markov Blankets

A heuristic demonstration of the concept that Karl will later refer to as 'Bayesian mechanics', this paper surveys the notion that any random dynamical systems with the right kind of coupling among its sub-systems (i.e. a Markov blanket), will naturally appear as if it's performing a kind of approximate Bayesian inference. This argument is motivated by appeal to the existence of a non-equilibrium steady-state density, to which the system's probability distribution converges over time.

The 'free energy' response to the Fodorian notion of 'modularity' as an explanation of functional segregation, here motivated by an appeal to the stochastic dynamics of Markov-blanketed systems. Parr et al. argue that, given a particular conditional independency structure among the components that comprise a random dynamical system, one can interpret the system and its dynamics as entertaining a mean-field factorised generative model of its local environment, as opposed to appealing to philosophically or otherwise-unsatisfying notions such as 'modularity'.

Information Geometry

This paper gives succinct and schematic treatments of several of the main concepts explored in a Free Energy Principle for a Particular Physics, particularly those related to Bayesian mechanics and information geometry. This work importantly delineates some of the conditions required of a system, so that its internal states approximately parameterise beliefs about external states. Fluctuation theorems are also invoked to relate the probability of trajectories or sequences of states to existing concepts in the active inference world, such as information gain, risk, and ambiguity resolution.

Active Inference Outline

Active Inference is a process theory of neurobiological function inspired by and closely related to the FEP. However Active Inference stands independent of the FEP and can be true even if the FEP is not, and similarly can potentially be falsified without impacting the FEP. The core idea behind Active Inference is the idea that the brain performs both action and perception by variational inference on a unified objective function.

In effect, the key idea behind active inference is that our brains possess powerful probabilistic generative models and inference engines, and that to select actions, we repurpose this powerful capacity we use for perception to also infer potential actions. Hence Active Inference.

This high-level description leaves open the exact type of models and inference being used for action inference in the brain. The active inference literature contains three clear strands of work, which correspond to different assumptions on the exact form of generative models which are proposed to be utilized by the brain. Discrete active inference focuses on models of discrete state-spaces parametrised by categorical distributions and transition matrices. Continuous active inference focuses on the continuous time case with (generally) linear dynamics, and Deep active inference focuses on using deep neural networks to 'scale up' active inference by amortising probabilistic distributions with learned maps. The discrete-state-space work has close similarities with bandit-problems and neuroscience tasks and forms a tractable test-bed to understand different kinds of behaviour. Most of the work of creating active inference models of brain function (or dysfunction) lies within this paradigm. Continuous active inference, which is being used for robot control, has close links to classical control theory, while Deep active inference has close links with reinforcement learning and machine learning.

The task of inferring actions (requiring detailed models of future outcomes given these actions), is a subtly more complex task than simply inferring the immediate causes of sensory data as in perceptual inference. It therefore requires different objective functionals (the expected free energy) and potentially more advanced message-passing inference algorithms. This work is summarised in the 'Message Passing and Free Energies' section.

Surveys and Tutorials

This is a great and thorough tutorial on discrete-state-space active inference. I would reccomend it to everybody new to the field.

Discrete State Space Formulation

Introduces the main intuitions behind active inference, as well as the crucial epistemic foraging behaviour of the expected free energy. Illustrated on a simple T-maze task.

The first (I think) discrete-state-space paper on active inference. Notable for using the standard variational free energy as objective function and not the expected free energy. Describes some of the intuitions behind active inference.

Provides a very good and thorough description of discrete-state-space active inference and ties its updates closely to neural physiology. I would reccomend this after the Da Costa introduction.

Introduces the next stage of active inference. 'Sophisticated' active inference, where agents make decisions not just on their beliefs about the future, but on how their beliefs will change in the future. Allows the simulation of real epistemic value -- i.e. act so as to change your beliefs in the future.

Discusses the relationship between active inference and dynamic programming solutions to reinforcement learning problems (i.e. Q learning, value functions etc). Shows that they are largely equivalent except with different objectives (Expected Free Energy vs Expected Discounted Reward).

Continuous Time Formulation

The earliest paper (I think) on active inference. Introduces the motivation behind the continuous state and time formulation of active inference. Shows how predictive coding can be used to learn actions as well as observations (by treating them the same)

Active inference in plants!!!

Active inference under a linear gaussian generative model can replicate PID, but also provide a natural method for learning the tuning coefficients (by understanding them as precisions).

A key step towards understanding how active inference relates to classical control theory methods such as Kalman Filters and LQR control.

Another approach to understanding how active inference relates to and extends classical control theory methods.

An excellent overview and fantastic piece of work on the linear time-indepenent formulation of active inference and its relation to classical control theory.

Message Passing and Free Energies

Introduces the general factor-graph message passing viewpoint on active inference. Also introduces hierarchical active inference models.

Discusses in depth the different potential message passing inference algorithms which can be used to implement active inference on factor graphs.

Introduces the Bethe free energy, as a result of making the Bethe approximation instead of the mean-field variational assumption to derive the message passing algorithms.

Discusses whether we can derive the expected free energy objective function on principled ground from the FEP, and discusses different potential objective functions for active inference.

Discusses the relationship between Active Inference and Control as Inference, a variational framework for understanding action selection which has emerged from RL.

Active Inference for Control Theory/Robotics

Neuroscience and Computational Psychiatry Applications

A comprehensive review of neuroscientific and computational psychiatry applications of the FEP and Active Inference.

Deep Active Inference

Demonstrates that the exploration afforded by the Expected Free Energy Objective is useful in a deep reinforcement learning setting. Also maintains uncertainty through model ensembles applied in a model-based RL setting.

Implements Deep Active Inference in a model-based RL setting using explicit planning with a transition model.

Implements deep active inference in a model-free policy gradient setting by amortising the learning of the expected-free-energy value function. Uses a transition model for the state-information gain term in the expected free energy.

The first paper to try combining active inference with deep neural networks. Demonstrates the importance of the exploratory terms of the EFE to solve the mountain-car problem.

Acknowledgements

Many thanks to @conorheins, Tomasz Korbak, Ryan Smith, Mel Andrews, Casper Hesp, and Manuel Baltieri for their helpful suggestions.

Contributing

To contribute, please make pull requests adding entries to the bibtex file.

The README file was generated from bibtex using the bibtex_to_md.py file. The keywords to use for each classification (Survey, Discrete-state-space etc) can be found at the bottom of the .py file.

This code and structure is heavily inspired by https://github.com/optimass/continual_learning_papers.

About

A repository for major/influential FEP and active inference papers.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •