Relevant Academic Literature
We have noticed the emergence of several core architectural principles or axioms that seem necessary for machine learning systems to progress
Models are programs and programs are transmitters
Gamalon was originally funded by one of the largest investments by DARPA and the US Federal Government in next generation machine learning, the DARPA Probabilistic Programming for Advancing Machine Learning (PPAML) Program. Probabilistic programming in the minds of most machine learning researchers and in the press is usually described as "you write a model as a program" and then "you solve it using single site Metropolis-Hastings MCMC." Many leaders in the machine learning and deep learning community now embrace the tenet that models should be programs. This is a beautiful and important idea. The no-free lunch theorem, however, teaches us to reject the notion of a one-size-fits-all solver for all models which means we apply solvers ranging from and combining elements of automatic differentiation with stochastic gradient descent, Markov Chain Monte Carlo techniques, variational methods, and evolutionary techniques. Here are some of our favorite publications that explain how a model should be a program:
- Deep Learning est mort. Vive Differentiable Programming!
- The Design and Implementation of Probabilistic Programming Languages
Noah D. Goodman and Andreas Stuhlmüller.
- A repository for generative models
- Human-level concept learning through probabilistic program induction
Brenden M. Lake, Ruslan Salakhutdinov, Joshua B. Tenenbaum
- Church: a language for generative models
Noah D. Goodman, Vikash K. Mansinghka, Daniel M. Roy, Keith Bonawitz & Joshua B. Tenenbaum
A machine learning system can be usefully viewed as a receiver. It receives a signal and it must solve an inverse problem in order to infer the "hidden" state of a "transmitter" - the system that generated the data that we observe. The receiver contains a model of the transmitter which is called a "generative model." The more accurate and detailed this model is (whether via training or by design), the better the receiver works. Over the past 75 years, the fields of information theory and communications theory has developed a very rich set of mathematical tools within this framework, and our cell phones and flash memories would not work without it. Read these:
- Information Theory, Inference and Learning Algorithms
David J. C. MacKay
- Iterative Receiver Design
Variables Must Carry Uncertainty
Uncertainty is always present when solving complex inverse problems, because there are almost always many possible answers and you usually do not know for sure if you have the right one. Although alternative formulations of uncertainty exist (such as Dempster Shafer theory), we happen to think that probability theory is pretty cool. Read these:
- Probability Theory, The Logic of Science
- Understanding Belief Propagation and its Generalizations
Jonathan S. Yedidia, William T. Freeman, Yair Weiss
- An Introduction to Factor Graphs
- Everything written by Judea Pearl
Users Interact with "Hidden Layers" in a Model
We do not force users to program every time they would like to interact with a computer. Computer science added user interfaces to programs, so that users can readily influence the control flow of a conventional program while it is running. Gamalon has pioneered the invention of user interfaces for interaction directly with the "hidden" or "latent" variables in machine learning models while inference/learning is running. We have not had a chance to write about this for a scientific publication yet (we have been a little busy lately building products), but our patents are beginning to publish.
We see at least one more core architectural principle necessary for progress in machine learning and artificial intelligence, which we hope to talk about more in future posts.