We have noticed the emergence of several core architectural principles or axioms that seem necessary for machine learning systems to progress:

#### Models Are Programs, and Programs Are Transmitters

Gamalon was originally funded by one of the largest investments by DARPA and the US Federal Government in next generation machine learning, the DARPA Probabilistic Programming for Advancing Machine Learning (PPAML) Program. Probabilistic programming in the minds of most machine learning researchers and in the press is usually described as “you write a model as a program” and then “you solve it using single site Metropolis-Hastings MCMC.” Many leaders in the machine learning and deep learning community now embrace the tenet that **models should be programs. **This is a beautiful and important idea. The no-free lunch theorem, however, teaches us to reject the notion of a one-size-fits-all solver for all models which means we apply solvers ranging from and combining elements of automatic differentiation with stochastic gradient descent, Markov Chain Monte Carlo techniques, variational methods, and evolutionary techniques. Here are some of our favorite publications that explain how a model should be a program:

**Deep Learning est mort. Vive Differentiable Programming!**

Yann Lecun**The Design and Implementation of Probabilistic Programming Languages**

Noah D. Goodman and Andreas Stuhlmüller.**A repository for generative models**

Andreas Stuhlmüller**Human-level concept learning through probabilistic program induction**

Brenden M. Lake, Ruslan Salakhutdinov, Joshua B. Tenenbaum**Church: a language for generative models**

Noah D. Goodman, Vikash K. Mansinghka, Daniel M. Roy, Keith Bonawitz & Joshua B. Tenenbaum

A machine learning system can be usefully viewed as a receiver. It receives a signal and it must solve an inverse problem in order to infer the “hidden” state of a “transmitter” – the system that generated the data that we observe. **The receiver contains a model of the transmitter which is called a “generative model.” **The more accurate and detailed this model is (whether via training or by design), the better the receiver works. Over the past 75 years, the fields of information theory and communications theory has developed a very rich set of mathematical tools within this framework, and our cell phones and flash memories would not work without it. Read these:

**Information Theory, Inference and Learning Algorithms**

David J. C. MacKay**Iterative Receiver Design**

Henk Wymeersch

#### Variables Must Carry Uncertainty

Uncertainty is always present when solving complex inverse problems, because there are almost always many possible answers and you usually do not know for sure if you have the right one. Although alternative formulations of uncertainty exist (such as Dempster Shafer theory), we happen to think that probability theory is pretty cool. Read these:

**Probability Theory, The Logic of Science**

E.T. Jaynes**Understanding Belief Propagation and its Generalizations**

Jonathan S. Yedidia, William T. Freeman, Yair Weiss**An Introduction to Factor Graphs**

Hans-Andrea Loeliger**Everything written by Judea Pearl**

#### Users Interact with “Hidden Layers” in a Model

We do not force users to program every time they would like to interact with a computer. Computer science added user interfaces to programs, so that users can readily influence the control flow of a conventional program while it is running. Gamalon has pioneered the invention of user interfaces for interaction directly with the “hidden” or “latent” variables in machine learning models while inference/learning is running. We have not had a chance to write about this for a scientific publication yet (we have been a little busy lately building products), but our patents are beginning to publish:

#### Axiom #4

We see at least one more core architectural principle necessary for progress in machine learning and artificial intelligence, which we hope to talk about more in future posts.