We have noticed the emergence of several core architectural principles or axioms that seem necessary for machine learning systems to progress:

Models Are Programs, and Programs Are Transmitters

Gamalon was originally funded by one of the largest investments by DARPA and the US Federal Government in next generation machine learning, the DARPA Probabilistic Programming for Advancing Machine Learning (PPAML) Program. Probabilistic programming in the minds of most machine learning researchers and in the press is usually described as “you write a model as a program” and then “you solve it using single site Metropolis-Hastings MCMC.” Many leaders in the machine learning and deep learning community now embrace the tenet that models should be programs. This is a beautiful and important idea. The no-free lunch theorem, however, teaches us to reject the notion of a one-size-fits-all solver for all models which means we apply solvers ranging from and combining elements of automatic differentiation with stochastic gradient descent, Markov Chain Monte Carlo techniques, variational methods, and evolutionary techniques. Here are some of our favorite publications that explain how a model should be a program:

A machine learning system can be usefully viewed as a receiver. It receives a signal and it must solve an inverse problem in order to infer the “hidden” state of a “transmitter” – the system that generated the data that we observe. The receiver contains a model of the transmitter which is called a “generative model.” The more accurate and detailed this model is (whether via training or by design), the better the receiver works. Over the past 75 years, the fields of information theory and communications theory has developed a very rich set of mathematical tools within this framework, and our cell phones and flash memories would not work without it. Read these:

Variables Must Carry Uncertainty

Uncertainty is always present when solving complex inverse problems, because there almost always many possible answers and you usually do not know for sure if you have the right one. Although alternative formulations of uncertainty exist (such as Dempster Shafer theory), we happen to think that probability theory is pretty cool. Read these:

Users Interact with “Hidden Layers” in a Model

We do not force users to program every time they would like to interact with a computer. Computer science added user interfaces to programs, so that users can readily influence the control flow of a conventional program while it is running. Gamalon has pioneered the invention of user interfaces for interaction directly with the “hidden” or “latent” variables in machine learning models while inference/learning is running. We have not had a chance to write about this for a scientific publication yet (we have been a little busy lately building products), but our patents are beginning to publish:

Axiom #4

We see at least one more core architectural principle necessary for progress in machine learning and artificial intelligence, which we hope to talk about more in future posts.



Ben Vigoda, Gamalon’s CEO, spoke recently at MIT Technology Review EmTech along with Pedro Domingos from University of Washington, Noah Goodman from Stanford, Ruslan Salakhutdinov from Apple/CMU, Ilya Sutskever from OpenAI, Maya Gupta from Google, and Eric Horvitz from Microsoft.

He describes how deep learning and other state-of-the art machine learning is like training a dog to provide a desired response to a stimulus – ‘ring the bell, give some food’ , ‘ring the bell, give some food’, and so forth, except that with today’s machine learning you typically need to repeat this kind of labeled input/output pair 10,000 times.

By contrast, to teach a human we would just say, ‘This is a dinner bell, when I ring it I am going to serve you some food’ – you would insert that idea directly into their mind in between where the stimulus comes in and the response goes out – by talking to them. The person can still learn from stimulus-response experiences, but you can also teach them by communicating ideas to them. This is how Gamalon’s Idea Learning works.


In this video, we compare Gamalon’s new Idea Learning technology versus state-of-the-art deep learning while playing Pictionary: we draw something, and the system must guess what we drew.

We show that the Gamalon Idea Learning system learns from only a few examples, not millions. It can learn using a tablet processor, not hundreds of servers. It learns right away while we play with it, not over weeks or months. And it learns from just one person, not from thousands. Someday soon you might even have your own private machine intelligence running on your mobile device!

Read more


Our CEO, Ben Vigoda, gave a talk at TEDx Boston 2016 called “When Machines Have Ideas” that describes why building “stories” (i.e. Bayesian generative models) into machine intelligence systems can be very powerful.


Listen to Katherine Gorman interview our CEO, Ben Vigoda, on Talking Machines.