Simulated Senses


Written by:

Robert McIntyre

1 Background

Artificial Intelligence has tried and failed for more than half a century to produce programs as flexible, creative, and "intelligent" as the human mind itself. Clearly, we are still missing some important ideas concerning intelligent programs or we would have strong AI already. What idea could be missing?

When Turing first proposed his famous "Turing Test" in the groundbreaking paper Computing Machines and Intelligence, he gave little importance to how a computer program might interact with the world:

“We need not be too concerned about the legs, eyes, etc. The example of Miss Helen Keller shows that education can take place provided that communication in both directions between teacher and pupil can take place by some means or other.”

And from the example of Hellen Keller he went on to assume that the only thing a fledgling AI program could need by way of communication is a teletypewriter. But Hellen Keller did possess vision and hearing for the first few months of her life, and her tactile sense was far more rich than any text-stream could hope to achieve. She possessed a body she could move freely, and had continual access to the real world to learn from her actions.

I believe that our programs are suffering from too little sensory input to become really intelligent. Imagine for a moment that you lived in a world completely cut off form all sensory stimulation. You have no eyes to see, no ears to hear, no mouth to speak. No body, no taste, no feeling whatsoever. The only sense you get at all is a single point of light, flickering on and off in the void. If this was your life from birth, you would never learn anything, and could never become intelligent. Actual humans placed in sensory deprivation chambers experience hallucinations and can begin to loose their sense of reality. Most of the time, the programs we write are in exactly this situation. They do not interface with cameras and microphones, and they do not control a real or simulated body or interact with any sort of world.

2 Simulation vs. Reality

I want demonstrate that multiple senses are what enable intelligence. There are two ways of playing around with senses and computer programs:

2.1 Simulation

The first is to go entirely with simulation: virtual world, virtual character, virtual senses. The advantages are that when everything is a simulation, experiments in that simulation are absolutely reproducible. It's also easier to change the character and world to explore new situations and different sensory combinations.

If the world is to be simulated on a computer, then not only do you have to worry about whether the character's senses are rich enough to learn from the world, but whether the world itself is rendered with enough detail and realism to give enough working material to the character's senses. To name just a few difficulties facing modern physics simulators: destructibility of the environment, simulation of water/other fluids, large areas, nonrigid bodies, lots of objects, smoke. I don't know of any computer simulation that would allow a character to take a rock and grind it into fine dust, then use that dust to make a clay sculpture, at least not without spending years calculating the interactions of every single small grain of dust. Maybe a simulated world with today's limitations doesn't provide enough richness for real intelligence to evolve.

2.2 Reality

The other approach for playing with senses is to hook your software up to real cameras, microphones, robots, etc., and let it loose in the real world. This has the advantage of eliminating concerns about simulating the world at the expense of increasing the complexity of implementing the senses. Instead of just grabbing the current rendered frame for processing, you have to use an actual camera with real lenses and interact with photons to get an image. It is much harder to change the character, which is now partly a physical robot of some sort, since doing so involves changing things around in the real world instead of modifying lines of code. While the real world is very rich and definitely provides enough stimulation for intelligence to develop as evidenced by our own existence, it is also uncontrollable in the sense that a particular situation cannot be recreated perfectly or saved for later use. It is harder to conduct science because it is harder to repeat an experiment. The worst thing about using the real world instead of a simulation is the matter of time. Instead of simulated time you get the constant and unstoppable flow of real time. This severely limits the sorts of software you can use to program the AI because all sense inputs must be handled in real time. Complicated ideas may have to be implemented in hardware or may simply be impossible given the current speed of our processors. Contrast this with a simulation, in which the flow of time in the simulated world can be slowed down to accommodate the limitations of the character's programming. In terms of cost, doing everything in software is far cheaper than building custom real-time hardware. All you need is a laptop and some patience.

3 Choose a Simulation Engine

Mainly because of issues with controlling the flow of time, I chose to simulate both the world and the character. I set out to make a world in which I could embed a character with multiple senses. My main goal is to make an environment where I can perform further experiments in simulated senses.

I examined many different 3D environments to try and find something I would use as the base for my simulation; eventually the choice came down to three engines: the Quake II engine, the Source Engine, and jMonkeyEngine.

3.1 Quake II/Jake2

I spent a bit more than a month working with the Quake II Engine from ID software to see if I could use it for my purposes. All the source code was released by ID software into the Public Domain several years ago, and as a result it has been ported and modified for many different reasons. This engine was famous for its advanced use of realistic shading and had decent and fast physics simulation. Researchers at Princeton used this code (video) to study spatial information encoding in the hippocampal cells of rats. Those researchers created a special Quake II level that simulated a maze, and added an interface where a mouse could run on top of a ball in various directions to move the character in the simulated maze. They measured hippocampal activity during this exercise to try and tease out the method in which spatial data was stored in that area of the brain. I find this promising because if a real living rat can interact with a computer simulation of a maze in the same way as it interacts with a real-world maze, then maybe that simulation is close enough to reality that a simulated sense of vision and motor control interacting with that simulation could reveal useful information about the real thing. There is a Java port of the original C source code called Jake2. The port demonstrates Java's OpenGL bindings and runs anywhere from 90% to 105% as fast as the C version. After reviewing much of the source of Jake2, I rejected it because the engine is too tied to the concept of a first-person shooter game. One of the problems I had was that there does not seem to be any easy way to attach multiple cameras to a single character. There are also several physics clipping issues that are corrected in a way that only applies to the main character and do not apply to arbitrary objects. While there is a large community of level modders, I couldn't find a community to support using the engine to make new things.

3.2 Source Engine

The Source Engine evolved from the Quake II and Quake I engines and is used by Valve in the Half-Life series of games. The physics simulation in the Source Engine is quite accurate and probably the best out of all the engines I investigated. There is also an extensive community actively working with the engine. However, applications that use the Source Engine must be written in C++, the code is not open, it only runs on Windows, and the tools that come with the SDK to handle models and textures are complicated and awkward to use.

3.3 jMonkeyEngine3

jMonkeyEngine is a new library for creating games in Java. It uses OpenGL to render to the screen and uses screengraphs to avoid drawing things that do not appear on the screen. It has an active community and several games in the pipeline. The engine was not built to serve any particular game but is instead meant to be used for any 3D game. After experimenting with each of these three engines and a few others for about 2 months I settled on jMonkeyEngine. I chose it because it had the most features out of all the open projects I looked at, and because I could then write my code in Clojure, an implementation of LISP that runs on the JVM.

Author: Robert McIntyre

Created: 2015-04-19 Sun 07:04

Emacs 24.4.1 (Org mode 8.3beta)