24 December, 2025
mit-s-ai-framework-simulates-evolution-of-vision-systems

Why did humans evolve the eyes we have today? While scientists can’t go back in time to study the environmental pressures that shaped the evolution of the diverse vision systems in nature, a new computational framework developed by MIT researchers allows them to explore this evolution in artificial intelligence agents.

The framework, described as a “scientific sandbox,” enables embodied AI agents to evolve eyes and learn to see over many generations. Researchers can recreate different evolutionary trees by altering the structure of the world and the tasks AI agents complete, such as finding food or distinguishing objects.

Exploring Evolutionary Paths

This innovative approach allows scientists to investigate why certain animals evolved simple, light-sensitive patches as eyes, while others developed complex, camera-type eyes. The researchers’ experiments demonstrate how specific tasks drove the evolution of vision systems in the agents. For example, navigation tasks often led to the evolution of compound eyes with many individual units, akin to those of insects and crustaceans.

Conversely, agents focused on object discrimination tended to evolve camera-type eyes with irises and retinas. This framework could enable scientists to explore “what-if” questions about vision systems that are challenging to study experimentally. It may also guide the design of novel sensors and cameras for robots, drones, and wearable devices, balancing performance with real-world constraints like energy efficiency and manufacturability.

“While we can never go back and figure out every detail of how evolution took place, in this work we’ve created an environment where we can, in a sense, recreate evolution and probe the environment in all these different ways. This method of doing science opens the door to a lot of possibilities,” says Kushagra Tiwary, a graduate student at the MIT Media Lab and co-lead author of a paper on this research.

Building a Scientific Sandbox

The research, appearing in Science Advances, began as a conversation among the researchers about discovering new vision systems that could be useful in various fields, such as robotics. To test their “what-if” questions, the researchers decided to use AI to explore the numerous evolutionary possibilities.

“What-if questions inspired me when I was growing up to study science. With AI, we have a unique opportunity to create these embodied agents that allow us to ask the kinds of questions that would usually be impossible to answer,” Tiwary explains.

The researchers constructed this evolutionary sandbox by converting all the elements of a camera—sensors, lenses, apertures, and processors—into parameters that an embodied AI agent could learn. These building blocks served as the starting point for an algorithmic learning mechanism that an agent would use to evolve eyes over time.

“We couldn’t simulate the entire universe atom-by-atom. It was challenging to determine which ingredients we needed, which ingredients we didn’t need, and how to allocate resources over those different elements,” notes Brian Cheung, a postdoc in the Center for Brains, Minds, and Machines and an incoming assistant professor at the University of California San Francisco.

Testing Hypotheses and Future Directions

In their framework, an evolutionary algorithm can select which elements to evolve based on environmental constraints and the agent’s task. Each environment has a specific task, such as navigation, food identification, or prey tracking, designed to mimic real visual tasks animals must overcome to survive. Agents start with a single photoreceptor and an associated neural network model that processes visual information.

Over an agent’s lifetime, it is trained using reinforcement learning—a trial-and-error technique where the agent is rewarded for accomplishing its task. The environment also incorporates constraints, such as a limited number of pixels for an agent’s visual sensors.

“These constraints drive the design process, the same way we have physical constraints in our world, like the physics of light, that have driven the design of our own eyes,” Tiwary says.

Over many generations, agents evolve different elements of vision systems to maximize rewards. Their framework employs a genetic encoding mechanism to computationally mimic evolution, where individual genes mutate to control an agent’s development.

For instance, morphological genes capture how the agent views the environment and control eye placement; optical genes determine how the eye interacts with light and dictate the number of photoreceptors; and neural genes control the learning capacity of the agents.

In future research, the team aims to use this simulator to explore the best vision systems for specific applications, potentially aiding in the development of task-specific sensors and cameras. They also plan to integrate large language models (LLMs) into their framework to facilitate “what-if” questions and study additional possibilities.

“There’s a real benefit that comes from asking questions in a more imaginative way. I hope this inspires others to create larger frameworks, where instead of focusing on narrow questions that cover a specific area, they are looking to answer questions with a much wider scope,” Cheung says.

This work was supported, in part, by the Center for Brains, Minds, and Machines and the Defense Advanced Research Projects Agency (DARPA) Mathematics for the Discovery of Algorithms and Architectures (DIAL) program.