Researchers have created a fully integrated photonic processor capable of executing all essential calculations for deep neural networks using light, demonstrating the potential for improved speed and energy efficiency in rigorous deep learning tasks such as lidar applications, astronomical studies, and navigational systems.
Deep neural network models, which power today’s advanced machine learning applications, have become so extensive and intricate that they are stretching the capabilities of conventional electronic computing hardware.
Photonic hardware, which utilizes light to conduct machine-learning calculations, presents a faster and more energy-saving alternative. Nonetheless, certain neural network operations still require off-chip electronics or other methods that can slow down the process and reduce efficiency.
Building upon ten years of research, scientists from MIT and other institutions have created a new photonic chip that eliminates these limitations. They showcased a fully integrated photonic processor capable of conducting all the significant computations of a deep neural network optically on the chip itself.
This optical device was able to complete the necessary computations for a machine-learning classification task in under half a nanosecond, achieving over 92 percent accuracy—performance comparable to traditional hardware solutions.
The chip consists of interconnected modules that collectively form an optical neural network and is produced using standard commercial foundry techniques, potentially allowing for the technology to scale and integrate with electronic systems.
Looking ahead, this photonic processor could pave the way for faster and more energy-efficient deep learning solutions in computation-heavy tasks such as lidar, scientific investigations in fields like astronomy and particle physics, or rapid telecommunications.
“It’s not just about model performance; the speed of obtaining results is crucial in many cases. Now that we possess a complete system capable of running neural networks using optics on a nanosecond time scale, we can start to consider more advanced applications and algorithms,” notes Saumil Bandyopadhyay ’17, MEng ’18, PhD ’23, a visiting scientist in the Quantum Photonics and AI Group at the Research Laboratory of Electronics (RLE) and lead author of the published paper on this new chip.
Bandyopadhyay collaborated with Alexander Sludds ’18, MEng ’19, PhD ’23, Nicholas Harris PhD ’17, and Darius Bunandar PhD ’19; Stefan Krastanov, a former research scientist at RLE now serving as an assistant professor at the University of Massachusetts at Amherst; Ryan Hamerly, a visiting scientist at RLE and senior scientist at NTT Research; Matthew Streshinsky, who was previously the silicon photonics lead at Nokia and is now co-founder and CEO of Enosemi; Michael Hochberg, president of Periplous, LLC; and senior author Dirk Englund, a professor in the Department of Electrical Engineering and Computer Science and a principal investigator at RLE. Their findings are published today in Nature Photonics.
Harnessing Light for Machine Learning
Deep neural networks consist of numerous interconnected layers of nodes or neurons that process input data to generate outputs. A critical function in these networks involves linear algebra for performing matrix multiplications, which reshapes the data as it is transferred between layers.
In addition to these linear processes, deep neural networks execute nonlinear operations that help them understand complex patterns. Nonlinear functions, such as activation functions, empower deep neural networks to tackle difficult problems effectively.
In 2017, Englund’s group, along with researchers from Marin Soljačić’s lab, showcased an optical neural network on a single photonic chip capable of conducting matrix multiplication utilizing light.
However, at that stage, the device lacked the capability to execute nonlinear operations directly on the chip, necessitating the conversion of optical data into electrical signals for processing by a digital processor.
“Optical nonlinearity poses significant challenges because photons do not easily interact with one another. This makes it power-intensive to provoke optical nonlinearities, complicating the development of a scalable system,” Bandyopadhyay observes.
The team addressed this obstacle by developing devices called nonlinear optical function units (NOFUs), which merge electronics and optics to implement nonlinear operations on the chip.
The researchers constructed an optical deep neural network on a photonic chip featuring three layers of devices designed for both linear and nonlinear operations.
An Integrated Network
The system first encodes deep neural network parameters into light. A series of programmable beamsplitters, previously demonstrated in the 2017 study, carries out matrix multiplication on those inputs.
The resulting data then move to programmable NOFUs, which implement nonlinear functions by diverting a small fraction of light to photodiodes that convert optical signals into electrical currents. Remarkably, this process eliminates the need for an external amplifier and requires minimal energy.
“We remain in the optical domain throughout the process, only converting to the electrical domain at the point we read the output. This allows us to achieve extremely low latency,” says Bandyopadhyay.
Such low latency enabled efficient training of a deep neural network on the chip, a method called in situ training, which typically demands substantial energy when done using digital hardware.
“This is particularly advantageous for systems processing optical signals in real-time, like navigation or telecommunications,” he adds.
The photonic system achieved over 96 percent accuracy during training tests and more than 92 percent accuracy during inference, similar to traditional hardware performance. Additionally, the chip executed crucial computations in less than half a nanosecond.
“This research illustrates that computing—essentially translating inputs to outputs—can be adapted onto new frameworks of linear and nonlinear physics, allowing for fundamentally different scaling laws in computation versus the effort required,” explains Englund.
The entire circuit was manufactured using similar infrastructure and foundry processes typical for producing CMOS computer chips. This suggests potential for the chip to be mass-produced using reliable techniques that incur minimal fabrication errors.
Future work will focus on scaling the device and integrating it with practical electronics, such as cameras and telecommunications systems, Bandyopadhyay emphasizes. Additionally, the researchers aspire to investigate algorithms that can utilize the benefits of optical processing to enhance training speed and energy efficiency.
This research received funding from the National Science Foundation, the Air Force Office of Scientific Research, and NTT Research.