Aiming to become the global leader in chip-scale photonic solutions by deploying Optical Interposer technology to enable the seamless integration of electronics and photonics for a broad range of vertical market applications

Free
Message: Could anyone be so kind...

Sipho, I have access to the article and have provided a relevant portion below.  Interesting they mention the challenge with processing in GPU'S and especially in edge devices, something TM directly spoke to as something POET can help with using their platform.

 

Main

The capacity of computing systems is in an arms race with the massively growing amount of visual data they seek to understand. In a range of applications—including autonomous driving, robotic vision, smart homes, remote sensing, microscopy, surveillance, defence and the Internet of Things—computational imaging systems record and process unprecedented amounts of data that are not seen by a human but instead are interpreted by algorithms built on artificial intelligence (AI).

Across these applications, deep neural networks (DNNs) are rapidly becoming the standard algorithmic approach for visual data processing. This is primarily because DNNs achieve state-of-the-art results across the board—often by a large margin. Recent breakthroughs in deep learning have been fuelled by the immense processing power and parallelism of modern graphics processing units (GPUs) and the availability of massive visual datasets that enable DNNs to be efficiently trained using supervised machine learning strategies.

However, high-end GPUs and other accelerators running increasingly complex neural networks are hungry for power and bandwidth; they require substantial processing times and bulky form factors. These constraints make it challenging to adopt DNNs in edge devices, such as cameras, autonomous vehicles, robots or Internet of Things peripherals. Consider vision systems in autonomous cars, which have to make robust decisions instantaneously using limited computational resources. When driving at high speed, split-second decisions can decide between life or death. Indeed, virtually all edge devices would benefit from leaner computational imaging systems, offering lower latency and improvements in size, weight and power.

The computing requirements of the two stages of a DNN—training and inference—are very different. During the training stage, the DNN is fed massive amounts of labelled examples and, using iterative methods, its parameters are optimized for a specific task. Once trained, the DNN is used for inference where some input data, such as an image, is sent through the network once, in a feedforward pass, to compute the desired result. GPUs are used for inference in some applications, but for many edge devices this is impractical, owing to the aforementioned reasons.

Despite the flexibility of electronic AI accelerators, optical neural networks (ONNs) and photonic circuits could represent a paradigm shift in this and other machine learning applications. Optical computing systems promise massive parallelism in conjunction with small device form factors and, in some implementations, little to no power consumption. Indeed, optical interconnects that use light to achieve communications in computing systems are already widely used in data centres today, and the increasing use of optical interconnects deeper inside computing systems is probably essential for continued scaling. Unlike electrical interconnect technologies, optical interconnects offer the potential for orders of magnitude improvements in bandwidth density and in energy per bit in communications as we move to deeper integration of optics, optoelectronics and electronics. Such improved interconnects could allow hybrid electronic–optical DNNs, and the same low-energy, highly parallel integrated technologies could be used as part of analogue optical processors.

General-purpose optical computing has yet to mature into a practical technology despite the enormous potential of optical computers and about half a century of focused research efforts. However, inference tasks—especially for visual computing applications—are well suited for implementation with all-optical or hybrid optical–electronic systems. For example, linear optical elements can calculate convolutions, Fourier transforms, random projections and many other operations ‘for free’—that is, as a byproduct of light–matter interaction or light propagation. These operations are the fundamental building blocks of the DNN architectures that drive most modern visual computing algorithms. The possibility of executing these operations at the speed of light, potentially with little to no power requirements, holds transformative potential that we survey in this Perspective.

 

Historical overview of optical computing

Research into neuromorphic computing was intense in the 1980s. Following early pioneering work, Rumelhart, Hinton and Williams published a deeply influential paper in 1986 describing the error-backpropagation method for training multi-layer networks. Analogue implementations of neural networks emerged as a promising approach for dealing with the high computational load in training and reading large neural networks. Several analogue very-large-scale integration circuit implementations were demonstrated and, in parallel, analogue optical realizations were pursued. The first optical neural network was a modest demonstration of a fully connected network of 32 neurons with feedback. This demonstration triggered interesting new research in optical neural networks, reviewed by Denz. The next major step for optical neural networks was the introduction of dynamic nonlinear crystals for the implementation of adaptive connections between optoelectronic neurons arranged in planes. In addition to their dynamic nature, nonlinear crystals are inherently three-dimensional (3D) devices, and they allow the storage of a much larger number of weights. In an ambitious demonstration published in 1993, for example, an optical two-layer network was trained to recognize faces with very good accuracy by storing approximately 1 billion weights in a single photorefractive crystal.

Despite promising demonstrations of analogue hardware implementations, interest in custom optical hardware waned in the 1990s. There were three main reasons for this: (1) the advantages (power and speed) of the analogue accelerators are useful only for very large networks; (2) the technology for the optoelectronic implementation of the nonlinear activation function was immature; and (3) the difficulty in controlling analogue weights made it difficult to reliably control large optical networks.

The situation has changed in the intervening years. DNNs have emerged as one of the dominant algorithmic approaches for many applications. Moreover, major improvements in optoelectronics and silicon photonics, in particular coupled with the emergence of extremely large networks, have led many researchers to revisit the idea of implementing neural networks optically.

 

Photonic circuits for artificial intelligence

Modern DNN architectures are cascades of linear layers followed by nonlinear activation functions that repeat many times over. The most general type of linear layer is fully connected, which means that each output neuron is a weighted sum of all input neurons—a multiply–accumulate (MAC) operation. This is mathematically represented as a matrix–vector multiplication, which can be efficiently implemented in the optical domain. One specific change that has occurred since earlier optical computing work is the understanding that meshes of Mach–Zehnder interferometers (MZIs) in specific architectures (for example, those based on singular value matrix decomposition) can implement arbitrary matrix multiplication without fundamental loss; these architectures are also easily configured and controlled.

 

Specifically, recent silicon photonic neuromorphic circuits have demonstrated such singular value matrix decomposition implementations of matrix–vector products using coherent light. In this case, MZIs fabricated on a silicon chip implement the element-wise multiplications.  This design represents a truly parallel implementation of one of the most crucial building blocks of neural networks using light, and modern foundries could easily mass-fabricate this type of photonic system.

 

One of the challenges of such a design is that the number of MZIs grows as N2 with the number of elements N in the vector, a necessary consequence of implementing an arbitrary matrix. As the size of the photonic circuits grows, losses, noise and imperfections also become larger issues. As a result, it becomes increasingly difficult to construct a sufficiently accurate model to train it on a computer. Approaches to overcoming this difficulty include designing the circuit with robustness to imperfections, automatically ‘perfecting’ the circuit, or training the photonic neuromorphic circuit in situ.

 

As an alternative to MZI-based MACs, Feldmann et al. recently introduced an all-optical neurosynaptic network based on phase-change materials (PCM). In this design, PCM cells implement the weighting of the linear layer and a PCM cell coupled with a ring resonator implements a nonlinear activation function akin to a rectified linear unit (ReLU).  Micro-ring weight banks were also used by Tait et al. to implement a recurrent silicon photonic neural network.

 

Incorporating all-optical nonlinearities into photonic circuits is one of the key requirements for truly deep photonic networks. Yet, the challenge of efficiently implementing photonic nonlinear activation functions at low optical signal intensities was one of the primary reasons that interest in ONNs waned in the 1990s. Creative approaches from the last decade, such as nonlinear thresholders based on all-optical micro-ring resonators, saturable absorbers, electro-absorption modulators, or hybrid electro-optical approaches, represent possible solutions for overcoming this challenge in the near future. Earlier ‘self-electrooptic-effect’ device concepts may also offer hybrid solutions, especially with recent advances towards foundry-enabled mass fabrication of silicon-compatible versions of the energy-efficient quantum-confined Stark effect electro-absorption modulators on which they are based.

 

Comprehensive reviews of neuromorphic photonics and photonic MACs for neural networks were recently published. In one review, the authors provide a detailed comparison of photonic linear computing systems and their electronic counterparts, taking metrics such as energy, speed and computational density into account. The primary insight of this study was that photonic circuits exhibit advantages over electronic implementations in all of these metrics when considering large processor sizes, large vector sizes and low-precision operations.  However, the authors also point to the long-standing challenge of the high energy cost of electro–optical conversion, which is now rapidly approaching that of electronic links.

 

Photonic circuits could become fundamental building blocks of future AI systems. Although much progress has been made in the last 20 years, major challenges still lie ahead. Electronic computing platforms today offer programmability, mature and high-yield mass-fabrication technology, opportunities for 3D implementation, built-in signal restoration and gain, and robust memory solutions. Moreover, modern digital electronic systems offer high precision, which cannot be easily matched by analogue photonic systems. Yet, AI systems often do not require high precision, especially when used for inference tasks. Although programmability has traditionally been more difficult with photonic systems, first steps towards simplifying the process have recently been demonstrated.

 

Overall, the capabilities of photonic circuits have increased considerably over the last decade, and we have seen progress on some of the most crucial challenges that have hindered their utility in the past. Yet, to compete with their electronic counterparts, photonic computing systems still face fundamental engineering challenges. One direction that seems particularly well suited for optical and photonic processing is optical inference with incoherent light to rapidly process scene information under ambient lighting conditions. Such an approach presents many exciting opportunities for autonomous vehicles, robotics and computer vision.



From: Wetzstein etal.

https://www.nature.com/articles/s41586-020-2973-6

 

Share
New Message
Please login to post a reply