The proposed invention is a computational approach that makes convolutional neural network (CNN) execution less energy expensive for live computer vision applications.
In the real-time video, every frame differs only slightly from previous frames. A generic CNN accelerator runs equally expensive computations for every frame. An alternative strategy is to process the input video as a mixture of keyframes, which undergo full and precise CNN execution, and predicted frames, which use cheaper approximate execution.
Keyframes are selected by a proprietary activation motion compensation (AMC) algorithm, of which the warp engine is one of the main components. For keyframes, AMC sends the unmodified pixels to the layer accelerators and invokes them to run the full CNN. For predicted frames, AMC performs activation warping and invokes the layer accelerators to compute the CNN suffix. AMC’s activation warping takes in an old CNN activation and a converted vector field describing motion in the input image and produces an updated activation for the last spatial CNN layer.
The warp engine's architecture consists of four sparsity decoder lanes, a bilinear interpolator, and a min unit. Activation data are loaded into sparsity decoder lanes. The min unit checks the zero-gap in each lane and sends the minimum to all of the lanes each of which decrements its zero-gap by this amount. All lanes with zero gaps of zero after the min subtraction then provide their value register as input to the bilinear interpolator. The warp engine feeds activation outputs from the sparsity decoder lanes into the four weighting units in the bilinear interpolator. Once all activations have been produced for the target warped layer the CNN accelerators can begin the CNN suffix computation.
CNN's activation, classic machine learning, scientific computing, social networking, signal processing
Name: Ryan Luebke