You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Accelerating Neural Network Evaluation

Neural networks work by activating neurons in the prior row based on the linear combination of the prior row and weights determined by the network's training.

This is equivalent to the matrix operations:

Matrix operations are easily parallelized. Below is a table showing the number of operations needed to multiply an NxN matrix by a vector of length N, the number of cycles needed if optimally parallelized, and the number of cores to achieve optimal parallelization.

NOperationsParallelized CyclesCores Required
2624
428316
8120464
164965256
322,01661,024
648,12874,096
12832,640816,384

Commercial CPUs top out at around 32 cores (AMD Ryzen Threadripper 3970X), so for anything more complex than 5x5 matrices, we need to use GPUs, which have more cores.

  • No labels