0000000000901582

AUTHOR

Yu Qiao

showing 1 related works from this author

FeatherCNN: Fast Inference Computation with TensorGEMM on ARM Architectures

2020

Deep Learning is ubiquitous in a wide field of applications ranging from research to industry. In comparison to time-consuming iterative training of convolutional neural networks (CNNs), inference is a relatively lightweight operation making it amenable to execution on mobile devices. Nevertheless, lower latency and higher computation efficiency are crucial to allow for complex models and prolonged battery life. Addressing the aforementioned challenges, we propose FeatherCNN – a fast inference library for ARM CPUs – targeting the performance ceiling of mobile devices. FeatherCNN employs three key techniques: 1) A highly efficient TensorGEMM (generalized matrix multiplication) routine is app…

020203 distributed computingSource codeIterative methodComputer sciencebusiness.industrymedia_common.quotation_subjectDeep learningInference02 engineering and technologyParallel computingConvolutional neural networkMatrix multiplicationARM architectureComputational Theory and MathematicsHardware and ArchitectureSignal Processing0202 electrical engineering electronic engineering information engineeringArtificial intelligencebusinessmedia_commonIEEE Transactions on Parallel and Distributed Systems
researchProduct