Micro-kernels for portable and efficient matrix multiplication in deep learning

CC BY

Saved in:
Bibliographic Details
Main Authors: Guillermo, Alaejos, Adrián, Castelló, Héctor, Martínez
Format: Book
Language:English
Published: Springer 2023
Subjects:
Online Access:https://link.springer.com/article/10.1007/s11227-022-05003-3
https://dlib.phenikaa-uni.edu.vn/handle/PNK/7328
Tags: Add Tag
No Tags, Be the first to tag this record!
id oai:localhost:PNK-7328
record_format dspace
spelling oai:localhost:PNK-73282023-03-30T04:06:20Z Micro-kernels for portable and efficient matrix multiplication in deep learning Guillermo, Alaejos Adrián, Castelló Héctor, Martínez template-based micro-kernels AMD EPYC CC BY Our work exposes the structure of the template-based micro-kernels for ARM Neon (128-bit SIMD), ARM SVE (variable-length SIMD) and Intel AVX512 (512-bit SIMD), showing considerable performance for an NVIDIA Carmel processor (ARM Neon), a Fujitsu A64FX processor (ARM SVE) and on an AMD EPYC 7282 processor (256-bit SIMD). 2023-03-30T04:06:20Z 2023-03-30T04:06:20Z 2023 Book https://link.springer.com/article/10.1007/s11227-022-05003-3 https://dlib.phenikaa-uni.edu.vn/handle/PNK/7328 en application/pdf Springer
institution Digital Phenikaa
collection Digital Phenikaa
language English
topic template-based micro-kernels
AMD EPYC
spellingShingle template-based micro-kernels
AMD EPYC
Guillermo, Alaejos
Adrián, Castelló
Héctor, Martínez
Micro-kernels for portable and efficient matrix multiplication in deep learning
description CC BY
format Book
author Guillermo, Alaejos
Adrián, Castelló
Héctor, Martínez
author_facet Guillermo, Alaejos
Adrián, Castelló
Héctor, Martínez
author_sort Guillermo, Alaejos
title Micro-kernels for portable and efficient matrix multiplication in deep learning
title_short Micro-kernels for portable and efficient matrix multiplication in deep learning
title_full Micro-kernels for portable and efficient matrix multiplication in deep learning
title_fullStr Micro-kernels for portable and efficient matrix multiplication in deep learning
title_full_unstemmed Micro-kernels for portable and efficient matrix multiplication in deep learning
title_sort micro-kernels for portable and efficient matrix multiplication in deep learning
publisher Springer
publishDate 2023
url https://link.springer.com/article/10.1007/s11227-022-05003-3
https://dlib.phenikaa-uni.edu.vn/handle/PNK/7328
_version_ 1761821913614647296
score 8.881002