Micro-kernels for portable and efficient matrix multiplication in deep learning
CC BY
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Book |
| Language: | English |
| Published: |
Springer
2023
|
| Subjects: | |
| Online Access: | https://link.springer.com/article/10.1007/s11227-022-05003-3 https://dlib.phenikaa-uni.edu.vn/handle/PNK/7328 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| id |
oai:localhost:PNK-7328 |
|---|---|
| record_format |
dspace |
| spelling |
oai:localhost:PNK-73282023-03-30T04:06:20Z Micro-kernels for portable and efficient matrix multiplication in deep learning Guillermo, Alaejos Adrián, Castelló Héctor, Martínez template-based micro-kernels AMD EPYC CC BY Our work exposes the structure of the template-based micro-kernels for ARM Neon (128-bit SIMD), ARM SVE (variable-length SIMD) and Intel AVX512 (512-bit SIMD), showing considerable performance for an NVIDIA Carmel processor (ARM Neon), a Fujitsu A64FX processor (ARM SVE) and on an AMD EPYC 7282 processor (256-bit SIMD). 2023-03-30T04:06:20Z 2023-03-30T04:06:20Z 2023 Book https://link.springer.com/article/10.1007/s11227-022-05003-3 https://dlib.phenikaa-uni.edu.vn/handle/PNK/7328 en application/pdf Springer |
| institution |
Digital Phenikaa |
| collection |
Digital Phenikaa |
| language |
English |
| topic |
template-based micro-kernels AMD EPYC |
| spellingShingle |
template-based micro-kernels AMD EPYC Guillermo, Alaejos Adrián, Castelló Héctor, Martínez Micro-kernels for portable and efficient matrix multiplication in deep learning |
| description |
CC BY |
| format |
Book |
| author |
Guillermo, Alaejos Adrián, Castelló Héctor, Martínez |
| author_facet |
Guillermo, Alaejos Adrián, Castelló Héctor, Martínez |
| author_sort |
Guillermo, Alaejos |
| title |
Micro-kernels for portable and efficient matrix multiplication in deep learning |
| title_short |
Micro-kernels for portable and efficient matrix multiplication in deep learning |
| title_full |
Micro-kernels for portable and efficient matrix multiplication in deep learning |
| title_fullStr |
Micro-kernels for portable and efficient matrix multiplication in deep learning |
| title_full_unstemmed |
Micro-kernels for portable and efficient matrix multiplication in deep learning |
| title_sort |
micro-kernels for portable and efficient matrix multiplication in deep learning |
| publisher |
Springer |
| publishDate |
2023 |
| url |
https://link.springer.com/article/10.1007/s11227-022-05003-3 https://dlib.phenikaa-uni.edu.vn/handle/PNK/7328 |
| _version_ |
1761821913614647296 |
| score |
8.893527 |
