Micro-kernels for portable and efficient matrix multiplication in deep learning
CC BY
Saved in:
Main Authors: | , , |
---|---|
Format: | Book |
Language: | English |
Published: |
Springer
2023
|
Subjects: | |
Online Access: | https://link.springer.com/article/10.1007/s11227-022-05003-3 https://dlib.phenikaa-uni.edu.vn/handle/PNK/7328 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
oai:localhost:PNK-7328 |
---|---|
record_format |
dspace |
spelling |
oai:localhost:PNK-73282023-03-30T04:06:20Z Micro-kernels for portable and efficient matrix multiplication in deep learning Guillermo, Alaejos Adrián, Castelló Héctor, Martínez template-based micro-kernels AMD EPYC CC BY Our work exposes the structure of the template-based micro-kernels for ARM Neon (128-bit SIMD), ARM SVE (variable-length SIMD) and Intel AVX512 (512-bit SIMD), showing considerable performance for an NVIDIA Carmel processor (ARM Neon), a Fujitsu A64FX processor (ARM SVE) and on an AMD EPYC 7282 processor (256-bit SIMD). 2023-03-30T04:06:20Z 2023-03-30T04:06:20Z 2023 Book https://link.springer.com/article/10.1007/s11227-022-05003-3 https://dlib.phenikaa-uni.edu.vn/handle/PNK/7328 en application/pdf Springer |
institution |
Digital Phenikaa |
collection |
Digital Phenikaa |
language |
English |
topic |
template-based micro-kernels AMD EPYC |
spellingShingle |
template-based micro-kernels AMD EPYC Guillermo, Alaejos Adrián, Castelló Héctor, Martínez Micro-kernels for portable and efficient matrix multiplication in deep learning |
description |
CC BY |
format |
Book |
author |
Guillermo, Alaejos Adrián, Castelló Héctor, Martínez |
author_facet |
Guillermo, Alaejos Adrián, Castelló Héctor, Martínez |
author_sort |
Guillermo, Alaejos |
title |
Micro-kernels for portable and efficient matrix multiplication in deep learning |
title_short |
Micro-kernels for portable and efficient matrix multiplication in deep learning |
title_full |
Micro-kernels for portable and efficient matrix multiplication in deep learning |
title_fullStr |
Micro-kernels for portable and efficient matrix multiplication in deep learning |
title_full_unstemmed |
Micro-kernels for portable and efficient matrix multiplication in deep learning |
title_sort |
micro-kernels for portable and efficient matrix multiplication in deep learning |
publisher |
Springer |
publishDate |
2023 |
url |
https://link.springer.com/article/10.1007/s11227-022-05003-3 https://dlib.phenikaa-uni.edu.vn/handle/PNK/7328 |
_version_ |
1761821913614647296 |
score |
8.891695 |