Commit Graph

3 Commits

Author SHA1 Message Date
Yoan Picchi
7054378c93 Speed up truffle with 256b TBL instructions
256b wide SVE vectors allow some simplification of truffle.
Up to 40% speedup on graviton3. Going from 12500 MB/s to 17000 MB/s
onhe microbenchmark.
SVE2 also offer this capability for 128b vector with a speedup around
25% compared to normal SVE

Add unit tests and benchmark for this wide variant

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>
2024-05-22 16:13:53 +00:00
Yoan Picchi
c67076ce22 Add truffle SVE implementation
Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>
2024-01-09 16:50:03 +00:00
Konstantinos Margaritis
ae81088193 add arm truffle block function 2021-10-12 11:51:35 +03:00