Yoan Picchi 938c026256 Speed up truffle with 256b TBL instructions
256b wide SVE vectors allow some simplification of truffle.
Up to 40% speedup on graviton3. Going from 12500 MB/s to 17000 MB/s
onhe microbenchmark.
SVE2 also offer this capability for 128b vector with a speedup around
25% compared to normal SVE

Add unit tests and benchmark for this wide variant

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>
2024-05-22 16:13:53 +00:00
..
2024-04-24 11:13:28 +03:00
2015-10-20 09:13:35 +11:00
2015-10-20 09:13:35 +11:00
2024-05-17 13:57:12 +03:00
2024-05-13 14:24:16 +03:00
2024-05-17 16:58:08 +03:00
2024-05-17 10:44:28 +03:00
2017-01-17 11:38:09 +11:00
2024-05-17 16:58:08 +03:00
2021-02-15 13:54:19 +02:00
2024-05-15 17:18:53 +03:00
2024-05-17 13:57:12 +03:00
2024-05-10 10:08:14 +03:00
2024-05-17 13:57:12 +03:00
2015-10-20 09:13:35 +11:00
2015-10-20 09:13:35 +11:00
2024-04-24 11:13:02 +03:00
2024-05-17 13:57:12 +03:00
2024-04-29 13:13:07 +03:00
2024-04-29 13:13:07 +03:00
2024-05-17 13:57:12 +03:00
2024-05-14 13:32:50 +03:00
2024-05-17 16:58:08 +03:00
2024-05-15 17:05:50 +03:00
2024-04-24 11:13:28 +03:00
2024-05-17 16:58:08 +03:00
2015-10-20 09:13:35 +11:00
2015-10-20 09:13:35 +11:00
2024-05-17 16:58:08 +03:00
2023-09-05 13:49:52 +03:00
2024-05-17 16:58:08 +03:00