Chang, Harry
|
dbd3f66e87
|
Reinforced Teddy with 1-byte approach, based on "shift-or" and AVX2.
|
2017-08-21 11:10:11 +10:00 |
|
Justin Viiret
|
b126cbf556
|
fdr/teddy: simplify computing of confirm base
|
2017-08-21 10:39:00 +10:00 |
|
Justin Viiret
|
4f32a167d5
|
teddy: align major structures to cachelines
|
2017-08-21 10:38:59 +10:00 |
|
Matthew Barr
|
a295c96198
|
rename vpshufb to pshufb_m256
|
2017-05-30 13:59:23 +10:00 |
|
Matthew Barr
|
8201183138
|
Check compiler architecture flags in one place
|
2017-04-26 15:18:26 +10:00 |
|
Alex Coyte
|
8af4850d85
|
remove 'fast teddy' models
|
2017-04-26 14:43:43 +10:00 |
|
Wang, Xiang W
|
df7bc22ae0
|
fdr: remove confirm split and pull-back
|
2017-04-26 14:43:09 +10:00 |
|
Justin Viiret
|
3d9a60d023
|
teddy: apply poison mask after prep_conf_ work
This simplifies the code, and removes all the all-ones p_mask uses,
which we were otherwise trusting the optimizer to remove.
|
2016-08-10 15:05:23 +10:00 |
|
Justin Viiret
|
9346a9090e
|
fdr: remove groups from struct FDR_Runtime_Args
|
2016-08-10 14:55:52 +10:00 |
|
Justin Viiret
|
42f23c2c91
|
teddy: no need to write control out at the end
|
2016-08-10 14:55:51 +10:00 |
|
Justin Viiret
|
b6a77b7329
|
teddy: remove extra control ptr
|
2016-08-10 14:55:51 +10:00 |
|
Matthew Barr
|
e3d416a6ea
|
Apply some consistency to the names we give shifts
|
2016-07-08 11:07:50 +10:00 |
|
Matthew Barr
|
1b3e795fc9
|
teddy: we only need the upper lane
Just use an extract, no need to shuffle first.
|
2016-07-08 11:07:50 +10:00 |
|
Matthew Barr
|
4d6934fc77
|
Move limex specific shuffle utils and ssse3 funcs
|
2016-07-08 11:07:50 +10:00 |
|
Mohammad Abdul Awal
|
ed772380c0
|
teddy: remove python codegen, refactor code
Major cleanup of the Teddy runtime code. Removes python code generation,
splits AVX2 models into their own file, improves readability.
|
2016-05-18 16:28:11 +10:00 |
|