[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH] cksum: Use AVX2 and AVX512 for speedup
From: |
Sam Russell |
Subject: |
[PATCH] cksum: Use AVX2 and AVX512 for speedup |
Date: |
Mon, 25 Nov 2024 17:04:20 +0100 |
I've added a sample benchmarking program to measure the difference without
hitting disk, looking like a 40% speedup
$ time ./cksum_bench_pclmul 1048576 10000
Hash: EFA0B24F, length: 1048576
real 0m3.018s
user 0m3.018s
sys 0m0.000s
$ time ./cksum_bench_avx2 1048576 10000
Hash: EFA0B24F, length: 1048576
real 0m1.824s
user 0m1.804s
sys 0m0.020s
The code effectively replicates the existing pclmul code and has new
constants generated for the larger folds. The main gotcha was that the
previous CRC gets inserted at a weird offset due to endianness and byte
swapping.
I don't have a skylake processor so I spun up an AWS instance to test out
the AVX512 version, it turns out there's a bug where virtualisation
environments don't handle the AVX512 pclmul correctly despite the CPU
supporting it. It might be worth us disabling this for now as it does get
past the __builtin_cpu_supports() gate but then throws an illegal
instruction halfway through the function. It would be nice if we could at
least validate it for now though.
AVX2 has been around over 10 years though so this seems to be a safer
addition.
cksum_bench.c
Description: Text document
0001-cksum-Use-AVX2-and-AVX512-for-speedup.patch
Description: Binary data
- [PATCH] cksum: Use AVX2 and AVX512 for speedup,
Sam Russell <=
- Re: [PATCH] cksum: Use AVX2 and AVX512 for speedup, Pádraig Brady, 2024/11/25
- Re: [PATCH] cksum: Use AVX2 and AVX512 for speedup, Jeffrey Walton, 2024/11/25
- Re: [PATCH] cksum: Use AVX2 and AVX512 for speedup, Sam Russell, 2024/11/25
- Re: [PATCH] cksum: Use AVX2 and AVX512 for speedup, Sam Russell, 2024/11/25
- Re: [PATCH] cksum: Use AVX2 and AVX512 for speedup, Jeffrey Walton, 2024/11/25
- Re: [PATCH] cksum: Use AVX2 and AVX512 for speedup, Sam Russell, 2024/11/25
- Re: [PATCH] cksum: Use AVX2 and AVX512 for speedup, Pádraig Brady, 2024/11/25
- Re: [PATCH] cksum: Use AVX2 and AVX512 for speedup, Sam Russell, 2024/11/26