coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] cksum: Use AVX2 and AVX512 for speedup


From: Sam Russell
Subject: Re: [PATCH] cksum: Use AVX2 and AVX512 for speedup
Date: Mon, 25 Nov 2024 19:29:40 +0100

Thanks, sent key off-list

I also think I've been confusing myself, the benchmark program doesn't
check the flags. I think I will need to change the logic though, here's the
lscpu from my Radeon with AVX2

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         48 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  12
  On-line CPU(s) list:   0-11
Vendor ID:               AuthenticAMD
  Model name:            AMD Ryzen 5 5600 6-Core Processor
    CPU family:          25
    Model:               33
    Thread(s) per core:  2
    Core(s) per socket:  6
    Socket(s):           1
    Stepping:            2
    BogoMIPS:            6986.86
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_
                         good nopl tsc_reliable nonstop_tsc cpuid
extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes
xsave avx f16c rdrand hypervisor lahf_lm cmp_legac
                         y cr8_legacy abm sse4a misalignsse 3dnowprefetch
osvw topoext ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms
rdseed adx smap clflushopt clwb sha_ni xsav
                         eopt xsavec xgetbv1 xsaves clzero xsaveerptr arat
umip vaes vpclmulqdq rdpid fsrm

So it does set vpclmulqdq but doesn't set avx512. Jeff's CPU has both
avx512f and vpclmulqdq, and the skylake on EC2 has avx512f but does NOT
have vpclmulqdq. This might mean that we'll want AVX2 on any AVX2 processor
with vpclmulqdq, and any AVX512 processor that does NOT have vpclmulqdq
set, does that seem logical?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]