guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

fftw runtime cpu detection


From: Eric Bavier
Subject: fftw runtime cpu detection
Date: Thu, 5 Apr 2018 17:13:29 -0500
User-agent: Mutt/1.5.17 (2007-11-01)

Hello Guix,

I recently discovered that the FFTW library can do runtime cpu
detection.  In order to do this, the package needs to be configured to
build SIMD "codelets", like how our 'fftw-avx' currently does.  Then,
based on the instruction support detected at runtime, make those
kernels available to the fftw "planner" for execution.

I tested this on two systems: 1) system with sse2, and 2) system with
avx2.  I configured the library with "--enable-sse2 --enable-avx
--enable-avx2", then ran the following on both systems:

1)
$ ./tests/bench --verbose=3 --verify 'ibcd11x7x6v10'
Planning ibcd11x7x6v10...
using plan_many_dft
estimate-planner time: 0.004355 s
using plan_many_dft
planner time: 0.035684 s
(dft-rank>=2/1
  (dft-vrank>=1-x11/1
    (dft-rank>=2/1
      (dft-vrank>=1-x7/1
        (dft-direct-6-x10 "n1bv_6_sse2"))
      (dft-direct-7-x60 "n1bv_7_sse2")))
  (dft-direct-11-x420 "n1bv_11_sse2"))
flops: 36800 add, 9700 mul, 26260 fma
estimated cost: 99057.699080, pcost = 115706.000000
ibcd11x7x6v10 4.33362e-16 7.27264e-16 8.46842e-16

2)
$ ./tests/bench --verbose=3 --verify 'ibcd11x7x6v10'
Planning ibcd11x7x6v10...
using plan_many_dft
estimate-planner time: 0.001485 s
using plan_many_dft
planner time: 0.025788 s
(dft-rank>=2/1
  (dft-rank>=2/1
    (dft-vrank>=1-x77/1
      (dft-direct-6-x10 "n1bv_6_sse2"))
    (dft-vrank>=1-x11/1
      (dft-direct-7-x60 "n1bv_7_avx")))
  (dft-direct-11-x420 "n1bv_11_avx"))
flops: 12280 add, 2810 mul, 6950 fma
estimated cost: 28996.283180, pcost = 40767.000000
ibcd11x7x6v10 2.24601e-07 3.90447e-07 2.42548e-07


The attached patch is a WIP.

-- 
Eric Bavier, Scientific Libraries, Cray Inc.

Attachment: guix-fftw-codelets.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]