guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: fftwf tests running for 25+ hours - is this normal?


From: Chris Marusich
Subject: Re: fftwf tests running for 25+ hours - is this normal?
Date: Tue, 15 Oct 2019 11:32:38 -0700
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)

Hi Bengt and Matteo,

Bengt Richter <address@hidden> writes:

> Have you checked sensors for overheating that might induce CPU clock 
> throttling?

Actually, yes, I happened to be watching dmesg output at the time, and I
did notice these messages (no similar messages have been printed since
then, which was many hours ago):

--8<---------------cut here---------------start------------->8---
[180270.045081] mce: CPU1: Core temperature above threshold, cpu clock 
throttled (total events = 124733)
[180270.045567] mce: CPU1: Core temperature/speed normal
[180570.044352] mce: CPU1: Core temperature above threshold, cpu clock 
throttled (total events = 134902)
[180570.044838] mce: CPU1: Core temperature/speed normal
[180875.663432] mce: CPU1: Core temperature above threshold, cpu clock 
throttled (total events = 143897)
[180875.663918] mce: CPU1: Core temperature/speed normal
[181175.748616] mce: CPU1: Core temperature above threshold, cpu clock 
throttled (total events = 153503)
[181175.749103] mce: CPU1: Core temperature/speed normal
[181476.496915] mce: CPU1: Core temperature above threshold, cpu clock 
throttled (total events = 171377)
[181476.497401] mce: CPU1: Core temperature/speed normal
[181776.914264] mce: CPU1: Core temperature above threshold, cpu clock 
throttled (total events = 185828)
[181776.914751] mce: CPU1: Core temperature/speed normal
[182076.914391] mce: CPU1: Core temperature/speed normal
[182377.322112] mce: CPU1: Core temperature above threshold, cpu clock 
throttled (total events = 231578)
[182377.322598] mce: CPU1: Core temperature/speed normal
--8<---------------cut here---------------end--------------->8---

Is this throttling permanent, or is the throttling released after the
temperature returns to normal?

Matteo Frigo <address@hidden> writes:

> thanks for reporting this issue.  (By the way, did the test ever
> finish?)

It's still running.  It's been 38 hours now.

> Most likely, the problem is in FFTW and not in your machine.  Would you
> mind reporting the exact configure options (I care specifically about
> --enable-sse2, --enable-avx, etc.) and the exact command line that is
> taking 25h?

These two options were included in the configure invocation.

> The issue is probably related to the funny integer 34320.  The FFT
> algorithm depends on the factorization of the size, 34320 in this case,
> which is 34320=2*2*2*2*3*5*11*13.  FFTW tries to find the optimal way to
> decompose 34320 into small problems that it can solve.  For example, it
> may reduce a DFT of size 34320 into 34320/13 FFTs of size 13 followed by
> 13 FFTs of size 34320/13.  Or it may reduce a DFT of size 34320 into
> 34320/11 FFTs of size 11 followed by 11 FFTs of size 34320/11.  Because
> there are many distinct prime factors, the number of factorizations is
> really large and FFTW is pretty much trying them all.  The problem is
> even harder once you consider the fact that some sizes are odd, and thus
> they don't naturally match the vector length of SSE/AVX instructions.
>
> FFTW has some heuristics to limit the search space in cases like this,
> but these heuristics were implemented in a simpler world (before AVX)
> and maybe they are not adequate any longer, so I want to look into this.

Thank you for taking a look!  Here is how the test is being run.

Output of "pstree -lap 20098":

--8<---------------cut here---------------start------------->8---
make,20098 check -j 2
  `-bash,20099 -c fail=; \\\012if (target_option=k; case ${target_option-} in 
?) ;; *) echo "am__make_running_with_option: internal error: invalid" "target 
option '${target_option-}' specified" >&2; exit 1;; esac; has_opt=no; 
sane_makeflags=$MAKEFLAGS; if { if test -z '0'; then false; elif test -n 
'x86_64-unknown-linux-gnu'; then true; elif test -n '4.2.1' && test -n 
'/tmp/guix-build-fftwf-3.3.8.drv-0/fftw-3.3.8'; then true; else false; fi; }; 
then sane_makeflags=$MFLAGS; else case $MAKEFLAGS in *\\\\[\\ \\\011]*) 
bs=\\\\; sane_makeflags=`printf '%s\\n' "$MAKEFLAGS" | sed "s/$bs$bs[$bs 
$bs\011]*//g"`;; esac; fi; skip_next=no; strip_trailopt () { flg=`printf 
'%s\\n' "$flg" | sed "s/$1.*$//"`; }; for flg in $sane_makeflags; do test 
$skip_next = yes && { skip_next=no; continue; }; case $flg in *=*|--*) 
continue;; -*I) strip_trailopt 'I'; skip_next=yes;; -*I?*) strip_trailopt 'I';; 
-*O) strip_trailopt 'O'; skip_next=yes;; -*O?*) strip_trailopt 'O';; -*l) 
strip_trailopt 'l'; skip_next=yes;; -*l?*) strip_trailopt 'l';; -[dEDm]) 
skip_next=yes;; -[JT]) skip_next=yes;; esac; case $flg in *$target_option*) 
has_opt=yes; break;; esac; done; test $has_opt = yes); then \\\012  
failcom='fail=yes'; \\\012else \\\012  failcom='exit 1'; \\\012fi; 
\\\012dot_seen=no; \\\012target=`echo check-recursive | sed s/-recursive//`; 
\\\012case "check-recursive" in \\\012  distclean-* | maintainer-clean-*) 
list='support genfft kernel simd-support dft rdft reodft api libbench2 . 
threads tests mpi doc tools m4' ;; \\\012  *) list='support  kernel 
simd-support dft rdft reodft api libbench2 . threads tests mpi doc tools m4' ;; 
\\\012esac; \\\012for subdir in $list; do \\\012  echo "Making $target in 
$subdir"; \\\012  if test "$subdir" = "."; then \\\012    dot_seen=yes; \\\012  
  local_target="$target-am"; \\\012  else \\\012    local_target="$target"; 
\\\012  fi; \\\012  (CDPATH="${ZSH_VERSION+.}:" && cd $subdir && make  
$local_target) \\\012  || eval $failcom; \\\012done; \\\012if test "$dot_seen" 
= "no"; then \\\012  make  "$target-am" || exit 1; \\\012fi; test -z "$fail"
      `-bash,20267 -c fail=; \\\012if (target_option=k; case ${target_option-} 
in ?) ;; *) echo "am__make_running_with_option: internal error: invalid" 
"target option '${target_option-}' specified" >&2; exit 1;; esac; has_opt=no; 
sane_makeflags=$MAKEFLAGS; if { if test -z '0'; then false; elif test -n 
'x86_64-unknown-linux-gnu'; then true; elif test -n '4.2.1' && test -n 
'/tmp/guix-build-fftwf-3.3.8.drv-0/fftw-3.3.8'; then true; else false; fi; }; 
then sane_makeflags=$MFLAGS; else case $MAKEFLAGS in *\\\\[\\ \\\011]*) 
bs=\\\\; sane_makeflags=`printf '%s\\n' "$MAKEFLAGS" | sed "s/$bs$bs[$bs 
$bs\011]*//g"`;; esac; fi; skip_next=no; strip_trailopt () { flg=`printf 
'%s\\n' "$flg" | sed "s/$1.*$//"`; }; for flg in $sane_makeflags; do test 
$skip_next = yes && { skip_next=no; continue; }; case $flg in *=*|--*) 
continue;; -*I) strip_trailopt 'I'; skip_next=yes;; -*I?*) strip_trailopt 'I';; 
-*O) strip_trailopt 'O'; skip_next=yes;; -*O?*) strip_trailopt 'O';; -*l) 
strip_trailopt 'l'; skip_next=yes;; -*l?*) strip_trailopt 'l';; -[dEDm]) 
skip_next=yes;; -[JT]) skip_next=yes;; esac; case $flg in *$target_option*) 
has_opt=yes; break;; esac; done; test $has_opt = yes); then \\\012  
failcom='fail=yes'; \\\012else \\\012  failcom='exit 1'; \\\012fi; 
\\\012dot_seen=no; \\\012target=`echo check-recursive | sed s/-recursive//`; 
\\\012case "check-recursive" in \\\012  distclean-* | maintainer-clean-*) 
list='support genfft kernel simd-support dft rdft reodft api libbench2 . 
threads tests mpi doc tools m4' ;; \\\012  *) list='support  kernel 
simd-support dft rdft reodft api libbench2 . threads tests mpi doc tools m4' ;; 
\\\012esac; \\\012for subdir in $list; do \\\012  echo "Making $target in 
$subdir"; \\\012  if test "$subdir" = "."; then \\\012    dot_seen=yes; \\\012  
  local_target="$target-am"; \\\012  else \\\012    local_target="$target"; 
\\\012  fi; \\\012  (CDPATH="${ZSH_VERSION+.}:" && cd $subdir && make  
$local_target) \\\012  || eval $failcom; \\\012done; \\\012if test "$dot_seen" 
= "no"; then \\\012  make  "$target-am" || exit 1; \\\012fi; test -z "$fail"
          `-make,20268 check
              `-make,20269 check-local
                  `-perl,20431 -w ./check.pl -r -c=30 -v --nthreads=2 
/tmp/guix-build-fftwf-3.3.8.drv-0/fftw-3.3.8/tests/bench
                      `-bench,20572 -o nthreads=2 --verbose=1 --verify 
//obcd34320 --verify //ibcd34320 --verify //ofcd34320 --verify //ifcd34320 
--verify obcd34320 --verify ibcd34320 --verify ofcd34320 --verify ifcd34320 
--verify okd1200e11 --verify ikd1200e11 --verify obr9x5x6v12 --verify 
ibr9x5x6v12 --verify ofr9x5x6v12 --verify ifr9x5x6v12 --verify //obc9x5x6v12 
--verify //ibc9x5x6v12 --verify //ofc9x5x6v12 --verify //ifc9x5x6v12 --verify 
obc9x5x6v12 --verify ibc9x5x6v12 --verify ofc9x5x6v12 --verify ifc9x5x6v12 
--verify ok65e01x36e10 --verify ik65e01x36e10 --verify obr8x9x3*17 --verify 
ibr8x9x3*17 --verify ofr8x9x3*17 --verify ifr8x9x3*17 --verify //obc8x9x3*17 
--verify //ibc8x9x3*17 --verify //ofc8x9x3*17 --verify //ifc8x9x3*17 --verify 
obc8x9x3*17 --verify ibc8x9x3*17 --verify ofc8x9x3*17 --verify ifc8x9x3*17 
--verify ok7o10x28e00v21 --verify ik7o10x28e00v21 --verify obr8x4x4x5v3 
--verify ibr8x4x4x5v3 --verify ofr8x4x4x5v3 --verify ifr8x4x4x5v3 --verify 
//obc8x4x4x5v3 --verify //ibc8x4x4x5v3
                          `-{bench},20586
--8<---------------cut here---------------end--------------->8---

For some reason, the pstree output seems to have a lot of backslash
escapes in it.  That might be a little hard to read, so here is the same
thing from "ps --forest -ef", which for some reason does not have the
same number of backslash escapes:

--8<---------------cut here---------------start------------->8---
UID        PID  PPID  C STIME TTY          TIME CMD
[... omitted ...]
guixbui+ 20098 31417  0 Oct13 ?        00:00:00          \_ make check -j 2
guixbui+ 20099 20098  0 Oct13 ?        00:00:00              \_ 
/gnu/store/vd5fcqira1q4ibq5q7bfdfpcmdyy6fxg-bash-minimal-5.0.7/bin/bash -c 
fail=; \ if (target_option=k; case ${target_option-} in ?) ;; *) echo 
"am__make_running_with_option: internal error: invalid" "target option 
'${target_option-}' specified" >&2; exit 1;; esac; has_opt=no; 
sane_makeflags=$MAKEFLAGS; if { if test -z '0'; then false; elif test -n 
'x86_64-unknown-linux-gnu'; then true; elif test -n '4.2.1' && test -n 
'/tmp/guix-build-fftwf-3.3.8.drv-0/fftw-3.3.8'; then true; else false; fi; }; 
then sane_makeflags=$MFLAGS; else case $MAKEFLAGS in *\\[\ \?]*) bs=\\; 
sane_makeflags=`printf '%s\n' "$MAKEFLAGS" | sed "s/$bs$bs[$bs $bs?]*//g"`;; 
esac; fi; skip_next=no; strip_trailopt () { flg=`printf '%s\n' "$flg" | sed 
"s/$1.*$//"`; }; for flg in $sane_makeflags; do test $skip_next = yes && { 
skip_next=no; continue; }; case $flg in *=*|--*) continue;; -*I) strip_trailopt 
'I'; skip_next=yes;; -*I?*) strip_trailopt 'I';; -*O) strip_trailopt 'O'; 
skip_next=yes;; -*O?*) strip_trailopt 'O';; -*l) strip_trailopt 'l'; 
skip_next=yes;; -*l?*) strip_trailopt 'l';; -[dEDm]) skip_next=yes;; -[JT]) 
skip_next=yes;; esac; case $flg in *$target_option*) has_opt=yes; break;; esac; 
done; test $has_opt = yes); then \   failcom='fail=yes'; \ else \   
failcom='exit 1'; \ fi; \ dot_seen=no; \ target=`echo check-recursive | sed 
s/-recursive//`; \ case "check-recursive" in \   distclean-* | 
maintainer-clean-*) list='support genfft kernel simd-support dft rdft reodft 
api libbench2 . threads tests mpi doc tools m4' ;; \   *) list='support  kernel 
simd-support dft rdft reodft api libbench2 . threads tests mpi doc tools m4' ;; 
\ esac; \ for subdir in $list; do \   echo "Making $target in $subdir"; \   if 
test "$subdir" = "."; then \     dot_seen=yes; \     local_target="$target-am"; 
\   else \     local_target="$target"; \   fi; \   (CDPATH="${ZSH_VERSION+.}:" 
&& cd $subdir && make  $local_target) \   || eval $failcom; \ done; \ if test 
"$dot_seen" = "no"; then \   make  "$target-am" || exit 1; \ fi; test -z "$fail"
guixbui+ 20267 20099  0 Oct13 ?        00:00:00                  \_ 
/gnu/store/vd5fcqira1q4ibq5q7bfdfpcmdyy6fxg-bash-minimal-5.0.7/bin/bash -c 
fail=; \ if (target_option=k; case ${target_option-} in ?) ;; *) echo 
"am__make_running_with_option: internal error: invalid" "target option 
'${target_option-}' specified" >&2; exit 1;; esac; has_opt=no; 
sane_makeflags=$MAKEFLAGS; if { if test -z '0'; then false; elif test -n 
'x86_64-unknown-linux-gnu'; then true; elif test -n '4.2.1' && test -n 
'/tmp/guix-build-fftwf-3.3.8.drv-0/fftw-3.3.8'; then true; else false; fi; }; 
then sane_makeflags=$MFLAGS; else case $MAKEFLAGS in *\\[\ \?]*) bs=\\; 
sane_makeflags=`printf '%s\n' "$MAKEFLAGS" | sed "s/$bs$bs[$bs $bs?]*//g"`;; 
esac; fi; skip_next=no; strip_trailopt () { flg=`printf '%s\n' "$flg" | sed 
"s/$1.*$//"`; }; for flg in $sane_makeflags; do test $skip_next = yes && { 
skip_next=no; continue; }; case $flg in *=*|--*) continue;; -*I) strip_trailopt 
'I'; skip_next=yes;; -*I?*) strip_trailopt 'I';; -*O) strip_trailopt 'O'; 
skip_next=yes;; -*O?*) strip_trailopt 'O';; -*l) strip_trailopt 'l'; 
skip_next=yes;; -*l?*) strip_trailopt 'l';; -[dEDm]) skip_next=yes;; -[JT]) 
skip_next=yes;; esac; case $flg in *$target_option*) has_opt=yes; break;; esac; 
done; test $has_opt = yes); then \   failcom='fail=yes'; \ else \   
failcom='exit 1'; \ fi; \ dot_seen=no; \ target=`echo check-recursive | sed 
s/-recursive//`; \ case "check-recursive" in \   distclean-* | 
maintainer-clean-*) list='support genfft kernel simd-support dft rdft reodft 
api libbench2 . threads tests mpi doc tools m4' ;; \   *) list='support  kernel 
simd-support dft rdft reodft api libbench2 . threads tests mpi doc tools m4' ;; 
\ esac; \ for subdir in $list; do \   echo "Making $target in $subdir"; \   if 
test "$subdir" = "."; then \     dot_seen=yes; \     local_target="$target-am"; 
\   else \     local_target="$target"; \   fi; \   (CDPATH="${ZSH_VERSION+.}:" 
&& cd $subdir && make  $local_target) \   || eval $failcom; \ done; \ if test 
"$dot_seen" = "no"; then \   make  "$target-am" || exit 1; \ fi; test -z "$fail"
guixbui+ 20268 20267  0 Oct13 ?        00:00:00                      \_ make 
check
guixbui+ 20269 20268  0 Oct13 ?        00:00:00                          \_ 
make check-local
guixbui+ 20431 20269  0 Oct13 ?        00:00:00                              \_ 
perl -w ./check.pl -r -c=30 -v --nthreads=2 
/tmp/guix-build-fftwf-3.3.8.drv-0/fftw-3.3.8/tests/bench
guixbui+ 20572 20431 99 Oct13 ?        1-15:47:52                               
   \_ /tmp/guix-build-fftwf-3.3.8.drv-0/fftw-3.3.8/tests/.libs/bench -o 
nthreads=2 --verbose=1 --verify //obcd34320 --verify //ibcd34320 --verify 
//ofcd34320 --verify //ifcd34320 --verify obcd34320 --verify ibcd34320 --verify 
ofcd34320 --verify ifcd34320 --verify okd1200e11 --verify ikd1200e11 --verify 
obr9x5x6v12 --verify ibr9x5x6v12 --verify ofr9x5x6v12 --verify ifr9x5x6v12 
--verify //obc9x5x6v12 --verify //ibc9x5x6v12 --verify //ofc9x5x6v12 --verify 
//ifc9x5x6v12 --verify obc9x5x6v12 --verify ibc9x5x6v12 --verify ofc9x5x6v12 
--verify ifc9x5x6v12 --verify ok65e01x36e10 --verify ik65e01x36e10 --verify 
obr8x9x3*17 --verify ibr8x9x3*17 --verify ofr8x9x3*17 --verify ifr8x9x3*17 
--verify //obc8x9x3*17 --verify //ibc8x9x3*17 --verify //ofc8x9x3*17 --verify 
//ifc8x9x3*17 --verify obc8x9x3*17 --verify ibc8x9x3*17 --verify ofc8x9x3*17 
--verify ifc8x9x3*17 --verify ok7o10x28e00v21 --verify ik7o10x28e00v21 --verify 
obr8x4x4x5v3 --verify ibr8x4x4x5v3 --verify ofr8x4x4x5v3 --verify ifr8x4x4x5v3 
--verify //obc8x4x4x5v3 --verify //ibc8x4x4x5v3
--8<---------------cut here---------------end--------------->8---

The configure invocation, according to the config.log, was:

--8<---------------cut here---------------start------------->8---
This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.

It was created by fftw configure 3.3.8, which was
generated by GNU Autoconf 2.69.  Invocation command line was

  $ ./configure 
CONFIG_SHELL=/gnu/store/vd5fcqira1q4ibq5q7bfdfpcmdyy6fxg-bash-minimal-5.0.7/bin/bash
 SHELL=/gnu/store/vd5fcqira1q4ibq5q7bfdfpcmdyy6fxg-bash-minimal-5.0.7/bin/bash 
--prefix=/gnu/store/w21mrlm5bl2q54x8xlij0cbf4pjzxcii-fftwf-3.3.8 
--enable-fast-install --build=x86_64-unknown-linux-gnu --enable-single 
--enable-shared --enable-openmp --enable-threads --enable-sse2 --enable-avx 
--enable-avx2 --enable-avx512 --enable-avx-128-fma 
ax_cv_c_flags__mtune_native=no
--8<---------------cut here---------------end--------------->8---

I've attached the config.log (config.log.gz) to this email, since it
might be helpful.  I've also attached the build log
(fftwf-3.3.8.drv.bz2), which contains all the lines that the build logic
has printed up to this point.

Additionally, if you're familiar with Guix, you might find the package
definition to be helpful, as well.  It is defined here:

https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages/algebra.scm?id=6c50e1dc0625f89884cff40b22627091efa37708#n774

To be specific, I'm building fftwf.  The fftwf package builds a
single-precision version of fftw by customizing the fftw package's
configure options a little bit.  I'm building fftwf from a custom Guix
checkout with a few personal commits unrelated to fftw, but the commit
6c50e1dc0625f89884cff40b22627091efa37708 is similar enough that you
should be able to reproduce the issue with Guix if you run "guix pull
--commit=6c50e1dc0625f89884cff40b22627091efa37708" and then run "guix
build --keep-failed fftwf".

For now, I'll keep on waiting and see what happens...  If any other info
would be helpful, let me know and I'll see what I can do!

Thank you for your help,

-- 
Chris

Attachment: config.log.gz
Description: Binary data

Attachment: fftwf-3.3.8.drv.bz2
Description: Binary data

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]