reproduce-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Reproduce-devel] [bug #56724] required installing of non-native openssh


From: Mohammad Akhlaghi
Subject: [Reproduce-devel] [bug #56724] required installing of non-native openssh is a security bug
Date: Wed, 7 Aug 2019 05:53:42 -0400 (EDT)
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0

Follow-up Comment #1, bug #56724 (project reproduce):

This is a good point and I completely agree with you. I was also really
reluctant to include OpenSSH. The only reason I did this was because I haven't
used OpenMPI and wasn't aware of how to tell it to not use SSH. 

The main problem was that OpenMPI wasn't called directly by the user! It was
loaded through a third-layer dependency of Astropy: Astropy depends on h5py
which depends on mpi4py which depends on OpenMPI. Infact the user that
reported this error wasn't even using OpenMPI! He just reported that his
loading of Astropy fails (with the error message in the P.S.).

Since you know OpenMPI much better than me, what would you recommend we do? Is
there an environment variable that we can set? I tried searching a little but
couldn't find any. 

If there isn't any environment variable, I can think of two suggestions now:

* That we set `ssh' as a symbolic link to the host's SSH (like we do with the
`low-level-links
<http://git.savannah.nongnu.org/cgit/reproduce.git/tree/reproduce/software/make/basic.mk#n274>'
in `[reproduce/software/make/basic.mk basic.mk]'. But I am not sure how useful
this will be because we completely remove all the environment when running the
script, and within the pipeline we even re-set the HOME environment variable.

* Another trick/hack is to make the `ssh' executable as a shell script while
installing OpenMPI. Inside this `ssh' script we can put something like `echo
"SSH isn't reproducible!"'. In this way, the executable will exist and OpenMPI
won't crash, but if it later needs SSH, its not going to work. 


P.S.

Error message in OpenMPI when trying to load Astropy:


python3 script.py input1 outer \
        21 151 \
        0.0 7.0 \
        231.0 21.0 1 \
out1 1.0
Putting child 0x226cbd0 (BDIR/out1) PID 53021 on the chain.
Live child 0x226cbd0 (BDIR/out1) PID 53021
--------------------------------------------------------------------------
The value of the MCA parameter "plm_rsh_agent" was set to a path
that could not be found:

  plm_rsh_agent: ssh : rsh

Please either unset the parameter, or check that the path is correct
--------------------------------------------------------------------------
[machine:53085] [[INVALID],INVALID] FORCE-TERMINATE AT Not found:-13 - error
plm_rsh_component.c(327)
[machine:53085] *** Process received signal ***
[machine:53085] Signal: Segmentation fault (11)
[machine:53085] Signal code: Address not mapped (1)
[machine:53085] Failing at address: (nil)
[machine:53085] [ 0] /lib64/libpthread.so.0(+0x12420)[0x15395a6ee420]
[machine:53085] *** End of error message ***
[machine:53021] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a daemon
on the local node in file ess_singleton_module.c at line 532
[machine:53021] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a daemon
on the local node in file ess_singleton_module.c at line 166
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_init failed
  --> Returned value Unable to start a daemon on the local node (-127) instead
of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: ompi_rte_init failed
  --> Returned "Unable to start a daemon on the local node" (-127) instead of
"Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[machine:53021] Local abort before MPI_INIT completed completed successfully,
but am not able to aggregate error messages, and not able to guarantee that
all other processes were killed!
Reaping losing child 0x226cbd0 PID 53021
make: *** [reproduce/analysis/make/psf_py.mk;77: BDIR/out1] Error 1
Removing child 0x226cbd0 PID 53021 from chain.


    _______________________________________________________

Reply to this item at:

  <https://savannah.nongnu.org/bugs/?56724>

_______________________________________________
  Message sent via Savannah
  https://savannah.nongnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]