help-cfengine
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

infinite loop in dry-run mode


From: Marion Hakanson
Subject: infinite loop in dry-run mode
Date: Tue, 24 Feb 2004 18:11:01 -0800

This problem seems to have appeared in cfengine-2.0.8p1;  It also seems
to exist in 2.1.3.  This is running on the Solaris-9 SPARC platform.

We have a macro which gets defined based on the output
of a shell command, as such:

  control:
    tww_base = ( "/opt/TWWfsw/" )
    ssh_tww_root = ( ExecResult(/bin/sh -c "sed -ne blah blah"))


It gets used later in a "links" section, as such:

  links:
    $(tww_base)openssh ->! $(ssh_tww_root) type=rel


Our code works perfectly when cfengine is run "for real", but as of
version 2.0.8p1, and also in 2.1.3, when run via "cfagent -n", it goes
into an infinite loop, allocating more and more memory until the machine
thrashes itself to death (or you get tired of waiting and kill it).

The problem stems from the fact that the "ssh_tww_root" macro does not get
a value when run in dry-run mode.  In addition to the fact that it is
impossible to see what such a cfengine program would do when run for real,
we also have the unpleasant side effects described above.

Here's a trace from "cfagent -n -d0", showing the problem section:

  . . .
  ExpandVarstring(/opt/TWWfsw/openssh)
  ExpandVarstring()
  ExpandVarbinserv , (Binserver is cyclops)
  RelativeLink(/opt/TWWfsw/openssh,)
  CompressPath(,)
[ things stop here, until cfagent is killed ]
  cfengine:cyclops: Received signal 15 (SIGTERM) while doing [pre-lock-state]
  cfengine:cyclops: Logical start time Tue Feb 24 16:31:28 2004
  cfengine:cyclops: This sub-task started really at Tue Feb 24 16:31:30 2004


I ran the above cfagent under "truss", and found that cfagent is making
no system calls during this looping behavior.  Also ran it under "apptrace",
and it doesn't appear to be making any shared-library calls, either.  Here
is where cfagent hangs/loops:

  6145:cfagent  -> libc.so.1:printf(format = 0x8fd10, ...) = 19
  6145:cfagent  -> libc.so.1:printf(format = 0x8fd28, ...) = 23
  6145:cfagent  -> libc.so.1:sscanf(0xffbfd180, 0x8fb70, 0xffbfb110)
  6145:cfagent  -> libc.so.1:strcat(dst = "", src = "") = ""
  6145:cfagent  -> libc.so.1:strlen(s = "") = 0x0
  6145:cfagent  -> libc.so.1:printf(format = 0x8c980, ...) = 35
  6145:cfagent  -> libc.so.1:printf(format = 0x8f5c4, ...) = 16
  6145:cfagent  -> libc.so.1:bzero(s = 0xffbfb110, n = 0x1000)
  6145:cfagent  -> libc.so.1:strncpy(dst = "", src = "", n = 0x0) = ""
  6145:cfagent  -> libc.so.1:strcmp(s1 = "", s2 = "/opt/TWWfsw/openssh") = 
0xffffffd1
[ things stop here, until cfagent is killed, then... ]
  6145:cfagent  -> libc.so.1:snprintf(buf = 0x105e24, n = 8192, format = 
0x8e8f4, ...) = 55
  6145:cfagent  -> libc.so.1:strlen(s = "Received signal 2 (S") = 0x37
  6145:cfagent  -> libc.so.1:strlen(s = "Received signal 2 (S") = 0x37
  6145:cfagent  -> libc.so.1:strlen(s = "Received signal 2 (S") = 0x37
  6145:cfagent  -> libc.so.1:strncpy(dst = "Received signal 2 (S", src = 
"Received signal 2 (S", n = 0x3fe) = "Received signal 2 (S"
  . . .


Does that help to track down where it's hanging/looping?

Regards,

-- 
Marion Hakanson <hakanson@cse.ogi.edu>
CSE Computing Facilities






reply via email to

[Prev in Thread] Current Thread [Next in Thread]