Re: [STUMP] Interesting Bug...

stumpwm-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [STUMP] Interesting Bug...

From:	Brit Butler
Subject:	Re: [STUMP] Interesting Bug...
Date:	Wed, 27 Jan 2010 16:37:12 -0500

Ben,

Please let me apologize to you again for e-mailing you directly. It's
taken a while to remember gmail replies by default to individuals.

I decided that profiling might be a useful bit of information since
top on tty1 reveals that stumpwm consumes >90% cpu when it hangs. I
profiled the entire stumpwm package, induced a hang, aborted, called
report and unprofiled. Here are my results:


STUMPWM> (sb-profile:profile "STUMPWM")
; No value
STUMPWM> (loadrc)
; Evaluation aborted.
STUMPWM> (sb-profile:report)
WARNING:
   Function FIX-LAYOUT has been redefined, so times may be inaccurate.
PROFILE it again to record calls to the new definition.
WARNING:
   Function GET-PROC-FILE-FIELD has been redefined, so times may be inaccurate.
PROFILE it again to record calls to the new definition.
WARNING:
   Function FMT-CPU-USAGE-BAR has been redefined, so times may be inaccurate.
PROFILE it again to record calls to the new definition.
WARNING:
   Function CURRENT-CPU-USAGE has been redefined, so times may be inaccurate.
PROFILE it again to record calls to the new definition.
WARNING:
   Function FMT-CPU-TEMP has been redefined, so times may be inaccurate.
PROFILE it again to record calls to the new definition.
WARNING:
   Function FMT-CPU-USAGE has been redefined, so times may be inaccurate.
PROFILE it again to record calls to the new definition.
WARNING:
   Function FMT-CPU-FREQ has been redefined, so times may be inaccurate.
PROFILE it again to record calls to the new definition.
WARNING: Function SWANK has been redefined, so times may be inaccurate.
PROFILE it again to record calls to the new definition.
  seconds  |     consed    |  calls |  sec/call  |  name
------------------------------------------------------------
    12.259 | 2,965,863,488 |      3 |   4.086328 | RUN-PROG
     0.220 |    35,646,112 |      1 |   0.219956 | LOAD-RC-FILE
     0.180 |    27,966,576 |      3 |   0.059987 | LOAD-MODULE
     0.089 |     8,193,456 |      2 |   0.044491 | SETUP-WIN-GRAVITY
     0.022 |     2,273,024 |    220 |   0.000101 | FORMAT-EXPAND
     0.014 |       750,944 |     19 |   0.000736 | SPLIT-SEQ

I've truncated the output but you get the idea. I expect run-prog
shouldn't be taking so long per-call. I don't know about the consing.

Grepping shows it's here:
address@hidden stumpwm]$ grep "(run-prog " *.lisp
user.lisp:      (run-prog *shell-program* :args (list "-c" cmd) :wait nil)))
wrappers.lisp:    (run-prog prog :args args :output s :wait t)))

My guess is that this is being called by run-shell-command which is in
turn being called by make-screen-heads. Any thoughts? I hope this
helps some as I realize I haven't been great at helping to narrow
things down to date.

Regards,
Brit Butler

On 1/26/10, Brit Butler <address@hidden> wrote:
> Ben,
>
> Sorry again for the long gaps in my replies.
>
>> Does it happen consistently with particular applications?
>
> If you mean resizing on running a command (say lxrandr), that happens
> consistently with my terminals. If you mean not shrinking or expanding
> appropriately, that happens on all groups with all apps
> (chromium,emacs,pidgin,virtualbox). I think thunar (the file manager)
> as mentioned above is the only exception. It *seems* to resize
> properly. I'd have to check dimensions before and after to be sure.
> That's also the only group with more than one split, most of the
> others are a single fullscreen app or a single h-split (piding, for
> example).
>
>> Could you let me know what (...) looks like immediately after
>> switching...
>
> Sure. I actually have a nice little SLIME log here which might answer
> a few questions, note that evaluation aborted signals aborting the
> (loadrc) call to fix the "hang"\cpu-throttling nonsense.
>
> ; SLIME 2009-09-18
> CL-USER> (in-package :stumpwm)
> #<PACKAGE "STUMPWM">
> STUMPWM> (setf *top-level-error-action* :abort)
> :ABORT
> STUMPWM> *top-level-error-action*
> :ABORT
> STUMPWM> (screen-heads (current-screen))
> (#S(frame 0 NIL 0 0 1680 1050))
> STUMPWM> *top-level-error-action*
> :ABORT
> STUMPWM> (loadrc)
> NIL
> STUMPWM> (loadrc)
> ; Evaluation aborted.
> STUMPWM> *top-level-error-action*
> :ABORT
> STUMPWM> (setf *top-level-error-action* :break)
> :BREAK
> STUMPWM> (screen-heads (current-screen))
> (#S(frame 0 NIL 0 0 1280 800))
> STUMPWM> (loadrc)
> NIL
> STUMPWM> (loadrc)
> ; Evaluation aborted.
> STUMPWM> *top-level-error-action*
> :BREAK
> STUMPWM> (setf *top-level-error-action* :abort)
> :ABORT
> STUMPWM> (loadrc)
> ; Evaluation aborted.
> STUMPWM> (stumpwm::group-frames (current-group))
> (#S(frame 0 #S(TILE-WINDOW "screenshots - File Manager" #x800003) 0 0 640
> 800)
>  #S(frame 1 #S(TILE-WINDOW
> "address@hidden:~/builds/clbuild-dev/source/paktahn-dev"
> #x2200006) 640 0 640 400)
>  #S(frame 2 #S(TILE-WINDOW
> "address@hidden:~/projects/cl-web-apis" #x2400006) 640 400 640
> 400))
> STUMPWM> (stumpwm::group-frames (current-group))
> (#S(frame 0 #S(TILE-WINDOW "screenshots - File Manager" #x800003) 0 0 640
> 800)
>  #S(frame 1 #S(TILE-WINDOW "Display Settings" #x1800003) 640 0 640 400)
>  #S(frame 2 #S(TILE-WINDOW
> "address@hidden:~/projects/cl-web-apis" #x2400006) 640 400 640
> 400))
>
> This is interesting for two facts.
> 1) The frame dimensions are the same before and after. Maybe they're
> being resized properly and just not moved into the correct place?
> 2) Regardless of *top-level-error-action*'s setting I have to call
> (loadrc) as there is no "crash". The "hang" must be some loop that
> isn't terminating. That would fit with the high CPU usage. Calling
> (loadrc) and then aborting for some reasons kicks the loop out.
>
>> If you have the time it would be useful if you could set *debug-level* to
>> 4 and send me a log of the breakage happening.
>
> I'll try to do that this weekend, if not sooner. Again, sorry for the
> lag in my reply and thanks for your patience.
>
> Regards,
> Brit Butler
>
> On 1/23/10, Ben Spencer <address@hidden> wrote:
>> On Sat, Jan 23, 2010 at 10:18:15AM -0500, Brit Butler wrote:
>>> For example, I just switched from VGA to laptop in a group with a
>>> hsplit and a vsplit, a file manager on the left and two terminals on
>>> the right. The file manager resized immediately but the terminals
>>> didn't. Launching lxrandr in one of the terminals made it resize
>>> right away. Here's a screenshot:
>>> http://redlinernotes.com/docs/2010-01-23.png
>>
>> This behaviour is very odd.  Does it happen consistently with
>> particular applications?
>>
>> Could you let me know what (stumpwm::group-frames (current-group))
>> looks like immediately after switching (while things are still in the
>> wrong place).
>>
>> If you have the time it would be useful if you could set *debug-level*
>> to 4 and send me a log of the breakage happening.
>>
>>
>>> It's a bit embarrassing but I can't narrow this down enough to
>>> reproduce it with complete reliability. I just managed to cause the
>>> error a few times, once with *top-level-error-action* set to :break
>>> and once set to :abort but I only saw the same restarts from before.
>>>
>>> Essentially, the bug is: input becomes trapped in whatever frame I was
>>> in as stump is unresponsive, the mode-line goes blank. Switching to
>>> tty1 and running htop shows stumpwm using ~90% cpu, running emacs -nw
>>> in this tty and slime-connecting works, (in-package :stumpmw). I
>>> generally try to run (loadrc) and after a second it gives me the
>>> restart options I mailed before. Selecting abort returns the system to
>>> normal. It's worth noting that the lxrandr window doesn't close after
>>> switching screens as it should. It just stays open (and no frames
>>> resize) until I've selected a restart for the type-error.
>>
>> Hmm.  I was working on the basis that things were locking up *because*
>> you had it set to :break (which is intended to break out to the
>> debugger and thus offer you restarts).  In this case using :abort
>> should make it display an error and restart stumpwm immediately.
>>
>> So, obligatory stupid question: are you absolutely sure
>> stumpwm:*top-level-error-action* was set to :abort at the point the
>> error occurred?  I notice that your rc (which I probably should have
>> read in more detail earlier...) sets it to :abort when swank is
>> started, and swank is not started automatically.
>>
>> That said, I don't know why breaking would make it start hogging cpu,
>> so maybe I'm missing something here.
>>
>>
>> Ben
>>
>>
>> _______________________________________________
>> Stumpwm-devel mailing list
>> address@hidden
>> http://lists.nongnu.org/mailman/listinfo/stumpwm-devel
>>
>

[Prev in Thread]

Current Thread

[Next in Thread]

[STUMP] Interesting Bug..., Brit Butler, 2010/01/19
- Re: [STUMP] Interesting Bug..., Ben Spencer, 2010/01/20
  - Message not available
    - Re: [STUMP] Interesting Bug..., Ben Spencer, 2010/01/21
    - Re: [STUMP] Interesting Bug..., Brit Butler, 2010/01/21
    - Re: [STUMP] Interesting Bug..., Ben Spencer, 2010/01/21
    - Re: [STUMP] Interesting Bug..., Brit Butler, 2010/01/22
    - Re: [STUMP] Interesting Bug..., Ben Spencer, 2010/01/23
    - Re: [STUMP] Interesting Bug..., Brit Butler, 2010/01/23
    - Re: [STUMP] Interesting Bug..., Ben Spencer, 2010/01/23
    - Message not available
    - Re: [STUMP] Interesting Bug..., Brit Butler <=
    - Re: [STUMP] Interesting Bug..., Julian Stecklina, 2010/01/31
    - Message not available
    - Re: [STUMP] Interesting Bug..., Ben Spencer, 2010/01/31
    - Re: [STUMP] Interesting Bug..., Shawn Betts, 2010/01/21

Prev by Date: Re: [STUMP] Interesting Bug...
Next by Date: Re: [STUMP] Interesting Bug...
Previous by thread: Re: [STUMP] Interesting Bug...
Next by thread: Re: [STUMP] Interesting Bug...
Index(es):
- Date
- Thread