linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Why is kslowd accumulating so much CPU time?
@ 2010-06-09 18:36 Theodore Ts'o
  2010-06-09 18:56 ` David Howells
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Theodore Ts'o @ 2010-06-09 18:36 UTC (permalink / raw)
  To: Dave Airlie, David Howells; +Cc: linux-kernel


My laptop is running 2.6.35-rc2 (plus a few patches) and after not quite
four hours of beeing booted (3:45 to be precise), top shows that kslowd
has accumulated as surprising amount of CPU time.  From top display:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 6173 root      15  -5     0    0    0 S    6  0.0  10:42.91 kslowd001
 1219 root      15  -5     0    0    0 S    5  0.0  11:00.68 kslowd000
 1765 root      15  -5     0    0    0 S    5  0.0  10:59.20 kslowd002
 6784 root      15  -5     0    0    0 S    4  0.0  10:34.37 kslowd003
 4946 root      20   0  144m  34m  17m S    3  0.9   2:24.62 Xorg

This came up because I was trying to figure out why mouse response was
being so jerky.  There aren't that many users of slow-work, and given
that I'm not using cifs, fscache, gfs2, my suspicions are falling on
drivers/gpu/drm/drm_crtc_helper.c (Would this be something that would be
in use on a Lenovo T400 laptop?)

Is there any way to easily see what is scheduling themselves to use the
slow workqueue?

Thanks,

						- Ted

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Why is kslowd accumulating so much CPU time?
  2010-06-09 18:36 Why is kslowd accumulating so much CPU time? Theodore Ts'o
@ 2010-06-09 18:56 ` David Howells
  2010-06-09 19:00 ` David Howells
  2010-06-13  8:23 ` Dave Airlie
  2 siblings, 0 replies; 15+ messages in thread
From: David Howells @ 2010-06-09 18:56 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: dhowells, Dave Airlie, linux-kernel

Theodore Ts'o <tytso@mit.edu> wrote:

> Is there any way to easily see what is scheduling themselves to use the
> slow workqueue?

Slow work has debugging.  Turn on:

	CONFIG_SLOW_WORK_DEBUG=y

then do:

	watch -n0 cat /sys/kernel/debug/slow_work/runqueue

David

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Why is kslowd accumulating so much CPU time?
  2010-06-09 18:36 Why is kslowd accumulating so much CPU time? Theodore Ts'o
  2010-06-09 18:56 ` David Howells
@ 2010-06-09 19:00 ` David Howells
  2010-06-13  8:23 ` Dave Airlie
  2 siblings, 0 replies; 15+ messages in thread
From: David Howells @ 2010-06-09 19:00 UTC (permalink / raw)
  Cc: dhowells, Theodore Ts'o, Dave Airlie, linux-kernel

David Howells <dhowells@redhat.com> wrote:

> > Is there any way to easily see what is scheduling themselves to use the
> > slow workqueue?
> 
> Slow work has debugging.  Turn on:
> 
> 	CONFIG_SLOW_WORK_DEBUG=y
> 
> then do:
> 
> 	watch -n0 cat /sys/kernel/debug/slow_work/runqueue

I documented this, of course.  Look in:

	Documentation/slow-work.txt

in section "VIEWING EXECUTING AND QUEUED ITEMS".

David

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Why is kslowd accumulating so much CPU time?
  2010-06-09 18:36 Why is kslowd accumulating so much CPU time? Theodore Ts'o
  2010-06-09 18:56 ` David Howells
  2010-06-09 19:00 ` David Howells
@ 2010-06-13  8:23 ` Dave Airlie
  2010-06-13 19:49   ` tytso
  2 siblings, 1 reply; 15+ messages in thread
From: Dave Airlie @ 2010-06-13  8:23 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Dave Airlie, David Howells, linux-kernel

On Thu, Jun 10, 2010 at 4:36 AM, Theodore Ts'o <tytso@mit.edu> wrote:
>
> My laptop is running 2.6.35-rc2 (plus a few patches) and after not quite
> four hours of beeing booted (3:45 to be precise), top shows that kslowd
> has accumulated as surprising amount of CPU time.  From top display:
>
>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  6173 root      15  -5     0    0    0 S    6  0.0  10:42.91 kslowd001
>  1219 root      15  -5     0    0    0 S    5  0.0  11:00.68 kslowd000
>  1765 root      15  -5     0    0    0 S    5  0.0  10:59.20 kslowd002
>  6784 root      15  -5     0    0    0 S    4  0.0  10:34.37 kslowd003
>  4946 root      20   0  144m  34m  17m S    3  0.9   2:24.62 Xorg
>
> This came up because I was trying to figure out why mouse response was
> being so jerky.  There aren't that many users of slow-work, and given
> that I'm not using cifs, fscache, gfs2, my suspicions are falling on
> drivers/gpu/drm/drm_crtc_helper.c (Would this be something that would be
> in use on a Lenovo T400 laptop?)
>
> Is there any way to easily see what is scheduling themselves to use the
> slow workqueue?

Its an output polling feature of KMS, however it shouldn't cause
disruption elsewhere, but we are holding a mutex in there that there
isn't a real need to hold, so I suspect I'll drop that.

What GPU is it?

Dave.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Why is kslowd accumulating so much CPU time?
  2010-06-13  8:23 ` Dave Airlie
@ 2010-06-13 19:49   ` tytso
  2010-06-13 20:00     ` Dave Airlie
  0 siblings, 1 reply; 15+ messages in thread
From: tytso @ 2010-06-13 19:49 UTC (permalink / raw)
  To: Dave Airlie; +Cc: Dave Airlie, David Howells, linux-kernel

On Sun, Jun 13, 2010 at 06:23:45PM +1000, Dave Airlie wrote:
> 
> Its an output polling feature of KMS, however it shouldn't cause
> disruption elsewhere, but we are holding a mutex in there that there
> isn't a real need to hold, so I suspect I'll drop that.

> What GPU is it?

It's a Lenovo T400, with an Intel GPU:

(--) PCI:*(0:0:2:0) 8086:2a42:17aa:20e4 Intel Corporation Mobile 4 Series Chipse
t Integrated Graphics Controller rev 7, Mem @ 0xf4400000/4194304, 0xd0000000/268
435456, I/O @ 0x00001800/8

Why does KMS need to poll so frequently?  40 minutes of CPU time
accumulated in 4 hours of uptime translates to 16% of the CPU being
consumed by kslowd daemons, which seems... excessive.

I've seen upgraded the userspace to Ubuntu Lucid (from Karmic) and the
kernel to a post 2.6.35-rc2 (without the vt memory corruptor), and
after 4 days, 5 hours of uptime, and I'm seeing something better:

root      6390  4.7  0.0      0     0 ?        S<   Jun12  49:12 [kslowd001]
root      6784  2.4  0.0      0     0 ?        S<   Jun09 150:50 [kslowd003]
root     20047  2.7  0.0      0     0 ?        S<   Jun10 122:38 [kslowd002]
root     32324  4.6  0.0      0     0 ?        S<   Jun12  49:54 [kslowd000]

So that's 372 minutes of CPU time accumulated by kslowd in 6060
minutes, or about 6% of the CPU --- but part of tha time the laptop
was suspended --- and it still seems high.

I have the following patch which I'm going to install later tonight to
see if I can figure out if it really is drm_crtc_helper.c which is
really responsible for all of the kslowd time being burned, but an
examination of the source doesn't seem to show any other that I'm
using that would likely be using the slow workqueue.

						- Ted


commit bd69ab2e31a329ad9d660e23a1dc806ae0a3afbd
Author: Theodore Ts'o <tytso@mit.edu>
Date:   Sat Jun 12 21:54:03 2010 -0400

    Add slow workqueue debuging for drm_crtc_helper.c

diff --git a/drivers/gpu/drm/drm_crtc_helper.c b/drivers/gpu/drm/drm_crtc_helper.c
index 9b2a541..e536e80 100644
--- a/drivers/gpu/drm/drm_crtc_helper.c
+++ b/drivers/gpu/drm/drm_crtc_helper.c
@@ -860,6 +860,16 @@ static void output_poll_execute(struct slow_work *work)
 	}
 }
 
+#ifdef CONFIG_SLOW_WORK_DEBUG
+static void drm_op_desc(struct slow_work *work, struct seq_file *m)
+{
+	struct delayed_slow_work *delayed_work = container_of(work, struct delayed_slow_work, work);
+	struct drm_device *dev = container_of(delayed_work, struct drm_device, mode_config.output_poll_slow_work);
+
+	seq_printf(m, "DRM_CRTC_HELPER: %s", dev->devname);
+}
+#endif
+
 void drm_kms_helper_poll_disable(struct drm_device *dev)
 {
 	if (!dev->mode_config.poll_enabled)
@@ -917,4 +927,7 @@ EXPORT_SYMBOL(drm_helper_hpd_irq_event);
 
 static struct slow_work_ops output_poll_ops = {
 	.execute = output_poll_execute,
+#ifdef CONFIG_SLOW_WORK_DEBUG
+	.desc = drm_op_desc,
+#endif
 };


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: Why is kslowd accumulating so much CPU time?
  2010-06-13 19:49   ` tytso
@ 2010-06-13 20:00     ` Dave Airlie
  2010-06-14 18:42       ` Nick Bowler
  0 siblings, 1 reply; 15+ messages in thread
From: Dave Airlie @ 2010-06-13 20:00 UTC (permalink / raw)
  To: tytso, Dave Airlie, Dave Airlie, David Howells, linux-kernel

On Mon, Jun 14, 2010 at 5:49 AM,  <tytso@mit.edu> wrote:
> On Sun, Jun 13, 2010 at 06:23:45PM +1000, Dave Airlie wrote:
>>
>> Its an output polling feature of KMS, however it shouldn't cause
>> disruption elsewhere, but we are holding a mutex in there that there
>> isn't a real need to hold, so I suspect I'll drop that.
>
>> What GPU is it?
>
> It's a Lenovo T400, with an Intel GPU:
>
> (--) PCI:*(0:0:2:0) 8086:2a42:17aa:20e4 Intel Corporation Mobile 4 Series Chipse
> t Integrated Graphics Controller rev 7, Mem @ 0xf4400000/4194304, 0xd0000000/268
> 435456, I/O @ 0x00001800/8
>
> Why does KMS need to poll so frequently?  40 minutes of CPU time
> accumulated in 4 hours of uptime translates to 16% of the CPU being
> consumed by kslowd daemons, which seems... excessive.
>
> I've seen upgraded the userspace to Ubuntu Lucid (from Karmic) and the
> kernel to a post 2.6.35-rc2 (without the vt memory corruptor), and
> after 4 days, 5 hours of uptime, and I'm seeing something better:
>
> root      6390  4.7  0.0      0     0 ?        S<   Jun12  49:12 [kslowd001]
> root      6784  2.4  0.0      0     0 ?        S<   Jun09 150:50 [kslowd003]
> root     20047  2.7  0.0      0     0 ?        S<   Jun10 122:38 [kslowd002]
> root     32324  4.6  0.0      0     0 ?        S<   Jun12  49:54 [kslowd000]
>
> So that's 372 minutes of CPU time accumulated by kslowd in 6060
> minutes, or about 6% of the CPU --- but part of tha time the laptop
> was suspended --- and it still seems high.
>
> I have the following patch which I'm going to install later tonight to
> see if I can figure out if it really is drm_crtc_helper.c which is
> really responsible for all of the kslowd time being burned, but an
> examination of the source doesn't seem to show any other that I'm
> using that would likely be using the slow workqueue.

It most likely is, but polling shouldn't really be taking huge amounts
of CPU, unless there are some u/mdelays in there which would be bad.

In theory on Intel with hotplug irqs we shouldn't be poilling at all,
I must check why, the other thing is you could be suffering from the
hotplug irq problem that others have reported, this would cause slow
work triggers which aren't part of the normal poll cycle.

Dave.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Why is kslowd accumulating so much CPU time?
  2010-06-13 20:00     ` Dave Airlie
@ 2010-06-14 18:42       ` Nick Bowler
  2010-06-14 21:46         ` tytso
  0 siblings, 1 reply; 15+ messages in thread
From: Nick Bowler @ 2010-06-14 18:42 UTC (permalink / raw)
  To: Dave Airlie; +Cc: tytso, Dave Airlie, David Howells, linux-kernel

On 06:00 Mon 14 Jun     , Dave Airlie wrote:
> On Mon, Jun 14, 2010 at 5:49 AM,  <tytso@mit.edu> wrote:
> > It's a Lenovo T400, with an Intel GPU:
> >
[...]
> > Why does KMS need to poll so frequently?  40 minutes of CPU time
> > accumulated in 4 hours of uptime translates to 16% of the CPU being
> > consumed by kslowd daemons, which seems... excessive.
> >
[...]
> > I have the following patch which I'm going to install later tonight to
> > see if I can figure out if it really is drm_crtc_helper.c which is
> > really responsible for all of the kslowd time being burned, but an
> > examination of the source doesn't seem to show any other that I'm
> > using that would likely be using the slow workqueue.
> 
> It most likely is, but polling shouldn't really be taking huge amounts
> of CPU, unless there are some u/mdelays in there which would be bad.
> 
> In theory on Intel with hotplug irqs we shouldn't be poilling at all,
> I must check why, the other thing is you could be suffering from the
> hotplug irq problem that others have reported, this would cause slow
> work triggers which aren't part of the normal poll cycle.

This sounds exactly like the issue I've been seeing on a T500 laptop, as
well (GM45 board).  The slowdowns render the system essentially
unusable, as it can spend a loooong time just moving the mouse cursor a
few pixels on the screen.  During this time, nothing else on the display
is updating (glxgears drops to 0fps).  Things generally seem to be
working fine if I am not moving the mouse, or if I'm not running X.

I do not have this issue on a desktop machine with a G45.

Unfortunately, bisection is proving difficult because the exact set of
conditions to trigger the problems seem to be eluding me: sometimes the
kernel will work perfectly fine for quite some time, and then go
downhill from there.  However, this is definitely a regression
introduced after 2.6.35-rc1.

-- 
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Why is kslowd accumulating so much CPU time?
  2010-06-14 18:42       ` Nick Bowler
@ 2010-06-14 21:46         ` tytso
  2010-06-15 17:25           ` Andy Whitcroft
  2010-06-15 18:07           ` David Howells
  0 siblings, 2 replies; 15+ messages in thread
From: tytso @ 2010-06-14 21:46 UTC (permalink / raw)
  To: Dave Airlie, Dave Airlie, David Howells, linux-kernel

On Mon, Jun 14, 2010 at 02:42:44PM -0400, Nick Bowler wrote:
> 
> This sounds exactly like the issue I've been seeing on a T500 laptop, as
> well (GM45 board).  The slowdowns render the system essentially
> unusable, as it can spend a loooong time just moving the mouse cursor a
> few pixels on the screen.  During this time, nothing else on the display
> is updating (glxgears drops to 0fps).  Things generally seem to be
> working fine if I am not moving the mouse, or if I'm not running X.

I saw exactly the same behaviour when using the Ubuntu Karmic
userspace.  Basically, using a 2.6.35-rc2 (+ vt memory corrupter fix)
kernel, the system was essentially unusable.  Interestingly, the
problem went away (with the same kernel) once I updated to Ubuntu
Lucid.  I was going to mention that so that hopefully someone with far
more X.org-fu than I could figure out how much of this is a X server
bug, and how much of this was a kernel bug, but things worked just
*fine* with 2.6.34 kernel.

Occasionally it will happen that the mouse stops tracking, and then I
grumble and curse, and a few seconds later it resolves itself.  I
haven't had time to track it down; but the problem was breathtakingly
easy to reproduce with an Ubuntu Karmic userspace.  :-)

     		       	  	 		    - Ted

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Why is kslowd accumulating so much CPU time?
  2010-06-14 21:46         ` tytso
@ 2010-06-15 17:25           ` Andy Whitcroft
  2010-06-15 18:07           ` David Howells
  1 sibling, 0 replies; 15+ messages in thread
From: Andy Whitcroft @ 2010-06-15 17:25 UTC (permalink / raw)
  To: tytso, Dave Airlie, Dave Airlie, David Howells, linux-kernel

On Mon, Jun 14, 2010 at 10:46 PM,  <tytso@mit.edu> wrote:
> On Mon, Jun 14, 2010 at 02:42:44PM -0400, Nick Bowler wrote:
>>
>> This sounds exactly like the issue I've been seeing on a T500 laptop, as
>> well (GM45 board).  The slowdowns render the system essentially
>> unusable, as it can spend a loooong time just moving the mouse cursor a
>> few pixels on the screen.  During this time, nothing else on the display
>> is updating (glxgears drops to 0fps).  Things generally seem to be
>> working fine if I am not moving the mouse, or if I'm not running X.
>
> I saw exactly the same behaviour when using the Ubuntu Karmic
> userspace.  Basically, using a 2.6.35-rc2 (+ vt memory corrupter fix)
> kernel, the system was essentially unusable.  Interestingly, the
> problem went away (with the same kernel) once I updated to Ubuntu
> Lucid.  I was going to mention that so that hopefully someone with far
> more X.org-fu than I could figure out how much of this is a X server
> bug, and how much of this was a kernel bug, but things worked just
> *fine* with 2.6.34 kernel.
>
> Occasionally it will happen that the mouse stops tracking, and then I
> grumble and curse, and a few seconds later it resolves itself.  I
> haven't had time to track it down; but the problem was breathtakingly
> easy to reproduce with an Ubuntu Karmic userspace.  :-)

I am seeing crunchy cursor behavior on boot on Ubuntu Lucid user space
with the 2.6.35-rc3 kernel as well.  It pretty much always does that
on boot, and often later as well.  Right now I have four of these
chaps each consuming 9% of a CPU.  Though they definatly come and go.
Very odd:

  938 root      15  -5     0    0    0 S    9  0.0   1:00.90 kslowd001
 1048 root      15  -5     0    0    0 S    9  0.0   1:01.12 kslowd002
  937 root      15  -5     0    0    0 S    8  0.0   1:00.92 kslowd000
 1544 root      15  -5     0    0    0 D    8  0.0   1:00.55 kslowd003

-apw

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Why is kslowd accumulating so much CPU time?
  2010-06-14 21:46         ` tytso
  2010-06-15 17:25           ` Andy Whitcroft
@ 2010-06-15 18:07           ` David Howells
  2010-06-16 11:37             ` Andy Whitcroft
  1 sibling, 1 reply; 15+ messages in thread
From: David Howells @ 2010-06-15 18:07 UTC (permalink / raw)
  To: Andy Whitcroft; +Cc: dhowells, tytso, Dave Airlie, Dave Airlie, linux-kernel

Andy Whitcroft <apw@canonical.com> wrote:

> I am seeing crunchy cursor behavior on boot on Ubuntu Lucid user space
> with the 2.6.35-rc3 kernel as well.  It pretty much always does that
> on boot, and often later as well.  Right now I have four of these
> chaps each consuming 9% of a CPU.  Though they definatly come and go.
> Very odd:
> 
>   938 root      15  -5     0    0    0 S    9  0.0   1:00.90 kslowd001
>  1048 root      15  -5     0    0    0 S    9  0.0   1:01.12 kslowd002
>   937 root      15  -5     0    0    0 S    8  0.0   1:00.92 kslowd000
>  1544 root      15  -5     0    0    0 D    8  0.0   1:00.55 kslowd003

Can you see what they're doing?

	watch -n0 cat /sys/kernel/debug/slow_work/runqueue

David

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Why is kslowd accumulating so much CPU time?
  2010-06-15 18:07           ` David Howells
@ 2010-06-16 11:37             ` Andy Whitcroft
  2010-06-16 13:33               ` Andy Whitcroft
  2010-06-16 14:36               ` Nick Bowler
  0 siblings, 2 replies; 15+ messages in thread
From: Andy Whitcroft @ 2010-06-16 11:37 UTC (permalink / raw)
  To: David Howells; +Cc: tytso, Dave Airlie, Dave Airlie, linux-kernel

> Can you see what they're doing?
>
>        watch -n0 cat /sys/kernel/debug/slow_work/runqueue

Turned on the debugging and applied the patch from Ted, and when
things are bad I see constant cycling of all four threads in the
output showing similar to the below, note only one thread shows at a
time:

Every 0.1s: cat /sys/kernel/debug/slow_work/runqueue    Wed Jun 16 12:34:52 2010

THR PID   ITEM ADDR        FL MARK  DESC
=== ===== ================ == ===== ==========
   0   897 ffff88012bb07510 12  20ms DRM_CRTC_HELPER: i915@pci:0000:00:02.0

When things are working well I see the same output appearing for one
update, about every 10s.

-apw

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Why is kslowd accumulating so much CPU time?
  2010-06-16 11:37             ` Andy Whitcroft
@ 2010-06-16 13:33               ` Andy Whitcroft
  2010-06-16 14:36               ` Nick Bowler
  1 sibling, 0 replies; 15+ messages in thread
From: Andy Whitcroft @ 2010-06-16 13:33 UTC (permalink / raw)
  To: David Howells; +Cc: tytso, Dave Airlie, Dave Airlie, linux-kernel

On Wed, Jun 16, 2010 at 12:37 PM, Andy Whitcroft <apw@canonical.com> wrote:
>> Can you see what they're doing?
>>
>>        watch -n0 cat /sys/kernel/debug/slow_work/runqueue
>
> Turned on the debugging and applied the patch from Ted, and when
> things are bad I see constant cycling of all four threads in the
> output showing similar to the below, note only one thread shows at a
> time:
>
> Every 0.1s: cat /sys/kernel/debug/slow_work/runqueue    Wed Jun 16 12:34:52 2010
>
> THR PID   ITEM ADDR        FL MARK  DESC
> === ===== ================ == ===== ==========
>   0   897 ffff88012bb07510 12  20ms DRM_CRTC_HELPER: i915@pci:0000:00:02.0
>
> When things are working well I see the same output appearing for one
> update, about every 10s.

Ok I've added some additional debug and these appear to be being
thrown a loop by the HPD interrupt which is firing continiusly, here
is a sample from some 4 minutes after boot:

[  284.862670] APW: drm_helper_hpd_irq_event: cancel/submit
[  284.882803] APW: drm_helper_hpd_irq_event: cancel/submit
[  284.902691] APW: drm_helper_hpd_irq_event: cancel/submit
[  284.922814] APW: drm_helper_hpd_irq_event: cancel/submit
[  284.942620] APW: drm_helper_hpd_irq_event: cancel/submit
[  284.962707] APW: drm_helper_hpd_irq_event: cancel/submit
[  284.982937] APW: drm_helper_hpd_irq_event: cancel/submit
[  285.004181] APW: drm_helper_hpd_irq_event: cancel/submit
[  285.022622] APW: drm_helper_hpd_irq_event: cancel/submit
[  285.042569] APW: drm_helper_hpd_irq_event: cancel/submit
[  285.062593] APW: drm_helper_hpd_irq_event: cancel/submit
[  285.082683] APW: drm_helper_hpd_irq_event: cancel/submit

Later when things calm down we see them but much much less often:

[  379.038239] APW: drm_helper_hpd_irq_event: cancel/submit
[  379.044904] APW: drm_helper_hpd_irq_event: cancel/submit
[  382.438106] APW: drm_helper_hpd_irq_event: cancel/submit
[  382.444754] APW: drm_helper_hpd_irq_event: cancel/submit
[  385.838072] APW: drm_helper_hpd_irq_event: cancel/submit
[  385.844735] APW: drm_helper_hpd_irq_event: cancel/submit
[  389.237876] APW: drm_helper_hpd_irq_event: cancel/submit
[  389.244523] APW: drm_helper_hpd_irq_event: cancel/submit
[  392.638048] APW: drm_helper_hpd_irq_event: cancel/submit
[  392.644740] APW: drm_helper_hpd_irq_event: cancel/submit

-apw

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Why is kslowd accumulating so much CPU time?
  2010-06-16 11:37             ` Andy Whitcroft
  2010-06-16 13:33               ` Andy Whitcroft
@ 2010-06-16 14:36               ` Nick Bowler
  2010-06-16 14:55                 ` Andy Whitcroft
  1 sibling, 1 reply; 15+ messages in thread
From: Nick Bowler @ 2010-06-16 14:36 UTC (permalink / raw)
  To: Andy Whitcroft
  Cc: David Howells, tytso, Dave Airlie, Dave Airlie, linux-kernel

On 12:37 Wed 16 Jun     , Andy Whitcroft wrote:
> > Can you see what they're doing?
> >
> >        watch -n0 cat /sys/kernel/debug/slow_work/runqueue
> 
> Turned on the debugging and applied the patch from Ted, and when
> things are bad I see constant cycling of all four threads in the
> output showing similar to the below, note only one thread shows at a
> time:
> 
> Every 0.1s: cat /sys/kernel/debug/slow_work/runqueue    Wed Jun 16 12:34:52 2010
> 
> THR PID   ITEM ADDR        FL MARK  DESC
> === ===== ================ == ===== ==========
>    0   897 ffff88012bb07510 12  20ms DRM_CRTC_HELPER: i915@pci:0000:00:02.0

Indeed, this looks very similar to mine, except my DESC field is blank
for some reason.  The FL value is sometimes 12, sometimes 2.

When things are bad, all four of the kslowd threads are pegged at 16%
CPU usage.  On my T500, they _never_ calm down once they're pegged.
However, things seem to be working normally at boot.  It seems that
there is some initial trigger which pegs the threads, then the cursor
lag and other problems are just fallout after that.  The system is
essentially unusable after this point.

The threads sometimes get pegged immediately after booting the system,
sometimes it can last an hour or more before showing any problems.
Unfortunately, this is making bisection essentially impossible.

This seems to have been introduced somewhere between 2.6.35-rc1 and
-rc2.

-- 
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Why is kslowd accumulating so much CPU time?
  2010-06-16 14:36               ` Nick Bowler
@ 2010-06-16 14:55                 ` Andy Whitcroft
  2010-06-17 13:25                   ` Nick Bowler
  0 siblings, 1 reply; 15+ messages in thread
From: Andy Whitcroft @ 2010-06-16 14:55 UTC (permalink / raw)
  To: Andy Whitcroft, David Howells, tytso, Dave Airlie, Dave Airlie,
	linux-kernel

On Wed, Jun 16, 2010 at 3:36 PM, Nick Bowler <nbowler@elliptictech.com> wrote:
> On 12:37 Wed 16 Jun     , Andy Whitcroft wrote:
>> > Can you see what they're doing?
>> >
>> >        watch -n0 cat /sys/kernel/debug/slow_work/runqueue
>>
>> Turned on the debugging and applied the patch from Ted, and when
>> things are bad I see constant cycling of all four threads in the
>> output showing similar to the below, note only one thread shows at a
>> time:
>>
>> Every 0.1s: cat /sys/kernel/debug/slow_work/runqueue    Wed Jun 16 12:34:52 2010
>>
>> THR PID   ITEM ADDR        FL MARK  DESC
>> === ===== ================ == ===== ==========
>>    0   897 ffff88012bb07510 12  20ms DRM_CRTC_HELPER: i915@pci:0000:00:02.0
>
> Indeed, this looks very similar to mine, except my DESC field is blank
> for some reason.  The FL value is sometimes 12, sometimes 2.

DESC is only filled in if you have Ted's patch applied (earlier in this thread).

> When things are bad, all four of the kslowd threads are pegged at 16%
> CPU usage.  On my T500, they _never_ calm down once they're pegged.
> However, things seem to be working normally at boot.  It seems that
> there is some initial trigger which pegs the threads, then the cursor
> lag and other problems are just fallout after that.  The system is
> essentially unusable after this point.
>
> The threads sometimes get pegged immediately after booting the system,
> sometimes it can last an hour or more before showing any problems.
> Unfortunately, this is making bisection essentially impossible.
>
> This seems to have been introduced somewhere between 2.6.35-rc1 and

I suspect this is introduced by the commit which pulled polling of DRM
connectors into the core.

-apw

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Why is kslowd accumulating so much CPU time?
  2010-06-16 14:55                 ` Andy Whitcroft
@ 2010-06-17 13:25                   ` Nick Bowler
  0 siblings, 0 replies; 15+ messages in thread
From: Nick Bowler @ 2010-06-17 13:25 UTC (permalink / raw)
  To: Andy Whitcroft
  Cc: David Howells, tytso, Dave Airlie, Dave Airlie, linux-kernel

On 15:55 Wed 16 Jun     , Andy Whitcroft wrote:
> On Wed, Jun 16, 2010 at 3:36 PM, Nick Bowler <nbowler@elliptictech.com> wrote:
> > The threads sometimes get pegged immediately after booting the system,
> > sometimes it can last an hour or more before showing any problems.
> > Unfortunately, this is making bisection essentially impossible.
> >
> > This seems to have been introduced somewhere between 2.6.35-rc1 and
> 
> I suspect this is introduced by the commit which pulled polling of DRM
> connectors into the core.

OK.  From this hint I looked at the log between -rc1 and -rc2 and found
commit fbf81762e385d ("drm/kms: disable/enable poll around switcheroo
on/off").

Reverting this commit brings everything back to normal.

-- 
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2010-06-17 13:26 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-09 18:36 Why is kslowd accumulating so much CPU time? Theodore Ts'o
2010-06-09 18:56 ` David Howells
2010-06-09 19:00 ` David Howells
2010-06-13  8:23 ` Dave Airlie
2010-06-13 19:49   ` tytso
2010-06-13 20:00     ` Dave Airlie
2010-06-14 18:42       ` Nick Bowler
2010-06-14 21:46         ` tytso
2010-06-15 17:25           ` Andy Whitcroft
2010-06-15 18:07           ` David Howells
2010-06-16 11:37             ` Andy Whitcroft
2010-06-16 13:33               ` Andy Whitcroft
2010-06-16 14:36               ` Nick Bowler
2010-06-16 14:55                 ` Andy Whitcroft
2010-06-17 13:25                   ` Nick Bowler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).