All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: Andi Kleen <andi@firstfloor.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Chris Friesen <cfriesen@nortel.com>, Ingo Molnar <mingo@elte.hu>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Steven Rostedt <rostedt@goodmis.org>,
	David Miller <davem@davemloft.net>,
	linuxppc-dev@ozlabs.org, paulus@samba.org,
	netdev@vger.kernel.org
Subject: Re: question about softirqs
Date: Wed, 13 May 2009 16:54:44 +0200	[thread overview]
Message-ID: <4A0ADF34.2040001@cosmosbay.com> (raw)
In-Reply-To: <87my9hkrmw.fsf@basil.nowhere.org>

Andi Kleen a écrit :
> Thomas Gleixner <tglx@linutronix.de> writes:
> 
> 
>> Err, no. Chris is completely correct:
>>
>>         if (!in_interrupt())
>> 		wakeup_softirqd();
> 
> Yes you have to wake it up just in case, but it doesn't normally
> process the data because a normal softirq comes in faster. It's
> just a safety policy. 
> 
> You can check this by checking the accumulated CPU time on your
> ksoftirqs.  Mine are all 0 even on long running systems.
> 

Then its a bug Andi. Its quite easy to trigger ksoftirqd with a Gb ethernet link.

commit f5f293a4e3d0a0c52cec31de6762c95050156516 corrected something
(making mpstat and top correctly display softirq on cpu stats),
but apparently we still have a problem to report correct time on processes,
particularly on ksoftirq/x

I have one machine SMP flooded by network frames, CPU0 handling all
the work, inside ksoftirq/0 (napi processing : almost no more hard interrupts delivered)

Still, top or ps reports no more than 30% of cpu time used by
ksoftirqd, while this cpu only runs ksoftirqd/0 (100% in sirq), and has no idle time.

$ps -fp 4 ; mpstat -P 0 1 10 ; ps -fp 4
UID        PID  PPID  C STIME TTY          TIME CMD
root         4     2  1 15:35 ?        00:00:46 [ksoftirqd/0]
Linux 2.6.30-rc5-tip-01595-g6f75dad-dirty (svivoipvnx001)       05/13/2009      _i686_

04:45:01 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
04:45:02 PM    0    0.00    0.00    0.00    0.00    0.00  100.00    0.00    0.00    0.00
04:45:03 PM    0    0.00    0.00    0.00    0.00    0.00   99.01    0.00    0.00    0.99
04:45:04 PM    0    0.00    0.00    0.00    0.00    0.00  100.00    0.00    0.00    0.00
04:45:05 PM    0    0.00    0.00    0.00    0.00    0.00  100.00    0.00    0.00    0.00
04:45:06 PM    0    0.00    0.00    0.00    0.00    0.00  100.00    0.00    0.00    0.00
04:45:07 PM    0    0.00    0.00    0.00    0.00    0.00  100.00    0.00    0.00    0.00
04:45:08 PM    0    0.00    0.00    0.00    0.00    0.00  100.00    0.00    0.00    0.00
04:45:09 PM    0    0.00    0.00    0.00    0.00    0.00  100.00    0.00    0.00    0.00
04:45:10 PM    0    0.00    0.00    0.00    0.00    0.00  100.00    0.00    0.00    0.00
04:45:11 PM    0    0.00    0.00    0.00    0.00    0.00  100.00    0.00    0.00    0.00
Average:       0    0.00    0.00    0.00    0.00    0.00   99.90    0.00    0.00    0.10
UID        PID  PPID  C STIME TTY          TIME CMD
root         4     2  1 15:35 ?        00:00:49 [ksoftirqd/0]

You can see here time consumed by ksoftirqd/0 suring this 10 seconds time frame is *only* 3 seconds.

Therefore, we cannot trust ps, not with current kernel.

# cat /proc/4/stat ; sleep 10 ; cat /proc/4/stat
4 (ksoftirqd/0) R 2 0 0 0 -1 2216730688 0 0 0 0 0 15347 0 0 15 -5 1 0 6 0 0 4294967295 0 0 0 0 0 0 0 2147483647 0 0 0 0 17 0 0 0 0 0 0
4 (ksoftirqd/0) R 2 0 0 0 -1 2216730688 0 0 0 0 0 15670 0 0 15 -5 1 0 6 0 0 4294967295 0 0 0 0 0 0 0 2147483647 0 0 0 0 17 0 0 0 0 0 0


> The reason Andrea originally added the softirqds was just that
> if you have very softirq intensive workloads they would tie
> up too much CPU time or not make enough process with the default
> "don't loop too often" heuristics. 
> 
>> We can not rely on irqs coming in when the softirq is raised from
> 
> You can't rely on it, but it happens in near all cases.
> 
> -Andi



WARNING: multiple messages have this Message-ID (diff)
From: Eric Dumazet <dada1@cosmosbay.com>
To: Andi Kleen <andi@firstfloor.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	linuxppc-dev@ozlabs.org, netdev@vger.kernel.org,
	Ingo Molnar <mingo@elte.hu>, Steven Rostedt <rostedt@goodmis.org>,
	paulus@samba.org, Thomas Gleixner <tglx@linutronix.de>,
	David Miller <davem@davemloft.net>
Subject: Re: question about softirqs
Date: Wed, 13 May 2009 16:54:44 +0200	[thread overview]
Message-ID: <4A0ADF34.2040001@cosmosbay.com> (raw)
In-Reply-To: <87my9hkrmw.fsf@basil.nowhere.org>

Andi Kleen a =E9crit :
> Thomas Gleixner <tglx@linutronix.de> writes:
>=20
>=20
>> Err, no. Chris is completely correct:
>>
>>         if (!in_interrupt())
>> 		wakeup_softirqd();
>=20
> Yes you have to wake it up just in case, but it doesn't normally
> process the data because a normal softirq comes in faster. It's
> just a safety policy.=20
>=20
> You can check this by checking the accumulated CPU time on your
> ksoftirqs.  Mine are all 0 even on long running systems.
>=20

Then its a bug Andi. Its quite easy to trigger ksoftirqd with a Gb ethern=
et link.

commit f5f293a4e3d0a0c52cec31de6762c95050156516 corrected something
(making mpstat and top correctly display softirq on cpu stats),
but apparently we still have a problem to report correct time on processe=
s,
particularly on ksoftirq/x

I have one machine SMP flooded by network frames, CPU0 handling all
the work, inside ksoftirq/0 (napi processing : almost no more hard interr=
upts delivered)

Still, top or ps reports no more than 30% of cpu time used by
ksoftirqd, while this cpu only runs ksoftirqd/0 (100% in sirq), and has n=
o idle time.

$ps -fp 4 ; mpstat -P 0 1 10 ; ps -fp 4
UID        PID  PPID  C STIME TTY          TIME CMD
root         4     2  1 15:35 ?        00:00:46 [ksoftirqd/0]
Linux 2.6.30-rc5-tip-01595-g6f75dad-dirty (svivoipvnx001)       05/13/200=
9      _i686_

04:45:01 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal =
 %guest   %idle
04:45:02 PM    0    0.00    0.00    0.00    0.00    0.00  100.00    0.00 =
   0.00    0.00
04:45:03 PM    0    0.00    0.00    0.00    0.00    0.00   99.01    0.00 =
   0.00    0.99
04:45:04 PM    0    0.00    0.00    0.00    0.00    0.00  100.00    0.00 =
   0.00    0.00
04:45:05 PM    0    0.00    0.00    0.00    0.00    0.00  100.00    0.00 =
   0.00    0.00
04:45:06 PM    0    0.00    0.00    0.00    0.00    0.00  100.00    0.00 =
   0.00    0.00
04:45:07 PM    0    0.00    0.00    0.00    0.00    0.00  100.00    0.00 =
   0.00    0.00
04:45:08 PM    0    0.00    0.00    0.00    0.00    0.00  100.00    0.00 =
   0.00    0.00
04:45:09 PM    0    0.00    0.00    0.00    0.00    0.00  100.00    0.00 =
   0.00    0.00
04:45:10 PM    0    0.00    0.00    0.00    0.00    0.00  100.00    0.00 =
   0.00    0.00
04:45:11 PM    0    0.00    0.00    0.00    0.00    0.00  100.00    0.00 =
   0.00    0.00
Average:       0    0.00    0.00    0.00    0.00    0.00   99.90    0.00 =
   0.00    0.10
UID        PID  PPID  C STIME TTY          TIME CMD
root         4     2  1 15:35 ?        00:00:49 [ksoftirqd/0]

You can see here time consumed by ksoftirqd/0 suring this 10 seconds time=
 frame is *only* 3 seconds.

Therefore, we cannot trust ps, not with current kernel.

# cat /proc/4/stat ; sleep 10 ; cat /proc/4/stat
4 (ksoftirqd/0) R 2 0 0 0 -1 2216730688 0 0 0 0 0 15347 0 0 15 -5 1 0 6 0=
 0 4294967295 0 0 0 0 0 0 0 2147483647 0 0 0 0 17 0 0 0 0 0 0
4 (ksoftirqd/0) R 2 0 0 0 -1 2216730688 0 0 0 0 0 15670 0 0 15 -5 1 0 6 0=
 0 4294967295 0 0 0 0 0 0 0 2147483647 0 0 0 0 17 0 0 0 0 0 0


> The reason Andrea originally added the softirqds was just that
> if you have very softirq intensive workloads they would tie
> up too much CPU time or not make enough process with the default
> "don't loop too often" heuristics.=20
>=20
>> We can not rely on irqs coming in when the softirq is raised from
>=20
> You can't rely on it, but it happens in near all cases.
>=20
> -Andi

  reply	other threads:[~2009-05-13 14:55 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-08 22:51 question about softirqs Chris Friesen
2009-05-08 23:05 ` David Miller
2009-05-08 23:34 ` Paul Mackerras
2009-05-08 23:53   ` David Miller
2009-05-09  2:52     ` Benjamin Herrenschmidt
2009-05-09  3:31     ` Paul Mackerras
2009-05-09  6:48       ` David Miller
2009-05-11 18:25         ` Chris Friesen
2009-05-11 23:24           ` David Miller
2009-05-12  0:43             ` Chris Friesen
2009-05-12  8:12               ` Ingo Molnar
2009-05-12  8:12                 ` Ingo Molnar
2009-05-12  9:12                 ` Peter Zijlstra
2009-05-12  9:23                   ` Ingo Molnar
2009-05-12  9:32                     ` Peter Zijlstra
2009-05-12 12:20                       ` Steven Rostedt
2009-05-12 12:20                         ` Steven Rostedt
2009-05-13  4:45                         ` David Miller
2009-05-13  4:44                     ` David Miller
2009-05-13  4:44                       ` David Miller
2009-05-13  5:15                       ` Paul Mackerras
2009-05-13  5:15                         ` Paul Mackerras
2009-05-13  5:28                         ` David Miller
2009-05-13  5:28                           ` David Miller
2009-05-13  5:55                   ` Evgeniy Polyakov
2009-05-13  5:55                     ` Evgeniy Polyakov
2009-05-12 15:18                 ` Chris Friesen
2009-05-12 15:18                   ` Chris Friesen
2009-05-13  8:34                   ` Andi Kleen
2009-05-13  8:34                     ` Andi Kleen
2009-05-13 13:23                     ` Chris Friesen
2009-05-13 14:15                       ` Andi Kleen
2009-05-13 14:15                         ` Andi Kleen
2009-05-13 14:17                         ` Thomas Gleixner
2009-05-13 14:17                           ` Thomas Gleixner
2009-05-13 14:24                           ` Andi Kleen
2009-05-13 14:24                             ` Andi Kleen
2009-05-13 14:54                             ` Eric Dumazet [this message]
2009-05-13 14:54                               ` Eric Dumazet
2009-05-13 15:02                               ` Andi Kleen
2009-05-13 15:02                                 ` Andi Kleen
2009-05-13 15:05                             ` Chris Friesen
2009-05-13 15:54                               ` Thomas Gleixner
2009-05-13 15:54                                 ` Thomas Gleixner
2009-05-13 16:10                                 ` Chris Friesen
2009-05-13 17:01                               ` Andi Kleen
2009-05-13 19:04                                 ` Chris Friesen
2009-05-13 19:04                                   ` Chris Friesen
2009-05-13 19:13                                   ` Andi Kleen
2009-05-13 19:13                                     ` Andi Kleen
2009-05-13 19:44                                     ` Chris Friesen
2009-05-13 19:53                                       ` Andi Kleen
2009-05-13 19:53                                         ` Andi Kleen
2009-05-13 20:55                                         ` Thomas Gleixner
2009-05-13 20:55                                           ` Thomas Gleixner
2009-05-11 23:34           ` Paul Mackerras
2009-05-09  0:28   ` Chris Friesen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A0ADF34.2040001@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=andi@firstfloor.org \
    --cc=cfriesen@nortel.com \
    --cc=davem@davemloft.net \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=mingo@elte.hu \
    --cc=netdev@vger.kernel.org \
    --cc=paulus@samba.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.