netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Willem de Bruijn <willemb@google.com>
To: Wei Wang <weiwan@google.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	David Miller <davem@davemloft.net>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	Jakub Kicinski <kuba@kernel.org>,
	virtualization@lists.linux-foundation.org
Subject: Re: [PATCH net] virtio-net: suppress bad irq warning for tx napi
Date: Tue, 2 Feb 2021 19:06:53 -0500	[thread overview]
Message-ID: <CA+FuTSdkJcj_ikNnJmGadBZ1fa7q26MZ1g3ERf8Ax+YbXvgcng@mail.gmail.com> (raw)
In-Reply-To: <CA+FuTSe-6MSpB4hwwvwPgDqHkxYJoxMZMDbOusNqiq0Gwa1eiQ@mail.gmail.com>

On Tue, Feb 2, 2021 at 6:53 PM Willem de Bruijn <willemb@google.com> wrote:
>
> On Tue, Feb 2, 2021 at 6:47 PM Wei Wang <weiwan@google.com> wrote:
> >
> > On Tue, Feb 2, 2021 at 3:12 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Thu, Jan 28, 2021 at 04:21:36PM -0800, Wei Wang wrote:
> > > > With the implementation of napi-tx in virtio driver, we clean tx
> > > > descriptors from rx napi handler, for the purpose of reducing tx
> > > > complete interrupts. But this could introduce a race where tx complete
> > > > interrupt has been raised, but the handler found there is no work to do
> > > > because we have done the work in the previous rx interrupt handler.
> > > > This could lead to the following warning msg:
> > > > [ 3588.010778] irq 38: nobody cared (try booting with the
> > > > "irqpoll" option)
> > > > [ 3588.017938] CPU: 4 PID: 0 Comm: swapper/4 Not tainted
> > > > 5.3.0-19-generic #20~18.04.2-Ubuntu
> > > > [ 3588.017940] Call Trace:
> > > > [ 3588.017942]  <IRQ>
> > > > [ 3588.017951]  dump_stack+0x63/0x85
> > > > [ 3588.017953]  __report_bad_irq+0x35/0xc0
> > > > [ 3588.017955]  note_interrupt+0x24b/0x2a0
> > > > [ 3588.017956]  handle_irq_event_percpu+0x54/0x80
> > > > [ 3588.017957]  handle_irq_event+0x3b/0x60
> > > > [ 3588.017958]  handle_edge_irq+0x83/0x1a0
> > > > [ 3588.017961]  handle_irq+0x20/0x30
> > > > [ 3588.017964]  do_IRQ+0x50/0xe0
> > > > [ 3588.017966]  common_interrupt+0xf/0xf
> > > > [ 3588.017966]  </IRQ>
> > > > [ 3588.017989] handlers:
> > > > [ 3588.020374] [<000000001b9f1da8>] vring_interrupt
> > > > [ 3588.025099] Disabling IRQ #38
> > > >
> > > > This patch adds a new param to struct vring_virtqueue, and we set it for
> > > > tx virtqueues if napi-tx is enabled, to suppress the warning in such
> > > > case.
> > > >
> > > > Fixes: 7b0411ef4aa6 ("virtio-net: clean tx descriptors from rx napi")
> > > > Reported-by: Rick Jones <jonesrick@google.com>
> > > > Signed-off-by: Wei Wang <weiwan@google.com>
> > > > Signed-off-by: Willem de Bruijn <willemb@google.com>
> > >
> > >
> > > This description does not make sense to me.
> > >
> > > irq X: nobody cared
> > > only triggers after an interrupt is unhandled repeatedly.
> > >
> > > So something causes a storm of useless tx interrupts here.
> > >
> > > Let's find out what it was please. What you are doing is
> > > just preventing linux from complaining.
> >
> > The traffic that causes this warning is a netperf tcp_stream with at
> > least 128 flows between 2 hosts. And the warning gets triggered on the
> > receiving host, which has a lot of rx interrupts firing on all queues,
> > and a few tx interrupts.
> > And I think the scenario is: when the tx interrupt gets fired, it gets
> > coalesced with the rx interrupt. Basically, the rx and tx interrupts
> > get triggered very close to each other, and gets handled in one round
> > of do_IRQ(). And the rx irq handler gets called first, which calls
> > virtnet_poll(). However, virtnet_poll() calls virtnet_poll_cleantx()
> > to try to do the work on the corresponding tx queue as well. That's
> > why when tx interrupt handler gets called, it sees no work to do.
> > And the reason for the rx handler to handle the tx work is here:
> > https://lists.linuxfoundation.org/pipermail/virtualization/2017-April/034740.html
>
> Indeed. It's not a storm necessarily. The warning occurs after one
> hundred such events, since boot, which is a small number compared real
> interrupt load.

Sorry, this is wrong. It is the other call to __report_bad_irq from
note_interrupt that applies here.

> Occasionally seeing an interrupt with no work is expected after
> 7b0411ef4aa6 ("virtio-net: clean tx descriptors from rx napi"). As
> long as this rate of events is very low compared to useful interrupts,
> and total interrupt count is greatly reduced vs not having work
> stealing, it is a net win.

  reply	other threads:[~2021-02-03  0:08 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-29  0:21 [PATCH net] virtio-net: suppress bad irq warning for tx napi Wei Wang
2021-02-02  2:24 ` Jakub Kicinski
2021-02-02  3:06 ` Jason Wang
2021-02-02 14:37   ` Willem de Bruijn
2021-02-03  5:33     ` Jason Wang
2021-02-03 18:28       ` Willem de Bruijn
2021-02-04  3:06         ` Jason Wang
2021-02-04 20:50           ` Willem de Bruijn
2021-02-08  3:29             ` Jason Wang
2021-02-08 19:08               ` Willem de Bruijn
2021-02-09  3:27                 ` Jason Wang
2021-02-09 14:58                   ` Willem de Bruijn
2021-02-09 18:00                     ` Wei Wang
2021-02-10  9:14                       ` Michael S. Tsirkin
2021-02-11  0:13                         ` Wei Wang
2021-02-18  5:39                         ` Jason Wang
2021-02-21 11:33                           ` Michael S. Tsirkin
2021-02-02 23:11 ` Michael S. Tsirkin
2021-02-02 23:47   ` Wei Wang
2021-02-02 23:53     ` Willem de Bruijn
2021-02-03  0:06       ` Willem de Bruijn [this message]
2021-02-03 10:38         ` Michael S. Tsirkin
2021-02-03 18:24           ` Willem de Bruijn
2021-02-03 23:09             ` Michael S. Tsirkin
2021-02-03 23:52               ` Wei Wang
2021-02-04 20:47                 ` Willem de Bruijn
2021-02-05 22:28                   ` Wei Wang
2021-04-13  5:15                     ` Michael S. Tsirkin
2021-09-29 20:21                       ` Wei Wang
2021-09-29 21:53                         ` Michael S. Tsirkin
2021-09-29 23:08                           ` Wei Wang
2021-09-30  5:40                             ` Michael S. Tsirkin
2021-09-30 18:10                             ` Dave Taht
2021-04-12 22:08 ` Michael S. Tsirkin
2021-04-12 22:33   ` Michael S. Tsirkin
2021-04-12 23:14     ` David Miller
2021-04-13  4:24       ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+FuTSdkJcj_ikNnJmGadBZ1fa7q26MZ1g3ERf8Ax+YbXvgcng@mail.gmail.com \
    --to=willemb@google.com \
    --cc=davem@davemloft.net \
    --cc=kuba@kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=weiwan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).