Netdev Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH net v2 0/2] virtio-net: suppress bad irq warning for tx napi
@ 2021-02-20  1:44 Wei Wang
  2021-02-20  1:44 ` [PATCH net v2 1/2] virtio: add a new parameter in struct virtqueue Wei Wang
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Wei Wang @ 2021-02-20  1:44 UTC (permalink / raw)
  To: Michael S . Tsirkin, Jason Wang, David S . Miller, Jakub Kicinski
  Cc: Willem de Bruijn, virtualization, netdev

With the implementation of napi-tx in virtio driver, we clean tx
descriptors from rx napi handler, for the purpose of reducing tx
complete interrupts. But this could introduce a race where tx complete
interrupt has been raised, but the handler found there is no work to do
because we have done the work in the previous rx interrupt handler.
This could lead to the following warning msg:
[ 3588.010778] irq 38: nobody cared (try booting with the
"irqpoll" option)
[ 3588.017938] CPU: 4 PID: 0 Comm: swapper/4 Not tainted
5.3.0-19-generic #20~18.04.2-Ubuntu
[ 3588.017940] Call Trace:
[ 3588.017942]  <IRQ>
[ 3588.017951]  dump_stack+0x63/0x85
[ 3588.017953]  __report_bad_irq+0x35/0xc0
[ 3588.017955]  note_interrupt+0x24b/0x2a0
[ 3588.017956]  handle_irq_event_percpu+0x54/0x80
[ 3588.017957]  handle_irq_event+0x3b/0x60
[ 3588.017958]  handle_edge_irq+0x83/0x1a0
[ 3588.017961]  handle_irq+0x20/0x30
[ 3588.017964]  do_IRQ+0x50/0xe0
[ 3588.017966]  common_interrupt+0xf/0xf
[ 3588.017966]  </IRQ>
[ 3588.017989] handlers:
[ 3588.020374] [<000000001b9f1da8>] vring_interrupt
[ 3588.025099] Disabling IRQ #38

This patch series contains 2 patches. The first one adds a new param to
struct vring_virtqueue to control if we want to suppress the bad irq
warning. And the second patch in virtio-net sets it for tx virtqueues if
napi-tx is enabled.

Wei Wang (2):
  virtio: add a new parameter in struct virtqueue
  virtio-net: suppress bad irq warning for tx napi

 drivers/net/virtio_net.c     | 19 ++++++++++++++-----
 drivers/virtio/virtio_ring.c | 16 ++++++++++++++++
 include/linux/virtio.h       |  2 ++
 3 files changed, 32 insertions(+), 5 deletions(-)

-- 
2.30.0.617.g56c4b15f3c-goog


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH net v2 1/2] virtio: add a new parameter in struct virtqueue
  2021-02-20  1:44 [PATCH net v2 0/2] virtio-net: suppress bad irq warning for tx napi Wei Wang
@ 2021-02-20  1:44 ` Wei Wang
  2021-02-20  1:44 ` [PATCH net v2 2/2] virtio-net: suppress bad irq warning for tx napi Wei Wang
  2021-02-23 14:25 ` [PATCH net v2 0/2] " Michael S. Tsirkin
  2 siblings, 0 replies; 6+ messages in thread
From: Wei Wang @ 2021-02-20  1:44 UTC (permalink / raw)
  To: Michael S . Tsirkin, Jason Wang, David S . Miller, Jakub Kicinski
  Cc: Willem de Bruijn, virtualization, netdev

The new parameter is set to suppress the warning in the interrupt
handler when no work needs to be done.
This will be used for virtio net driver in the following patch.

Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 drivers/virtio/virtio_ring.c | 16 ++++++++++++++++
 include/linux/virtio.h       |  2 ++
 2 files changed, 18 insertions(+)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 71e16b53e9c1..3c5ac1b26dff 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -105,6 +105,9 @@ struct vring_virtqueue {
 	/* Host publishes avail event idx */
 	bool event;
 
+	/* Suppress warning in interrupt handler */
+	bool no_interrupt_check;
+
 	/* Head of free buffer list. */
 	unsigned int free_head;
 	/* Number we've added since last sync. */
@@ -1604,6 +1607,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
 	vq->notify = notify;
 	vq->weak_barriers = weak_barriers;
 	vq->broken = false;
+	vq->no_interrupt_check = false;
 	vq->last_used_idx = 0;
 	vq->num_added = 0;
 	vq->packed_ring = true;
@@ -2037,6 +2041,9 @@ irqreturn_t vring_interrupt(int irq, void *_vq)
 	struct vring_virtqueue *vq = to_vvq(_vq);
 
 	if (!more_used(vq)) {
+		if (vq->no_interrupt_check)
+			return IRQ_HANDLED;
+
 		pr_debug("virtqueue interrupt with no work for %p\n", vq);
 		return IRQ_NONE;
 	}
@@ -2082,6 +2089,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
 	vq->notify = notify;
 	vq->weak_barriers = weak_barriers;
 	vq->broken = false;
+	vq->no_interrupt_check = false;
 	vq->last_used_idx = 0;
 	vq->num_added = 0;
 	vq->use_dma_api = vring_use_dma_api(vdev);
@@ -2266,6 +2274,14 @@ bool virtqueue_is_broken(struct virtqueue *_vq)
 }
 EXPORT_SYMBOL_GPL(virtqueue_is_broken);
 
+void virtqueue_set_no_interrupt_check(struct virtqueue *_vq, bool val)
+{
+	struct vring_virtqueue *vq = to_vvq(_vq);
+
+	vq->no_interrupt_check = val;
+}
+EXPORT_SYMBOL_GPL(virtqueue_set_no_interrupt_check);
+
 /*
  * This should prevent the device from being used, allowing drivers to
  * recover.  You may need to grab appropriate locks to flush.
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 55ea329fe72a..27b374df78cc 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -84,6 +84,8 @@ unsigned int virtqueue_get_vring_size(struct virtqueue *vq);
 
 bool virtqueue_is_broken(struct virtqueue *vq);
 
+void virtqueue_set_no_interrupt_check(struct virtqueue *vq, bool val);
+
 const struct vring *virtqueue_get_vring(struct virtqueue *vq);
 dma_addr_t virtqueue_get_desc_addr(struct virtqueue *vq);
 dma_addr_t virtqueue_get_avail_addr(struct virtqueue *vq);
-- 
2.30.0.617.g56c4b15f3c-goog


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH net v2 2/2] virtio-net: suppress bad irq warning for tx napi
  2021-02-20  1:44 [PATCH net v2 0/2] virtio-net: suppress bad irq warning for tx napi Wei Wang
  2021-02-20  1:44 ` [PATCH net v2 1/2] virtio: add a new parameter in struct virtqueue Wei Wang
@ 2021-02-20  1:44 ` Wei Wang
  2021-02-23 14:25 ` [PATCH net v2 0/2] " Michael S. Tsirkin
  2 siblings, 0 replies; 6+ messages in thread
From: Wei Wang @ 2021-02-20  1:44 UTC (permalink / raw)
  To: Michael S . Tsirkin, Jason Wang, David S . Miller, Jakub Kicinski
  Cc: Willem de Bruijn, virtualization, netdev

With the implementation of napi-tx in virtio driver, we clean tx
descriptors from rx napi handler, for the purpose of reducing tx
complete interrupts. But this could introduce a race where tx complete
interrupt has been raised, but the handler found there is no work to do
because we have done the work in the previous rx interrupt handler.
This could lead to the following warning msg:
[ 3588.010778] irq 38: nobody cared (try booting with the
"irqpoll" option)
[ 3588.017938] CPU: 4 PID: 0 Comm: swapper/4 Not tainted
5.3.0-19-generic #20~18.04.2-Ubuntu
[ 3588.017940] Call Trace:
[ 3588.017942]  <IRQ>
[ 3588.017951]  dump_stack+0x63/0x85
[ 3588.017953]  __report_bad_irq+0x35/0xc0
[ 3588.017955]  note_interrupt+0x24b/0x2a0
[ 3588.017956]  handle_irq_event_percpu+0x54/0x80
[ 3588.017957]  handle_irq_event+0x3b/0x60
[ 3588.017958]  handle_edge_irq+0x83/0x1a0
[ 3588.017961]  handle_irq+0x20/0x30
[ 3588.017964]  do_IRQ+0x50/0xe0
[ 3588.017966]  common_interrupt+0xf/0xf
[ 3588.017966]  </IRQ>
[ 3588.017989] handlers:
[ 3588.020374] [<000000001b9f1da8>] vring_interrupt
[ 3588.025099] Disabling IRQ #38

This patch sets no_interrupt_check in tx vring_virtqueue, when napi-tx
is enabled, to suppress the warning in such case.

Fixes: 7b0411ef4aa6 ("virtio-net: clean tx descriptors from rx napi")
Reported-by: Rick Jones <jonesrick@google.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 drivers/net/virtio_net.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 508408fbe78f..18b14739d63e 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1303,13 +1303,22 @@ static void virtnet_napi_tx_enable(struct virtnet_info *vi,
 		return;
 	}
 
+	/* With napi_tx enabled, free_old_xmit_skbs() could be called from
+	 * rx napi handler. Set no_interrupt_check to suppress bad irq warning
+	 * for IRQ_NONE case from tx complete interrupt handler.
+	 */
+	virtqueue_set_no_interrupt_check(vq, true);
+
 	return virtnet_napi_enable(vq, napi);
 }
 
-static void virtnet_napi_tx_disable(struct napi_struct *napi)
+static void virtnet_napi_tx_disable(struct virtqueue *vq,
+				    struct napi_struct *napi)
 {
-	if (napi->weight)
+	if (napi->weight) {
 		napi_disable(napi);
+		virtqueue_set_no_interrupt_check(vq, false);
+	}
 }
 
 static void refill_work(struct work_struct *work)
@@ -1835,7 +1844,7 @@ static int virtnet_close(struct net_device *dev)
 	for (i = 0; i < vi->max_queue_pairs; i++) {
 		xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
 		napi_disable(&vi->rq[i].napi);
-		virtnet_napi_tx_disable(&vi->sq[i].napi);
+		virtnet_napi_tx_disable(vi->sq[i].vq, &vi->sq[i].napi);
 	}
 
 	return 0;
@@ -2315,7 +2324,7 @@ static void virtnet_freeze_down(struct virtio_device *vdev)
 	if (netif_running(vi->dev)) {
 		for (i = 0; i < vi->max_queue_pairs; i++) {
 			napi_disable(&vi->rq[i].napi);
-			virtnet_napi_tx_disable(&vi->sq[i].napi);
+			virtnet_napi_tx_disable(vi->sq[i].vq, &vi->sq[i].napi);
 		}
 	}
 }
@@ -2440,7 +2449,7 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog,
 	if (netif_running(dev)) {
 		for (i = 0; i < vi->max_queue_pairs; i++) {
 			napi_disable(&vi->rq[i].napi);
-			virtnet_napi_tx_disable(&vi->sq[i].napi);
+			virtnet_napi_tx_disable(vi->sq[i].vq, &vi->sq[i].napi);
 		}
 	}
 
-- 
2.30.0.617.g56c4b15f3c-goog


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net v2 0/2] virtio-net: suppress bad irq warning for tx napi
  2021-02-20  1:44 [PATCH net v2 0/2] virtio-net: suppress bad irq warning for tx napi Wei Wang
  2021-02-20  1:44 ` [PATCH net v2 1/2] virtio: add a new parameter in struct virtqueue Wei Wang
  2021-02-20  1:44 ` [PATCH net v2 2/2] virtio-net: suppress bad irq warning for tx napi Wei Wang
@ 2021-02-23 14:25 ` Michael S. Tsirkin
  2021-02-23 17:37   ` Wei Wang
  2 siblings, 1 reply; 6+ messages in thread
From: Michael S. Tsirkin @ 2021-02-23 14:25 UTC (permalink / raw)
  To: Wei Wang
  Cc: Jason Wang, David S . Miller, Jakub Kicinski, Willem de Bruijn,
	virtualization, netdev

On Fri, Feb 19, 2021 at 05:44:34PM -0800, Wei Wang wrote:
> With the implementation of napi-tx in virtio driver, we clean tx
> descriptors from rx napi handler, for the purpose of reducing tx
> complete interrupts. But this could introduce a race where tx complete
> interrupt has been raised, but the handler found there is no work to do
> because we have done the work in the previous rx interrupt handler.
> This could lead to the following warning msg:
> [ 3588.010778] irq 38: nobody cared (try booting with the
> "irqpoll" option)
> [ 3588.017938] CPU: 4 PID: 0 Comm: swapper/4 Not tainted
> 5.3.0-19-generic #20~18.04.2-Ubuntu
> [ 3588.017940] Call Trace:
> [ 3588.017942]  <IRQ>
> [ 3588.017951]  dump_stack+0x63/0x85
> [ 3588.017953]  __report_bad_irq+0x35/0xc0
> [ 3588.017955]  note_interrupt+0x24b/0x2a0
> [ 3588.017956]  handle_irq_event_percpu+0x54/0x80
> [ 3588.017957]  handle_irq_event+0x3b/0x60
> [ 3588.017958]  handle_edge_irq+0x83/0x1a0
> [ 3588.017961]  handle_irq+0x20/0x30
> [ 3588.017964]  do_IRQ+0x50/0xe0
> [ 3588.017966]  common_interrupt+0xf/0xf
> [ 3588.017966]  </IRQ>
> [ 3588.017989] handlers:
> [ 3588.020374] [<000000001b9f1da8>] vring_interrupt
> [ 3588.025099] Disabling IRQ #38
> 
> This patch series contains 2 patches. The first one adds a new param to
> struct vring_virtqueue to control if we want to suppress the bad irq
> warning. And the second patch in virtio-net sets it for tx virtqueues if
> napi-tx is enabled.

I'm going to be busy until March, I think there should be a better
way to fix this though. Will think about it and respond in about a week.


> Wei Wang (2):
>   virtio: add a new parameter in struct virtqueue
>   virtio-net: suppress bad irq warning for tx napi
> 
>  drivers/net/virtio_net.c     | 19 ++++++++++++++-----
>  drivers/virtio/virtio_ring.c | 16 ++++++++++++++++
>  include/linux/virtio.h       |  2 ++
>  3 files changed, 32 insertions(+), 5 deletions(-)
> 
> -- 
> 2.30.0.617.g56c4b15f3c-goog


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net v2 0/2] virtio-net: suppress bad irq warning for tx napi
  2021-02-23 14:25 ` [PATCH net v2 0/2] " Michael S. Tsirkin
@ 2021-02-23 17:37   ` Wei Wang
  2021-02-23 19:13     ` Willem de Bruijn
  0 siblings, 1 reply; 6+ messages in thread
From: Wei Wang @ 2021-02-23 17:37 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, David S . Miller, Jakub Kicinski, Willem de Bruijn,
	virtualization, Linux Kernel Network Developers

On Tue, Feb 23, 2021 at 6:26 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Fri, Feb 19, 2021 at 05:44:34PM -0800, Wei Wang wrote:
> > With the implementation of napi-tx in virtio driver, we clean tx
> > descriptors from rx napi handler, for the purpose of reducing tx
> > complete interrupts. But this could introduce a race where tx complete
> > interrupt has been raised, but the handler found there is no work to do
> > because we have done the work in the previous rx interrupt handler.
> > This could lead to the following warning msg:
> > [ 3588.010778] irq 38: nobody cared (try booting with the
> > "irqpoll" option)
> > [ 3588.017938] CPU: 4 PID: 0 Comm: swapper/4 Not tainted
> > 5.3.0-19-generic #20~18.04.2-Ubuntu
> > [ 3588.017940] Call Trace:
> > [ 3588.017942]  <IRQ>
> > [ 3588.017951]  dump_stack+0x63/0x85
> > [ 3588.017953]  __report_bad_irq+0x35/0xc0
> > [ 3588.017955]  note_interrupt+0x24b/0x2a0
> > [ 3588.017956]  handle_irq_event_percpu+0x54/0x80
> > [ 3588.017957]  handle_irq_event+0x3b/0x60
> > [ 3588.017958]  handle_edge_irq+0x83/0x1a0
> > [ 3588.017961]  handle_irq+0x20/0x30
> > [ 3588.017964]  do_IRQ+0x50/0xe0
> > [ 3588.017966]  common_interrupt+0xf/0xf
> > [ 3588.017966]  </IRQ>
> > [ 3588.017989] handlers:
> > [ 3588.020374] [<000000001b9f1da8>] vring_interrupt
> > [ 3588.025099] Disabling IRQ #38
> >
> > This patch series contains 2 patches. The first one adds a new param to
> > struct vring_virtqueue to control if we want to suppress the bad irq
> > warning. And the second patch in virtio-net sets it for tx virtqueues if
> > napi-tx is enabled.
>
> I'm going to be busy until March, I think there should be a better
> way to fix this though. Will think about it and respond in about a week.
>
OK. Thanks.

>
> > Wei Wang (2):
> >   virtio: add a new parameter in struct virtqueue
> >   virtio-net: suppress bad irq warning for tx napi
> >
> >  drivers/net/virtio_net.c     | 19 ++++++++++++++-----
> >  drivers/virtio/virtio_ring.c | 16 ++++++++++++++++
> >  include/linux/virtio.h       |  2 ++
> >  3 files changed, 32 insertions(+), 5 deletions(-)
> >
> > --
> > 2.30.0.617.g56c4b15f3c-goog
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net v2 0/2] virtio-net: suppress bad irq warning for tx napi
  2021-02-23 17:37   ` Wei Wang
@ 2021-02-23 19:13     ` Willem de Bruijn
  0 siblings, 0 replies; 6+ messages in thread
From: Willem de Bruijn @ 2021-02-23 19:13 UTC (permalink / raw)
  To: Wei Wang
  Cc: Michael S. Tsirkin, Jason Wang, David S . Miller, Jakub Kicinski,
	virtualization, Linux Kernel Network Developers

On Tue, Feb 23, 2021 at 12:37 PM Wei Wang <weiwan@google.com> wrote:
>
> On Tue, Feb 23, 2021 at 6:26 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Fri, Feb 19, 2021 at 05:44:34PM -0800, Wei Wang wrote:
> > > With the implementation of napi-tx in virtio driver, we clean tx
> > > descriptors from rx napi handler, for the purpose of reducing tx
> > > complete interrupts. But this could introduce a race where tx complete
> > > interrupt has been raised, but the handler found there is no work to do
> > > because we have done the work in the previous rx interrupt handler.
> > > This could lead to the following warning msg:
> > > [ 3588.010778] irq 38: nobody cared (try booting with the
> > > "irqpoll" option)
> > > [ 3588.017938] CPU: 4 PID: 0 Comm: swapper/4 Not tainted
> > > 5.3.0-19-generic #20~18.04.2-Ubuntu
> > > [ 3588.017940] Call Trace:
> > > [ 3588.017942]  <IRQ>
> > > [ 3588.017951]  dump_stack+0x63/0x85
> > > [ 3588.017953]  __report_bad_irq+0x35/0xc0
> > > [ 3588.017955]  note_interrupt+0x24b/0x2a0
> > > [ 3588.017956]  handle_irq_event_percpu+0x54/0x80
> > > [ 3588.017957]  handle_irq_event+0x3b/0x60
> > > [ 3588.017958]  handle_edge_irq+0x83/0x1a0
> > > [ 3588.017961]  handle_irq+0x20/0x30
> > > [ 3588.017964]  do_IRQ+0x50/0xe0
> > > [ 3588.017966]  common_interrupt+0xf/0xf
> > > [ 3588.017966]  </IRQ>
> > > [ 3588.017989] handlers:
> > > [ 3588.020374] [<000000001b9f1da8>] vring_interrupt
> > > [ 3588.025099] Disabling IRQ #38
> > >
> > > This patch series contains 2 patches. The first one adds a new param to
> > > struct vring_virtqueue to control if we want to suppress the bad irq
> > > warning. And the second patch in virtio-net sets it for tx virtqueues if
> > > napi-tx is enabled.
> >
> > I'm going to be busy until March, I think there should be a better
> > way to fix this though. Will think about it and respond in about a week.
> >
> OK. Thanks.

Yes, thanks for helping to think about a solution.

The warning went unreported for years. I'm a bit hesitant to make
actual datapath changes to suppress it, if those may actually have a
higher risk of regressions for some workloads.

Unless they actually might show a clear progression. Which may very
well be possible given the high spurious interrupt rate.

But the odd thing is that by virtue of the interrupt getting masked
once the warning hits, it may actually be difficult to improve on the
efficiency today.

As you pointed out, just probabilistically throttling how often to
steal work from the rx interrupt handler would be another low risk
approach to reduce the incidence rate.

Anyway, definitely no rush. This went unreported for a long time.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, back to index

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-20  1:44 [PATCH net v2 0/2] virtio-net: suppress bad irq warning for tx napi Wei Wang
2021-02-20  1:44 ` [PATCH net v2 1/2] virtio: add a new parameter in struct virtqueue Wei Wang
2021-02-20  1:44 ` [PATCH net v2 2/2] virtio-net: suppress bad irq warning for tx napi Wei Wang
2021-02-23 14:25 ` [PATCH net v2 0/2] " Michael S. Tsirkin
2021-02-23 17:37   ` Wei Wang
2021-02-23 19:13     ` Willem de Bruijn

Netdev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netdev/0 netdev/git/0.git
	git clone --mirror https://lore.kernel.org/netdev/1 netdev/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netdev netdev/ https://lore.kernel.org/netdev \
		netdev@vger.kernel.org
	public-inbox-index netdev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netdev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git