All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] net: hns: Fix the stray netpoll locks causing deadlock in NAPI path
@ 2019-11-06 18:54 Salil Mehta
  2019-11-07 14:12 ` Marc Zyngier
  0 siblings, 1 reply; 3+ messages in thread
From: Salil Mehta @ 2019-11-06 18:54 UTC (permalink / raw)
  To: davem, maz
  Cc: edumazet, salil.mehta, yisen.zhuang, lipeng321, mehta.salil,
	netdev, linux-kernel, linuxarm

This patch fixes the problem of the spin locks, originally
meant for the netpoll path of hns driver, causing deadlock in
the normal NAPI poll path. The issue happened due presence of
the stray leftover spin lock code related to the netpoll,
whose support was earlier removed from the HNS[1], got activated
due to enabling of NET_POLL_CONTROLLER switch.

Earlier background:
The netpoll handling code originally had this bug(as identified
by Marc Zyngier[2]) of wrong spin lock API being used which did
not disable the interrupts and hence could cause locking issues.
i.e. if the lock were first acquired in context to thread like
'ip' util and this lock if ever got later acquired again in
context to the interrupt context like TX/RX (Interrupts could
always pre-empt the lock holding task and acquire the lock again)
and hence could cause deadlock.

Proposed Solution:
1. If the netpoll was enabled in the HNS driver, which is not
   right now, we could have simply used spin_[un]lock_irqsave()
2. But as netpoll is disabled, therefore, it is best to get rid
   of the existing locks and stray code for now. This should
   solve the problem reported by Marc.

@Marc,
Could you please test this patch and confirm if the problem is
fixed at your end?

Many Thanks

[1] https://git.kernel.org/torvalds/c/4bd2c03be7
[2] https://patchwork.ozlabs.org/patch/1189139/

Fixes: 4bd2c03be707 ("net: hns: remove ndo_poll_controller")
Cc: lipeng <lipeng321@huawei.com>
Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Reported-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns/hns_enet.c | 22 +------------------
 1 file changed, 1 insertion(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.c b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
index a48396dd4ebb..14ab20491fd0 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
@@ -943,15 +943,6 @@ static int is_valid_clean_head(struct hnae_ring *ring, int h)
 	return u > c ? (h > c && h <= u) : (h > c || h <= u);
 }
 
-/* netif_tx_lock will turn down the performance, set only when necessary */
-#ifdef CONFIG_NET_POLL_CONTROLLER
-#define NETIF_TX_LOCK(ring) spin_lock(&(ring)->lock)
-#define NETIF_TX_UNLOCK(ring) spin_unlock(&(ring)->lock)
-#else
-#define NETIF_TX_LOCK(ring)
-#define NETIF_TX_UNLOCK(ring)
-#endif
-
 /* reclaim all desc in one budget
  * return error or number of desc left
  */
@@ -965,21 +956,16 @@ static int hns_nic_tx_poll_one(struct hns_nic_ring_data *ring_data,
 	int head;
 	int bytes, pkts;
 
-	NETIF_TX_LOCK(ring);
-
 	head = readl_relaxed(ring->io_base + RCB_REG_HEAD);
 	rmb(); /* make sure head is ready before touch any data */
 
-	if (is_ring_empty(ring) || head == ring->next_to_clean) {
-		NETIF_TX_UNLOCK(ring);
+	if (is_ring_empty(ring) || head == ring->next_to_clean)
 		return 0; /* no data to poll */
-	}
 
 	if (!is_valid_clean_head(ring, head)) {
 		netdev_err(ndev, "wrong head (%d, %d-%d)\n", head,
 			   ring->next_to_use, ring->next_to_clean);
 		ring->stats.io_err_cnt++;
-		NETIF_TX_UNLOCK(ring);
 		return -EIO;
 	}
 
@@ -994,8 +980,6 @@ static int hns_nic_tx_poll_one(struct hns_nic_ring_data *ring_data,
 	ring->stats.tx_pkts += pkts;
 	ring->stats.tx_bytes += bytes;
 
-	NETIF_TX_UNLOCK(ring);
-
 	dev_queue = netdev_get_tx_queue(ndev, ring_data->queue_index);
 	netdev_tx_completed_queue(dev_queue, pkts, bytes);
 
@@ -1055,16 +1039,12 @@ static void hns_nic_tx_clr_all_bufs(struct hns_nic_ring_data *ring_data)
 	int head;
 	int bytes, pkts;
 
-	NETIF_TX_LOCK(ring);
-
 	head = ring->next_to_use; /* ntu :soft setted ring position*/
 	bytes = 0;
 	pkts = 0;
 	while (head != ring->next_to_clean)
 		hns_nic_reclaim_one_desc(ring, &bytes, &pkts);
 
-	NETIF_TX_UNLOCK(ring);
-
 	dev_queue = netdev_get_tx_queue(ndev, ring_data->queue_index);
 	netdev_tx_reset_queue(dev_queue);
 }
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH net] net: hns: Fix the stray netpoll locks causing  deadlock in NAPI path
  2019-11-06 18:54 [PATCH net] net: hns: Fix the stray netpoll locks causing deadlock in NAPI path Salil Mehta
@ 2019-11-07 14:12 ` Marc Zyngier
  2019-11-07 14:20   ` Salil Mehta
  0 siblings, 1 reply; 3+ messages in thread
From: Marc Zyngier @ 2019-11-07 14:12 UTC (permalink / raw)
  To: Salil Mehta
  Cc: davem, edumazet, yisen.zhuang, lipeng321, mehta.salil, netdev,
	linux-kernel, linuxarm

Hi Salil,

On 2019-11-06 20:03, Salil Mehta wrote:
> This patch fixes the problem of the spin locks, originally
> meant for the netpoll path of hns driver, causing deadlock in
> the normal NAPI poll path. The issue happened due presence of
> the stray leftover spin lock code related to the netpoll,
> whose support was earlier removed from the HNS[1], got activated
> due to enabling of NET_POLL_CONTROLLER switch.
>
> Earlier background:
> The netpoll handling code originally had this bug(as identified
> by Marc Zyngier[2]) of wrong spin lock API being used which did
> not disable the interrupts and hence could cause locking issues.
> i.e. if the lock were first acquired in context to thread like
> 'ip' util and this lock if ever got later acquired again in
> context to the interrupt context like TX/RX (Interrupts could
> always pre-empt the lock holding task and acquire the lock again)
> and hence could cause deadlock.
>
> Proposed Solution:
> 1. If the netpoll was enabled in the HNS driver, which is not
>    right now, we could have simply used spin_[un]lock_irqsave()
> 2. But as netpoll is disabled, therefore, it is best to get rid
>    of the existing locks and stray code for now. This should
>    solve the problem reported by Marc.
>
> @Marc,
> Could you please test this patch and confirm if the problem is
> fixed at your end?

Yes, this fixes it, although you may want to fully get rid of
the now useless lock:

diff --git a/drivers/net/ethernet/hisilicon/hns/hnae.c 
b/drivers/net/ethernet/hisilicon/hns/hnae.c
index 6d0457eb4faa..08339278c722 100644
--- a/drivers/net/ethernet/hisilicon/hns/hnae.c
+++ b/drivers/net/ethernet/hisilicon/hns/hnae.c
@@ -199,7 +199,6 @@ hnae_init_ring(struct hnae_queue *q, struct 
hnae_ring *ring, int flags)

  	ring->q = q;
  	ring->flags = flags;
-	spin_lock_init(&ring->lock);
  	ring->coal_param = q->handle->coal_param;
  	assert(!ring->desc && !ring->desc_cb && !ring->desc_dma_addr);

diff --git a/drivers/net/ethernet/hisilicon/hns/hnae.h 
b/drivers/net/ethernet/hisilicon/hns/hnae.h
index e9c67c06bfd2..6ab9458302e1 100644
--- a/drivers/net/ethernet/hisilicon/hns/hnae.h
+++ b/drivers/net/ethernet/hisilicon/hns/hnae.h
@@ -274,9 +274,6 @@ struct hnae_ring {
  	/* statistic */
  	struct ring_stats stats;

-	/* ring lock for poll one */
-	spinlock_t lock;
-
  	dma_addr_t desc_dma_addr;
  	u32 buf_size;       /* size for hnae_desc->addr, preset by AE */
  	u16 desc_num;       /* total number of desc */

With that:

Acked-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* RE: [PATCH net] net: hns: Fix the stray netpoll locks causing deadlock in NAPI path
  2019-11-07 14:12 ` Marc Zyngier
@ 2019-11-07 14:20   ` Salil Mehta
  0 siblings, 0 replies; 3+ messages in thread
From: Salil Mehta @ 2019-11-07 14:20 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: davem, edumazet, Zhuangyuzeng (Yisen), lipeng (Y),
	mehta.salil, netdev, linux-kernel, Linuxarm

Hi Marc,

> From: Marc Zyngier [mailto:maz@kernel.org]
> Sent: Thursday, November 7, 2019 2:13 PM
> To: Salil Mehta <salil.mehta@huawei.com>
> 
> Hi Salil,
> 
> On 2019-11-06 20:03, Salil Mehta wrote:
> > This patch fixes the problem of the spin locks, originally
> > meant for the netpoll path of hns driver, causing deadlock in
> > the normal NAPI poll path. The issue happened due presence of
> > the stray leftover spin lock code related to the netpoll,
> > whose support was earlier removed from the HNS[1], got activated
> > due to enabling of NET_POLL_CONTROLLER switch.
> >
> > Earlier background:
> > The netpoll handling code originally had this bug(as identified
> > by Marc Zyngier[2]) of wrong spin lock API being used which did
> > not disable the interrupts and hence could cause locking issues.
> > i.e. if the lock were first acquired in context to thread like
> > 'ip' util and this lock if ever got later acquired again in
> > context to the interrupt context like TX/RX (Interrupts could
> > always pre-empt the lock holding task and acquire the lock again)
> > and hence could cause deadlock.
> >
> > Proposed Solution:
> > 1. If the netpoll was enabled in the HNS driver, which is not
> >    right now, we could have simply used spin_[un]lock_irqsave()
> > 2. But as netpoll is disabled, therefore, it is best to get rid
> >    of the existing locks and stray code for now. This should
> >    solve the problem reported by Marc.
> >
> > @Marc,
> > Could you please test this patch and confirm if the problem is
> > fixed at your end?
> 
> Yes, this fixes it, although you may want to fully get rid of
> the now useless lock:


Oops..missed them. I will fix these as well and float V2 version
of this patch.


> diff --git a/drivers/net/ethernet/hisilicon/hns/hnae.c
> b/drivers/net/ethernet/hisilicon/hns/hnae.c
> index 6d0457eb4faa..08339278c722 100644
> --- a/drivers/net/ethernet/hisilicon/hns/hnae.c
> +++ b/drivers/net/ethernet/hisilicon/hns/hnae.c
> @@ -199,7 +199,6 @@ hnae_init_ring(struct hnae_queue *q, struct
> hnae_ring *ring, int flags)
> 
>   	ring->q = q;
>   	ring->flags = flags;
> -	spin_lock_init(&ring->lock);
>   	ring->coal_param = q->handle->coal_param;
>   	assert(!ring->desc && !ring->desc_cb && !ring->desc_dma_addr);
> 
> diff --git a/drivers/net/ethernet/hisilicon/hns/hnae.h
> b/drivers/net/ethernet/hisilicon/hns/hnae.h
> index e9c67c06bfd2..6ab9458302e1 100644
> --- a/drivers/net/ethernet/hisilicon/hns/hnae.h
> +++ b/drivers/net/ethernet/hisilicon/hns/hnae.h
> @@ -274,9 +274,6 @@ struct hnae_ring {
>   	/* statistic */
>   	struct ring_stats stats;
> 
> -	/* ring lock for poll one */
> -	spinlock_t lock;
> -
>   	dma_addr_t desc_dma_addr;
>   	u32 buf_size;       /* size for hnae_desc->addr, preset by AE */
>   	u16 desc_num;       /* total number of desc */
> 
> With that:
> 
> Acked-by: Marc Zyngier <maz@kernel.org>
> Tested-by: Marc Zyngier <maz@kernel.org>


Many thanks for your review and testing.


Best Regards
Salil.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-11-07 14:21 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-06 18:54 [PATCH net] net: hns: Fix the stray netpoll locks causing deadlock in NAPI path Salil Mehta
2019-11-07 14:12 ` Marc Zyngier
2019-11-07 14:20   ` Salil Mehta

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.