netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] bnx2x: Prevent ptp_task to be rescheduled indefinitely
@ 2019-06-21 21:26 Guilherme G. Piccoli
  2019-06-23  5:13 ` [EXT] " Sudarsana Reddy Kalluru
  0 siblings, 1 reply; 4+ messages in thread
From: Guilherme G. Piccoli @ 2019-06-21 21:26 UTC (permalink / raw)
  To: GR-everest-linux-l2, netdev; +Cc: aelior, skalluru, gpiccoli, jay.vosburgh

Currently bnx2x ptp worker tries to read a register with timestamp
information in case of TX packet timestamping and in case it fails,
the routine reschedules itself indefinitely. This was reported as a
kworker always at 100% of CPU usage, which was narrowed down to be
bnx2x ptp_task.

By following the ioctl handler, we could narrow down the problem to an
NTP tool (chrony) requesting HW timestamping from bnx2x NIC with RX
filter zeroed; this isn't reproducible for example with linuxptp since
this tool request a supported RX filter. It seems the NIC HW timestamp
mechanism cannot work well with RX_FILTER_NONE - in driver's PTP filter
initialization routine, when there's not a supported filter request the
function does not perform a specific register write to the adapter.

This patch addresses the problem of the everlasting reschedule of the
ptp worker by limiting that to 3 attempts (the first one plus two
reschedules), in order to prevent the unbound resource consumption
from the driver. It's not correct behavior for a driver to not take
into account potential problems in a routine reading a device register,
be it an invalid RX filter (leading to a non-functional HW clock) or
even a potential device FW issue causing the register value to be wrong,
hence we believe the fix is relevant to ensure proper driver behavior.

This has no functional change in the succeeding path of the HW
timestamping code in the driver, only portion of code it changes
is the error path for TX timestamping. It was tested using both
linuxptp and chrony.

Reported-and-tested-by: Przemyslaw Hausman <przemyslaw.hausman@canonical.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h    |  1 +
 .../net/ethernet/broadcom/bnx2x/bnx2x_cmn.c    |  1 +
 .../net/ethernet/broadcom/bnx2x/bnx2x_main.c   | 18 +++++++++++++-----
 3 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
index 6026b53137aa..349965135227 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
@@ -1838,6 +1838,7 @@ struct bnx2x {
 	bool timecounter_init_done;
 	struct sk_buff *ptp_tx_skb;
 	unsigned long ptp_tx_start;
+	u8 ptp_retry_count;
 	bool hwtstamp_ioctl_called;
 	u16 tx_type;
 	u16 rx_filter;
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 008ad0ca89ba..990ec049f357 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -3865,6 +3865,7 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
 			/* schedule check for Tx timestamp */
 			bp->ptp_tx_skb = skb_get(skb);
 			bp->ptp_tx_start = jiffies;
+			bp->ptp_retry_count = 0;
 			schedule_work(&bp->ptp_task);
 		}
 	}
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 03ac10b1cd1e..872ae672faaa 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -15233,16 +15233,24 @@ static void bnx2x_ptp_task(struct work_struct *work)
 		memset(&shhwtstamps, 0, sizeof(shhwtstamps));
 		shhwtstamps.hwtstamp = ns_to_ktime(ns);
 		skb_tstamp_tx(bp->ptp_tx_skb, &shhwtstamps);
-		dev_kfree_skb_any(bp->ptp_tx_skb);
-		bp->ptp_tx_skb = NULL;
-
 		DP(BNX2X_MSG_PTP, "Tx timestamp, timestamp cycles = %llu, ns = %llu\n",
 		   timestamp, ns);
+		goto clear;
 	} else {
 		DP(BNX2X_MSG_PTP, "There is no valid Tx timestamp yet\n");
-		/* Reschedule to keep checking for a valid timestamp value */
-		schedule_work(&bp->ptp_task);
+		/* Reschedule twice to check again for a valid timestamp */
+		if (++bp->ptp_retry_count < 3) {
+			schedule_work(&bp->ptp_task);
+			return;
+		}
+		DP(BNX2X_MSG_PTP, "Gave up Tx timestamp, register read %u\n", val_seq);
+		netdev_warn_once(bp->dev,
+				 "Gave up Tx timestamp, register read %u\n", val_seq);
 	}
+clear:
+	dev_kfree_skb_any(bp->ptp_tx_skb);
+	bp->ptp_tx_skb = NULL;
+	bp->ptp_retry_count = 0;
 }
 
 void bnx2x_set_rx_ts(struct bnx2x *bp, struct sk_buff *skb)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* RE: [EXT] [PATCH] bnx2x: Prevent ptp_task to be rescheduled indefinitely
  2019-06-21 21:26 [PATCH] bnx2x: Prevent ptp_task to be rescheduled indefinitely Guilherme G. Piccoli
@ 2019-06-23  5:13 ` Sudarsana Reddy Kalluru
  2019-06-24  0:50   ` Guilherme Piccoli
  0 siblings, 1 reply; 4+ messages in thread
From: Sudarsana Reddy Kalluru @ 2019-06-23  5:13 UTC (permalink / raw)
  To: Guilherme G. Piccoli, GR-everest-linux-l2, netdev
  Cc: Ariel Elior, jay.vosburgh


> -----Original Message-----
> From: Guilherme G. Piccoli <gpiccoli@canonical.com>
> Sent: Saturday, June 22, 2019 2:57 AM
> To: GR-everest-linux-l2 <GR-everest-linux-l2@marvell.com>;
> netdev@vger.kernel.org
> Cc: Ariel Elior <aelior@marvell.com>; Sudarsana Reddy Kalluru
> <skalluru@marvell.com>; gpiccoli@canonical.com;
> jay.vosburgh@canonical.com
> Subject: [EXT] [PATCH] bnx2x: Prevent ptp_task to be rescheduled indefinitely
> 
> External Email
> 
> ----------------------------------------------------------------------
> Currently bnx2x ptp worker tries to read a register with timestamp
> information in case of TX packet timestamping and in case it fails, the routine
> reschedules itself indefinitely. This was reported as a kworker always at 100%
> of CPU usage, which was narrowed down to be bnx2x ptp_task.
> 
> By following the ioctl handler, we could narrow down the problem to an NTP
> tool (chrony) requesting HW timestamping from bnx2x NIC with RX filter
> zeroed; this isn't reproducible for example with linuxptp since this tool
> request a supported RX filter. It seems the NIC HW timestamp mechanism
> cannot work well with RX_FILTER_NONE - in driver's PTP filter initialization
> routine, when there's not a supported filter request the function does not
> perform a specific register write to the adapter.
> 
> This patch addresses the problem of the everlasting reschedule of the ptp
> worker by limiting that to 3 attempts (the first one plus two reschedules), in
> order to prevent the unbound resource consumption from the driver. It's not
> correct behavior for a driver to not take into account potential problems in a
> routine reading a device register, be it an invalid RX filter (leading to a non-
> functional HW clock) or even a potential device FW issue causing the register
> value to be wrong, hence we believe the fix is relevant to ensure proper
> driver behavior.
> 
> This has no functional change in the succeeding path of the HW
> timestamping code in the driver, only portion of code it changes is the error
> path for TX timestamping. It was tested using both linuxptp and chrony.
> 
> Reported-and-tested-by: Przemyslaw Hausman
> <przemyslaw.hausman@canonical.com>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
> ---
>  drivers/net/ethernet/broadcom/bnx2x/bnx2x.h    |  1 +
>  .../net/ethernet/broadcom/bnx2x/bnx2x_cmn.c    |  1 +
>  .../net/ethernet/broadcom/bnx2x/bnx2x_main.c   | 18 +++++++++++++-----
>  3 files changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
> index 6026b53137aa..349965135227 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
> @@ -1838,6 +1838,7 @@ struct bnx2x {
>  	bool timecounter_init_done;
>  	struct sk_buff *ptp_tx_skb;
>  	unsigned long ptp_tx_start;
> +	u8 ptp_retry_count;
>  	bool hwtstamp_ioctl_called;
>  	u16 tx_type;
>  	u16 rx_filter;
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
> index 008ad0ca89ba..990ec049f357 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
> @@ -3865,6 +3865,7 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb,
> struct net_device *dev)
>  			/* schedule check for Tx timestamp */
>  			bp->ptp_tx_skb = skb_get(skb);
>  			bp->ptp_tx_start = jiffies;
> +			bp->ptp_retry_count = 0;
>  			schedule_work(&bp->ptp_task);
>  		}
>  	}
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> index 03ac10b1cd1e..872ae672faaa 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> @@ -15233,16 +15233,24 @@ static void bnx2x_ptp_task(struct work_struct
> *work)
>  		memset(&shhwtstamps, 0, sizeof(shhwtstamps));
>  		shhwtstamps.hwtstamp = ns_to_ktime(ns);
>  		skb_tstamp_tx(bp->ptp_tx_skb, &shhwtstamps);
> -		dev_kfree_skb_any(bp->ptp_tx_skb);
> -		bp->ptp_tx_skb = NULL;
> -
>  		DP(BNX2X_MSG_PTP, "Tx timestamp, timestamp cycles =
> %llu, ns = %llu\n",
>  		   timestamp, ns);
> +		goto clear;
>  	} else {
>  		DP(BNX2X_MSG_PTP, "There is no valid Tx timestamp
> yet\n");
> -		/* Reschedule to keep checking for a valid timestamp value
> */
> -		schedule_work(&bp->ptp_task);
> +		/* Reschedule twice to check again for a valid timestamp */
> +		if (++bp->ptp_retry_count < 3) {
> +			schedule_work(&bp->ptp_task);
> +			return;
> +		}
> +		DP(BNX2X_MSG_PTP, "Gave up Tx timestamp, register read
> %u\n", val_seq);
> +		netdev_warn_once(bp->dev,
> +				 "Gave up Tx timestamp, register read %u\n",
> val_seq);
>  	}
> +clear:
> +	dev_kfree_skb_any(bp->ptp_tx_skb);
> +	bp->ptp_tx_skb = NULL;
> +	bp->ptp_retry_count = 0;
>  }
> 
>  void bnx2x_set_rx_ts(struct bnx2x *bp, struct sk_buff *skb)
> --
> 2.21.0

Thanks for uncovering this issue, and the fix.
With the proposed fix, if HW latches the timestamp after 3 iterations then it would lead to erroneous PTP functionality. When driver receives the next PTP packet, driver sees that timestamp is available in the register and would assign this value (which corresponds to last/skipped ptp packet) to the PTP packet. In general, FW takes around 1ms to 100ms to perform the timestamp and may take more. Hence the promising solution for this issue would be to wait for timestamp for particular period of time. If Tx timestamp is not available for a pre-determined max period, then declare it as an error scenario and terminate the thread.

Just for the reference, similar fix was added for Marvell qede driver recently. Patch details are,
9adebac37e7d: (qede: Handle infinite driver spinning for Tx timestamp.)
https://www.spinics.net/lists/netdev/msg572838.html


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [EXT] [PATCH] bnx2x: Prevent ptp_task to be rescheduled indefinitely
  2019-06-23  5:13 ` [EXT] " Sudarsana Reddy Kalluru
@ 2019-06-24  0:50   ` Guilherme Piccoli
  2019-06-24 22:27     ` Guilherme Piccoli
  0 siblings, 1 reply; 4+ messages in thread
From: Guilherme Piccoli @ 2019-06-24  0:50 UTC (permalink / raw)
  To: Sudarsana Reddy Kalluru
  Cc: GR-everest-linux-l2, netdev, Ariel Elior, jay.vosburgh

On Sun, Jun 23, 2019 at 2:13 AM Sudarsana Reddy Kalluru
<skalluru@marvell.com> wrote:
>
> Thanks for uncovering this issue, and the fix.
> With the proposed fix, if HW latches the timestamp after 3 iterations then it would lead to erroneous PTP functionality. When driver receives the next PTP packet, driver sees that timestamp is available in the register and would assign this value (which corresponds to last/skipped ptp packet) to the PTP packet. In general, FW takes around 1ms to 100ms to perform the timestamp and may take more. Hence the promising solution for this issue would be to wait for timestamp for particular period of time. If Tx timestamp is not available for a pre-determined max period, then declare it as an error scenario and terminate the thread.
>
> Just for the reference, similar fix was added for Marvell qede driver recently. Patch details are,
> 9adebac37e7d: (qede: Handle infinite driver spinning for Tx timestamp.)
> https://www.spinics.net/lists/netdev/msg572838.html
>

Thanks a lot for the quick review and good suggestion Sudarsana!
I'll rework the fix based on your reference, and resubmit.

Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [EXT] [PATCH] bnx2x: Prevent ptp_task to be rescheduled indefinitely
  2019-06-24  0:50   ` Guilherme Piccoli
@ 2019-06-24 22:27     ` Guilherme Piccoli
  0 siblings, 0 replies; 4+ messages in thread
From: Guilherme Piccoli @ 2019-06-24 22:27 UTC (permalink / raw)
  To: Sudarsana Reddy Kalluru
  Cc: GR-everest-linux-l2, netdev, Ariel Elior, jay.vosburgh

V2 submitted here: https://marc.info/?l=linux-netdev&m=156141504615972

Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-06-24 22:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-21 21:26 [PATCH] bnx2x: Prevent ptp_task to be rescheduled indefinitely Guilherme G. Piccoli
2019-06-23  5:13 ` [EXT] " Sudarsana Reddy Kalluru
2019-06-24  0:50   ` Guilherme Piccoli
2019-06-24 22:27     ` Guilherme Piccoli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).