All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net v5 0/2] mlx5: ptp fifo bugfixes
@ 2023-02-02 17:13 Vadim Fedorenko
  2023-02-02 17:13 ` [PATCH net v5 1/2] mlx5: fix skb leak while fifo resync and push Vadim Fedorenko
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Vadim Fedorenko @ 2023-02-02 17:13 UTC (permalink / raw)
  To: Jakub Kicinski, Vadim Fedorenko, Rahul Rameshbabu, Tariq Toukan,
	Gal Pressman, Saeed Mahameed
  Cc: Vadim Fedorenko, netdev

Simple FIFO implementation for PTP queue has several bugs which lead to
use-after-free and skb leaks. This series fixes the issues and adds new
checks for this FIFO implementation to uncover the same problems in
future.

v4 -> v5:
  Change check to WARN_ON_ONCE() in mlx5e_skb_fifo_pop()
  Change the check of OOO cqe as Jakub provided corner case
  Move OOO logic into separate function and add counter
v3 -> v4:
  Change pr_err to mlx5_core_err_rl per suggest
  Removed WARN_ONCE on fifo push because has_room() should catch the
  issue
v2 -> v3:
  Rearrange patches order and rephrase commit messages
  Remove counters as Gal confirmed FW bug, use KERN_ERR message instead
  Provide proper budget to napi_consume_skb as Jakub suggested
v1 -> v2:
  Update Fixes tag to proper commit.
  Change debug line to avoid double print of function name

Vadim Fedorenko (2):
  mlx5: fix skb leak while fifo resync and push
  mlx5: fix possible ptp queue fifo use-after-free

 .../net/ethernet/mellanox/mlx5/core/en/ptp.c  | 25 ++++++++++++++++---
 .../net/ethernet/mellanox/mlx5/core/en/txrx.h |  4 ++-
 .../ethernet/mellanox/mlx5/core/en_stats.c    |  1 +
 .../ethernet/mellanox/mlx5/core/en_stats.h    |  1 +
 4 files changed, 27 insertions(+), 4 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH net v5 1/2] mlx5: fix skb leak while fifo resync and push
  2023-02-02 17:13 [PATCH net v5 0/2] mlx5: ptp fifo bugfixes Vadim Fedorenko
@ 2023-02-02 17:13 ` Vadim Fedorenko
  2023-02-02 17:13 ` [PATCH net v5 2/2] mlx5: fix possible ptp queue fifo use-after-free Vadim Fedorenko
  2023-02-02 18:53 ` [PATCH net v5 0/2] mlx5: ptp fifo bugfixes Saeed Mahameed
  2 siblings, 0 replies; 4+ messages in thread
From: Vadim Fedorenko @ 2023-02-02 17:13 UTC (permalink / raw)
  To: Jakub Kicinski, Vadim Fedorenko, Rahul Rameshbabu, Tariq Toukan,
	Gal Pressman, Saeed Mahameed
  Cc: Vadim Fedorenko, netdev, Tariq Toukan

During ptp resync operation SKBs were poped from the fifo but were never
freed neither by napi_consume nor by dev_kfree_skb_any. Add call to
napi_consume_skb to properly free SKBs.

Another leak was happening because mlx5e_skb_fifo_has_room() had an error
in the check. Comparing free running counters works well unless C promotes
the types to something wider than the counter. In this case counters are
u16 but the result of the substraction is promouted to int and it causes
wrong result (negative value) of the check when producer have already
overlapped but consumer haven't yet. Explicit cast to u16 fixes the issue.

Fixes: 58a518948f60 ("net/mlx5e: Add resiliency for PTP TX port timestamp")
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c  | 6 ++++--
 drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h | 2 +-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
index 8469e9c38670..b72de2b520ec 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
@@ -86,7 +86,8 @@ static bool mlx5e_ptp_ts_cqe_drop(struct mlx5e_ptpsq *ptpsq, u16 skb_cc, u16 skb
 	return (ptpsq->ts_cqe_ctr_mask && (skb_cc != skb_id));
 }
 
-static void mlx5e_ptp_skb_fifo_ts_cqe_resync(struct mlx5e_ptpsq *ptpsq, u16 skb_cc, u16 skb_id)
+static void mlx5e_ptp_skb_fifo_ts_cqe_resync(struct mlx5e_ptpsq *ptpsq, u16 skb_cc,
+					     u16 skb_id, int budget)
 {
 	struct skb_shared_hwtstamps hwts = {};
 	struct sk_buff *skb;
@@ -98,6 +99,7 @@ static void mlx5e_ptp_skb_fifo_ts_cqe_resync(struct mlx5e_ptpsq *ptpsq, u16 skb_
 		hwts.hwtstamp = mlx5e_skb_cb_get_hwts(skb)->cqe_hwtstamp;
 		skb_tstamp_tx(skb, &hwts);
 		ptpsq->cq_stats->resync_cqe++;
+		napi_consume_skb(skb, budget);
 		skb_cc = PTP_WQE_CTR2IDX(ptpsq->skb_fifo_cc);
 	}
 }
@@ -119,7 +121,7 @@ static void mlx5e_ptp_handle_ts_cqe(struct mlx5e_ptpsq *ptpsq,
 	}
 
 	if (mlx5e_ptp_ts_cqe_drop(ptpsq, skb_cc, skb_id))
-		mlx5e_ptp_skb_fifo_ts_cqe_resync(ptpsq, skb_cc, skb_id);
+		mlx5e_ptp_skb_fifo_ts_cqe_resync(ptpsq, skb_cc, skb_id, budget);
 
 	skb = mlx5e_skb_fifo_pop(&ptpsq->skb_fifo);
 	hwtstamp = mlx5e_cqe_ts_to_ns(sq->ptp_cyc2time, sq->clock, get_cqe_ts(cqe));
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
index c10c6ab2e7bc..d5afad368a69 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
@@ -86,7 +86,7 @@ void mlx5e_free_txqsq_descs(struct mlx5e_txqsq *sq);
 static inline bool
 mlx5e_skb_fifo_has_room(struct mlx5e_skb_fifo *fifo)
 {
-	return (*fifo->pc - *fifo->cc) < fifo->mask;
+	return (u16)(*fifo->pc - *fifo->cc) < fifo->mask;
 }
 
 static inline bool
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH net v5 2/2] mlx5: fix possible ptp queue fifo use-after-free
  2023-02-02 17:13 [PATCH net v5 0/2] mlx5: ptp fifo bugfixes Vadim Fedorenko
  2023-02-02 17:13 ` [PATCH net v5 1/2] mlx5: fix skb leak while fifo resync and push Vadim Fedorenko
@ 2023-02-02 17:13 ` Vadim Fedorenko
  2023-02-02 18:53 ` [PATCH net v5 0/2] mlx5: ptp fifo bugfixes Saeed Mahameed
  2 siblings, 0 replies; 4+ messages in thread
From: Vadim Fedorenko @ 2023-02-02 17:13 UTC (permalink / raw)
  To: Jakub Kicinski, Vadim Fedorenko, Rahul Rameshbabu, Tariq Toukan,
	Gal Pressman, Saeed Mahameed
  Cc: Vadim Fedorenko, netdev

Fifo indexes are not checked during pop operations and it leads to
potential use-after-free when poping from empty queue. Such case was
possible during re-sync action. WARN_ON_ONCE covers future cases.

There were out-of-order cqe spotted which lead to drain of the queue and
use-after-free because of lack of fifo pointers check. Special check and
counter are added to avoid resync operation if SKB could not exist in the
fifo because of OOO cqe (skb_id must be between consumer and producer
index).

Fixes: 58a518948f60 ("net/mlx5e: Add resiliency for PTP TX port timestamp")
Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
---
 .../net/ethernet/mellanox/mlx5/core/en/ptp.c  | 19 ++++++++++++++++++-
 .../net/ethernet/mellanox/mlx5/core/en/txrx.h |  2 ++
 .../ethernet/mellanox/mlx5/core/en_stats.c    |  1 +
 .../ethernet/mellanox/mlx5/core/en_stats.h    |  1 +
 4 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
index b72de2b520ec..ae75e230170b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
@@ -86,6 +86,17 @@ static bool mlx5e_ptp_ts_cqe_drop(struct mlx5e_ptpsq *ptpsq, u16 skb_cc, u16 skb
 	return (ptpsq->ts_cqe_ctr_mask && (skb_cc != skb_id));
 }
 
+static bool mlx5e_ptp_ts_cqe_ooo(struct mlx5e_ptpsq *ptpsq, u16 skb_id)
+{
+	u16 skb_cc = PTP_WQE_CTR2IDX(ptpsq->skb_fifo_cc);
+	u16 skb_pc = PTP_WQE_CTR2IDX(ptpsq->skb_fifo_pc);
+
+	if (PTP_WQE_CTR2IDX(skb_id - skb_cc) >= PTP_WQE_CTR2IDX(skb_pc - skb_cc))
+		return true;
+
+	return false;
+}
+
 static void mlx5e_ptp_skb_fifo_ts_cqe_resync(struct mlx5e_ptpsq *ptpsq, u16 skb_cc,
 					     u16 skb_id, int budget)
 {
@@ -120,8 +131,14 @@ static void mlx5e_ptp_handle_ts_cqe(struct mlx5e_ptpsq *ptpsq,
 		goto out;
 	}
 
-	if (mlx5e_ptp_ts_cqe_drop(ptpsq, skb_cc, skb_id))
+	if (mlx5e_ptp_ts_cqe_drop(ptpsq, skb_cc, skb_id)) {
+		if (mlx5e_ptp_ts_cqe_ooo(ptpsq, skb_id)) {
+			/* already handled by a previous resync */
+			ptpsq->cq_stats->ooo_cqe_drop++;
+			return;
+		}
 		mlx5e_ptp_skb_fifo_ts_cqe_resync(ptpsq, skb_cc, skb_id, budget);
+	}
 
 	skb = mlx5e_skb_fifo_pop(&ptpsq->skb_fifo);
 	hwtstamp = mlx5e_cqe_ts_to_ns(sq->ptp_cyc2time, sq->clock, get_cqe_ts(cqe));
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
index d5afad368a69..5646f0687f65 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
@@ -302,6 +302,8 @@ void mlx5e_skb_fifo_push(struct mlx5e_skb_fifo *fifo, struct sk_buff *skb)
 static inline
 struct sk_buff *mlx5e_skb_fifo_pop(struct mlx5e_skb_fifo *fifo)
 {
+	WARN_ON_ONCE(*fifo->pc == *fifo->cc);
+
 	return *mlx5e_skb_fifo_get(fifo, (*fifo->cc)++);
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index 6687b8136e44..4478223c1720 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -2138,6 +2138,7 @@ static const struct counter_desc ptp_cq_stats_desc[] = {
 	{ MLX5E_DECLARE_PTP_CQ_STAT(struct mlx5e_ptp_cq_stats, abort_abs_diff_ns) },
 	{ MLX5E_DECLARE_PTP_CQ_STAT(struct mlx5e_ptp_cq_stats, resync_cqe) },
 	{ MLX5E_DECLARE_PTP_CQ_STAT(struct mlx5e_ptp_cq_stats, resync_event) },
+	{ MLX5E_DECLARE_PTP_CQ_STAT(struct mlx5e_ptp_cq_stats, ooo_cqe_drop) },
 };
 
 static const struct counter_desc ptp_rq_stats_desc[] = {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index 375752d6546d..b77100b60b50 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -461,6 +461,7 @@ struct mlx5e_ptp_cq_stats {
 	u64 abort_abs_diff_ns;
 	u64 resync_cqe;
 	u64 resync_event;
+	u64 ooo_cqe_drop;
 };
 
 struct mlx5e_rep_stats {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net v5 0/2] mlx5: ptp fifo bugfixes
  2023-02-02 17:13 [PATCH net v5 0/2] mlx5: ptp fifo bugfixes Vadim Fedorenko
  2023-02-02 17:13 ` [PATCH net v5 1/2] mlx5: fix skb leak while fifo resync and push Vadim Fedorenko
  2023-02-02 17:13 ` [PATCH net v5 2/2] mlx5: fix possible ptp queue fifo use-after-free Vadim Fedorenko
@ 2023-02-02 18:53 ` Saeed Mahameed
  2 siblings, 0 replies; 4+ messages in thread
From: Saeed Mahameed @ 2023-02-02 18:53 UTC (permalink / raw)
  To: Vadim Fedorenko
  Cc: Jakub Kicinski, Vadim Fedorenko, Rahul Rameshbabu, Tariq Toukan,
	Gal Pressman, netdev

On 02 Feb 09:13, Vadim Fedorenko wrote:
>Simple FIFO implementation for PTP queue has several bugs which lead to
>use-after-free and skb leaks. This series fixes the issues and adds new
>checks for this FIFO implementation to uncover the same problems in
>future.
>

Thanks Vadim, Applied to net-mlx5.

>v4 -> v5:
>  Change check to WARN_ON_ONCE() in mlx5e_skb_fifo_pop()
>  Change the check of OOO cqe as Jakub provided corner case
>  Move OOO logic into separate function and add counter
>v3 -> v4:
>  Change pr_err to mlx5_core_err_rl per suggest
>  Removed WARN_ONCE on fifo push because has_room() should catch the
>  issue
>v2 -> v3:
>  Rearrange patches order and rephrase commit messages
>  Remove counters as Gal confirmed FW bug, use KERN_ERR message instead
>  Provide proper budget to napi_consume_skb as Jakub suggested
>v1 -> v2:
>  Update Fixes tag to proper commit.
>  Change debug line to avoid double print of function name
>
>Vadim Fedorenko (2):
>  mlx5: fix skb leak while fifo resync and push
>  mlx5: fix possible ptp queue fifo use-after-free
>
> .../net/ethernet/mellanox/mlx5/core/en/ptp.c  | 25 ++++++++++++++++---
> .../net/ethernet/mellanox/mlx5/core/en/txrx.h |  4 ++-
> .../ethernet/mellanox/mlx5/core/en_stats.c    |  1 +
> .../ethernet/mellanox/mlx5/core/en_stats.h    |  1 +
> 4 files changed, 27 insertions(+), 4 deletions(-)
>
>-- 
>2.30.2
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-02-02 18:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-02 17:13 [PATCH net v5 0/2] mlx5: ptp fifo bugfixes Vadim Fedorenko
2023-02-02 17:13 ` [PATCH net v5 1/2] mlx5: fix skb leak while fifo resync and push Vadim Fedorenko
2023-02-02 17:13 ` [PATCH net v5 2/2] mlx5: fix possible ptp queue fifo use-after-free Vadim Fedorenko
2023-02-02 18:53 ` [PATCH net v5 0/2] mlx5: ptp fifo bugfixes Saeed Mahameed

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.