From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80795C54EBD for ; Mon, 9 Jan 2023 13:37:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237200AbjAINhA (ORCPT ); Mon, 9 Jan 2023 08:37:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47870 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236710AbjAINgk (ORCPT ); Mon, 9 Jan 2023 08:36:40 -0500 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2089.outbound.protection.outlook.com [40.107.243.89]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 75E023C39B for ; Mon, 9 Jan 2023 05:34:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=N1Rf8kdgX9IcKFFwv7cSAHE6v4sjOmJ47w+xEhgLCP5woZ/wfJQi/pctb3yehNLueifmMZ5kQMRDuM4mA9daQ6Ip0FG2OLiyNkb3gScgsZP4Y9qkR7q+z/2XnvPDWAXGzCTNfHoyDBEDExVM8uwNGU+2EeHkp7/G6iEQUis3oFHqiQ+4/MulVyrHk3MUqFEIiSutXgP49+c2QMwKi8DKrgA/gBi2Gl1gt7to/mOJHKzZIspiTRAdhyYbWg4UqgWv6y8ViRMPN8A85WlaXRjKj0tXS/cBYw+nDZbsSJcFCqcXJDGzUIHGM3hvrE+KzXoBGf2igFcERhXOEDNkxb+09Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=NANmqoZ6T5PXPpRQ4kuvruD3kBK+CEkoor76lSajFvI=; b=YNckRU0Hfnmj+lhdbu1J52otAy4QT8FcE4GFVnFdHHU9Qn5S5IY9rJg28yFWwio6ycafoRrdQPGa+SZrxeJc34DkHu5NhLt3JG7FeiQVgwAFaN+/AQG/mncNqcLj3RJkyKj2WTovbjUbbUo7Vn46IBHAJdPm3Pk2WzpJxYRcTi+yDYiBH+8sL7hITWMyKtRJpwn+qtyoQb7u2SbA2nbt5uOV+GTNq9XyMvvrZlDzLJMACm6vCwGW2h2HPLzAhykel7mMoigbhDcF0mCwCZt2iL9DnDO/lgiOA9Jj56FNU0KT/XF11ganAscVWE6ii4Hkrkm6ttfBuzU6PNlvkWiznw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=NANmqoZ6T5PXPpRQ4kuvruD3kBK+CEkoor76lSajFvI=; b=aWGEgHr7QNSCV+8RMwt20ZVq1/yXg4RvRvBHyzsCwv4QBRIHOCYTAc0vEQYnJbochcBZxDy9jJp5Yyofonmj7DbTuP45gxgZeRMPEMxRYajcLAGtCYABpyPf6J7kI45Xutt2eBv/Tkknwt/j5aFtfFOzc84mYqdF+yhl946oyIOGYBwvmp/htc2TlWDg9Kac4FkG8hqXXNtxQgpW2rQ00OiYl0vRyeix4EeY6byrb32YDShLyqPsyJN9gkdXiGoZxbL+7IYoUI7Elu1kYVg1PnXBdiyS4kTPb/cp8BrvQwIEd9O3InV9HRRhIBqwFsYjJc01FqQ5tSoldoOVMl1bZA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by MN2PR12MB4285.namprd12.prod.outlook.com (2603:10b6:208:1d7::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:34:03 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:34:03 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Ben Ben-Ishay , Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, yorayz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 24/25] net/mlx5e: NVMEoTCP, data-path for DDP+DDGST offload Date: Mon, 9 Jan 2023 15:31:15 +0200 Message-Id: <20230109133116.20801-25-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> Content-Transfer-Encoding: 8bit Content-Type: text/plain X-ClientProxiedBy: FR3P281CA0162.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:a2::14) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|MN2PR12MB4285:EE_ X-MS-Office365-Filtering-Correlation-Id: 6f190d75-f1fc-4b93-aa2d-08daf2462c48 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: tpzuFaQpzppHcU9/XkK7O/UQ59+SNC6lrdsK59KFRPJanJYKz2HmEcEwkkrGic1uOU6hjW5VZfZSSmoP7uJrbPtm3WV0fjqVb9lD/wG2DwG98ylqEhouA6qe1Ff9odMcWK5FDLj8K+7uyK/jSd+CzFuTUiqnrvAUFfr0JYIxzcF1XpX1KMHX+SHY0BC8fLkVo23vmbr5viLZdBjhAJBVc6zQSZ8s4lkhKDIS+xjcCfzePETz3ZLAWUUK8I9X9QF3zuI/lwc7ba1MUh3AwKM4PryUC+/JFKG01dHKNGBcB0TVuoqJ7ZT7+14M3jjOusV1oEmFdrmWEJOEzRxVG+dlUQdiD3NlORb84MfsBD3LvIR4L6RlxXHihQfh2e8fdvLug7FcxR/KKypZzidFsuVb0EBcqaSbb9IV9upiIOhhRztuH4s3zUPbzvlLbliDzJXA0s/iyUwFBE85bo+hfCdjPKi+zZxSd5AVUmUfsvjtUe/6tOeAX5Gx8OuF/4KccXHybbE67Wzvc/C2MwJ1kB3tdbzSj+Q2quybG3PxC3e4apV9YPImw1Ke+48Gw3CWyr9qh3bEFAZqMhwfsCKn+0n07twd48m9Yx4UNU7t02T4xGqJzNrYArRDymdZFMbeOIhUQByTcZxRPILJBiat9Ta0Mg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(346002)(366004)(39860400002)(396003)(376002)(136003)(451199015)(8936002)(2906002)(5660300002)(41300700001)(7416002)(66476007)(66556008)(4326008)(8676002)(316002)(66946007)(30864003)(54906003)(186003)(26005)(6512007)(2616005)(1076003)(38100700002)(83380400001)(86362001)(36756003)(107886003)(478600001)(6486002)(6506007)(6666004);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?8MsSKlqHDqLRL+yzJa9SldqDBWHjEGfllQlVg9FOZsB+kFGgSXsJAdvGMNLB?= =?us-ascii?Q?NiHLYAPNm6ZeMu3HcfYKAuQaS1EUEIU+s8ufDzVMKiuK+DoHrbaz6CnCPx7p?= =?us-ascii?Q?fiOXVOvllLi8jXxOwcQXBI+qsEJ7NKepamJmTSrh5Q/bdZ8cbuIzEeOiBR07?= =?us-ascii?Q?iRVfVzemBmLL+jfPMPAr58slC46RhpVSNeeXvsr/MjYxVjQksT94sUsezL1x?= =?us-ascii?Q?u4ya7yru3IECutRhL8GcNLY/qvEEpLkYbA1WQKlxIMqWZ6hMPXvPtIPFEyJb?= =?us-ascii?Q?lEcyWpM03In9hk1mst0XeKQRISFMjbr1baTY9GiGjJ4sO2ArQKVl+tkTjWux?= =?us-ascii?Q?58vuCRkJz+JcnDAmfI+eUh++v+5Lwr3T5NSkVTycE/9Oi/82mSoTPRQ4G7O0?= =?us-ascii?Q?6Y6cf0r6kA4PkI1TiXd0P4eh7LyDIaTOG2uqTQRyrd/On1TKhIc3+hULxXqM?= =?us-ascii?Q?ByRomogAbV6VZfGEXbE35eVY8li/XZgBNDvCxfbkHkjRnIi8ysIr+TNA3yj+?= =?us-ascii?Q?vqVRcWwguhm/+KRUk5E6XtZhwF0NdDd2o3unYp4HboKeUgpUVYle46LhXzO1?= =?us-ascii?Q?wVE0E4iAdIIAjexUyiG30+I63teY8vVQ7p/IGaMUlXdIXl4/QUud7eu/ePRd?= =?us-ascii?Q?CtqSvOHKtdTS6VmYUMiL5ND3hCMMRWIHofwunJHL58f39D6MjyjhWk7mJYWq?= =?us-ascii?Q?Pa8EL5gQbfaXVF2lS/4GeupZH61RqkGbeITIBwqVHS3t0SZpPkOZL+yBaUbi?= =?us-ascii?Q?mhs441Xy4Zs4T/ZMk6IylEORrlW/vv8VCKFV4LZyLtziIXsrIXR3blRAAA21?= =?us-ascii?Q?HBTedSdU/bZouk0C4zqyvlkJ4PbBnHoKAvddmrLDHoJaBn7HccvKAtS9uK9H?= =?us-ascii?Q?fhjzVQK2kKhWFQtZmmEcXJMdsqrD6BRxq0V5vaPd+6nDtiYVetkfzX12xFlI?= =?us-ascii?Q?FrmKAv23Y0TZuhNR6fx5MQR9rWqP5C2Wiw7y9b7qjAsaaByERVzew5JM5R5A?= =?us-ascii?Q?E9s4Jgbg1uByM5QB27AlIZ0Jpk7fc2O1sD/8kaG4rah5dHO4KtuJ42s1NYbb?= =?us-ascii?Q?0BAWJvM8ux1aTrzjTG06kLnXnJNOQc75qUtU7XXd8dtiFJvim3xF6rN/bWjH?= =?us-ascii?Q?5ThQNiZZzdhLVLEQxWtCeEhnZE8/cXNvUcxjLInCNqAboBaJetuopJxt623G?= =?us-ascii?Q?xf+1hmUP8x2KXaUJCe7+dtlozp/EaI1krAo1FaSvD7x3aejRyKAhBLNv8fzK?= =?us-ascii?Q?P2H+21DBqTWV+h7HtzSZDYgDIpa9gPJkWKhP9RQSTe+Pmr5p4xJHP6gE+oV4?= =?us-ascii?Q?47cd/mbddWcO9uJP3O/UXbnJA/N49pZhvSNka/k+rjtswK+WZ6bYz4Qq5/aK?= =?us-ascii?Q?E1/CsQWa136+x4ljGL2KbQ1cPETjAgc5g/H9/sYF2eFDsxX5urQVv1a5hfxn?= =?us-ascii?Q?kMpdmQIlDo5ifg4VgC1b6sI8u8FBdONlpesppNz6Ygammbk9QW/H2W/gSPy6?= =?us-ascii?Q?Vc+FJ1GvrWzUf/90+LDk1xiKbl9dyx647p+MPXDsDtndqc9fIVb7Rbxp5Ugl?= =?us-ascii?Q?Qg09/Ho8ziSijeUqwL14RzMJqIBubJHPLAHKpw5D?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 6f190d75-f1fc-4b93-aa2d-08daf2462c48 X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:34:03.3281 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: mB3tYoESrmKzsgEahvQAJElriu1IG+fVJMHHgfVq1gD0QoZgAkFY8b6lbE6rtNJmFXn2HSn7zAS0I79+ToDxKw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB4285 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Ben Ben-Ishay This patch implements the data-path for direct data placement (DDP) and DDGST offloads. NVMEoTCP DDP constructs an SKB from each CQE, while pointing at NVME destination buffers. In turn, this enables the offload, as the NVMe-TCP layer will skip the copy when src == dst. Additionally, this patch adds support for DDGST (CRC32) offload. HW will report DDGST offload only if it has not encountered an error in the received packet. We pass this indication in skb->ulp_crc up the stack to NVMe-TCP to skip computing the DDGST if all corresponding SKBs were verified by HW. This patch also handles context resynchronization requests made by NIC HW. The resync request is passed to the NVMe-TCP layer to be handled at a later point in time. Finally, we also use the skb->ulp_ddp bit to avoid skb_condense. This is critical as every SKB that uses DDP has a hole that fits perfectly with skb_condense's policy, but filling this hole is counter-productive as the data there already resides in its destination buffer. This work has been done on pre-silicon functional simulator, and hence data-path performance numbers are not provided. Signed-off-by: Ben Ben-Ishay Signed-off-by: Boris Pismenny Signed-off-by: Or Gerlitz Signed-off-by: Yoray Zack Signed-off-by: Aurelien Aptel Reviewed-by: Tariq Toukan --- .../net/ethernet/mellanox/mlx5/core/Makefile | 2 +- drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + .../ethernet/mellanox/mlx5/core/en/xsk/rx.c | 1 + .../ethernet/mellanox/mlx5/core/en/xsk/rx.h | 1 + .../mlx5/core/en_accel/nvmeotcp_rxtx.c | 316 ++++++++++++++++++ .../mlx5/core/en_accel/nvmeotcp_rxtx.h | 37 ++ .../net/ethernet/mellanox/mlx5/core/en_rx.c | 51 ++- 7 files changed, 395 insertions(+), 14 deletions(-) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile index 9df9999047d1..9804bd086bf4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile @@ -103,7 +103,7 @@ mlx5_core-$(CONFIG_MLX5_EN_TLS) += en_accel/ktls_stats.o \ en_accel/fs_tcp.o en_accel/ktls.o en_accel/ktls_txrx.o \ en_accel/ktls_tx.o en_accel/ktls_rx.o -mlx5_core-$(CONFIG_MLX5_EN_NVMEOTCP) += en_accel/fs_tcp.o en_accel/nvmeotcp.o +mlx5_core-$(CONFIG_MLX5_EN_NVMEOTCP) += en_accel/fs_tcp.o en_accel/nvmeotcp.o en_accel/nvmeotcp_rxtx.o mlx5_core-$(CONFIG_MLX5_SW_STEERING) += steering/dr_domain.o steering/dr_table.o \ steering/dr_matcher.o steering/dr_rule.o \ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index 217b5480c3aa..dfd0f1aa650c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -628,6 +628,7 @@ struct mlx5e_rq; typedef void (*mlx5e_fp_handle_rx_cqe)(struct mlx5e_rq*, struct mlx5_cqe64*); typedef struct sk_buff * (*mlx5e_fp_skb_from_cqe_mpwrq)(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, + struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, u32 page_idx); typedef struct sk_buff * (*mlx5e_fp_skb_from_cqe)(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c index c91b54d9ff27..03b416989bba 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c @@ -229,6 +229,7 @@ static struct sk_buff *mlx5e_xsk_construct_skb(struct mlx5e_rq *rq, struct xdp_b struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, + struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, u32 page_idx) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h index 087c943bd8e9..22e972398d92 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h @@ -13,6 +13,7 @@ int mlx5e_xsk_alloc_rx_wqes_batched(struct mlx5e_rq *rq, u16 ix, int wqe_bulk); int mlx5e_xsk_alloc_rx_wqes(struct mlx5e_rq *rq, u16 ix, int wqe_bulk); struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, + struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, u32 page_idx); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c new file mode 100644 index 000000000000..4c7dab28ef56 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c @@ -0,0 +1,316 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +// Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. + +#include "en_accel/nvmeotcp_rxtx.h" +#include + +#define MLX5E_TC_FLOW_ID_MASK 0x00ffffff +static void nvmeotcp_update_resync(struct mlx5e_nvmeotcp_queue *queue, + struct mlx5e_cqe128 *cqe128) +{ + const struct ulp_ddp_ulp_ops *ulp_ops; + u32 seq; + + seq = be32_to_cpu(cqe128->resync_tcp_sn); + ulp_ops = inet_csk(queue->sk)->icsk_ulp_ddp_ops; + if (ulp_ops && ulp_ops->resync_request) + ulp_ops->resync_request(queue->sk, seq, ULP_DDP_RESYNC_PENDING); +} + +static void mlx5e_nvmeotcp_advance_sgl_iter(struct mlx5e_nvmeotcp_queue *queue) +{ + struct mlx5e_nvmeotcp_queue_entry *nqe = &queue->ccid_table[queue->ccid]; + + queue->ccoff += nqe->sgl[queue->ccsglidx].length; + queue->ccoff_inner = 0; + queue->ccsglidx++; +} + +static inline void +mlx5e_nvmeotcp_add_skb_frag(struct net_device *netdev, struct sk_buff *skb, + struct mlx5e_nvmeotcp_queue *queue, + struct mlx5e_nvmeotcp_queue_entry *nqe, u32 fragsz) +{ + dma_sync_single_for_cpu(&netdev->dev, + nqe->sgl[queue->ccsglidx].offset + queue->ccoff_inner, + fragsz, DMA_FROM_DEVICE); + page_ref_inc(compound_head(sg_page(&nqe->sgl[queue->ccsglidx]))); + skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, + sg_page(&nqe->sgl[queue->ccsglidx]), + nqe->sgl[queue->ccsglidx].offset + queue->ccoff_inner, + fragsz, + fragsz); +} + +static inline void +mlx5_nvmeotcp_add_tail_nonlinear(struct mlx5e_nvmeotcp_queue *queue, + struct sk_buff *skb, skb_frag_t *org_frags, + int org_nr_frags, int frag_index) +{ + while (org_nr_frags != frag_index) { + skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, + skb_frag_page(&org_frags[frag_index]), + skb_frag_off(&org_frags[frag_index]), + skb_frag_size(&org_frags[frag_index]), + skb_frag_size(&org_frags[frag_index])); + page_ref_inc(skb_frag_page(&org_frags[frag_index])); + frag_index++; + } +} + +static void +mlx5_nvmeotcp_add_tail(struct mlx5e_nvmeotcp_queue *queue, struct sk_buff *skb, + int offset, int len) +{ + skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, virt_to_page(skb->data), offset, len, + len); + page_ref_inc(virt_to_page(skb->data)); +} + +static void mlx5_nvmeotcp_trim_nonlinear(struct sk_buff *skb, skb_frag_t *org_frags, + int *frag_index, int remaining) +{ + unsigned int frag_size; + int nr_frags; + + /* skip @remaining bytes in frags */ + *frag_index = 0; + while (remaining) { + frag_size = skb_frag_size(&skb_shinfo(skb)->frags[*frag_index]); + if (frag_size > remaining) { + skb_frag_off_add(&skb_shinfo(skb)->frags[*frag_index], + remaining); + skb_frag_size_sub(&skb_shinfo(skb)->frags[*frag_index], + remaining); + remaining = 0; + } else { + remaining -= frag_size; + skb_frag_unref(skb, *frag_index); + *frag_index += 1; + } + } + + /* save original frags for the tail and unref */ + nr_frags = skb_shinfo(skb)->nr_frags; + memcpy(&org_frags[*frag_index], &skb_shinfo(skb)->frags[*frag_index], + (nr_frags - *frag_index) * sizeof(skb_frag_t)); + while (--nr_frags >= *frag_index) + skb_frag_unref(skb, nr_frags); + + /* remove frags from skb */ + skb_shinfo(skb)->nr_frags = 0; + skb->len -= skb->data_len; + skb->truesize -= skb->data_len; + skb->data_len = 0; +} + +static bool +mlx5e_nvmeotcp_rebuild_rx_skb_nonlinear(struct mlx5e_rq *rq, struct sk_buff *skb, + struct mlx5_cqe64 *cqe, u32 cqe_bcnt) +{ + int ccoff, cclen, hlen, ccid, remaining, fragsz, to_copy = 0; + struct net_device *netdev = rq->netdev; + struct mlx5e_priv *priv = netdev_priv(netdev); + struct mlx5e_nvmeotcp_queue_entry *nqe; + skb_frag_t org_frags[MAX_SKB_FRAGS]; + struct mlx5e_nvmeotcp_queue *queue; + int org_nr_frags, frag_index; + struct mlx5e_cqe128 *cqe128; + u32 queue_id; + + queue_id = (be32_to_cpu(cqe->sop_drop_qpn) & MLX5E_TC_FLOW_ID_MASK); + queue = mlx5e_nvmeotcp_get_queue(priv->nvmeotcp, queue_id); + if (unlikely(!queue)) { + dev_kfree_skb_any(skb); + return false; + } + + cqe128 = container_of(cqe, struct mlx5e_cqe128, cqe64); + if (cqe_is_nvmeotcp_resync(cqe)) { + nvmeotcp_update_resync(queue, cqe128); + mlx5e_nvmeotcp_put_queue(queue); + return true; + } + + /* If a resync occurred in the previous cqe, + * the current cqe.crcvalid bit may not be valid, + * so we will treat it as 0 + */ + if (unlikely(queue->after_resync_cqe) && cqe_is_nvmeotcp_crcvalid(cqe)) { + skb->ulp_crc = 0; + queue->after_resync_cqe = 0; + } else { + if (queue->crc_rx) + skb->ulp_crc = cqe_is_nvmeotcp_crcvalid(cqe); + } + + skb->ulp_ddp = cqe_is_nvmeotcp_zc(cqe); + if (!cqe_is_nvmeotcp_zc(cqe)) { + mlx5e_nvmeotcp_put_queue(queue); + return true; + } + + /* cc ddp from cqe */ + ccid = be16_to_cpu(cqe128->ccid); + ccoff = be32_to_cpu(cqe128->ccoff); + cclen = be16_to_cpu(cqe128->cclen); + hlen = be16_to_cpu(cqe128->hlen); + + /* carve a hole in the skb for DDP data */ + org_nr_frags = skb_shinfo(skb)->nr_frags; + mlx5_nvmeotcp_trim_nonlinear(skb, org_frags, &frag_index, cclen); + nqe = &queue->ccid_table[ccid]; + + /* packet starts new ccid? */ + if (queue->ccid != ccid || queue->ccid_gen != nqe->ccid_gen) { + queue->ccid = ccid; + queue->ccoff = 0; + queue->ccoff_inner = 0; + queue->ccsglidx = 0; + queue->ccid_gen = nqe->ccid_gen; + } + + /* skip inside cc until the ccoff in the cqe */ + while (queue->ccoff + queue->ccoff_inner < ccoff) { + remaining = nqe->sgl[queue->ccsglidx].length - queue->ccoff_inner; + fragsz = min_t(off_t, remaining, + ccoff - (queue->ccoff + queue->ccoff_inner)); + + if (fragsz == remaining) + mlx5e_nvmeotcp_advance_sgl_iter(queue); + else + queue->ccoff_inner += fragsz; + } + + /* adjust the skb according to the cqe cc */ + while (to_copy < cclen) { + remaining = nqe->sgl[queue->ccsglidx].length - queue->ccoff_inner; + fragsz = min_t(int, remaining, cclen - to_copy); + + mlx5e_nvmeotcp_add_skb_frag(netdev, skb, queue, nqe, fragsz); + to_copy += fragsz; + if (fragsz == remaining) + mlx5e_nvmeotcp_advance_sgl_iter(queue); + else + queue->ccoff_inner += fragsz; + } + + if (cqe_bcnt > hlen + cclen) { + remaining = cqe_bcnt - hlen - cclen; + mlx5_nvmeotcp_add_tail_nonlinear(queue, skb, org_frags, + org_nr_frags, + frag_index); + } + + mlx5e_nvmeotcp_put_queue(queue); + return true; +} + +static bool +mlx5e_nvmeotcp_rebuild_rx_skb_linear(struct mlx5e_rq *rq, struct sk_buff *skb, + struct mlx5_cqe64 *cqe, u32 cqe_bcnt) +{ + int ccoff, cclen, hlen, ccid, remaining, fragsz, to_copy = 0; + struct net_device *netdev = rq->netdev; + struct mlx5e_priv *priv = netdev_priv(netdev); + struct mlx5e_nvmeotcp_queue_entry *nqe; + struct mlx5e_nvmeotcp_queue *queue; + struct mlx5e_cqe128 *cqe128; + u32 queue_id; + + queue_id = (be32_to_cpu(cqe->sop_drop_qpn) & MLX5E_TC_FLOW_ID_MASK); + queue = mlx5e_nvmeotcp_get_queue(priv->nvmeotcp, queue_id); + if (unlikely(!queue)) { + dev_kfree_skb_any(skb); + return false; + } + + cqe128 = container_of(cqe, struct mlx5e_cqe128, cqe64); + if (cqe_is_nvmeotcp_resync(cqe)) { + nvmeotcp_update_resync(queue, cqe128); + mlx5e_nvmeotcp_put_queue(queue); + return true; + } + + /* If a resync occurred in the previous cqe, + * the current cqe.crcvalid bit may not be valid, + * so we will treat it as 0 + */ + if (unlikely(queue->after_resync_cqe) && cqe_is_nvmeotcp_crcvalid(cqe)) { + skb->ulp_crc = 0; + queue->after_resync_cqe = 0; + } else { + if (queue->crc_rx) + skb->ulp_crc = cqe_is_nvmeotcp_crcvalid(cqe); + } + + skb->ulp_ddp = cqe_is_nvmeotcp_zc(cqe); + if (!cqe_is_nvmeotcp_zc(cqe)) { + mlx5e_nvmeotcp_put_queue(queue); + return true; + } + + /* cc ddp from cqe */ + ccid = be16_to_cpu(cqe128->ccid); + ccoff = be32_to_cpu(cqe128->ccoff); + cclen = be16_to_cpu(cqe128->cclen); + hlen = be16_to_cpu(cqe128->hlen); + + /* carve a hole in the skb for DDP data */ + skb_trim(skb, hlen); + nqe = &queue->ccid_table[ccid]; + + /* packet starts new ccid? */ + if (queue->ccid != ccid || queue->ccid_gen != nqe->ccid_gen) { + queue->ccid = ccid; + queue->ccoff = 0; + queue->ccoff_inner = 0; + queue->ccsglidx = 0; + queue->ccid_gen = nqe->ccid_gen; + } + + /* skip inside cc until the ccoff in the cqe */ + while (queue->ccoff + queue->ccoff_inner < ccoff) { + remaining = nqe->sgl[queue->ccsglidx].length - queue->ccoff_inner; + fragsz = min_t(off_t, remaining, + ccoff - (queue->ccoff + queue->ccoff_inner)); + + if (fragsz == remaining) + mlx5e_nvmeotcp_advance_sgl_iter(queue); + else + queue->ccoff_inner += fragsz; + } + + /* adjust the skb according to the cqe cc */ + while (to_copy < cclen) { + remaining = nqe->sgl[queue->ccsglidx].length - queue->ccoff_inner; + fragsz = min_t(int, remaining, cclen - to_copy); + + mlx5e_nvmeotcp_add_skb_frag(netdev, skb, queue, nqe, fragsz); + to_copy += fragsz; + if (fragsz == remaining) + mlx5e_nvmeotcp_advance_sgl_iter(queue); + else + queue->ccoff_inner += fragsz; + } + + if (cqe_bcnt > hlen + cclen) { + remaining = cqe_bcnt - hlen - cclen; + mlx5_nvmeotcp_add_tail(queue, skb, + offset_in_page(skb->data) + + hlen + cclen, remaining); + } + + mlx5e_nvmeotcp_put_queue(queue); + return true; +} + +bool +mlx5e_nvmeotcp_rebuild_rx_skb(struct mlx5e_rq *rq, struct sk_buff *skb, + struct mlx5_cqe64 *cqe, u32 cqe_bcnt) +{ + if (skb->data_len) + return mlx5e_nvmeotcp_rebuild_rx_skb_nonlinear(rq, skb, cqe, cqe_bcnt); + else + return mlx5e_nvmeotcp_rebuild_rx_skb_linear(rq, skb, cqe, cqe_bcnt); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.h new file mode 100644 index 000000000000..a8ca8a53bac6 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.h @@ -0,0 +1,37 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. */ +#ifndef __MLX5E_NVMEOTCP_RXTX_H__ +#define __MLX5E_NVMEOTCP_RXTX_H__ + +#ifdef CONFIG_MLX5_EN_NVMEOTCP + +#include +#include "en_accel/nvmeotcp.h" + +bool +mlx5e_nvmeotcp_rebuild_rx_skb(struct mlx5e_rq *rq, struct sk_buff *skb, + struct mlx5_cqe64 *cqe, u32 cqe_bcnt); + +static inline int mlx5_nvmeotcp_get_headlen(struct mlx5_cqe64 *cqe, u32 cqe_bcnt) +{ + struct mlx5e_cqe128 *cqe128; + + if (!cqe_is_nvmeotcp_zc(cqe)) + return cqe_bcnt; + + cqe128 = container_of(cqe, struct mlx5e_cqe128, cqe64); + return be16_to_cpu(cqe128->hlen); +} + +#else + +static inline bool +mlx5e_nvmeotcp_rebuild_rx_skb(struct mlx5e_rq *rq, struct sk_buff *skb, + struct mlx5_cqe64 *cqe, u32 cqe_bcnt) +{ return true; } + +static inline int mlx5_nvmeotcp_get_headlen(struct mlx5_cqe64 *cqe, u32 cqe_bcnt) +{ return cqe_bcnt; } + +#endif /* CONFIG_MLX5_EN_NVMEOTCP */ +#endif /* __MLX5E_NVMEOTCP_RXTX_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index 1b3660d05350..d6dd611c1ebf 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -53,7 +53,7 @@ #include "en_accel/macsec.h" #include "en_accel/ipsec_rxtx.h" #include "en_accel/ktls_txrx.h" -#include "en_accel/nvmeotcp.h" +#include "en_accel/nvmeotcp_rxtx.h" #include "en/xdp.h" #include "en/xsk/rx.h" #include "en/health.h" @@ -63,9 +63,11 @@ static struct sk_buff * mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, + struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, u32 page_idx); static struct sk_buff * mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, + struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, u32 page_idx); static void mlx5e_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe); static void mlx5e_handle_rx_cqe_mpwrq(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe); @@ -1484,7 +1486,7 @@ static inline void mlx5e_handle_csum(struct net_device *netdev, #define MLX5E_CE_BIT_MASK 0x80 -static inline void mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe, +static inline bool mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe, u32 cqe_bcnt, struct mlx5e_rq *rq, struct sk_buff *skb) @@ -1495,6 +1497,13 @@ static inline void mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe, skb->mac_len = ETH_HLEN; + if (IS_ENABLED(CONFIG_MLX5_EN_NVMEOTCP) && cqe_is_nvmeotcp(cqe)) { + bool ret = mlx5e_nvmeotcp_rebuild_rx_skb(rq, skb, cqe, cqe_bcnt); + + if (unlikely(!ret)) + return ret; + } + if (unlikely(get_cqe_tls_offload(cqe))) mlx5e_ktls_handle_rx_skb(rq, skb, cqe, &cqe_bcnt); @@ -1540,6 +1549,8 @@ static inline void mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe, if (unlikely(mlx5e_skb_is_multicast(skb))) stats->mcast_packets++; + + return true; } static void mlx5e_shampo_complete_rx_cqe(struct mlx5e_rq *rq, @@ -1563,7 +1574,7 @@ static void mlx5e_shampo_complete_rx_cqe(struct mlx5e_rq *rq, } } -static inline void mlx5e_complete_rx_cqe(struct mlx5e_rq *rq, +static inline bool mlx5e_complete_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe, u32 cqe_bcnt, struct sk_buff *skb) @@ -1572,7 +1583,7 @@ static inline void mlx5e_complete_rx_cqe(struct mlx5e_rq *rq, stats->packets++; stats->bytes += cqe_bcnt; - mlx5e_build_rx_skb(cqe, cqe_bcnt, rq, skb); + return mlx5e_build_rx_skb(cqe, cqe_bcnt, rq, skb); } static inline @@ -1810,7 +1821,8 @@ static void mlx5e_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe) goto free_wqe; } - mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb); + if (unlikely(!mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb))) + goto free_wqe; if (mlx5e_cqe_regb_chain(cqe)) if (!mlx5e_tc_update_skb(cqe, skb)) { @@ -1863,7 +1875,8 @@ static void mlx5e_handle_rx_cqe_rep(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe) goto free_wqe; } - mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb); + if (unlikely(!mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb))) + goto free_wqe; if (rep->vlan && skb_vlan_tag_present(skb)) skb_vlan_pop(skb); @@ -1910,11 +1923,12 @@ static void mlx5e_handle_rx_cqe_mpwrq_rep(struct mlx5e_rq *rq, struct mlx5_cqe64 skb = INDIRECT_CALL_2(rq->mpwqe.skb_from_cqe_mpwrq, mlx5e_skb_from_cqe_mpwrq_linear, mlx5e_skb_from_cqe_mpwrq_nonlinear, - rq, wi, cqe_bcnt, head_offset, page_idx); + rq, wi, cqe, cqe_bcnt, head_offset, page_idx); if (!skb) goto mpwrq_cqe_out; - mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb); + if (unlikely(!mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb))) + goto mpwrq_cqe_out; mlx5e_rep_tc_receive(cqe, rq, skb); @@ -1959,12 +1973,18 @@ mlx5e_fill_skb_data(struct sk_buff *skb, struct mlx5e_rq *rq, } } +static inline u16 mlx5e_get_headlen_hint(struct mlx5_cqe64 *cqe, u32 cqe_bcnt) +{ + return min_t(u32, MLX5E_RX_MAX_HEAD, mlx5_nvmeotcp_get_headlen(cqe, cqe_bcnt)); +} + static struct sk_buff * mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, + struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, u32 page_idx) { union mlx5e_alloc_unit *au = &wi->alloc_units[page_idx]; - u16 headlen = min_t(u16, MLX5E_RX_MAX_HEAD, cqe_bcnt); + u16 headlen = mlx5e_get_headlen_hint(cqe, cqe_bcnt); u32 frag_offset = head_offset + headlen; u32 byte_cnt = cqe_bcnt - headlen; union mlx5e_alloc_unit *head_au = au; @@ -2000,6 +2020,7 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w static struct sk_buff * mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, + struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, u32 page_idx) { union mlx5e_alloc_unit *au = &wi->alloc_units[page_idx]; @@ -2195,7 +2216,8 @@ static void mlx5e_handle_rx_cqe_mpwrq_shampo(struct mlx5e_rq *rq, struct mlx5_cq if (likely(head_size)) *skb = mlx5e_skb_from_cqe_shampo(rq, wi, cqe, header_index); else - *skb = mlx5e_skb_from_cqe_mpwrq_nonlinear(rq, wi, cqe_bcnt, data_offset, + *skb = mlx5e_skb_from_cqe_mpwrq_nonlinear(rq, wi, cqe, + cqe_bcnt, data_offset, page_idx); if (unlikely(!*skb)) goto free_hd_entry; @@ -2270,11 +2292,12 @@ static void mlx5e_handle_rx_cqe_mpwrq(struct mlx5e_rq *rq, struct mlx5_cqe64 *cq mlx5e_skb_from_cqe_mpwrq_linear, mlx5e_skb_from_cqe_mpwrq_nonlinear, mlx5e_xsk_skb_from_cqe_mpwrq_linear, - rq, wi, cqe_bcnt, head_offset, page_idx); + rq, wi, cqe, cqe_bcnt, head_offset, page_idx); if (!skb) goto mpwrq_cqe_out; - mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb); + if (unlikely(!mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb))) + goto mpwrq_cqe_out; if (mlx5e_cqe_regb_chain(cqe)) if (!mlx5e_tc_update_skb(cqe, skb)) { @@ -2611,7 +2634,9 @@ static void mlx5e_trap_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe if (!skb) goto free_wqe; - mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb); + if (unlikely(!mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb))) + goto free_wqe; + skb_push(skb, ETH_HLEN); dl_port = mlx5e_devlink_get_dl_port(priv); -- 2.31.1