All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Baum <michaelba@nvidia.com>
To: <dev@dpdk.org>
Cc: Matan Azrad <matan@nvidia.com>, Akhil Goyal <gakhil@marvell.com>,
	"Thomas Monjalon" <thomas@monjalon.net>
Subject: [PATCH v3 8/8] compress/mlx5: add support for LZ4 algorithm
Date: Tue, 21 Feb 2023 09:07:56 +0200	[thread overview]
Message-ID: <20230221070756.3070819-9-michaelba@nvidia.com> (raw)
In-Reply-To: <20230221070756.3070819-1-michaelba@nvidia.com>

Add support for decompress LZ4 algorithm for mlx5 PMD.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
---
 doc/guides/compressdevs/features/mlx5.ini |  18 ++-
 doc/guides/compressdevs/mlx5.rst          |  44 +++++-
 doc/guides/rel_notes/release_23_03.rst    |   4 +
 drivers/compress/mlx5/mlx5_compress.c     | 157 ++++++++++++++++++----
 4 files changed, 184 insertions(+), 39 deletions(-)

diff --git a/doc/guides/compressdevs/features/mlx5.ini b/doc/guides/compressdevs/features/mlx5.ini
index 891ce47936..28b050144a 100644
--- a/doc/guides/compressdevs/features/mlx5.ini
+++ b/doc/guides/compressdevs/features/mlx5.ini
@@ -4,10 +4,14 @@
 ; Supported features of 'MLX5' compression driver.
 ;
 [Features]
-HW Accelerated = Y
-Deflate        = Y
-Adler32        = Y
-Crc32          = Y
-Adler32&Crc32  = Y
-Fixed          = Y
-Dynamic        = Y
+HW Accelerated         = Y
+Deflate                = Y
+LZ4                    = Y
+Adler32                = Y
+Crc32                  = Y
+Adler32&Crc32          = Y
+xxHash32               = Y
+Fixed                  = Y
+Dynamic                = Y
+LZ4 Block Checksum     = Y
+LZ4 Block Independence = Y
diff --git a/doc/guides/compressdevs/mlx5.rst b/doc/guides/compressdevs/mlx5.rst
index 8bf4423882..751fae7184 100644
--- a/doc/guides/compressdevs/mlx5.rst
+++ b/doc/guides/compressdevs/mlx5.rst
@@ -39,11 +39,27 @@ Features
 
 Compress mlx5 PMD has support for:
 
-Compression/Decompression algorithm:
+- Compression
+- Decompression
+- DMA
 
-* DEFLATE.
+Algorithms
+----------
 
-NULL algorithm for DMA operations.
+NULL algorithm
+~~~~~~~~~~~~~~
+
+NULL algorithm is the way to perform DMA operations.
+It works through either compress or decompress operation.
+
+Shareable transformation.
+
+Checksum generation:
+
+* CRC32, Adler32 and combined checksum.
+
+DEFLATE algorithm
+~~~~~~~~~~~~~~~~~
 
 Huffman code type:
 
@@ -60,11 +76,31 @@ Checksum generation:
 
 * CRC32, Adler32 and combined checksum.
 
+LZ4 algorithm
+~~~~~~~~~~~~~
+
+Support for flags:
+
+* ``RTE_COMP_LZ4_FLAG_BLOCK_CHECKSUM``
+* ``RTE_COMP_LZ4_FLAG_BLOCK_INDEPENDENCE``
+
+Window size support:
+
+1KB, 2KB, 4KB, 8KB, 16KB and 32KB.
+
+Shareable transformation.
+
+Checksum generation:
+
+* xxHash-32 checksum.
+
 Limitations
 -----------
 
 * Scatter-Gather, SHA and Stateful are not supported.
 * Non-compressed block is not supported in compress (supported in decompress).
+* Compress operation is not supported by BlueField-3.
+* LZ4 algorithm is not supported by BlueField-2.
 
 Driver options
 --------------
@@ -75,7 +111,7 @@ for an additional list of options shared with other mlx5 drivers.
 - ``log-block-size`` parameter [int]
 
   Log of the Huffman block size in the Deflate algorithm.
-  Values from [4-15]; value x means block size is 2^x.
+  Values from [4-15]; value x means block size is 2\ :sup:`x`.
   The default value is 15.
 
 
diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst
index 49c18617a5..8b9c47fd63 100644
--- a/doc/guides/rel_notes/release_23_03.rst
+++ b/doc/guides/rel_notes/release_23_03.rst
@@ -183,6 +183,10 @@ New Features
 
   Added LZ4 compression algorithm with xxHash-32 for the checksum.
 
+* **Updated NVIDIA mlx5 compress PMD.**
+
+  Added LZ4 algorithm support for decompress operation.
+
 * **Updated the eventdev reconfigure logic for service based adapters.**
 
   * eventdev reconfig logic is enhanced to increment the
diff --git a/drivers/compress/mlx5/mlx5_compress.c b/drivers/compress/mlx5/mlx5_compress.c
index 3ea3447f11..41d9752833 100644
--- a/drivers/compress/mlx5/mlx5_compress.c
+++ b/drivers/compress/mlx5/mlx5_compress.c
@@ -24,6 +24,7 @@
 #define MLX5_COMPRESS_DRIVER_NAME mlx5_compress
 #define MLX5_COMPRESS_MAX_QPS 1024
 #define MLX5_COMP_MAX_WIN_SIZE_CONF 6u
+#define MLX5_COMP_NUM_SUP_ALGO 4
 
 struct mlx5_compress_devarg_params {
 	uint32_t log_block_sz;
@@ -43,6 +44,7 @@ struct mlx5_compress_priv {
 	struct mlx5_common_device *cdev; /* Backend mlx5 device. */
 	struct mlx5_uar uar;
 	struct rte_compressdev_config dev_config;
+	struct rte_compressdev_capabilities caps[MLX5_COMP_NUM_SUP_ALGO];
 	LIST_HEAD(xform_list, mlx5_compress_xform) xform_list;
 	rte_spinlock_t xform_sl;
 	uint32_t log_block_sz;
@@ -70,36 +72,16 @@ static pthread_mutex_t priv_list_lock = PTHREAD_MUTEX_INITIALIZER;
 
 int mlx5_compress_logtype;
 
-static const struct rte_compressdev_capabilities mlx5_caps[] = {
-	{
-		.algo = RTE_COMP_ALGO_NULL,
-		.comp_feature_flags = RTE_COMP_FF_ADLER32_CHECKSUM |
-				      RTE_COMP_FF_CRC32_CHECKSUM |
-				      RTE_COMP_FF_CRC32_ADLER32_CHECKSUM |
-				      RTE_COMP_FF_SHAREABLE_PRIV_XFORM,
-	},
-	{
-		.algo = RTE_COMP_ALGO_DEFLATE,
-		.comp_feature_flags = RTE_COMP_FF_ADLER32_CHECKSUM |
-				      RTE_COMP_FF_CRC32_CHECKSUM |
-				      RTE_COMP_FF_CRC32_ADLER32_CHECKSUM |
-				      RTE_COMP_FF_SHAREABLE_PRIV_XFORM |
-				      RTE_COMP_FF_HUFFMAN_FIXED |
-				      RTE_COMP_FF_HUFFMAN_DYNAMIC,
-		.window_size = {.min = 10, .max = 15, .increment = 1},
-	},
-	RTE_COMP_END_OF_CAPABILITIES_LIST()
-};
-
 static void
 mlx5_compress_dev_info_get(struct rte_compressdev *dev,
 			   struct rte_compressdev_info *info)
 {
-	RTE_SET_USED(dev);
-	if (info != NULL) {
+	if (dev != NULL && info != NULL) {
+		struct mlx5_compress_priv *priv = dev->data->dev_private;
+
 		info->max_nb_queue_pairs = MLX5_COMPRESS_MAX_QPS;
 		info->feature_flags = RTE_COMPDEV_FF_HW_ACCELERATED;
-		info->capabilities = mlx5_caps;
+		info->capabilities = priv->caps;
 	}
 }
 
@@ -236,6 +218,8 @@ mlx5_compress_qp_setup(struct rte_compressdev *dev, uint16_t qp_id,
 	qp_attr.num_of_receive_wqes = 0;
 	qp_attr.num_of_send_wqbbs = RTE_BIT32(log_ops_n);
 	qp_attr.mmo = attr->mmo_compress_qp_en || attr->mmo_dma_qp_en ||
+		      attr->decomp_lz4_checksum_en ||
+		      attr->decomp_lz4_no_checksum_en ||
 		      attr->decomp_deflate_v1_en || attr->decomp_deflate_v2_en;
 	ret = mlx5_devx_qp_create(priv->cdev->ctx, &qp->qp,
 					qp_attr.num_of_send_wqbbs *
@@ -280,7 +264,11 @@ mlx5_compress_xform_validate(const struct rte_comp_xform *xform,
 			return -ENOTSUP;
 		} else if (!attr->mmo_compress_qp_en &&
 			   !attr->mmo_compress_sq_en) {
-			DRV_LOG(ERR, "Not enough capabilities to support compress operation, maybe old FW/OFED version?");
+			DRV_LOG(ERR, "Not enough capabilities to support compress operation.");
+			return -ENOTSUP;
+		}
+		if (xform->compress.algo == RTE_COMP_ALGO_LZ4) {
+			DRV_LOG(ERR, "LZ4 compression is not supported.");
 			return -ENOTSUP;
 		}
 		if (xform->compress.level == RTE_COMP_LEVEL_NONE) {
@@ -291,6 +279,10 @@ mlx5_compress_xform_validate(const struct rte_comp_xform *xform,
 			DRV_LOG(ERR, "SHA is not supported.");
 			return -ENOTSUP;
 		}
+		if (xform->compress.chksum == RTE_COMP_CHECKSUM_XXHASH32) {
+			DRV_LOG(ERR, "xxHash32 checksum isn't supported in compress operation.");
+			return -ENOTSUP;
+		}
 		break;
 	case RTE_COMP_DECOMPRESS:
 		switch (xform->decompress.algo) {
@@ -307,6 +299,44 @@ mlx5_compress_xform_validate(const struct rte_comp_xform *xform,
 				DRV_LOG(ERR, "Not enough capabilities to support decompress DEFLATE algorithm, maybe old FW/OFED version?");
 				return -ENOTSUP;
 			}
+			switch (xform->decompress.chksum) {
+			case RTE_COMP_CHECKSUM_NONE:
+			case RTE_COMP_CHECKSUM_CRC32:
+			case RTE_COMP_CHECKSUM_ADLER32:
+			case RTE_COMP_CHECKSUM_CRC32_ADLER32:
+				break;
+			case RTE_COMP_CHECKSUM_XXHASH32:
+			default:
+				DRV_LOG(ERR, "DEFLATE algorithm doesn't support %u checksum.",
+					xform->decompress.chksum);
+				return -ENOTSUP;
+			}
+			break;
+		case RTE_COMP_ALGO_LZ4:
+			if (!attr->decomp_lz4_no_checksum_en &&
+			    !attr->decomp_lz4_checksum_en) {
+				DRV_LOG(ERR, "Not enough capabilities to support decompress LZ4 algorithm, maybe old FW/OFED version?");
+				return -ENOTSUP;
+			}
+			if (xform->decompress.lz4.flags &
+			    RTE_COMP_LZ4_FLAG_BLOCK_CHECKSUM) {
+				if (!attr->decomp_lz4_checksum_en) {
+					DRV_LOG(ERR, "Not enough capabilities to support decompress LZ4 block with checksum param, maybe old FW/OFED version?");
+					return -ENOTSUP;
+				}
+			} else {
+				if (!attr->decomp_lz4_no_checksum_en) {
+					DRV_LOG(ERR, "Not enough capabilities to support decompress LZ4 block without checksum param, maybe old FW/OFED version?");
+					return -ENOTSUP;
+				}
+			}
+			if (xform->decompress.chksum !=
+			    RTE_COMP_CHECKSUM_XXHASH32 &&
+			    xform->decompress.chksum !=
+			    RTE_COMP_CHECKSUM_NONE) {
+				DRV_LOG(ERR, "LZ4 algorithm supports only xxHash32 checksum.");
+				return -ENOTSUP;
+			}
 			break;
 		default:
 			DRV_LOG(ERR, "Algorithm %u is not supported.",
@@ -383,6 +413,27 @@ mlx5_compress_xform_create(struct rte_compressdev *dev,
 		case RTE_COMP_ALGO_DEFLATE:
 			xfrm->opcode += MLX5_OPC_MOD_MMO_DECOMP <<
 							WQE_CSEG_OPC_MOD_OFFSET;
+			xfrm->gga_ctrl1 += WQE_GGA_DECOMP_DEFLATE <<
+						     WQE_GGA_DECOMP_TYPE_OFFSET;
+			break;
+		case RTE_COMP_ALGO_LZ4:
+			xfrm->opcode += MLX5_OPC_MOD_MMO_DECOMP <<
+							WQE_CSEG_OPC_MOD_OFFSET;
+			xfrm->gga_ctrl1 += WQE_GGA_DECOMP_LZ4 <<
+						     WQE_GGA_DECOMP_TYPE_OFFSET;
+			if (xform->decompress.lz4.flags &
+			    RTE_COMP_LZ4_FLAG_BLOCK_CHECKSUM)
+				xfrm->gga_ctrl1 +=
+				      MLX5_GGA_DECOMP_LZ4_BLOCK_WITH_CHECKSUM <<
+						   WQE_GGA_DECOMP_PARAMS_OFFSET;
+			else
+				xfrm->gga_ctrl1 +=
+				      MLX5_GGA_DECOMP_LZ4_BLOCK_WITHOUT_CHECKSUM
+						<< WQE_GGA_DECOMP_PARAMS_OFFSET;
+			if (xform->decompress.lz4.flags &
+			    RTE_COMP_LZ4_FLAG_BLOCK_INDEPENDENCE)
+				xfrm->gga_ctrl1 += 1u <<
+					WQE_GGA_DECOMP_BLOCK_INDEPENDENT_OFFSET;
 			break;
 		default:
 			goto err;
@@ -390,7 +441,7 @@ mlx5_compress_xform_create(struct rte_compressdev *dev,
 		xfrm->csum_type = xform->decompress.chksum;
 		break;
 	default:
-		DRV_LOG(ERR, "Algorithm %u is not supported.", xform->type);
+		DRV_LOG(ERR, "Operation %u is not supported.", xform->type);
 		goto err;
 	}
 	DRV_LOG(DEBUG, "New xform: gga ctrl1 = 0x%08X opcode = 0x%08X csum "
@@ -657,6 +708,10 @@ mlx5_compress_dequeue_burst(void *queue_pair, struct rte_comp_op **ops,
 						     ((uint64_t)rte_be_to_cpu_32
 					 (opaq[idx].data[crc32_idx + 1]) << 32);
 				break;
+			case RTE_COMP_CHECKSUM_XXHASH32:
+				op->output_chksum = (uint64_t)rte_be_to_cpu_32
+						    (opaq[idx].v2.xxh32);
+				break;
 			default:
 				break;
 			}
@@ -720,6 +775,49 @@ mlx5_compress_handle_devargs(struct mlx5_kvargs_ctrl *mkvlist,
 	return 0;
 }
 
+static void
+mlx5_compress_fill_caps(struct mlx5_compress_priv *priv,
+			const struct mlx5_hca_attr *attr)
+{
+	struct rte_compressdev_capabilities caps[] = {
+		{
+			.algo = RTE_COMP_ALGO_NULL,
+			.comp_feature_flags = RTE_COMP_FF_ADLER32_CHECKSUM |
+					RTE_COMP_FF_CRC32_CHECKSUM |
+					RTE_COMP_FF_CRC32_ADLER32_CHECKSUM |
+					RTE_COMP_FF_SHAREABLE_PRIV_XFORM,
+		},
+		{
+			.algo = RTE_COMP_ALGO_DEFLATE,
+			.comp_feature_flags = RTE_COMP_FF_ADLER32_CHECKSUM |
+					RTE_COMP_FF_CRC32_CHECKSUM |
+					RTE_COMP_FF_CRC32_ADLER32_CHECKSUM |
+					RTE_COMP_FF_SHAREABLE_PRIV_XFORM |
+					RTE_COMP_FF_HUFFMAN_FIXED |
+					RTE_COMP_FF_HUFFMAN_DYNAMIC,
+			.window_size = {.min = 10, .max = 15, .increment = 1},
+		},
+		{
+			.algo = RTE_COMP_ALGO_LZ4,
+			.comp_feature_flags = RTE_COMP_FF_XXHASH32_CHECKSUM |
+					RTE_COMP_FF_SHAREABLE_PRIV_XFORM |
+					RTE_COMP_FF_LZ4_BLOCK_INDEPENDENCE,
+			.window_size = {.min = 1, .max = 15, .increment = 1},
+		},
+		RTE_COMP_END_OF_CAPABILITIES_LIST()
+	};
+	priv->caps[0] = caps[0];
+	priv->caps[1] = caps[1];
+	if (attr->decomp_lz4_checksum_en || attr->decomp_lz4_no_checksum_en) {
+		priv->caps[2] = caps[2];
+		if (attr->decomp_lz4_checksum_en)
+			priv->caps[2].comp_feature_flags |=
+					RTE_COMP_FF_LZ4_BLOCK_WITH_CHECKSUM;
+		priv->caps[3] = caps[3];
+	} else
+		priv->caps[2] = caps[3];
+}
+
 static int
 mlx5_compress_dev_probe(struct mlx5_common_device *cdev,
 			struct mlx5_kvargs_ctrl *mkvlist)
@@ -740,7 +838,8 @@ mlx5_compress_dev_probe(struct mlx5_common_device *cdev,
 		rte_errno = ENOTSUP;
 		return -rte_errno;
 	}
-	if (!attr->decomp_deflate_v1_en && !attr->decomp_deflate_v2_en &&
+	if (!attr->decomp_lz4_checksum_en && !attr->decomp_lz4_no_checksum_en &&
+	    !attr->decomp_deflate_v1_en && !attr->decomp_deflate_v2_en &&
 	    !attr->mmo_decompress_sq_en && !attr->mmo_compress_qp_en &&
 	    !attr->mmo_compress_sq_en && !attr->mmo_dma_qp_en &&
 	    !attr->mmo_dma_sq_en) {
@@ -763,7 +862,8 @@ mlx5_compress_dev_probe(struct mlx5_common_device *cdev,
 	compressdev->feature_flags = RTE_COMPDEV_FF_HW_ACCELERATED;
 	priv = compressdev->data->dev_private;
 	priv->log_block_sz = devarg_prms.log_block_sz;
-	if (attr->decomp_deflate_v2_en)
+	if (attr->decomp_deflate_v2_en || attr->decomp_lz4_checksum_en ||
+	    attr->decomp_lz4_no_checksum_en)
 		crc32_opaq_offset = offsetof(union mlx5_gga_compress_opaque,
 					     v2.crc32);
 	else
@@ -773,6 +873,7 @@ mlx5_compress_dev_probe(struct mlx5_common_device *cdev,
 	priv->crc32_opaq_offs = crc32_opaq_offset / 4;
 	priv->cdev = cdev;
 	priv->compressdev = compressdev;
+	mlx5_compress_fill_caps(priv, attr);
 	if (mlx5_devx_uar_prepare(cdev, &priv->uar) != 0) {
 		rte_compressdev_pmd_destroy(priv->compressdev);
 		return -1;
-- 
2.25.1


  parent reply	other threads:[~2023-02-21  7:09 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-09  7:58 [PATCH 0/7] compress/mlx5: add LZ4 support Michael Baum
2023-01-09  7:58 ` [PATCH 1/7] compress/mlx5: fix wrong output Adler-32 checksum offset Michael Baum
2023-01-09  7:58 ` [PATCH 2/7] compress/mlx5: fix QP setup for partial transformations Michael Baum
2023-01-09  7:58 ` [PATCH 3/7] compress/mlx5: support new metadata layout added in BF3 Michael Baum
2023-01-09  7:58 ` [PATCH 4/7] compress/mlx5: remove unused variable from priv structure Michael Baum
2023-01-09  7:58 ` [PATCH 5/7] compress/mlx5: add xform validate function Michael Baum
2023-01-09  7:58 ` [PATCH 6/7] common/mlx5: add LZ4 capabilities check Michael Baum
2023-01-09  7:58 ` [PATCH 7/7] compress/mlx5: add support for LZ4 algorithm Michael Baum
2023-02-02 16:25 ` [PATCH v2 0/8] compress/mlx5: add LZ4 support Michael Baum
2023-02-02 16:25   ` [PATCH v2 1/8] compress/mlx5: fix decompress xform validation Michael Baum
2023-02-02 16:25   ` [PATCH v2 2/8] compress/mlx5: fix wrong output Adler-32 checksum offset Michael Baum
2023-02-02 16:25   ` [PATCH v2 3/8] compress/mlx5: fix QP setup for partial transformations Michael Baum
2023-02-02 16:25   ` [PATCH v2 4/8] compress/mlx5: support new metadata layout added in BF3 Michael Baum
2023-02-02 16:25   ` [PATCH v2 5/8] compress/mlx5: remove unused variable from priv structure Michael Baum
2023-02-02 16:25   ` [PATCH v2 6/8] compress/mlx5: add xform validate function Michael Baum
2023-02-02 16:25   ` [PATCH v2 7/8] common/mlx5: add LZ4 capabilities check Michael Baum
2023-02-02 16:25   ` [PATCH v2 8/8] compress/mlx5: add support for LZ4 algorithm Michael Baum
2023-02-21  7:07   ` [PATCH v3 0/8] compress/mlx5: add LZ4 support Michael Baum
2023-02-21  7:07     ` [PATCH v3 1/8] compress/mlx5: fix decompress xform validation Michael Baum
2023-02-21  7:07     ` [PATCH v3 2/8] compress/mlx5: fix wrong output Adler-32 checksum offset Michael Baum
2023-02-21  7:07     ` [PATCH v3 3/8] compress/mlx5: fix QP setup for partial transformations Michael Baum
2023-02-21  7:07     ` [PATCH v3 4/8] compress/mlx5: support new metadata layout added in BF3 Michael Baum
2023-02-21  7:07     ` [PATCH v3 5/8] compress/mlx5: remove unused variable from priv structure Michael Baum
2023-02-21  7:07     ` [PATCH v3 6/8] compress/mlx5: add xform validate function Michael Baum
2023-02-21  7:07     ` [PATCH v3 7/8] common/mlx5: add LZ4 capabilities check Michael Baum
2023-02-21  7:07     ` Michael Baum [this message]
2023-02-27 18:17     ` [EXT] [PATCH v3 0/8] compress/mlx5: add LZ4 support Akhil Goyal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230221070756.3070819-9-michaelba@nvidia.com \
    --to=michaelba@nvidia.com \
    --cc=dev@dpdk.org \
    --cc=gakhil@marvell.com \
    --cc=matan@nvidia.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.