netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/6] ConnectX-4 driver update 2015-07-23
@ 2015-07-23 20:35 Amir Vadai
  2015-07-23 20:35 ` [PATCH net-next 1/6] net/mlx5e: Support ETH_RSS_HASH_XOR Amir Vadai
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Amir Vadai @ 2015-07-23 20:35 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Amir Vadai, Achiad Shochat, Or Gerlitz, Tal Alon

Hi Dave,

This patchset introduce some performance enhancements to the ConnectX-4 driver.
1. Improving RSS distribution, and make RSS function controlable using ethtool.
2. Make memory that is written by NIC and read by host CPU allocate in the
   local NUMA to the processing CPU
3. Support tx copybreak
4. Using hardware feature called blueflame to save DMA reads when possible

Another patch by Achiad fix some cosmetic issues in the driver.

Patchset was applied and tested on top of commit 045a0fa ("ip_tunnel: Call
ip_tunnel_core_init() from inet_init()")

Thanks,
Amir

Achiad Shochat (4):
  net/mlx5e: Support TX packet copy into WQE
  net/mlx5e: TX latency optimization to save DMA reads
  net/mlx5e: Cosmetics: use BIT() instead of "1 <<", and others
  net/mlx5e: Input IPSEC.SPI into the RX RSS hash function

Saeed Mahameed (2):
  net/mlx5e: Support ETH_RSS_HASH_XOR
  net/mlx5e: Allocate DMA coherent memory on reader NUMA node

 drivers/net/ethernet/mellanox/mlx5/core/alloc.c    |  48 +++-
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |  47 ++--
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   |  92 ++++++++
 .../ethernet/mellanox/mlx5/core/en_flow_table.c    | 258 ++++++++++++++-------
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 134 ++++++++---
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c    |  34 ++-
 drivers/net/ethernet/mellanox/mlx5/core/main.c     |  32 ++-
 drivers/net/ethernet/mellanox/mlx5/core/uar.c      |   6 +
 drivers/net/ethernet/mellanox/mlx5/core/wq.c       |  12 +-
 drivers/net/ethernet/mellanox/mlx5/core/wq.h       |   3 +-
 include/linux/mlx5/driver.h                        |  12 +-
 include/linux/mlx5/mlx5_ifc.h                      |   6 +-
 12 files changed, 535 insertions(+), 149 deletions(-)

-- 
2.4.3.413.ga5fe668

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH net-next 1/6] net/mlx5e: Support ETH_RSS_HASH_XOR
  2015-07-23 20:35 [PATCH net-next 0/6] ConnectX-4 driver update 2015-07-23 Amir Vadai
@ 2015-07-23 20:35 ` Amir Vadai
  2015-07-23 20:35 ` [PATCH net-next 2/6] net/mlx5e: Allocate DMA coherent memory on reader NUMA node Amir Vadai
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Amir Vadai @ 2015-07-23 20:35 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Amir Vadai, Achiad Shochat, Or Gerlitz, Tal Alon, Saeed Mahameed

From: Saeed Mahameed <saeedm@mellanox.com>

The ConnectX-4 HW implements inverted XOR8.
To make it act as XOR we re-order the HW RSS indirection table.

Set XOR to be the default RSS hash function and add ethtool API to
control it.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |  1 +
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   | 39 ++++++++++++++++++
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 46 +++++++++++++++++-----
 include/linux/mlx5/mlx5_ifc.h                      |  6 +--
 4 files changed, 79 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 3d23bd6..61d8433 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -195,6 +195,7 @@ struct mlx5e_params {
 	u16 rx_hash_log_tbl_sz;
 	bool lro_en;
 	u32 lro_wqe_sz;
+	u8  rss_hfunc;
 };
 
 enum {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 3889384..cb28535 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -662,6 +662,43 @@ out:
 	return err;
 }
 
+static int mlx5e_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key,
+			  u8 *hfunc)
+{
+	struct mlx5e_priv *priv = netdev_priv(netdev);
+
+	if (hfunc)
+		*hfunc = priv->params.rss_hfunc;
+
+	return 0;
+}
+
+static int mlx5e_set_rxfh(struct net_device *netdev, const u32 *indir,
+			  const u8 *key, const u8 hfunc)
+{
+	struct mlx5e_priv *priv = netdev_priv(netdev);
+	int err = 0;
+
+	if (hfunc == ETH_RSS_HASH_NO_CHANGE)
+		return 0;
+
+	if ((hfunc != ETH_RSS_HASH_XOR) &&
+	    (hfunc != ETH_RSS_HASH_TOP))
+		return -EINVAL;
+
+	mutex_lock(&priv->state_lock);
+
+	priv->params.rss_hfunc = hfunc;
+	if (test_bit(MLX5E_STATE_OPENED, &priv->state)) {
+		mlx5e_close_locked(priv->netdev);
+		err = mlx5e_open_locked(priv->netdev);
+	}
+
+	mutex_unlock(&priv->state_lock);
+
+	return err;
+}
+
 const struct ethtool_ops mlx5e_ethtool_ops = {
 	.get_drvinfo       = mlx5e_get_drvinfo,
 	.get_link          = ethtool_op_get_link,
@@ -676,4 +713,6 @@ const struct ethtool_ops mlx5e_ethtool_ops = {
 	.set_coalesce      = mlx5e_set_coalesce,
 	.get_settings      = mlx5e_get_settings,
 	.set_settings      = mlx5e_set_settings,
+	.get_rxfh          = mlx5e_get_rxfh,
+	.set_rxfh          = mlx5e_set_rxfh,
 };
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 40206da..07d3627 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -1158,6 +1158,24 @@ static void mlx5e_close_tises(struct mlx5e_priv *priv)
 		mlx5e_close_tis(priv, tc);
 }
 
+static int mlx5e_rx_hash_fn(int hfunc)
+{
+	return (hfunc == ETH_RSS_HASH_TOP) ?
+	       MLX5_RX_HASH_FN_TOEPLITZ :
+	       MLX5_RX_HASH_FN_INVERTED_XOR8;
+}
+
+static int mlx5e_bits_invert(unsigned long a, int size)
+{
+	int inv = 0;
+	int i;
+
+	for (i = 0; i < size; i++)
+		inv |= (test_bit(size - i - 1, &a) ? 1 : 0) << i;
+
+	return inv;
+}
+
 static int mlx5e_open_rqt(struct mlx5e_priv *priv)
 {
 	struct mlx5_core_dev *mdev = priv->mdev;
@@ -1166,11 +1184,10 @@ static int mlx5e_open_rqt(struct mlx5e_priv *priv)
 	void *rqtc;
 	int inlen;
 	int err;
-	int sz;
+	int log_tbl_sz = priv->params.rx_hash_log_tbl_sz;
+	int sz = 1 << log_tbl_sz;
 	int i;
 
-	sz = 1 << priv->params.rx_hash_log_tbl_sz;
-
 	inlen = MLX5_ST_SZ_BYTES(create_rqt_in) + sizeof(u32) * sz;
 	in = mlx5_vzalloc(inlen);
 	if (!in)
@@ -1182,8 +1199,12 @@ static int mlx5e_open_rqt(struct mlx5e_priv *priv)
 	MLX5_SET(rqtc, rqtc, rqt_max_size, sz);
 
 	for (i = 0; i < sz; i++) {
-		int ix = i % priv->params.num_channels;
+		int ix = i;
+
+		if (priv->params.rss_hfunc == ETH_RSS_HASH_XOR)
+			ix = mlx5e_bits_invert(i, log_tbl_sz);
 
+		ix = ix % priv->params.num_channels;
 		MLX5_SET(rqtc, rqtc, rq_num[i], priv->channel[ix]->rq.rqn);
 	}
 
@@ -1254,12 +1275,16 @@ static void mlx5e_build_tir_ctx(struct mlx5e_priv *priv, u32 *tirc, int tt)
 		MLX5_SET(tirc, tirc, indirect_table,
 			 priv->rqtn);
 		MLX5_SET(tirc, tirc, rx_hash_fn,
-			 MLX5_TIRC_RX_HASH_FN_HASH_TOEPLITZ);
-		MLX5_SET(tirc, tirc, rx_hash_symmetric, 1);
-		netdev_rss_key_fill(MLX5_ADDR_OF(tirc, tirc,
-						 rx_hash_toeplitz_key),
-				    MLX5_FLD_SZ_BYTES(tirc,
-						      rx_hash_toeplitz_key));
+			 mlx5e_rx_hash_fn(priv->params.rss_hfunc));
+		if (priv->params.rss_hfunc == ETH_RSS_HASH_TOP) {
+			void *rss_key = MLX5_ADDR_OF(tirc, tirc,
+						     rx_hash_toeplitz_key);
+			size_t len = MLX5_FLD_SZ_BYTES(tirc,
+						       rx_hash_toeplitz_key);
+
+			MLX5_SET(tirc, tirc, rx_hash_symmetric, 1);
+			netdev_rss_key_fill(rss_key, len);
+		}
 		break;
 	}
 
@@ -1700,6 +1725,7 @@ static void mlx5e_build_netdev_priv(struct mlx5_core_dev *mdev,
 		MLX5E_PARAMS_DEFAULT_RX_HASH_LOG_TBL_SZ;
 	priv->params.num_tc                = 1;
 	priv->params.default_vlan_prio     = 0;
+	priv->params.rss_hfunc             = ETH_RSS_HASH_XOR;
 
 	priv->params.lro_en = false && !!MLX5_CAP_ETH(priv->mdev, lro_cap);
 	priv->params.lro_wqe_sz            =
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 6d2f6fe..c60a62b 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -1936,9 +1936,9 @@ enum {
 };
 
 enum {
-	MLX5_TIRC_RX_HASH_FN_HASH_NONE           = 0x0,
-	MLX5_TIRC_RX_HASH_FN_HASH_INVERTED_XOR8  = 0x1,
-	MLX5_TIRC_RX_HASH_FN_HASH_TOEPLITZ       = 0x2,
+	MLX5_RX_HASH_FN_NONE           = 0x0,
+	MLX5_RX_HASH_FN_INVERTED_XOR8  = 0x1,
+	MLX5_RX_HASH_FN_TOEPLITZ       = 0x2,
 };
 
 enum {
-- 
2.4.3.413.ga5fe668

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 2/6] net/mlx5e: Allocate DMA coherent memory on reader NUMA node
  2015-07-23 20:35 [PATCH net-next 0/6] ConnectX-4 driver update 2015-07-23 Amir Vadai
  2015-07-23 20:35 ` [PATCH net-next 1/6] net/mlx5e: Support ETH_RSS_HASH_XOR Amir Vadai
@ 2015-07-23 20:35 ` Amir Vadai
  2015-07-23 20:35 ` [PATCH net-next 3/6] net/mlx5e: Support TX packet copy into WQE Amir Vadai
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Amir Vadai @ 2015-07-23 20:35 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Amir Vadai, Achiad Shochat, Or Gerlitz, Tal Alon, Saeed Mahameed

From: Saeed Mahameed <saeedm@mellanox.com>

By affinity hints and XPS, each mlx5e channel is assigned a CPU
core.

Channel DMA coherent memory that is written by the NIC and read
by SW (e.g CQ buffer) is allocated on the NUMA node of the CPU
core assigned for the channel.

Channel DMA coherent memory that is written by SW and read by the
NIC (e.g SQ/RQ buffer) is allocated on the NUMA node of the NIC.

Doorbell record (written by SW and read by the NIC) is an
exception since it is accessed by SW more frequently.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/alloc.c   | 48 +++++++++++++++++++----
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 11 ++++--
 drivers/net/ethernet/mellanox/mlx5/core/main.c    |  6 ++-
 drivers/net/ethernet/mellanox/mlx5/core/wq.c      | 12 +++---
 drivers/net/ethernet/mellanox/mlx5/core/wq.h      |  3 +-
 include/linux/mlx5/driver.h                       |  8 ++++
 6 files changed, 70 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/alloc.c b/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
index 0715b49..6cb3830 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
@@ -45,15 +45,34 @@
  * register it in a memory region at HCA virtual address 0.
  */
 
-int mlx5_buf_alloc(struct mlx5_core_dev *dev, int size, struct mlx5_buf *buf)
+static void *mlx5_dma_zalloc_coherent_node(struct mlx5_core_dev *dev,
+					   size_t size, dma_addr_t *dma_handle,
+					   int node)
+{
+	struct mlx5_priv *priv = &dev->priv;
+	int original_node;
+	void *cpu_handle;
+
+	mutex_lock(&priv->alloc_mutex);
+	original_node = dev_to_node(&dev->pdev->dev);
+	set_dev_node(&dev->pdev->dev, node);
+	cpu_handle = dma_zalloc_coherent(&dev->pdev->dev, size,
+					 dma_handle, GFP_KERNEL);
+	set_dev_node(&dev->pdev->dev, original_node);
+	mutex_unlock(&priv->alloc_mutex);
+	return cpu_handle;
+}
+
+int mlx5_buf_alloc_node(struct mlx5_core_dev *dev, int size,
+			struct mlx5_buf *buf, int node)
 {
 	dma_addr_t t;
 
 	buf->size = size;
 	buf->npages       = 1;
 	buf->page_shift   = (u8)get_order(size) + PAGE_SHIFT;
-	buf->direct.buf   = dma_zalloc_coherent(&dev->pdev->dev,
-						size, &t, GFP_KERNEL);
+	buf->direct.buf   = mlx5_dma_zalloc_coherent_node(dev, size,
+							  &t, node);
 	if (!buf->direct.buf)
 		return -ENOMEM;
 
@@ -66,6 +85,11 @@ int mlx5_buf_alloc(struct mlx5_core_dev *dev, int size, struct mlx5_buf *buf)
 
 	return 0;
 }
+
+int mlx5_buf_alloc(struct mlx5_core_dev *dev, int size, struct mlx5_buf *buf)
+{
+	return mlx5_buf_alloc_node(dev, size, buf, dev->priv.numa_node);
+}
 EXPORT_SYMBOL_GPL(mlx5_buf_alloc);
 
 void mlx5_buf_free(struct mlx5_core_dev *dev, struct mlx5_buf *buf)
@@ -75,7 +99,8 @@ void mlx5_buf_free(struct mlx5_core_dev *dev, struct mlx5_buf *buf)
 }
 EXPORT_SYMBOL_GPL(mlx5_buf_free);
 
-static struct mlx5_db_pgdir *mlx5_alloc_db_pgdir(struct device *dma_device)
+static struct mlx5_db_pgdir *mlx5_alloc_db_pgdir(struct mlx5_core_dev *dev,
+						 int node)
 {
 	struct mlx5_db_pgdir *pgdir;
 
@@ -84,8 +109,9 @@ static struct mlx5_db_pgdir *mlx5_alloc_db_pgdir(struct device *dma_device)
 		return NULL;
 
 	bitmap_fill(pgdir->bitmap, MLX5_DB_PER_PAGE);
-	pgdir->db_page = dma_alloc_coherent(dma_device, PAGE_SIZE,
-					    &pgdir->db_dma, GFP_KERNEL);
+
+	pgdir->db_page = mlx5_dma_zalloc_coherent_node(dev, PAGE_SIZE,
+						       &pgdir->db_dma, node);
 	if (!pgdir->db_page) {
 		kfree(pgdir);
 		return NULL;
@@ -118,7 +144,7 @@ static int mlx5_alloc_db_from_pgdir(struct mlx5_db_pgdir *pgdir,
 	return 0;
 }
 
-int mlx5_db_alloc(struct mlx5_core_dev *dev, struct mlx5_db *db)
+int mlx5_db_alloc_node(struct mlx5_core_dev *dev, struct mlx5_db *db, int node)
 {
 	struct mlx5_db_pgdir *pgdir;
 	int ret = 0;
@@ -129,7 +155,7 @@ int mlx5_db_alloc(struct mlx5_core_dev *dev, struct mlx5_db *db)
 		if (!mlx5_alloc_db_from_pgdir(pgdir, db))
 			goto out;
 
-	pgdir = mlx5_alloc_db_pgdir(&(dev->pdev->dev));
+	pgdir = mlx5_alloc_db_pgdir(dev, node);
 	if (!pgdir) {
 		ret = -ENOMEM;
 		goto out;
@@ -145,6 +171,12 @@ out:
 
 	return ret;
 }
+EXPORT_SYMBOL_GPL(mlx5_db_alloc_node);
+
+int mlx5_db_alloc(struct mlx5_core_dev *dev, struct mlx5_db *db)
+{
+	return mlx5_db_alloc_node(dev, db, dev->priv.numa_node);
+}
 EXPORT_SYMBOL_GPL(mlx5_db_alloc);
 
 void mlx5_db_free(struct mlx5_core_dev *dev, struct mlx5_db *db)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 07d3627..57cc896 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -272,6 +272,8 @@ static int mlx5e_create_rq(struct mlx5e_channel *c,
 	int err;
 	int i;
 
+	param->wq.db_numa_node = cpu_to_node(c->cpu);
+
 	err = mlx5_wq_ll_create(mdev, &param->wq, rqc_wq, &rq->wq,
 				&rq->wq_ctrl);
 	if (err)
@@ -502,6 +504,8 @@ static int mlx5e_create_sq(struct mlx5e_channel *c,
 	if (err)
 		return err;
 
+	param->wq.db_numa_node = cpu_to_node(c->cpu);
+
 	err = mlx5_wq_cyc_create(mdev, &param->wq, sqc_wq, &sq->wq,
 				 &sq->wq_ctrl);
 	if (err)
@@ -702,7 +706,8 @@ static int mlx5e_create_cq(struct mlx5e_channel *c,
 	int err;
 	u32 i;
 
-	param->wq.numa = cpu_to_node(c->cpu);
+	param->wq.buf_numa_node = cpu_to_node(c->cpu);
+	param->wq.db_numa_node  = cpu_to_node(c->cpu);
 	param->eq_ix   = c->ix;
 
 	err = mlx5_cqwq_create(mdev, &param->wq, param->cqc, &cq->wq,
@@ -1000,7 +1005,7 @@ static void mlx5e_build_rq_param(struct mlx5e_priv *priv,
 	MLX5_SET(wq, wq, log_wq_sz,        priv->params.log_rq_size);
 	MLX5_SET(wq, wq, pd,               priv->pdn);
 
-	param->wq.numa   = dev_to_node(&priv->mdev->pdev->dev);
+	param->wq.buf_numa_node = dev_to_node(&priv->mdev->pdev->dev);
 	param->wq.linear = 1;
 }
 
@@ -1014,7 +1019,7 @@ static void mlx5e_build_sq_param(struct mlx5e_priv *priv,
 	MLX5_SET(wq, wq, log_wq_stride, ilog2(MLX5_SEND_WQE_BB));
 	MLX5_SET(wq, wq, pd,            priv->pdn);
 
-	param->wq.numa = dev_to_node(&priv->mdev->pdev->dev);
+	param->wq.buf_numa_node = dev_to_node(&priv->mdev->pdev->dev);
 }
 
 static void mlx5e_build_common_cq_param(struct mlx5e_priv *priv,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index afad529..c34eafb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -455,7 +455,7 @@ static int mlx5_irq_set_affinity_hint(struct mlx5_core_dev *mdev, int i)
 	struct mlx5_priv *priv  = &mdev->priv;
 	struct msix_entry *msix = priv->msix_arr;
 	int irq                 = msix[i + MLX5_EQ_VEC_COMP_BASE].vector;
-	int numa_node           = dev_to_node(&mdev->pdev->dev);
+	int numa_node           = priv->numa_node;
 	int err;
 
 	if (!zalloc_cpumask_var(&priv->irq_info[i].mask, GFP_KERNEL)) {
@@ -668,6 +668,10 @@ static int mlx5_dev_init(struct mlx5_core_dev *dev, struct pci_dev *pdev)
 	INIT_LIST_HEAD(&priv->pgdir_list);
 	spin_lock_init(&priv->mkey_lock);
 
+	mutex_init(&priv->alloc_mutex);
+
+	priv->numa_node = dev_to_node(&dev->pdev->dev);
+
 	priv->dbg_root = debugfs_create_dir(dev_name(&pdev->dev), mlx5_debugfs_root);
 	if (!priv->dbg_root)
 		return -ENOMEM;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/wq.c b/drivers/net/ethernet/mellanox/mlx5/core/wq.c
index 8388411..ce21ee5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/wq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/wq.c
@@ -73,13 +73,14 @@ int mlx5_wq_cyc_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
 	wq->log_stride = MLX5_GET(wq, wqc, log_wq_stride);
 	wq->sz_m1 = (1 << MLX5_GET(wq, wqc, log_wq_sz)) - 1;
 
-	err = mlx5_db_alloc(mdev, &wq_ctrl->db);
+	err = mlx5_db_alloc_node(mdev, &wq_ctrl->db, param->db_numa_node);
 	if (err) {
 		mlx5_core_warn(mdev, "mlx5_db_alloc() failed, %d\n", err);
 		return err;
 	}
 
-	err = mlx5_buf_alloc(mdev, mlx5_wq_cyc_get_byte_size(wq), &wq_ctrl->buf);
+	err = mlx5_buf_alloc_node(mdev, mlx5_wq_cyc_get_byte_size(wq),
+				  &wq_ctrl->buf, param->buf_numa_node);
 	if (err) {
 		mlx5_core_warn(mdev, "mlx5_buf_alloc() failed, %d\n", err);
 		goto err_db_free;
@@ -108,13 +109,14 @@ int mlx5_cqwq_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
 	wq->log_sz = MLX5_GET(cqc, cqc, log_cq_size);
 	wq->sz_m1 = (1 << wq->log_sz) - 1;
 
-	err = mlx5_db_alloc(mdev, &wq_ctrl->db);
+	err = mlx5_db_alloc_node(mdev, &wq_ctrl->db, param->db_numa_node);
 	if (err) {
 		mlx5_core_warn(mdev, "mlx5_db_alloc() failed, %d\n", err);
 		return err;
 	}
 
-	err = mlx5_buf_alloc(mdev, mlx5_cqwq_get_byte_size(wq), &wq_ctrl->buf);
+	err = mlx5_buf_alloc_node(mdev, mlx5_cqwq_get_byte_size(wq),
+				  &wq_ctrl->buf, param->buf_numa_node);
 	if (err) {
 		mlx5_core_warn(mdev, "mlx5_buf_alloc() failed, %d\n", err);
 		goto err_db_free;
@@ -144,7 +146,7 @@ int mlx5_wq_ll_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
 	wq->log_stride = MLX5_GET(wq, wqc, log_wq_stride);
 	wq->sz_m1 = (1 << MLX5_GET(wq, wqc, log_wq_sz)) - 1;
 
-	err = mlx5_db_alloc(mdev, &wq_ctrl->db);
+	err = mlx5_db_alloc_node(mdev, &wq_ctrl->db, param->db_numa_node);
 	if (err) {
 		mlx5_core_warn(mdev, "mlx5_db_alloc() failed, %d\n", err);
 		return err;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/wq.h b/drivers/net/ethernet/mellanox/mlx5/core/wq.h
index e0ddd69..6c2a8f9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/wq.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/wq.h
@@ -37,7 +37,8 @@
 
 struct mlx5_wq_param {
 	int		linear;
-	int		numa;
+	int		buf_numa_node;
+	int		db_numa_node;
 };
 
 struct mlx5_wq_ctrl {
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 5722d88..1c0d5d0 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -463,6 +463,10 @@ struct mlx5_priv {
 	/* end: mr staff */
 
 	/* start: alloc staff */
+	/* protect buffer alocation according to numa node */
+	struct mutex            alloc_mutex;
+	int                     numa_node;
+
 	struct mutex            pgdir_mutex;
 	struct list_head        pgdir_list;
 	/* end: alloc staff */
@@ -672,6 +676,8 @@ void mlx5_health_cleanup(void);
 void  __init mlx5_health_init(void);
 void mlx5_start_health_poll(struct mlx5_core_dev *dev);
 void mlx5_stop_health_poll(struct mlx5_core_dev *dev);
+int mlx5_buf_alloc_node(struct mlx5_core_dev *dev, int size,
+			struct mlx5_buf *buf, int node);
 int mlx5_buf_alloc(struct mlx5_core_dev *dev, int size, struct mlx5_buf *buf);
 void mlx5_buf_free(struct mlx5_core_dev *dev, struct mlx5_buf *buf);
 struct mlx5_cmd_mailbox *mlx5_alloc_cmd_mailbox_chain(struct mlx5_core_dev *dev,
@@ -773,6 +779,8 @@ void mlx5_eq_debugfs_cleanup(struct mlx5_core_dev *dev);
 int mlx5_cq_debugfs_init(struct mlx5_core_dev *dev);
 void mlx5_cq_debugfs_cleanup(struct mlx5_core_dev *dev);
 int mlx5_db_alloc(struct mlx5_core_dev *dev, struct mlx5_db *db);
+int mlx5_db_alloc_node(struct mlx5_core_dev *dev, struct mlx5_db *db,
+		       int node);
 void mlx5_db_free(struct mlx5_core_dev *dev, struct mlx5_db *db);
 
 const char *mlx5_command_str(int command);
-- 
2.4.3.413.ga5fe668

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 3/6] net/mlx5e: Support TX packet copy into WQE
  2015-07-23 20:35 [PATCH net-next 0/6] ConnectX-4 driver update 2015-07-23 Amir Vadai
  2015-07-23 20:35 ` [PATCH net-next 1/6] net/mlx5e: Support ETH_RSS_HASH_XOR Amir Vadai
  2015-07-23 20:35 ` [PATCH net-next 2/6] net/mlx5e: Allocate DMA coherent memory on reader NUMA node Amir Vadai
@ 2015-07-23 20:35 ` Amir Vadai
  2015-07-23 20:35 ` [PATCH net-next 4/6] net/mlx5e: TX latency optimization to save DMA reads Amir Vadai
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Amir Vadai @ 2015-07-23 20:35 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Amir Vadai, Achiad Shochat, Or Gerlitz, Tal Alon

From: Achiad Shochat <achiad@mellanox.com>

AKA inline WQE.
A TX latency optimization to save data gather DMA reads.
Controlled by ETHTOOL_TX_COPYBREAK.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |  2 +
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   | 53 ++++++++++++++++++++++
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 13 ++++++
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c    | 10 +++-
 4 files changed, 77 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 61d8433..d9dc506 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -196,6 +196,7 @@ struct mlx5e_params {
 	bool lro_en;
 	u32 lro_wqe_sz;
 	u8  rss_hfunc;
+	u16 tx_max_inline;
 };
 
 enum {
@@ -520,3 +521,4 @@ static inline void mlx5e_cq_arm(struct mlx5e_cq *cq)
 }
 
 extern const struct ethtool_ops mlx5e_ethtool_ops;
+u16 mlx5e_get_max_inline_cap(struct mlx5_core_dev *mdev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index cb28535..14fd82c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -699,6 +699,57 @@ static int mlx5e_set_rxfh(struct net_device *netdev, const u32 *indir,
 	return err;
 }
 
+static int mlx5e_get_tunable(struct net_device *dev,
+			     const struct ethtool_tunable *tuna,
+			     void *data)
+{
+	const struct mlx5e_priv *priv = netdev_priv(dev);
+	int err = 0;
+
+	switch (tuna->id) {
+	case ETHTOOL_TX_COPYBREAK:
+		*(u32 *)data = priv->params.tx_max_inline;
+		break;
+	default:
+		err = -EINVAL;
+		break;
+	}
+
+	return err;
+}
+
+static int mlx5e_set_tunable(struct net_device *dev,
+			     const struct ethtool_tunable *tuna,
+			     const void *data)
+{
+	struct mlx5e_priv *priv = netdev_priv(dev);
+	struct mlx5_core_dev *mdev = priv->mdev;
+	struct mlx5e_params new_params;
+	u32 val;
+	int err = 0;
+
+	switch (tuna->id) {
+	case ETHTOOL_TX_COPYBREAK:
+		val = *(u32 *)data;
+		if (val > mlx5e_get_max_inline_cap(mdev)) {
+			err = -EINVAL;
+			break;
+		}
+
+		mutex_lock(&priv->state_lock);
+		new_params = priv->params;
+		new_params.tx_max_inline = val;
+		err = mlx5e_update_priv_params(priv, &new_params);
+		mutex_unlock(&priv->state_lock);
+		break;
+	default:
+		err = -EINVAL;
+		break;
+	}
+
+	return err;
+}
+
 const struct ethtool_ops mlx5e_ethtool_ops = {
 	.get_drvinfo       = mlx5e_get_drvinfo,
 	.get_link          = ethtool_op_get_link,
@@ -715,4 +766,6 @@ const struct ethtool_ops mlx5e_ethtool_ops = {
 	.set_settings      = mlx5e_set_settings,
 	.get_rxfh          = mlx5e_get_rxfh,
 	.set_rxfh          = mlx5e_set_rxfh,
+	.get_tunable       = mlx5e_get_tunable,
+	.set_tunable       = mlx5e_set_tunable,
 };
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 57cc896..c55fad4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -41,6 +41,7 @@ struct mlx5e_rq_param {
 struct mlx5e_sq_param {
 	u32                        sqc[MLX5_ST_SZ_DW(sqc)];
 	struct mlx5_wq_param       wq;
+	u16                        max_inline;
 };
 
 struct mlx5e_cq_param {
@@ -514,6 +515,7 @@ static int mlx5e_create_sq(struct mlx5e_channel *c,
 	sq->wq.db       = &sq->wq.db[MLX5_SND_DBR];
 	sq->uar_map     = sq->uar.map;
 	sq->bf_buf_size = (1 << MLX5_CAP_GEN(mdev, log_bf_reg_size)) / 2;
+	sq->max_inline  = param->max_inline;
 
 	err = mlx5e_alloc_sq_db(sq, cpu_to_node(c->cpu));
 	if (err)
@@ -1020,6 +1022,7 @@ static void mlx5e_build_sq_param(struct mlx5e_priv *priv,
 	MLX5_SET(wq, wq, pd,            priv->pdn);
 
 	param->wq.buf_numa_node = dev_to_node(&priv->mdev->pdev->dev);
+	param->max_inline = priv->params.tx_max_inline;
 }
 
 static void mlx5e_build_common_cq_param(struct mlx5e_priv *priv,
@@ -1703,6 +1706,15 @@ static int mlx5e_check_required_hca_cap(struct mlx5_core_dev *mdev)
 	return 0;
 }
 
+u16 mlx5e_get_max_inline_cap(struct mlx5_core_dev *mdev)
+{
+	int bf_buf_size = (1 << MLX5_CAP_GEN(mdev, log_bf_reg_size)) / 2;
+
+	return bf_buf_size -
+	       sizeof(struct mlx5e_tx_wqe) +
+	       2 /*sizeof(mlx5e_tx_wqe.inline_hdr_start)*/;
+}
+
 static void mlx5e_build_netdev_priv(struct mlx5_core_dev *mdev,
 				    struct net_device *netdev,
 				    int num_comp_vectors)
@@ -1721,6 +1733,7 @@ static void mlx5e_build_netdev_priv(struct mlx5_core_dev *mdev,
 		MLX5E_PARAMS_DEFAULT_TX_CQ_MODERATION_USEC;
 	priv->params.tx_cq_moderation_pkts =
 		MLX5E_PARAMS_DEFAULT_TX_CQ_MODERATION_PKTS;
+	priv->params.tx_max_inline         = mlx5e_get_max_inline_cap(mdev);
 	priv->params.min_rx_wqes           =
 		MLX5E_PARAMS_DEFAULT_MIN_RX_WQES;
 	priv->params.rx_hash_log_tbl_sz    =
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index 03f28f4..351ac69 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -112,7 +112,15 @@ u16 mlx5e_select_queue(struct net_device *dev, struct sk_buff *skb,
 static inline u16 mlx5e_get_inline_hdr_size(struct mlx5e_sq *sq,
 					    struct sk_buff *skb)
 {
-#define MLX5E_MIN_INLINE 16 /* eth header with vlan (w/o next ethertype) */
+	/* Some NIC TX decisions, e.g loopback, are based on the packet
+	 * headers and occur before the data gather.
+	 * Therefore these headers must be copied into the WQE
+	 */
+#define MLX5E_MIN_INLINE (ETH_HLEN + 2/*vlan tag*/)
+
+	if (skb_headlen(skb) <= sq->max_inline)
+		return skb_headlen(skb);
+
 	return MLX5E_MIN_INLINE;
 }
 
-- 
2.4.3.413.ga5fe668

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 4/6] net/mlx5e: TX latency optimization to save DMA reads
  2015-07-23 20:35 [PATCH net-next 0/6] ConnectX-4 driver update 2015-07-23 Amir Vadai
                   ` (2 preceding siblings ...)
  2015-07-23 20:35 ` [PATCH net-next 3/6] net/mlx5e: Support TX packet copy into WQE Amir Vadai
@ 2015-07-23 20:35 ` Amir Vadai
  2015-07-23 20:36 ` [PATCH net-next 5/6] net/mlx5e: Cosmetics: use BIT() instead of "1 <<", and others Amir Vadai
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Amir Vadai @ 2015-07-23 20:35 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Amir Vadai, Achiad Shochat, Or Gerlitz, Tal Alon

From: Achiad Shochat <achiad@mellanox.com>

A regular TX WQE execution involves two or more DMA reads -
one to fetch the WQE, and another one per WQE gather entry.

These DMA reads obviously increase the TX latency.
There are two mlx5 mechanisms to bypass these DMA reads:
1) Inline WQE
2) Blue Flame (BF)

An inline WQE contains a whole packet, thus saves the DMA read/s
of the regular WQE gather entry/s. Inline WQE support was already
added in the previous commit.

A BF WQE is written directly to the device I/O mapped memory, thus
enables saving the DMA read that fetches the WQE.

The BF WQE I/O write must be in cache line granularity, thus uses
the CPU write combining mechanism.
A BF WQE I/O write acts also as a TX doorbell for notifying the
device of new TX WQEs.
A BF WQE is written to the same I/O mapped address as the regular TX
doorbell, thus this address is being mapped twice - once by ioremap()
and once by io_mapping_map_wc().

While both mechanisms reduce the TX latency, they both consume more CPU
cycles than a regular WQE:
- A BF WQE must still be written to host memory, in addition to being
  written directly to the device I/O mapped memory.
- An inline WQE involves copying the SKB data into it.

To handle this tradeoff, we introduce here a heuristic algorithm that
strives to avoid using these two mechanisms in case the TX queue is
being back-pressured by the device, and limit their usage rate otherwise.

An inline WQE will always be "Blue Flamed" (written directly to the
device I/O mapped memory) while a BF WQE may not be inlined (may contain
gather entries).

Preliminary testing using netperf UDP_RR shows that the latency goes down
from 17.5us to 16.9us, while the message rate (tested with pktgen) stays
the same.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h      | 24 +++++++++++++++------
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 12 ++++++-----
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   | 26 ++++++++++++++++++-----
 drivers/net/ethernet/mellanox/mlx5/core/main.c    | 26 +++++++++++++++++++++--
 drivers/net/ethernet/mellanox/mlx5/core/uar.c     |  6 ++++++
 include/linux/mlx5/driver.h                       |  4 +++-
 6 files changed, 79 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index d9dc506..b66edd2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -60,6 +60,7 @@
 
 #define MLX5E_TX_CQ_POLL_BUDGET        128
 #define MLX5E_UPDATE_STATS_INTERVAL    200 /* msecs */
+#define MLX5E_SQ_BF_BUDGET             16
 
 static const char vport_strings[][ETH_GSTRING_LEN] = {
 	/* vport statistics */
@@ -268,7 +269,9 @@ struct mlx5e_sq {
 	/* dirtied @xmit */
 	u16                        pc ____cacheline_aligned_in_smp;
 	u32                        dma_fifo_pc;
-	u32                        bf_offset;
+	u16                        bf_offset;
+	u16                        prev_cc;
+	u8                         bf_budget;
 	struct mlx5e_sq_stats      stats;
 
 	struct mlx5e_cq            cq;
@@ -281,9 +284,10 @@ struct mlx5e_sq {
 	struct mlx5_wq_cyc         wq;
 	u32                        dma_fifo_mask;
 	void __iomem              *uar_map;
+	void __iomem              *uar_bf_map;
 	struct netdev_queue       *txq;
 	u32                        sqn;
-	u32                        bf_buf_size;
+	u16                        bf_buf_size;
 	u16                        max_inline;
 	u16                        edge;
 	struct device             *pdev;
@@ -493,8 +497,10 @@ int mlx5e_update_priv_params(struct mlx5e_priv *priv,
 			     struct mlx5e_params *new_params);
 
 static inline void mlx5e_tx_notify_hw(struct mlx5e_sq *sq,
-				      struct mlx5e_tx_wqe *wqe)
+				      struct mlx5e_tx_wqe *wqe, int bf_sz)
 {
+	u16 ofst = MLX5_BF_OFFSET + sq->bf_offset;
+
 	/* ensure wqe is visible to device before updating doorbell record */
 	dma_wmb();
 
@@ -505,9 +511,15 @@ static inline void mlx5e_tx_notify_hw(struct mlx5e_sq *sq,
 	 */
 	wmb();
 
-	mlx5_write64((__be32 *)&wqe->ctrl,
-		     sq->uar_map + MLX5_BF_OFFSET + sq->bf_offset,
-		     NULL);
+	if (bf_sz) {
+		__iowrite64_copy(sq->uar_bf_map + ofst, &wqe->ctrl, bf_sz);
+
+		/* flush the write-combining mapped buffer */
+		wmb();
+
+	} else {
+		mlx5_write64((__be32 *)&wqe->ctrl, sq->uar_map + ofst, NULL);
+	}
 
 	sq->bf_offset ^= sq->bf_buf_size;
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index c55fad4..4a87e9d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -514,6 +514,7 @@ static int mlx5e_create_sq(struct mlx5e_channel *c,
 
 	sq->wq.db       = &sq->wq.db[MLX5_SND_DBR];
 	sq->uar_map     = sq->uar.map;
+	sq->uar_bf_map  = sq->uar.bf_map;
 	sq->bf_buf_size = (1 << MLX5_CAP_GEN(mdev, log_bf_reg_size)) / 2;
 	sq->max_inline  = param->max_inline;
 
@@ -524,11 +525,12 @@ static int mlx5e_create_sq(struct mlx5e_channel *c,
 	txq_ix = c->ix + tc * priv->params.num_channels;
 	sq->txq = netdev_get_tx_queue(priv->netdev, txq_ix);
 
-	sq->pdev    = c->pdev;
-	sq->mkey_be = c->mkey_be;
-	sq->channel = c;
-	sq->tc      = tc;
-	sq->edge    = (sq->wq.sz_m1 + 1) - MLX5_SEND_WQE_MAX_WQEBBS;
+	sq->pdev      = c->pdev;
+	sq->mkey_be   = c->mkey_be;
+	sq->channel   = c;
+	sq->tc        = tc;
+	sq->edge      = (sq->wq.sz_m1 + 1) - MLX5_SEND_WQE_MAX_WQEBBS;
+	sq->bf_budget = MLX5E_SQ_BF_BUDGET;
 	priv->txq_to_sq_map[txq_ix] = sq;
 
 	return 0;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index 351ac69..64380bc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -57,7 +57,7 @@ void mlx5e_send_nop(struct mlx5e_sq *sq, bool notify_hw)
 
 	if (notify_hw) {
 		cseg->fm_ce_se = MLX5_WQE_CTRL_CQ_UPDATE;
-		mlx5e_tx_notify_hw(sq, wqe);
+		mlx5e_tx_notify_hw(sq, wqe, 0);
 	}
 }
 
@@ -110,7 +110,7 @@ u16 mlx5e_select_queue(struct net_device *dev, struct sk_buff *skb,
 }
 
 static inline u16 mlx5e_get_inline_hdr_size(struct mlx5e_sq *sq,
-					    struct sk_buff *skb)
+					    struct sk_buff *skb, bool bf)
 {
 	/* Some NIC TX decisions, e.g loopback, are based on the packet
 	 * headers and occur before the data gather.
@@ -118,7 +118,7 @@ static inline u16 mlx5e_get_inline_hdr_size(struct mlx5e_sq *sq,
 	 */
 #define MLX5E_MIN_INLINE (ETH_HLEN + 2/*vlan tag*/)
 
-	if (skb_headlen(skb) <= sq->max_inline)
+	if (bf && (skb_headlen(skb) <= sq->max_inline))
 		return skb_headlen(skb);
 
 	return MLX5E_MIN_INLINE;
@@ -137,6 +137,7 @@ static netdev_tx_t mlx5e_sq_xmit(struct mlx5e_sq *sq, struct sk_buff *skb)
 
 	u8  opcode = MLX5_OPCODE_SEND;
 	dma_addr_t dma_addr = 0;
+	bool bf = false;
 	u16 headlen;
 	u16 ds_cnt;
 	u16 ihs;
@@ -149,6 +150,11 @@ static netdev_tx_t mlx5e_sq_xmit(struct mlx5e_sq *sq, struct sk_buff *skb)
 	else
 		sq->stats.csum_offload_none++;
 
+	if (sq->cc != sq->prev_cc) {
+		sq->prev_cc = sq->cc;
+		sq->bf_budget = (sq->cc == sq->pc) ? MLX5E_SQ_BF_BUDGET : 0;
+	}
+
 	if (skb_is_gso(skb)) {
 		u32 payload_len;
 
@@ -161,7 +167,10 @@ static netdev_tx_t mlx5e_sq_xmit(struct mlx5e_sq *sq, struct sk_buff *skb)
 		sq->stats.tso_packets++;
 		sq->stats.tso_bytes += payload_len;
 	} else {
-		ihs = mlx5e_get_inline_hdr_size(sq, skb);
+		bf = sq->bf_budget &&
+		     !skb->xmit_more &&
+		     !skb_shinfo(skb)->nr_frags;
+		ihs = mlx5e_get_inline_hdr_size(sq, skb, bf);
 		MLX5E_TX_SKB_CB(skb)->num_bytes = max_t(unsigned int, skb->len,
 							ETH_ZLEN);
 	}
@@ -233,14 +242,21 @@ static netdev_tx_t mlx5e_sq_xmit(struct mlx5e_sq *sq, struct sk_buff *skb)
 	}
 
 	if (!skb->xmit_more || netif_xmit_stopped(sq->txq)) {
+		int bf_sz = 0;
+
+		if (bf && sq->uar_bf_map)
+			bf_sz = MLX5E_TX_SKB_CB(skb)->num_wqebbs << 3;
+
 		cseg->fm_ce_se = MLX5_WQE_CTRL_CQ_UPDATE;
-		mlx5e_tx_notify_hw(sq, wqe);
+		mlx5e_tx_notify_hw(sq, wqe, bf_sz);
 	}
 
 	/* fill sq edge with nops to avoid wqe wrap around */
 	while ((sq->pc & wq->sz_m1) > sq->edge)
 		mlx5e_send_nop(sq, false);
 
+	sq->bf_budget = bf ? sq->bf_budget - 1 : 0;
+
 	sq->stats.packets++;
 	return NETDEV_TX_OK;
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index c34eafb..603a8b0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -654,6 +654,22 @@ static int mlx5_core_set_issi(struct mlx5_core_dev *dev)
 }
 #endif
 
+static int map_bf_area(struct mlx5_core_dev *dev)
+{
+	resource_size_t bf_start = pci_resource_start(dev->pdev, 0);
+	resource_size_t bf_len = pci_resource_len(dev->pdev, 0);
+
+	dev->priv.bf_mapping = io_mapping_create_wc(bf_start, bf_len);
+
+	return dev->priv.bf_mapping ? 0 : -ENOMEM;
+}
+
+static void unmap_bf_area(struct mlx5_core_dev *dev)
+{
+	if (dev->priv.bf_mapping)
+		io_mapping_free(dev->priv.bf_mapping);
+}
+
 static int mlx5_dev_init(struct mlx5_core_dev *dev, struct pci_dev *pdev)
 {
 	struct mlx5_priv *priv = &dev->priv;
@@ -808,10 +824,13 @@ static int mlx5_dev_init(struct mlx5_core_dev *dev, struct pci_dev *pdev)
 		goto err_stop_eqs;
 	}
 
+	if (map_bf_area(dev))
+		dev_err(&pdev->dev, "Failed to map blue flame area\n");
+
 	err = mlx5_irq_set_affinity_hints(dev);
 	if (err) {
 		dev_err(&pdev->dev, "Failed to alloc affinity hint cpumask\n");
-		goto err_free_comp_eqs;
+		goto err_unmap_bf_area;
 	}
 
 	MLX5_INIT_DOORBELL_LOCK(&priv->cq_uar_lock);
@@ -823,7 +842,9 @@ static int mlx5_dev_init(struct mlx5_core_dev *dev, struct pci_dev *pdev)
 
 	return 0;
 
-err_free_comp_eqs:
+err_unmap_bf_area:
+	unmap_bf_area(dev);
+
 	free_comp_eqs(dev);
 
 err_stop_eqs:
@@ -881,6 +902,7 @@ static void mlx5_dev_cleanup(struct mlx5_core_dev *dev)
 	mlx5_cleanup_qp_table(dev);
 	mlx5_cleanup_cq_table(dev);
 	mlx5_irq_clear_affinity_hints(dev);
+	unmap_bf_area(dev);
 	free_comp_eqs(dev);
 	mlx5_stop_eqs(dev);
 	mlx5_free_uuars(dev, &priv->uuari);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/uar.c b/drivers/net/ethernet/mellanox/mlx5/core/uar.c
index 9ef8587..eb05c84 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/uar.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/uar.c
@@ -32,6 +32,7 @@
 
 #include <linux/kernel.h>
 #include <linux/module.h>
+#include <linux/io-mapping.h>
 #include <linux/mlx5/driver.h>
 #include <linux/mlx5/cmd.h>
 #include "mlx5_core.h"
@@ -246,6 +247,10 @@ int mlx5_alloc_map_uar(struct mlx5_core_dev *mdev, struct mlx5_uar *uar)
 		goto err_free_uar;
 	}
 
+	if (mdev->priv.bf_mapping)
+		uar->bf_map = io_mapping_map_wc(mdev->priv.bf_mapping,
+						uar->index << PAGE_SHIFT);
+
 	return 0;
 
 err_free_uar:
@@ -257,6 +262,7 @@ EXPORT_SYMBOL(mlx5_alloc_map_uar);
 
 void mlx5_unmap_free_uar(struct mlx5_core_dev *mdev, struct mlx5_uar *uar)
 {
+	io_mapping_unmap(uar->bf_map);
 	iounmap(uar->map);
 	mlx5_cmd_free_uar(mdev, uar->index);
 }
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 1c0d5d0..5fe0cae 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -380,7 +380,7 @@ struct mlx5_uar {
 	u32			index;
 	struct list_head	bf_list;
 	unsigned		free_bf_bmap;
-	void __iomem	       *wc_map;
+	void __iomem	       *bf_map;
 	void __iomem	       *map;
 };
 
@@ -435,6 +435,8 @@ struct mlx5_priv {
 	struct mlx5_uuar_info	uuari;
 	MLX5_DECLARE_DOORBELL_LOCK(cq_uar_lock);
 
+	struct io_mapping	*bf_mapping;
+
 	/* pages stuff */
 	struct workqueue_struct *pg_wq;
 	struct rb_root		page_root;
-- 
2.4.3.413.ga5fe668

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 5/6] net/mlx5e: Cosmetics: use BIT() instead of "1 <<", and others
  2015-07-23 20:35 [PATCH net-next 0/6] ConnectX-4 driver update 2015-07-23 Amir Vadai
                   ` (3 preceding siblings ...)
  2015-07-23 20:35 ` [PATCH net-next 4/6] net/mlx5e: TX latency optimization to save DMA reads Amir Vadai
@ 2015-07-23 20:36 ` Amir Vadai
  2015-07-23 20:36 ` [PATCH net-next 6/6] net/mlx5e: Input IPSEC.SPI into the RX RSS hash function Amir Vadai
  2015-07-27  7:29 ` [PATCH net-next 0/6] ConnectX-4 driver update 2015-07-23 David Miller
  6 siblings, 0 replies; 8+ messages in thread
From: Amir Vadai @ 2015-07-23 20:36 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Amir Vadai, Achiad Shochat, Or Gerlitz, Tal Alon

From: Achiad Shochat <achiad@mellanox.com>

No logical change in this commit.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |  16 +-
 .../ethernet/mellanox/mlx5/core/en_flow_table.c    | 166 +++++++++++----------
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |  20 +--
 3 files changed, 104 insertions(+), 98 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index b66edd2..39294f2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -330,14 +330,14 @@ struct mlx5e_channel {
 };
 
 enum mlx5e_traffic_types {
-	MLX5E_TT_IPV4_TCP = 0,
-	MLX5E_TT_IPV6_TCP = 1,
-	MLX5E_TT_IPV4_UDP = 2,
-	MLX5E_TT_IPV6_UDP = 3,
-	MLX5E_TT_IPV4     = 4,
-	MLX5E_TT_IPV6     = 5,
-	MLX5E_TT_ANY      = 6,
-	MLX5E_NUM_TT      = 7,
+	MLX5E_TT_IPV4_TCP,
+	MLX5E_TT_IPV6_TCP,
+	MLX5E_TT_IPV4_UDP,
+	MLX5E_TT_IPV6_UDP,
+	MLX5E_TT_IPV4,
+	MLX5E_TT_IPV6,
+	MLX5E_TT_ANY,
+	MLX5E_NUM_TT,
 };
 
 enum {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_flow_table.c b/drivers/net/ethernet/mellanox/mlx5/core/en_flow_table.c
index 120db80..cca34f6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_flow_table.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_flow_table.c
@@ -105,25 +105,25 @@ static void mlx5e_del_eth_addr_from_flow_table(struct mlx5e_priv *priv,
 {
 	void *ft = priv->ft.main;
 
-	if (ai->tt_vec & (1 << MLX5E_TT_IPV6_TCP))
+	if (ai->tt_vec & BIT(MLX5E_TT_IPV6_TCP))
 		mlx5_del_flow_table_entry(ft, ai->ft_ix[MLX5E_TT_IPV6_TCP]);
 
-	if (ai->tt_vec & (1 << MLX5E_TT_IPV4_TCP))
+	if (ai->tt_vec & BIT(MLX5E_TT_IPV4_TCP))
 		mlx5_del_flow_table_entry(ft, ai->ft_ix[MLX5E_TT_IPV4_TCP]);
 
-	if (ai->tt_vec & (1 << MLX5E_TT_IPV6_UDP))
+	if (ai->tt_vec & BIT(MLX5E_TT_IPV6_UDP))
 		mlx5_del_flow_table_entry(ft, ai->ft_ix[MLX5E_TT_IPV6_UDP]);
 
-	if (ai->tt_vec & (1 << MLX5E_TT_IPV4_UDP))
+	if (ai->tt_vec & BIT(MLX5E_TT_IPV4_UDP))
 		mlx5_del_flow_table_entry(ft, ai->ft_ix[MLX5E_TT_IPV4_UDP]);
 
-	if (ai->tt_vec & (1 << MLX5E_TT_IPV6))
+	if (ai->tt_vec & BIT(MLX5E_TT_IPV6))
 		mlx5_del_flow_table_entry(ft, ai->ft_ix[MLX5E_TT_IPV6]);
 
-	if (ai->tt_vec & (1 << MLX5E_TT_IPV4))
+	if (ai->tt_vec & BIT(MLX5E_TT_IPV4))
 		mlx5_del_flow_table_entry(ft, ai->ft_ix[MLX5E_TT_IPV4]);
 
-	if (ai->tt_vec & (1 << MLX5E_TT_ANY))
+	if (ai->tt_vec & BIT(MLX5E_TT_ANY))
 		mlx5_del_flow_table_entry(ft, ai->ft_ix[MLX5E_TT_ANY]);
 }
 
@@ -156,33 +156,33 @@ static u32 mlx5e_get_tt_vec(struct mlx5e_eth_addr_info *ai, int type)
 		switch (eth_addr_type) {
 		case MLX5E_UC:
 			ret =
-				(1 << MLX5E_TT_IPV4_TCP) |
-				(1 << MLX5E_TT_IPV6_TCP) |
-				(1 << MLX5E_TT_IPV4_UDP) |
-				(1 << MLX5E_TT_IPV6_UDP) |
-				(1 << MLX5E_TT_IPV4)     |
-				(1 << MLX5E_TT_IPV6)     |
-				(1 << MLX5E_TT_ANY)      |
+				BIT(MLX5E_TT_IPV4_TCP)       |
+				BIT(MLX5E_TT_IPV6_TCP)       |
+				BIT(MLX5E_TT_IPV4_UDP)       |
+				BIT(MLX5E_TT_IPV6_UDP)       |
+				BIT(MLX5E_TT_IPV4)           |
+				BIT(MLX5E_TT_IPV6)           |
+				BIT(MLX5E_TT_ANY)            |
 				0;
 			break;
 
 		case MLX5E_MC_IPV4:
 			ret =
-				(1 << MLX5E_TT_IPV4_UDP) |
-				(1 << MLX5E_TT_IPV4)     |
+				BIT(MLX5E_TT_IPV4_UDP)       |
+				BIT(MLX5E_TT_IPV4)           |
 				0;
 			break;
 
 		case MLX5E_MC_IPV6:
 			ret =
-				(1 << MLX5E_TT_IPV6_UDP) |
-				(1 << MLX5E_TT_IPV6)     |
+				BIT(MLX5E_TT_IPV6_UDP)       |
+				BIT(MLX5E_TT_IPV6)           |
 				0;
 			break;
 
 		case MLX5E_MC_OTHER:
 			ret =
-				(1 << MLX5E_TT_ANY)      |
+				BIT(MLX5E_TT_ANY)            |
 				0;
 			break;
 		}
@@ -191,23 +191,23 @@ static u32 mlx5e_get_tt_vec(struct mlx5e_eth_addr_info *ai, int type)
 
 	case MLX5E_ALLMULTI:
 		ret =
-			(1 << MLX5E_TT_IPV4_UDP) |
-			(1 << MLX5E_TT_IPV6_UDP) |
-			(1 << MLX5E_TT_IPV4)     |
-			(1 << MLX5E_TT_IPV6)     |
-			(1 << MLX5E_TT_ANY)      |
+			BIT(MLX5E_TT_IPV4_UDP) |
+			BIT(MLX5E_TT_IPV6_UDP) |
+			BIT(MLX5E_TT_IPV4)     |
+			BIT(MLX5E_TT_IPV6)     |
+			BIT(MLX5E_TT_ANY)      |
 			0;
 		break;
 
 	default: /* MLX5E_PROMISC */
 		ret =
-			(1 << MLX5E_TT_IPV4_TCP) |
-			(1 << MLX5E_TT_IPV6_TCP) |
-			(1 << MLX5E_TT_IPV4_UDP) |
-			(1 << MLX5E_TT_IPV6_UDP) |
-			(1 << MLX5E_TT_IPV4)     |
-			(1 << MLX5E_TT_IPV6)     |
-			(1 << MLX5E_TT_ANY)      |
+			BIT(MLX5E_TT_IPV4_TCP)       |
+			BIT(MLX5E_TT_IPV6_TCP)       |
+			BIT(MLX5E_TT_IPV4_UDP)       |
+			BIT(MLX5E_TT_IPV6_UDP)       |
+			BIT(MLX5E_TT_IPV4)           |
+			BIT(MLX5E_TT_IPV6)           |
+			BIT(MLX5E_TT_ANY)            |
 			0;
 		break;
 	}
@@ -226,6 +226,7 @@ static int __mlx5e_add_eth_addr_rule(struct mlx5e_priv *priv,
 	u8   *match_criteria_dmac;
 	void *ft   = priv->ft.main;
 	u32  *tirn = priv->tirn;
+	u32  *ft_ix;
 	u32  tt_vec;
 	int  err;
 
@@ -261,51 +262,51 @@ static int __mlx5e_add_eth_addr_rule(struct mlx5e_priv *priv,
 
 	tt_vec = mlx5e_get_tt_vec(ai, type);
 
-	if (tt_vec & (1 << MLX5E_TT_ANY)) {
+	ft_ix = &ai->ft_ix[MLX5E_TT_ANY];
+	if (tt_vec & BIT(MLX5E_TT_ANY)) {
 		MLX5_SET(dest_format_struct, dest, destination_id,
 			 tirn[MLX5E_TT_ANY]);
 		err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
 						match_criteria, flow_context,
-						&ai->ft_ix[MLX5E_TT_ANY]);
-		if (err) {
-			mlx5e_del_eth_addr_from_flow_table(priv, ai);
-			return err;
-		}
-		ai->tt_vec |= (1 << MLX5E_TT_ANY);
+						ft_ix);
+		if (err)
+			goto err_del_ai;
+
+		ai->tt_vec |= BIT(MLX5E_TT_ANY);
 	}
 
 	match_criteria_enable = MLX5_MATCH_OUTER_HEADERS;
 	MLX5_SET_TO_ONES(fte_match_param, match_criteria,
 			 outer_headers.ethertype);
 
-	if (tt_vec & (1 << MLX5E_TT_IPV4)) {
+	ft_ix = &ai->ft_ix[MLX5E_TT_IPV4];
+	if (tt_vec & BIT(MLX5E_TT_IPV4)) {
 		MLX5_SET(fte_match_param, match_value, outer_headers.ethertype,
 			 ETH_P_IP);
 		MLX5_SET(dest_format_struct, dest, destination_id,
 			 tirn[MLX5E_TT_IPV4]);
 		err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
 						match_criteria, flow_context,
-						&ai->ft_ix[MLX5E_TT_IPV4]);
-		if (err) {
-			mlx5e_del_eth_addr_from_flow_table(priv, ai);
-			return err;
-		}
-		ai->tt_vec |= (1 << MLX5E_TT_IPV4);
+						ft_ix);
+		if (err)
+			goto err_del_ai;
+
+		ai->tt_vec |= BIT(MLX5E_TT_IPV4);
 	}
 
-	if (tt_vec & (1 << MLX5E_TT_IPV6)) {
+	ft_ix = &ai->ft_ix[MLX5E_TT_IPV6];
+	if (tt_vec & BIT(MLX5E_TT_IPV6)) {
 		MLX5_SET(fte_match_param, match_value, outer_headers.ethertype,
 			 ETH_P_IPV6);
 		MLX5_SET(dest_format_struct, dest, destination_id,
 			 tirn[MLX5E_TT_IPV6]);
 		err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
 						match_criteria, flow_context,
-						&ai->ft_ix[MLX5E_TT_IPV6]);
-		if (err) {
-			mlx5e_del_eth_addr_from_flow_table(priv, ai);
-			return err;
-		}
-		ai->tt_vec |= (1 << MLX5E_TT_IPV6);
+						ft_ix);
+		if (err)
+			goto err_del_ai;
+
+		ai->tt_vec |= BIT(MLX5E_TT_IPV6);
 	}
 
 	MLX5_SET_TO_ONES(fte_match_param, match_criteria,
@@ -313,70 +314,75 @@ static int __mlx5e_add_eth_addr_rule(struct mlx5e_priv *priv,
 	MLX5_SET(fte_match_param, match_value, outer_headers.ip_protocol,
 		 IPPROTO_UDP);
 
-	if (tt_vec & (1 << MLX5E_TT_IPV4_UDP)) {
+	ft_ix = &ai->ft_ix[MLX5E_TT_IPV4_UDP];
+	if (tt_vec & BIT(MLX5E_TT_IPV4_UDP)) {
 		MLX5_SET(fte_match_param, match_value, outer_headers.ethertype,
 			 ETH_P_IP);
 		MLX5_SET(dest_format_struct, dest, destination_id,
 			 tirn[MLX5E_TT_IPV4_UDP]);
 		err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
 						match_criteria, flow_context,
-						&ai->ft_ix[MLX5E_TT_IPV4_UDP]);
-		if (err) {
-			mlx5e_del_eth_addr_from_flow_table(priv, ai);
-			return err;
-		}
-		ai->tt_vec |= (1 << MLX5E_TT_IPV4_UDP);
+						ft_ix);
+		if (err)
+			goto err_del_ai;
+
+		ai->tt_vec |= BIT(MLX5E_TT_IPV4_UDP);
 	}
 
-	if (tt_vec & (1 << MLX5E_TT_IPV6_UDP)) {
+	ft_ix = &ai->ft_ix[MLX5E_TT_IPV6_UDP];
+	if (tt_vec & BIT(MLX5E_TT_IPV6_UDP)) {
 		MLX5_SET(fte_match_param, match_value, outer_headers.ethertype,
 			 ETH_P_IPV6);
 		MLX5_SET(dest_format_struct, dest, destination_id,
 			 tirn[MLX5E_TT_IPV6_UDP]);
 		err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
 						match_criteria, flow_context,
-						&ai->ft_ix[MLX5E_TT_IPV6_UDP]);
-		if (err) {
-			mlx5e_del_eth_addr_from_flow_table(priv, ai);
-			return err;
-		}
-		ai->tt_vec |= (1 << MLX5E_TT_IPV6_UDP);
+						ft_ix);
+		if (err)
+			goto err_del_ai;
+
+		ai->tt_vec |= BIT(MLX5E_TT_IPV6_UDP);
 	}
 
 	MLX5_SET(fte_match_param, match_value, outer_headers.ip_protocol,
 		 IPPROTO_TCP);
 
-	if (tt_vec & (1 << MLX5E_TT_IPV4_TCP)) {
+	ft_ix = &ai->ft_ix[MLX5E_TT_IPV4_TCP];
+	if (tt_vec & BIT(MLX5E_TT_IPV4_TCP)) {
 		MLX5_SET(fte_match_param, match_value, outer_headers.ethertype,
 			 ETH_P_IP);
 		MLX5_SET(dest_format_struct, dest, destination_id,
 			 tirn[MLX5E_TT_IPV4_TCP]);
 		err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
 						match_criteria, flow_context,
-						&ai->ft_ix[MLX5E_TT_IPV4_TCP]);
-		if (err) {
-			mlx5e_del_eth_addr_from_flow_table(priv, ai);
-			return err;
-		}
-		ai->tt_vec |= (1 << MLX5E_TT_IPV4_TCP);
+						ft_ix);
+		if (err)
+			goto err_del_ai;
+
+		ai->tt_vec |= BIT(MLX5E_TT_IPV4_TCP);
 	}
 
-	if (tt_vec & (1 << MLX5E_TT_IPV6_TCP)) {
+	ft_ix = &ai->ft_ix[MLX5E_TT_IPV6_TCP];
+	if (tt_vec & BIT(MLX5E_TT_IPV6_TCP)) {
 		MLX5_SET(fte_match_param, match_value, outer_headers.ethertype,
 			 ETH_P_IPV6);
 		MLX5_SET(dest_format_struct, dest, destination_id,
 			 tirn[MLX5E_TT_IPV6_TCP]);
 		err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
 						match_criteria, flow_context,
-						&ai->ft_ix[MLX5E_TT_IPV6_TCP]);
-		if (err) {
-			mlx5e_del_eth_addr_from_flow_table(priv, ai);
-			return err;
-		}
-		ai->tt_vec |= (1 << MLX5E_TT_IPV6_TCP);
+						ft_ix);
+		if (err)
+			goto err_del_ai;
+
+		ai->tt_vec |= BIT(MLX5E_TT_IPV6_TCP);
 	}
 
 	return 0;
+
+err_del_ai:
+	mlx5e_del_eth_addr_from_flow_table(priv, ai);
+
+	return err;
 }
 
 static int mlx5e_add_eth_addr_rule(struct mlx5e_priv *priv,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 4a87e9d..8194c32 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -1252,13 +1252,13 @@ static void mlx5e_build_tir_ctx(struct mlx5e_priv *priv, u32 *tirc, int tt)
 
 #define ROUGH_MAX_L2_L3_HDR_SZ 256
 
-#define MLX5_HASH_IP     (MLX5_HASH_FIELD_SEL_SRC_IP   |\
-			  MLX5_HASH_FIELD_SEL_DST_IP)
+#define MLX5_HASH_IP            (MLX5_HASH_FIELD_SEL_SRC_IP   |\
+				 MLX5_HASH_FIELD_SEL_DST_IP)
 
-#define MLX5_HASH_ALL    (MLX5_HASH_FIELD_SEL_SRC_IP   |\
-			  MLX5_HASH_FIELD_SEL_DST_IP   |\
-			  MLX5_HASH_FIELD_SEL_L4_SPORT |\
-			  MLX5_HASH_FIELD_SEL_L4_DPORT)
+#define MLX5_HASH_IP_L4PORTS    (MLX5_HASH_FIELD_SEL_SRC_IP   |\
+				 MLX5_HASH_FIELD_SEL_DST_IP   |\
+				 MLX5_HASH_FIELD_SEL_L4_SPORT |\
+				 MLX5_HASH_FIELD_SEL_L4_DPORT)
 
 	if (priv->params.lro_en) {
 		MLX5_SET(tirc, tirc, lro_enable_mask,
@@ -1305,7 +1305,7 @@ static void mlx5e_build_tir_ctx(struct mlx5e_priv *priv, u32 *tirc, int tt)
 		MLX5_SET(rx_hash_field_select, hfso, l4_prot_type,
 			 MLX5_L4_PROT_TYPE_TCP);
 		MLX5_SET(rx_hash_field_select, hfso, selected_fields,
-			 MLX5_HASH_ALL);
+			 MLX5_HASH_IP_L4PORTS);
 		break;
 
 	case MLX5E_TT_IPV6_TCP:
@@ -1314,7 +1314,7 @@ static void mlx5e_build_tir_ctx(struct mlx5e_priv *priv, u32 *tirc, int tt)
 		MLX5_SET(rx_hash_field_select, hfso, l4_prot_type,
 			 MLX5_L4_PROT_TYPE_TCP);
 		MLX5_SET(rx_hash_field_select, hfso, selected_fields,
-			 MLX5_HASH_ALL);
+			 MLX5_HASH_IP_L4PORTS);
 		break;
 
 	case MLX5E_TT_IPV4_UDP:
@@ -1323,7 +1323,7 @@ static void mlx5e_build_tir_ctx(struct mlx5e_priv *priv, u32 *tirc, int tt)
 		MLX5_SET(rx_hash_field_select, hfso, l4_prot_type,
 			 MLX5_L4_PROT_TYPE_UDP);
 		MLX5_SET(rx_hash_field_select, hfso, selected_fields,
-			 MLX5_HASH_ALL);
+			 MLX5_HASH_IP_L4PORTS);
 		break;
 
 	case MLX5E_TT_IPV6_UDP:
@@ -1332,7 +1332,7 @@ static void mlx5e_build_tir_ctx(struct mlx5e_priv *priv, u32 *tirc, int tt)
 		MLX5_SET(rx_hash_field_select, hfso, l4_prot_type,
 			 MLX5_L4_PROT_TYPE_UDP);
 		MLX5_SET(rx_hash_field_select, hfso, selected_fields,
-			 MLX5_HASH_ALL);
+			 MLX5_HASH_IP_L4PORTS);
 		break;
 
 	case MLX5E_TT_IPV4:
-- 
2.4.3.413.ga5fe668

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 6/6] net/mlx5e: Input IPSEC.SPI into the RX RSS hash function
  2015-07-23 20:35 [PATCH net-next 0/6] ConnectX-4 driver update 2015-07-23 Amir Vadai
                   ` (4 preceding siblings ...)
  2015-07-23 20:36 ` [PATCH net-next 5/6] net/mlx5e: Cosmetics: use BIT() instead of "1 <<", and others Amir Vadai
@ 2015-07-23 20:36 ` Amir Vadai
  2015-07-27  7:29 ` [PATCH net-next 0/6] ConnectX-4 driver update 2015-07-23 David Miller
  6 siblings, 0 replies; 8+ messages in thread
From: Amir Vadai @ 2015-07-23 20:36 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Amir Vadai, Achiad Shochat, Or Gerlitz, Tal Alon

From: Achiad Shochat <achiad@mellanox.com>

In addition to the source/destination IP which are already hashed.
Only for unicast traffic for now.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |  4 +
 .../ethernet/mellanox/mlx5/core/en_flow_table.c    | 92 +++++++++++++++++++++-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 32 ++++++++
 3 files changed, 127 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 39294f2..b710e9b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -334,6 +334,10 @@ enum mlx5e_traffic_types {
 	MLX5E_TT_IPV6_TCP,
 	MLX5E_TT_IPV4_UDP,
 	MLX5E_TT_IPV6_UDP,
+	MLX5E_TT_IPV4_IPSEC_AH,
+	MLX5E_TT_IPV6_IPSEC_AH,
+	MLX5E_TT_IPV4_IPSEC_ESP,
+	MLX5E_TT_IPV6_IPSEC_ESP,
 	MLX5E_TT_IPV4,
 	MLX5E_TT_IPV6,
 	MLX5E_TT_ANY,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_flow_table.c b/drivers/net/ethernet/mellanox/mlx5/core/en_flow_table.c
index cca34f6..70ec31b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_flow_table.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_flow_table.c
@@ -105,6 +105,22 @@ static void mlx5e_del_eth_addr_from_flow_table(struct mlx5e_priv *priv,
 {
 	void *ft = priv->ft.main;
 
+	if (ai->tt_vec & BIT(MLX5E_TT_IPV6_IPSEC_ESP))
+		mlx5_del_flow_table_entry(ft,
+					  ai->ft_ix[MLX5E_TT_IPV6_IPSEC_ESP]);
+
+	if (ai->tt_vec & BIT(MLX5E_TT_IPV4_IPSEC_ESP))
+		mlx5_del_flow_table_entry(ft,
+					  ai->ft_ix[MLX5E_TT_IPV4_IPSEC_ESP]);
+
+	if (ai->tt_vec & BIT(MLX5E_TT_IPV6_IPSEC_AH))
+		mlx5_del_flow_table_entry(ft,
+					  ai->ft_ix[MLX5E_TT_IPV6_IPSEC_AH]);
+
+	if (ai->tt_vec & BIT(MLX5E_TT_IPV4_IPSEC_AH))
+		mlx5_del_flow_table_entry(ft,
+					  ai->ft_ix[MLX5E_TT_IPV4_IPSEC_AH]);
+
 	if (ai->tt_vec & BIT(MLX5E_TT_IPV6_TCP))
 		mlx5_del_flow_table_entry(ft, ai->ft_ix[MLX5E_TT_IPV6_TCP]);
 
@@ -160,6 +176,10 @@ static u32 mlx5e_get_tt_vec(struct mlx5e_eth_addr_info *ai, int type)
 				BIT(MLX5E_TT_IPV6_TCP)       |
 				BIT(MLX5E_TT_IPV4_UDP)       |
 				BIT(MLX5E_TT_IPV6_UDP)       |
+				BIT(MLX5E_TT_IPV4_IPSEC_AH)  |
+				BIT(MLX5E_TT_IPV6_IPSEC_AH)  |
+				BIT(MLX5E_TT_IPV4_IPSEC_ESP) |
+				BIT(MLX5E_TT_IPV6_IPSEC_ESP) |
 				BIT(MLX5E_TT_IPV4)           |
 				BIT(MLX5E_TT_IPV6)           |
 				BIT(MLX5E_TT_ANY)            |
@@ -205,6 +225,10 @@ static u32 mlx5e_get_tt_vec(struct mlx5e_eth_addr_info *ai, int type)
 			BIT(MLX5E_TT_IPV6_TCP)       |
 			BIT(MLX5E_TT_IPV4_UDP)       |
 			BIT(MLX5E_TT_IPV6_UDP)       |
+			BIT(MLX5E_TT_IPV4_IPSEC_AH)  |
+			BIT(MLX5E_TT_IPV6_IPSEC_AH)  |
+			BIT(MLX5E_TT_IPV4_IPSEC_ESP) |
+			BIT(MLX5E_TT_IPV6_IPSEC_ESP) |
 			BIT(MLX5E_TT_IPV4)           |
 			BIT(MLX5E_TT_IPV6)           |
 			BIT(MLX5E_TT_ANY)            |
@@ -377,6 +401,72 @@ static int __mlx5e_add_eth_addr_rule(struct mlx5e_priv *priv,
 		ai->tt_vec |= BIT(MLX5E_TT_IPV6_TCP);
 	}
 
+	MLX5_SET(fte_match_param, match_value, outer_headers.ip_protocol,
+		 IPPROTO_AH);
+
+	ft_ix = &ai->ft_ix[MLX5E_TT_IPV4_IPSEC_AH];
+	if (tt_vec & BIT(MLX5E_TT_IPV4_IPSEC_AH)) {
+		MLX5_SET(fte_match_param, match_value, outer_headers.ethertype,
+			 ETH_P_IP);
+		MLX5_SET(dest_format_struct, dest, destination_id,
+			 tirn[MLX5E_TT_IPV4_IPSEC_AH]);
+		err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
+						match_criteria, flow_context,
+						ft_ix);
+		if (err)
+			goto err_del_ai;
+
+		ai->tt_vec |= BIT(MLX5E_TT_IPV4_IPSEC_AH);
+	}
+
+	ft_ix = &ai->ft_ix[MLX5E_TT_IPV6_IPSEC_AH];
+	if (tt_vec & BIT(MLX5E_TT_IPV6_IPSEC_AH)) {
+		MLX5_SET(fte_match_param, match_value, outer_headers.ethertype,
+			 ETH_P_IPV6);
+		MLX5_SET(dest_format_struct, dest, destination_id,
+			 tirn[MLX5E_TT_IPV6_IPSEC_AH]);
+		err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
+						match_criteria, flow_context,
+						ft_ix);
+		if (err)
+			goto err_del_ai;
+
+		ai->tt_vec |= BIT(MLX5E_TT_IPV6_IPSEC_AH);
+	}
+
+	MLX5_SET(fte_match_param, match_value, outer_headers.ip_protocol,
+		 IPPROTO_ESP);
+
+	ft_ix = &ai->ft_ix[MLX5E_TT_IPV4_IPSEC_ESP];
+	if (tt_vec & BIT(MLX5E_TT_IPV4_IPSEC_ESP)) {
+		MLX5_SET(fte_match_param, match_value, outer_headers.ethertype,
+			 ETH_P_IP);
+		MLX5_SET(dest_format_struct, dest, destination_id,
+			 tirn[MLX5E_TT_IPV4_IPSEC_ESP]);
+		err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
+						match_criteria, flow_context,
+						ft_ix);
+		if (err)
+			goto err_del_ai;
+
+		ai->tt_vec |= BIT(MLX5E_TT_IPV4_IPSEC_ESP);
+	}
+
+	ft_ix = &ai->ft_ix[MLX5E_TT_IPV6_IPSEC_ESP];
+	if (tt_vec & BIT(MLX5E_TT_IPV6_IPSEC_ESP)) {
+		MLX5_SET(fte_match_param, match_value, outer_headers.ethertype,
+			 ETH_P_IPV6);
+		MLX5_SET(dest_format_struct, dest, destination_id,
+			 tirn[MLX5E_TT_IPV6_IPSEC_ESP]);
+		err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
+						match_criteria, flow_context,
+						ft_ix);
+		if (err)
+			goto err_del_ai;
+
+		ai->tt_vec |= BIT(MLX5E_TT_IPV6_IPSEC_ESP);
+	}
+
 	return 0;
 
 err_del_ai:
@@ -731,7 +821,7 @@ static int mlx5e_create_main_flow_table(struct mlx5e_priv *priv)
 	if (!g)
 		return -ENOMEM;
 
-	g[0].log_sz = 2;
+	g[0].log_sz = 3;
 	g[0].match_criteria_enable = MLX5_MATCH_OUTER_HEADERS;
 	MLX5_SET_TO_ONES(fte_match_param, g[0].match_criteria,
 			 outer_headers.ethertype);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 8194c32..355a10a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -1260,6 +1260,10 @@ static void mlx5e_build_tir_ctx(struct mlx5e_priv *priv, u32 *tirc, int tt)
 				 MLX5_HASH_FIELD_SEL_L4_SPORT |\
 				 MLX5_HASH_FIELD_SEL_L4_DPORT)
 
+#define MLX5_HASH_IP_IPSEC_SPI  (MLX5_HASH_FIELD_SEL_SRC_IP   |\
+				 MLX5_HASH_FIELD_SEL_DST_IP   |\
+				 MLX5_HASH_FIELD_SEL_IPSEC_SPI)
+
 	if (priv->params.lro_en) {
 		MLX5_SET(tirc, tirc, lro_enable_mask,
 			 MLX5_TIRC_LRO_ENABLE_MASK_IPV4_LRO |
@@ -1335,6 +1339,34 @@ static void mlx5e_build_tir_ctx(struct mlx5e_priv *priv, u32 *tirc, int tt)
 			 MLX5_HASH_IP_L4PORTS);
 		break;
 
+	case MLX5E_TT_IPV4_IPSEC_AH:
+		MLX5_SET(rx_hash_field_select, hfso, l3_prot_type,
+			 MLX5_L3_PROT_TYPE_IPV4);
+		MLX5_SET(rx_hash_field_select, hfso, selected_fields,
+			 MLX5_HASH_IP_IPSEC_SPI);
+		break;
+
+	case MLX5E_TT_IPV6_IPSEC_AH:
+		MLX5_SET(rx_hash_field_select, hfso, l3_prot_type,
+			 MLX5_L3_PROT_TYPE_IPV6);
+		MLX5_SET(rx_hash_field_select, hfso, selected_fields,
+			 MLX5_HASH_IP_IPSEC_SPI);
+		break;
+
+	case MLX5E_TT_IPV4_IPSEC_ESP:
+		MLX5_SET(rx_hash_field_select, hfso, l3_prot_type,
+			 MLX5_L3_PROT_TYPE_IPV4);
+		MLX5_SET(rx_hash_field_select, hfso, selected_fields,
+			 MLX5_HASH_IP_IPSEC_SPI);
+		break;
+
+	case MLX5E_TT_IPV6_IPSEC_ESP:
+		MLX5_SET(rx_hash_field_select, hfso, l3_prot_type,
+			 MLX5_L3_PROT_TYPE_IPV6);
+		MLX5_SET(rx_hash_field_select, hfso, selected_fields,
+			 MLX5_HASH_IP_IPSEC_SPI);
+		break;
+
 	case MLX5E_TT_IPV4:
 		MLX5_SET(rx_hash_field_select, hfso, l3_prot_type,
 			 MLX5_L3_PROT_TYPE_IPV4);
-- 
2.4.3.413.ga5fe668

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next 0/6] ConnectX-4 driver update 2015-07-23
  2015-07-23 20:35 [PATCH net-next 0/6] ConnectX-4 driver update 2015-07-23 Amir Vadai
                   ` (5 preceding siblings ...)
  2015-07-23 20:36 ` [PATCH net-next 6/6] net/mlx5e: Input IPSEC.SPI into the RX RSS hash function Amir Vadai
@ 2015-07-27  7:29 ` David Miller
  6 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2015-07-27  7:29 UTC (permalink / raw)
  To: amirv; +Cc: netdev, achiad, ogerlitz, talal

From: Amir Vadai <amirv@mellanox.com>
Date: Thu, 23 Jul 2015 23:35:55 +0300

> This patchset introduce some performance enhancements to the ConnectX-4 driver.
> 1. Improving RSS distribution, and make RSS function controlable using ethtool.
> 2. Make memory that is written by NIC and read by host CPU allocate in the
>    local NUMA to the processing CPU
> 3. Support tx copybreak
> 4. Using hardware feature called blueflame to save DMA reads when possible
> 
> Another patch by Achiad fix some cosmetic issues in the driver.
> 
> Patchset was applied and tested on top of commit 045a0fa ("ip_tunnel: Call
> ip_tunnel_core_init() from inet_init()")

Series applied, thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-07-27  7:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-23 20:35 [PATCH net-next 0/6] ConnectX-4 driver update 2015-07-23 Amir Vadai
2015-07-23 20:35 ` [PATCH net-next 1/6] net/mlx5e: Support ETH_RSS_HASH_XOR Amir Vadai
2015-07-23 20:35 ` [PATCH net-next 2/6] net/mlx5e: Allocate DMA coherent memory on reader NUMA node Amir Vadai
2015-07-23 20:35 ` [PATCH net-next 3/6] net/mlx5e: Support TX packet copy into WQE Amir Vadai
2015-07-23 20:35 ` [PATCH net-next 4/6] net/mlx5e: TX latency optimization to save DMA reads Amir Vadai
2015-07-23 20:36 ` [PATCH net-next 5/6] net/mlx5e: Cosmetics: use BIT() instead of "1 <<", and others Amir Vadai
2015-07-23 20:36 ` [PATCH net-next 6/6] net/mlx5e: Input IPSEC.SPI into the RX RSS hash function Amir Vadai
2015-07-27  7:29 ` [PATCH net-next 0/6] ConnectX-4 driver update 2015-07-23 David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).