[PATCH V3 1/6] net/ethtool: introduce a new ioctl for per queue setting

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH V3 1/6] net/ethtool: introduce a new ioctl for per queue setting
@ 2016-01-14 14:08 kan.liang
  2016-01-14 14:08 ` [PATCH V3 2/6] net/ethtool: support get coalesce per queue kan.liang
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: kan.liang @ 2016-01-14 14:08 UTC (permalink / raw)
  To: netdev, davem, bwh
  Cc: jesse.brandeburg, andi, f.fainelli, alexander.duyck,
	jeffrey.t.kirsher, shannon.nelson, carolyn.wyborny,
	donald.c.skidmore, mitch.a.williams, ogerlitz, edumazet, jiri,
	sfeldma, gospo, sasha.levin, dsahern, tj, cascardo, corbet, ben,
	Kan Liang

From: Kan Liang <kan.liang@intel.com>

Introduce a new ioctl ETHTOOL_PERQUEUE for per queue parameters setting.
The following patches will enable some SUB_COMMANDs for per queue
setting.

Signed-off-by: Kan Liang <kan.liang@intel.com>
---

Changes since V1:
 - Checking the sub-command number to determine whether the command
   requires CAP_NET_ADMIN
 - Refine the struct ethtool_per_queue_op and improve the comments

No changes since V2

 include/uapi/linux/ethtool.h | 17 +++++++++++++++++
 net/core/ethtool.c           | 18 ++++++++++++++++++
 2 files changed, 35 insertions(+)

diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index cd16291..1913d37 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -1144,6 +1144,21 @@ enum ethtool_sfeatures_retval_bits {
 #define ETHTOOL_F_WISH          (1 << ETHTOOL_F_WISH__BIT)
 #define ETHTOOL_F_COMPAT        (1 << ETHTOOL_F_COMPAT__BIT)
 
+#define MAX_NUM_QUEUE		4096
+
+/**
+ * struct ethtool_per_queue_op - apply sub command to the queues in mask.
+ * @cmd: ETHTOOL_PERQUEUE
+ * @sub_command: the sub command which apply to each queues
+ * @queue_mask: Bitmap of the queues which sub command apply to
+ * @data: A complete command structure following for each of the queues addressed
+ */
+struct ethtool_per_queue_op {
+	__u32	cmd;
+	__u32	sub_command;
+	unsigned long queue_mask[BITS_TO_LONGS(MAX_NUM_QUEUE)];
+	char	data[];
+};
 
 /* CMDs currently supported */
 #define ETHTOOL_GSET		0x00000001 /* Get settings. */
@@ -1226,6 +1241,8 @@ enum ethtool_sfeatures_retval_bits {
 #define ETHTOOL_GTUNABLE	0x00000048 /* Get tunable configuration */
 #define ETHTOOL_STUNABLE	0x00000049 /* Set tunable configuration */
 
+#define ETHTOOL_PERQUEUE	0x0000004a /* Set per queue options */
+
 /* compatibility with older code */
 #define SPARC_ETH_GSET		ETHTOOL_GSET
 #define SPARC_ETH_SSET		ETHTOOL_SSET
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 09948a7..ac45597 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -1748,6 +1748,20 @@ out:
 	return ret;
 }
 
+static int ethtool_set_per_queue(struct net_device *dev, void __user *useraddr)
+{
+	struct ethtool_per_queue_op per_queue_opt;
+
+	if (copy_from_user(&per_queue_opt, useraddr, sizeof(per_queue_opt)))
+		return -EFAULT;
+
+	switch (per_queue_opt.sub_command) {
+
+	default:
+		return -EOPNOTSUPP;
+	};
+}
+
 /* The main entry point in this file.  Called from net/core/dev_ioctl.c */
 
 int dev_ethtool(struct net *net, struct ifreq *ifr)
@@ -1799,6 +1813,7 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
 	case ETHTOOL_GEEE:
 	case ETHTOOL_GTUNABLE:
 		break;
+	case ETHTOOL_PERQUEUE:
 	default:
 		if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
 			return -EPERM;
@@ -1991,6 +2006,9 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
 	case ETHTOOL_STUNABLE:
 		rc = ethtool_set_tunable(dev, useraddr);
 		break;
+	case ETHTOOL_PERQUEUE:
+		rc = ethtool_set_per_queue(dev, useraddr);
+		break;
 	default:
 		rc = -EOPNOTSUPP;
 	}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH V3 2/6] net/ethtool: support get coalesce per queue
  2016-01-14 14:08 [PATCH V3 1/6] net/ethtool: introduce a new ioctl for per queue setting kan.liang
@ 2016-01-14 14:08 ` kan.liang
  2016-01-17  2:06   ` Ben Hutchings
  2016-01-14 14:08 ` [PATCH V3 3/6] net/ethtool: support set " kan.liang
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: kan.liang @ 2016-01-14 14:08 UTC (permalink / raw)
  To: netdev, davem, bwh
  Cc: jesse.brandeburg, andi, f.fainelli, alexander.duyck,
	jeffrey.t.kirsher, shannon.nelson, carolyn.wyborny,
	donald.c.skidmore, mitch.a.williams, ogerlitz, edumazet, jiri,
	sfeldma, gospo, sasha.levin, dsahern, tj, cascardo, corbet, ben,
	Kan Liang

From: Kan Liang <kan.liang@intel.com>

This patch implements sub command ETHTOOL_GCOALESCE for ioctl
ETHTOOL_PERQUEUE. It introduces an interface get_per_queue_coalesce to
get coalesce of each masked queue from device driver. Then the interrupt
coalescing parameters will be copied back to user space one by one.

Signed-off-by: Kan Liang <kan.liang@intel.com>
---

Changes since V1:
 - Use bitmap functions to parse queue mask
 - Improve comments

No changes since V2

 include/linux/ethtool.h | 10 +++++++++-
 net/core/ethtool.c      | 34 ++++++++++++++++++++++++++++++++--
 2 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 653dc9c..bd01feb 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -201,6 +201,13 @@ static inline u32 ethtool_rxfh_indir_default(u32 index, u32 n_rx_rings)
  * @get_module_eeprom: Get the eeprom information from the plug-in module
  * @get_eee: Get Energy-Efficient (EEE) supported and status.
  * @set_eee: Set EEE status (enable/disable) as well as LPI timers.
+ * @get_per_queue_coalesce: Get interrupt coalescing parameters per queue.
+ *	It needs to do range check for the input queue number. Only if
+ *	neither RX nor TX queue number is in the range, a negative error code
+ *	returns. For the case that only RX or only TX is not in the range,
+ *	zero should return. But related unavailable fields should be set to ~0,
+ *	which indicates RX or TX is not in the range.
+ * 	Returns a negative error code or zero.
  *
  * All operations are optional (i.e. the function pointer may be set
  * to %NULL) and callers must take this into account.  Callers must
@@ -279,7 +286,8 @@ struct ethtool_ops {
 			       const struct ethtool_tunable *, void *);
 	int	(*set_tunable)(struct net_device *,
 			       const struct ethtool_tunable *, const void *);
-
+	int	(*get_per_queue_coalesce)(struct net_device *, int,
+					  struct ethtool_coalesce *);
 
 };
 #endif /* _LINUX_ETHTOOL_H */
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index ac45597..572c103 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -1748,6 +1748,31 @@ out:
 	return ret;
 }
 
+static int ethtool_get_per_queue_coalesce(struct net_device *dev,
+					  void __user *useraddr,
+					  struct ethtool_per_queue_op *per_queue_opt)
+{
+	int bit, ret;
+
+	if (!dev->ethtool_ops->get_per_queue_coalesce)
+		return -EOPNOTSUPP;
+
+	useraddr += sizeof(*per_queue_opt);
+
+	for_each_set_bit(bit, per_queue_opt->queue_mask, MAX_NUM_QUEUE) {
+		struct ethtool_coalesce coalesce = { .cmd = ETHTOOL_GCOALESCE };
+
+		ret = dev->ethtool_ops->get_per_queue_coalesce(dev, bit, &coalesce);
+		if (ret != 0)
+			return ret;
+		if (copy_to_user(useraddr, &coalesce, sizeof(coalesce)))
+			return -EFAULT;
+		useraddr += sizeof(coalesce);
+	}
+
+	return 0;
+}
+
 static int ethtool_set_per_queue(struct net_device *dev, void __user *useraddr)
 {
 	struct ethtool_per_queue_op per_queue_opt;
@@ -1756,7 +1781,8 @@ static int ethtool_set_per_queue(struct net_device *dev, void __user *useraddr)
 		return -EFAULT;
 
 	switch (per_queue_opt.sub_command) {
-
+	case ETHTOOL_GCOALESCE:
+		return ethtool_get_per_queue_coalesce(dev, useraddr, &per_queue_opt);
 	default:
 		return -EOPNOTSUPP;
 	};
@@ -1768,7 +1794,7 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
 {
 	struct net_device *dev = __dev_get_by_name(net, ifr->ifr_name);
 	void __user *useraddr = ifr->ifr_data;
-	u32 ethcmd;
+	u32 ethcmd, sub_cmd;
 	int rc;
 	netdev_features_t old_features;
 
@@ -1814,6 +1840,10 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
 	case ETHTOOL_GTUNABLE:
 		break;
 	case ETHTOOL_PERQUEUE:
+		if (copy_from_user(&sub_cmd, useraddr + sizeof(ethcmd), sizeof(sub_cmd)))
+			return -EFAULT;
+		if (sub_cmd == ETHTOOL_GCOALESCE)
+			break;
 	default:
 		if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
 			return -EPERM;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH V3 3/6] net/ethtool: support set coalesce per queue
  2016-01-14 14:08 [PATCH V3 1/6] net/ethtool: introduce a new ioctl for per queue setting kan.liang
  2016-01-14 14:08 ` [PATCH V3 2/6] net/ethtool: support get coalesce per queue kan.liang
@ 2016-01-14 14:08 ` kan.liang
  2016-01-17  2:13   ` Ben Hutchings
  2016-01-14 14:08 ` [PATCH V3 4/6] i40e: queue-specific settings for interrupt moderation kan.liang
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: kan.liang @ 2016-01-14 14:08 UTC (permalink / raw)
  To: netdev, davem, bwh
  Cc: jesse.brandeburg, andi, f.fainelli, alexander.duyck,
	jeffrey.t.kirsher, shannon.nelson, carolyn.wyborny,
	donald.c.skidmore, mitch.a.williams, ogerlitz, edumazet, jiri,
	sfeldma, gospo, sasha.levin, dsahern, tj, cascardo, corbet, ben,
	Kan Liang

From: Kan Liang <kan.liang@intel.com>

This patch implements sub command ETHTOOL_SCOALESCE for ioctl
ETHTOOL_PERQUEUE. It introduces an interface set_per_queue_coalesce to
set coalesce of each masked queue to device driver. The wanted coalesce
information are stored in "data" for each masked queue, which can copy
from userspace.
If it fails to set coalesce to device driver, the value which already
set to specific queue will be tried to rollback.

Signed-off-by: Kan Liang <kan.liang@intel.com>
---

Changes since V1:
 - Use bitmap functions to parse queue mask
 - Improve comments
 - Add rollback support

No changes since V2

 include/linux/ethtool.h |  8 +++++++
 net/core/ethtool.c      | 59 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 67 insertions(+)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index bd01feb..e68f950 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -208,6 +208,12 @@ static inline u32 ethtool_rxfh_indir_default(u32 index, u32 n_rx_rings)
  *	zero should return. But related unavailable fields should be set to ~0,
  *	which indicates RX or TX is not in the range.
  * 	Returns a negative error code or zero.
+ * @set_per_queue_coalesce: Set interrupt coalescing parameters per queue.
+ * 	It needs to do range check for the input queue number. Only if
+ * 	neither RX nor TX queue number is in the range, a negative error code
+ * 	returns. For the case that only RX or only TX is not in the range,
+ * 	zero should return. The related unavailable fields should be avoid.
+ * 	Returns a negative error code or zero.
  *
  * All operations are optional (i.e. the function pointer may be set
  * to %NULL) and callers must take this into account.  Callers must
@@ -288,6 +294,8 @@ struct ethtool_ops {
 			       const struct ethtool_tunable *, const void *);
 	int	(*get_per_queue_coalesce)(struct net_device *, int,
 					  struct ethtool_coalesce *);
+	int	(*set_per_queue_coalesce)(struct net_device *, int,
+					  struct ethtool_coalesce *);
 
 };
 #endif /* _LINUX_ETHTOOL_H */
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 572c103..d0e6d29 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -1773,6 +1773,63 @@ static int ethtool_get_per_queue_coalesce(struct net_device *dev,
 	return 0;
 }
 
+static int ethtool_set_per_queue_coalesce(struct net_device *dev,
+					  void __user *useraddr,
+					  struct ethtool_per_queue_op *per_queue_opt)
+{
+	int bit, i, ret = 0;
+	int queue_num = bitmap_weight(per_queue_opt->queue_mask, MAX_NUM_QUEUE);
+	struct ethtool_coalesce *backup = NULL, *tmp = NULL;
+	bool rollback = true;
+
+	if (!dev->ethtool_ops->set_per_queue_coalesce)
+		return -EOPNOTSUPP;
+
+	if (!dev->ethtool_ops->get_per_queue_coalesce)
+		rollback = false;
+
+	useraddr += sizeof(*per_queue_opt);
+
+	if (rollback)
+		tmp = backup = kmalloc(queue_num * sizeof(*backup), GFP_KERNEL);
+
+	for_each_set_bit(bit, per_queue_opt->queue_mask, MAX_NUM_QUEUE) {
+		struct ethtool_coalesce coalesce;
+
+		if (rollback) {
+			if (dev->ethtool_ops->get_per_queue_coalesce(dev, bit, tmp)) {
+				ret = -EFAULT;
+				goto roll_back;
+			}
+			tmp += sizeof(struct ethtool_coalesce);
+		}
+
+		if (copy_from_user(&coalesce, useraddr, sizeof(coalesce))) {
+			ret = -EFAULT;
+			goto roll_back;
+		}
+
+		ret = dev->ethtool_ops->set_per_queue_coalesce(dev, bit, &coalesce);
+		if (ret != 0)
+			goto roll_back;
+
+		useraddr += sizeof(coalesce);
+	}
+
+roll_back:
+	if (rollback) {
+		if (ret != 0) {
+			tmp = backup;
+			for_each_set_bit(i, per_queue_opt->queue_mask, bit - 1) {
+				dev->ethtool_ops->set_per_queue_coalesce(dev, i, tmp);
+				tmp += sizeof(struct ethtool_coalesce);
+			}
+		}
+		kfree(backup);
+	}
+	return ret;
+}
+
 static int ethtool_set_per_queue(struct net_device *dev, void __user *useraddr)
 {
 	struct ethtool_per_queue_op per_queue_opt;
@@ -1783,6 +1840,8 @@ static int ethtool_set_per_queue(struct net_device *dev, void __user *useraddr)
 	switch (per_queue_opt.sub_command) {
 	case ETHTOOL_GCOALESCE:
 		return ethtool_get_per_queue_coalesce(dev, useraddr, &per_queue_opt);
+	case ETHTOOL_SCOALESCE:
+		return ethtool_set_per_queue_coalesce(dev, useraddr, &per_queue_opt);
 	default:
 		return -EOPNOTSUPP;
 	};
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH V3 4/6] i40e: queue-specific settings for interrupt moderation
  2016-01-14 14:08 [PATCH V3 1/6] net/ethtool: introduce a new ioctl for per queue setting kan.liang
  2016-01-14 14:08 ` [PATCH V3 2/6] net/ethtool: support get coalesce per queue kan.liang
  2016-01-14 14:08 ` [PATCH V3 3/6] net/ethtool: support set " kan.liang
@ 2016-01-14 14:08 ` kan.liang
  2016-01-14 22:41   ` Nelson, Shannon
  2016-01-14 14:08 ` [PATCH V3 5/6] i40e/ethtool: support coalesce getting by queue kan.liang
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: kan.liang @ 2016-01-14 14:08 UTC (permalink / raw)
  To: netdev, davem, bwh
  Cc: jesse.brandeburg, andi, f.fainelli, alexander.duyck,
	jeffrey.t.kirsher, shannon.nelson, carolyn.wyborny,
	donald.c.skidmore, mitch.a.williams, ogerlitz, edumazet, jiri,
	sfeldma, gospo, sasha.levin, dsahern, tj, cascardo, corbet, ben,
	Kan Liang

From: Kan Liang <kan.liang@intel.com>

For i40e driver, each vector has its own ITR register. However, there
are no concept of queue-specific settings in the driver proper. Only
global variable is used to store ITR values. That will cause problems
especially when resetting the vector. The specific ITR values could be
lost.
This patch move rx_itr_setting and tx_itr_setting to i40e_ring to store
specific ITR register for each queue.
i40e_get_coalesce and i40e_set_coalesce are also modified accordingly to
support queue-specific settings. To make it compatible with old ethtool,
if user doesn't specify the queue number, i40e_get_coalesce will return
queue 0's value. While i40e_set_coalesce will apply value to all queues.

Signed-off-by: Kan Liang <kan.liang@intel.com>
---

new file since V2

 drivers/net/ethernet/intel/i40e/i40e.h         |   7 --
 drivers/net/ethernet/intel/i40e/i40e_debugfs.c |  15 ++-
 drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 142 ++++++++++++++++---------
 drivers/net/ethernet/intel/i40e/i40e_main.c    |  12 +--
 drivers/net/ethernet/intel/i40e/i40e_txrx.c    |   9 +-
 drivers/net/ethernet/intel/i40e/i40e_txrx.h    |   8 ++
 6 files changed, 123 insertions(+), 70 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index c202f9b..d51c650 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -499,13 +499,6 @@ struct i40e_vsi {
 	struct i40e_ring **tx_rings;
 
 	u16 work_limit;
-	/* high bit set means dynamic, use accessor routines to read/write.
-	 * hardware only supports 2us resolution for the ITR registers.
-	 * these values always store the USER setting, and must be converted
-	 * before programming to a register.
-	 */
-	u16 rx_itr_setting;
-	u16 tx_itr_setting;
 	u16 int_rate_limit;  /* value in usecs */
 
 	u16 rss_table_size; /* HW RSS table size */
diff --git a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
index 10744a6..40d49f4 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
@@ -533,6 +533,10 @@ static void i40e_dbg_dump_vsi_seid(struct i40e_pf *pf, int seid)
 			 "    rx_rings[%i]: vsi = %p, q_vector = %p\n",
 			 i, rx_ring->vsi,
 			 rx_ring->q_vector);
+		dev_info(&pf->pdev->dev,
+			 "    rx_rings[%i]: rx_itr_setting = %d (%s)\n",
+			 i, rx_ring->rx_itr_setting,
+			 ITR_IS_DYNAMIC(rx_ring->rx_itr_setting) ? "dynamic" : "fixed");
 	}
 	for (i = 0; i < vsi->num_queue_pairs; i++) {
 		struct i40e_ring *tx_ring = ACCESS_ONCE(vsi->tx_rings[i]);
@@ -583,14 +587,15 @@ static void i40e_dbg_dump_vsi_seid(struct i40e_pf *pf, int seid)
 		dev_info(&pf->pdev->dev,
 			 "    tx_rings[%i]: DCB tc = %d\n",
 			 i, tx_ring->dcb_tc);
+		dev_info(&pf->pdev->dev,
+			 "    tx_rings[%i]: tx_itr_setting = %d (%s)\n",
+			 i, tx_ring->tx_itr_setting,
+			 ITR_IS_DYNAMIC(tx_ring->tx_itr_setting) ? "dynamic" : "fixed");
 	}
 	rcu_read_unlock();
 	dev_info(&pf->pdev->dev,
-		 "    work_limit = %d, rx_itr_setting = %d (%s), tx_itr_setting = %d (%s)\n",
-		 vsi->work_limit, vsi->rx_itr_setting,
-		 ITR_IS_DYNAMIC(vsi->rx_itr_setting) ? "dynamic" : "fixed",
-		 vsi->tx_itr_setting,
-		 ITR_IS_DYNAMIC(vsi->tx_itr_setting) ? "dynamic" : "fixed");
+		 "    work_limit = %d\n",
+		 vsi->work_limit);
 	dev_info(&pf->pdev->dev,
 		 "    max_frame = %d, rx_hdr_len = %d, rx_buf_len = %d dtype = %d\n",
 		 vsi->max_frame, vsi->rx_hdr_len, vsi->rx_buf_len, vsi->dtype);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
index 29d5833..26ad7bd 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
@@ -1845,23 +1845,37 @@ static int i40e_set_phys_id(struct net_device *netdev,
  * 125us (8000 interrupts per second) == ITR(62)
  */
 
-static int i40e_get_coalesce(struct net_device *netdev,
-			     struct ethtool_coalesce *ec)
+static int __i40e_get_coalesce(struct net_device *netdev,
+			       struct ethtool_coalesce *ec,
+			       int queue)
 {
 	struct i40e_netdev_priv *np = netdev_priv(netdev);
 	struct i40e_vsi *vsi = np->vsi;
+	struct i40e_pf *pf = vsi->back;
 
 	ec->tx_max_coalesced_frames_irq = vsi->work_limit;
 	ec->rx_max_coalesced_frames_irq = vsi->work_limit;
 
-	if (ITR_IS_DYNAMIC(vsi->rx_itr_setting))
+	/* rx and tx usecs has per queue value. If user doesn't specify the queue,
+	 * return queue 0's value to represent.
+	 */
+	if (queue < 0) {
+		queue = 0;
+	} else if (queue >= vsi->num_queue_pairs) {
+		netif_info(pf, drv, netdev, "Invalid queue value, queue range is 0 - %d\n",
+			   vsi->num_queue_pairs - 1);
+		return -EINVAL;
+	}
+
+	if (ITR_IS_DYNAMIC(vsi->rx_rings[queue]->rx_itr_setting))
 		ec->use_adaptive_rx_coalesce = 1;
 
-	if (ITR_IS_DYNAMIC(vsi->tx_itr_setting))
+	if (ITR_IS_DYNAMIC(vsi->tx_rings[queue]->tx_itr_setting))
 		ec->use_adaptive_tx_coalesce = 1;
 
-	ec->rx_coalesce_usecs = vsi->rx_itr_setting & ~I40E_ITR_DYNAMIC;
-	ec->tx_coalesce_usecs = vsi->tx_itr_setting & ~I40E_ITR_DYNAMIC;
+	ec->rx_coalesce_usecs = vsi->rx_rings[queue]->rx_itr_setting & ~I40E_ITR_DYNAMIC;
+	ec->tx_coalesce_usecs = vsi->tx_rings[queue]->tx_itr_setting & ~I40E_ITR_DYNAMIC;
+
 	/* we use the _usecs_high to store/set the interrupt rate limit
 	 * that the hardware supports, that almost but not quite
 	 * fits the original intent of the ethtool variable,
@@ -1874,15 +1888,57 @@ static int i40e_get_coalesce(struct net_device *netdev,
 	return 0;
 }
 
-static int i40e_set_coalesce(struct net_device *netdev,
+static int i40e_get_coalesce(struct net_device *netdev,
 			     struct ethtool_coalesce *ec)
 {
-	struct i40e_netdev_priv *np = netdev_priv(netdev);
+	return __i40e_get_coalesce(netdev, ec, -1);
+}
+
+static void i40e_set_itr_per_queue(struct i40e_vsi *vsi,
+				   struct ethtool_coalesce *ec,
+				   int queue)
+{
+	struct i40e_pf *pf = vsi->back;
+	struct i40e_hw *hw = &pf->hw;
 	struct i40e_q_vector *q_vector;
+	u16 vector, intrl;
+
+	intrl = INTRL_USEC_TO_REG(vsi->int_rate_limit);
+
+	vsi->rx_rings[queue]->rx_itr_setting = ec->rx_coalesce_usecs;
+	vsi->tx_rings[queue]->tx_itr_setting = ec->tx_coalesce_usecs;
+
+	if (ec->use_adaptive_rx_coalesce)
+		vsi->rx_rings[queue]->rx_itr_setting |= I40E_ITR_DYNAMIC;
+	else
+		vsi->rx_rings[queue]->rx_itr_setting &= ~I40E_ITR_DYNAMIC;
+
+	if (ec->use_adaptive_tx_coalesce)
+		vsi->tx_rings[queue]->tx_itr_setting |= I40E_ITR_DYNAMIC;
+	else
+		vsi->tx_rings[queue]->tx_itr_setting &= ~I40E_ITR_DYNAMIC;
+
+	q_vector = vsi->rx_rings[queue]->q_vector;
+	q_vector->rx.itr = ITR_TO_REG(vsi->rx_rings[queue]->rx_itr_setting);
+	vector = vsi->base_vector + q_vector->v_idx;
+	wr32(hw, I40E_PFINT_ITRN(I40E_RX_ITR, vector - 1), q_vector->rx.itr);
+
+	q_vector = vsi->tx_rings[queue]->q_vector;
+	q_vector->tx.itr = ITR_TO_REG(vsi->tx_rings[queue]->tx_itr_setting);
+	vector = vsi->base_vector + q_vector->v_idx;
+	wr32(hw, I40E_PFINT_ITRN(I40E_TX_ITR, vector - 1), q_vector->tx.itr);
+
+	wr32(hw, I40E_PFINT_RATEN(vector - 1), intrl);
+	i40e_flush(hw);
+}
+
+static int __i40e_set_coalesce(struct net_device *netdev,
+			       struct ethtool_coalesce *ec,
+			       int queue)
+{
+	struct i40e_netdev_priv *np = netdev_priv(netdev);
 	struct i40e_vsi *vsi = np->vsi;
 	struct i40e_pf *pf = vsi->back;
-	struct i40e_hw *hw = &pf->hw;
-	u16 vector;
 	int i;
 
 	if (ec->tx_max_coalesced_frames_irq || ec->rx_max_coalesced_frames_irq)
@@ -1899,59 +1955,49 @@ static int i40e_set_coalesce(struct net_device *netdev,
 		return -EINVAL;
 	}
 
-	vector = vsi->base_vector;
-	if ((ec->rx_coalesce_usecs >= (I40E_MIN_ITR << 1)) &&
-	    (ec->rx_coalesce_usecs <= (I40E_MAX_ITR << 1))) {
-		vsi->rx_itr_setting = ec->rx_coalesce_usecs;
-	} else if (ec->rx_coalesce_usecs == 0) {
-		vsi->rx_itr_setting = ec->rx_coalesce_usecs;
+	if (ec->rx_coalesce_usecs == 0) {
 		if (ec->use_adaptive_rx_coalesce)
 			netif_info(pf, drv, netdev, "rx-usecs=0, need to disable adaptive-rx for a complete disable\n");
-	} else {
-		netif_info(pf, drv, netdev, "Invalid value, rx-usecs range is 0-8160\n");
-		return -EINVAL;
+	} else if ((ec->rx_coalesce_usecs < (I40E_MIN_ITR << 1)) ||
+		   (ec->rx_coalesce_usecs > (I40E_MAX_ITR << 1))) {
+			netif_info(pf, drv, netdev, "Invalid value, rx-usecs range is 0-8160\n");
+			return -EINVAL;
 	}
 
 	vsi->int_rate_limit = ec->rx_coalesce_usecs_high;
 
-	if ((ec->tx_coalesce_usecs >= (I40E_MIN_ITR << 1)) &&
-	    (ec->tx_coalesce_usecs <= (I40E_MAX_ITR << 1))) {
-		vsi->tx_itr_setting = ec->tx_coalesce_usecs;
-	} else if (ec->tx_coalesce_usecs == 0) {
-		vsi->tx_itr_setting = ec->tx_coalesce_usecs;
+	if (ec->tx_coalesce_usecs == 0) {
 		if (ec->use_adaptive_tx_coalesce)
 			netif_info(pf, drv, netdev, "tx-usecs=0, need to disable adaptive-tx for a complete disable\n");
-	} else {
-		netif_info(pf, drv, netdev,
-			   "Invalid value, tx-usecs range is 0-8160\n");
-		return -EINVAL;
+	} else if ((ec->tx_coalesce_usecs < (I40E_MIN_ITR << 1)) ||
+		   (ec->tx_coalesce_usecs > (I40E_MAX_ITR << 1))) {
+			netif_info(pf, drv, netdev, "Invalid value, tx-usecs range is 0-8160\n");
+			return -EINVAL;
 	}
 
-	if (ec->use_adaptive_rx_coalesce)
-		vsi->rx_itr_setting |= I40E_ITR_DYNAMIC;
-	else
-		vsi->rx_itr_setting &= ~I40E_ITR_DYNAMIC;
-
-	if (ec->use_adaptive_tx_coalesce)
-		vsi->tx_itr_setting |= I40E_ITR_DYNAMIC;
-	else
-		vsi->tx_itr_setting &= ~I40E_ITR_DYNAMIC;
-
-	for (i = 0; i < vsi->num_q_vectors; i++, vector++) {
-		u16 intrl = INTRL_USEC_TO_REG(vsi->int_rate_limit);
-
-		q_vector = vsi->q_vectors[i];
-		q_vector->rx.itr = ITR_TO_REG(vsi->rx_itr_setting);
-		wr32(hw, I40E_PFINT_ITRN(0, vector - 1), q_vector->rx.itr);
-		q_vector->tx.itr = ITR_TO_REG(vsi->tx_itr_setting);
-		wr32(hw, I40E_PFINT_ITRN(1, vector - 1), q_vector->tx.itr);
-		wr32(hw, I40E_PFINT_RATEN(vector - 1), intrl);
-		i40e_flush(hw);
+	/* rx and tx usecs has per queue value. If user doesn't specify the queue,
+	 * apply to all queues.
+	 */
+	if (queue < 0) {
+		for (i = 0; i < vsi->num_queue_pairs; i++)
+			i40e_set_itr_per_queue(vsi, ec, i);
+	} else if (queue < vsi->num_queue_pairs) {
+		i40e_set_itr_per_queue(vsi, ec, queue);
+	} else {
+		netif_info(pf, drv, netdev, "Invalid queue value, queue range is 0 - %d\n",
+			   vsi->num_queue_pairs - 1);
+		return -EINVAL;
 	}
 
 	return 0;
 }
 
+static int i40e_set_coalesce(struct net_device *netdev,
+			     struct ethtool_coalesce *ec)
+{
+	return __i40e_set_coalesce(netdev, ec, -1);
+}
+
 /**
  * i40e_get_rss_hash_opts - Get RSS hash Input Set for each flow type
  * @pf: pointer to the physical function struct
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 1598fb3..02c4557 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -3098,11 +3098,11 @@ static void i40e_vsi_configure_msix(struct i40e_vsi *vsi)
 		struct i40e_q_vector *q_vector = vsi->q_vectors[i];
 
 		q_vector->itr_countdown = ITR_COUNTDOWN_START;
-		q_vector->rx.itr = ITR_TO_REG(vsi->rx_itr_setting);
+		q_vector->rx.itr = ITR_TO_REG(vsi->rx_rings[i]->rx_itr_setting);
 		q_vector->rx.latency_range = I40E_LOW_LATENCY;
 		wr32(hw, I40E_PFINT_ITRN(I40E_RX_ITR, vector - 1),
 		     q_vector->rx.itr);
-		q_vector->tx.itr = ITR_TO_REG(vsi->tx_itr_setting);
+		q_vector->tx.itr = ITR_TO_REG(vsi->tx_rings[i]->tx_itr_setting);
 		q_vector->tx.latency_range = I40E_LOW_LATENCY;
 		wr32(hw, I40E_PFINT_ITRN(I40E_TX_ITR, vector - 1),
 		     q_vector->tx.itr);
@@ -3194,10 +3194,10 @@ static void i40e_configure_msi_and_legacy(struct i40e_vsi *vsi)
 
 	/* set the ITR configuration */
 	q_vector->itr_countdown = ITR_COUNTDOWN_START;
-	q_vector->rx.itr = ITR_TO_REG(vsi->rx_itr_setting);
+	q_vector->rx.itr = ITR_TO_REG(vsi->rx_rings[0]->rx_itr_setting);
 	q_vector->rx.latency_range = I40E_LOW_LATENCY;
 	wr32(hw, I40E_PFINT_ITR0(I40E_RX_ITR), q_vector->rx.itr);
-	q_vector->tx.itr = ITR_TO_REG(vsi->tx_itr_setting);
+	q_vector->tx.itr = ITR_TO_REG(vsi->tx_rings[0]->tx_itr_setting);
 	q_vector->tx.latency_range = I40E_LOW_LATENCY;
 	wr32(hw, I40E_PFINT_ITR0(I40E_TX_ITR), q_vector->tx.itr);
 
@@ -7284,8 +7284,6 @@ static int i40e_vsi_mem_alloc(struct i40e_pf *pf, enum i40e_vsi_type type)
 	set_bit(__I40E_DOWN, &vsi->state);
 	vsi->flags = 0;
 	vsi->idx = vsi_idx;
-	vsi->rx_itr_setting = pf->rx_itr_default;
-	vsi->tx_itr_setting = pf->tx_itr_default;
 	vsi->int_rate_limit = 0;
 	vsi->rss_table_size = (vsi->type == I40E_VSI_MAIN) ?
 				pf->rss_table_size : 64;
@@ -7454,6 +7452,7 @@ static int i40e_alloc_rings(struct i40e_vsi *vsi)
 			tx_ring->flags = I40E_TXR_FLAGS_WB_ON_ITR;
 		if (vsi->back->flags & I40E_FLAG_OUTER_UDP_CSUM_CAPABLE)
 			tx_ring->flags |= I40E_TXR_FLAGS_OUTER_UDP_CSUM;
+		tx_ring->tx_itr_setting = pf->tx_itr_default;
 		vsi->tx_rings[i] = tx_ring;
 
 		rx_ring = &tx_ring[1];
@@ -7470,6 +7469,7 @@ static int i40e_alloc_rings(struct i40e_vsi *vsi)
 			set_ring_16byte_desc_enabled(rx_ring);
 		else
 			clear_ring_16byte_desc_enabled(rx_ring);
+		rx_ring->rx_itr_setting = pf->rx_itr_default;
 		vsi->rx_rings[i] = rx_ring;
 	}
 
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index e9e9a37..19b82dc 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -1810,6 +1810,7 @@ static inline void i40e_update_enable_itr(struct i40e_vsi *vsi,
 	bool rx = false, tx = false;
 	u32 rxval, txval;
 	int vector;
+	int idx = q_vector->v_idx;
 
 	vector = (q_vector->v_idx + vsi->base_vector);
 
@@ -1819,17 +1820,17 @@ static inline void i40e_update_enable_itr(struct i40e_vsi *vsi,
 	rxval = txval = i40e_buildreg_itr(I40E_ITR_NONE, 0);
 
 	if (q_vector->itr_countdown > 0 ||
-	    (!ITR_IS_DYNAMIC(vsi->rx_itr_setting) &&
-	     !ITR_IS_DYNAMIC(vsi->tx_itr_setting))) {
+	    (!ITR_IS_DYNAMIC(vsi->rx_rings[idx]->rx_itr_setting) &&
+	     !ITR_IS_DYNAMIC(vsi->tx_rings[idx]->tx_itr_setting))) {
 		goto enable_int;
 	}
 
-	if (ITR_IS_DYNAMIC(vsi->rx_itr_setting)) {
+	if (ITR_IS_DYNAMIC(vsi->rx_rings[idx]->rx_itr_setting)) {
 		rx = i40e_set_new_dynamic_itr(&q_vector->rx);
 		rxval = i40e_buildreg_itr(I40E_RX_ITR, q_vector->rx.itr);
 	}
 
-	if (ITR_IS_DYNAMIC(vsi->tx_itr_setting)) {
+	if (ITR_IS_DYNAMIC(vsi->tx_rings[idx]->tx_itr_setting)) {
 		tx = i40e_set_new_dynamic_itr(&q_vector->tx);
 		txval = i40e_buildreg_itr(I40E_TX_ITR, q_vector->tx.itr);
 	}
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
index 3f081e2..ea1f479 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
@@ -246,6 +246,14 @@ struct i40e_ring {
 	u8 dcb_tc;			/* Traffic class of ring */
 	u8 __iomem *tail;
 
+	/* high bit set means dynamic, use accessor routines to read/write.
+	 * hardware only supports 2us resolution for the ITR registers.
+	 * these values always store the USER setting, and must be converted
+	 * before programming to a register.
+	 */
+	u16 rx_itr_setting;
+	u16 tx_itr_setting;
+
 	u16 count;			/* Number of descriptors */
 	u16 reg_idx;			/* HW register index of the ring */
 	u16 rx_hdr_len;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH V3 5/6] i40e/ethtool: support coalesce getting by queue
  2016-01-14 14:08 [PATCH V3 1/6] net/ethtool: introduce a new ioctl for per queue setting kan.liang
                   ` (2 preceding siblings ...)
  2016-01-14 14:08 ` [PATCH V3 4/6] i40e: queue-specific settings for interrupt moderation kan.liang
@ 2016-01-14 14:08 ` kan.liang
  2016-01-14 14:08 ` [PATCH V3 6/6] i40e/ethtool: support coalesce setting " kan.liang
  2016-01-17  2:00 ` [PATCH V3 1/6] net/ethtool: introduce a new ioctl for per queue setting Ben Hutchings
  5 siblings, 0 replies; 12+ messages in thread
From: kan.liang @ 2016-01-14 14:08 UTC (permalink / raw)
  To: netdev, davem, bwh
  Cc: jesse.brandeburg, andi, f.fainelli, alexander.duyck,
	jeffrey.t.kirsher, shannon.nelson, carolyn.wyborny,
	donald.c.skidmore, mitch.a.williams, ogerlitz, edumazet, jiri,
	sfeldma, gospo, sasha.levin, dsahern, tj, cascardo, corbet, ben,
	Kan Liang

From: Kan Liang <kan.liang@intel.com>

This patch implements get_per_queue_coalesce for i40e driver.

Signed-off-by: Kan Liang <kan.liang@intel.com>
---

Changes since V1:
 - Correct the way to find the vector for specific queue.

Changes since V2:
 - Move the code per queue coalesce getting to patch 4.

 drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
index 26ad7bd..2fe878d 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
@@ -1894,6 +1894,12 @@ static int i40e_get_coalesce(struct net_device *netdev,
 	return __i40e_get_coalesce(netdev, ec, -1);
 }
 
+static int i40e_get_per_queue_coalesce(struct net_device *netdev, int queue,
+				       struct ethtool_coalesce *ec)
+{
+	return __i40e_get_coalesce(netdev, ec, queue);
+}
+
 static void i40e_set_itr_per_queue(struct i40e_vsi *vsi,
 				   struct ethtool_coalesce *ec,
 				   int queue)
@@ -2858,6 +2864,7 @@ static const struct ethtool_ops i40e_ethtool_ops = {
 	.get_ts_info		= i40e_get_ts_info,
 	.get_priv_flags		= i40e_get_priv_flags,
 	.set_priv_flags		= i40e_set_priv_flags,
+	.get_per_queue_coalesce	= i40e_get_per_queue_coalesce,
 };
 
 void i40e_set_ethtool_ops(struct net_device *netdev)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH V3 6/6] i40e/ethtool: support coalesce setting by queue
  2016-01-14 14:08 [PATCH V3 1/6] net/ethtool: introduce a new ioctl for per queue setting kan.liang
                   ` (3 preceding siblings ...)
  2016-01-14 14:08 ` [PATCH V3 5/6] i40e/ethtool: support coalesce getting by queue kan.liang
@ 2016-01-14 14:08 ` kan.liang
  2016-01-17  2:00 ` [PATCH V3 1/6] net/ethtool: introduce a new ioctl for per queue setting Ben Hutchings
  5 siblings, 0 replies; 12+ messages in thread
From: kan.liang @ 2016-01-14 14:08 UTC (permalink / raw)
  To: netdev, davem, bwh
  Cc: jesse.brandeburg, andi, f.fainelli, alexander.duyck,
	jeffrey.t.kirsher, shannon.nelson, carolyn.wyborny,
	donald.c.skidmore, mitch.a.williams, ogerlitz, edumazet, jiri,
	sfeldma, gospo, sasha.levin, dsahern, tj, cascardo, corbet, ben,
	Kan Liang

From: Kan Liang <kan.liang@intel.com>

This patch implements set_per_queue_coalesce for i40e driver.

Signed-off-by: Kan Liang <kan.liang@intel.com>
---

Changes since V1:
 - Correct the way to find the vector for specific queue.

Changes since V2:
 - Move the code per queue coalesce setting to patch 4.

 drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
index 2fe878d..54a8e96 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
@@ -2004,6 +2004,12 @@ static int i40e_set_coalesce(struct net_device *netdev,
 	return __i40e_set_coalesce(netdev, ec, -1);
 }
 
+static int i40e_set_per_queue_coalesce(struct net_device *netdev, int queue,
+				       struct ethtool_coalesce *ec)
+{
+	return __i40e_set_coalesce(netdev, ec, queue);
+}
+
 /**
  * i40e_get_rss_hash_opts - Get RSS hash Input Set for each flow type
  * @pf: pointer to the physical function struct
@@ -2865,6 +2871,7 @@ static const struct ethtool_ops i40e_ethtool_ops = {
 	.get_priv_flags		= i40e_get_priv_flags,
 	.set_priv_flags		= i40e_set_priv_flags,
 	.get_per_queue_coalesce	= i40e_get_per_queue_coalesce,
+	.set_per_queue_coalesce	= i40e_set_per_queue_coalesce,
 };
 
 void i40e_set_ethtool_ops(struct net_device *netdev)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* RE: [PATCH V3 4/6] i40e: queue-specific settings for interrupt moderation
  2016-01-14 14:08 ` [PATCH V3 4/6] i40e: queue-specific settings for interrupt moderation kan.liang
@ 2016-01-14 22:41   ` Nelson, Shannon
  0 siblings, 0 replies; 12+ messages in thread
From: Nelson, Shannon @ 2016-01-14 22:41 UTC (permalink / raw)
  To: Liang, Kan, netdev, davem, bwh
  Cc: Brandeburg, Jesse, andi, f.fainelli, alexander.duyck, Kirsher,
	Jeffrey T, Wyborny, Carolyn, Skidmore, Donald C, Williams,
	Mitch A, ogerlitz, edumazet, jiri, sfeldma, gospo, sasha.levin,
	dsahern, tj, cascardo, corbet, ben@decadent.org.uk

> From: Kan Liang <kan.liang@intel.com>
> 
> For i40e driver, each vector has its own ITR register. However, there
> are no concept of queue-specific settings in the driver proper. Only
> global variable is used to store ITR values. That will cause problems
> especially when resetting the vector. The specific ITR values could be
> lost.
> This patch move rx_itr_setting and tx_itr_setting to i40e_ring to store
> specific ITR register for each queue.
> i40e_get_coalesce and i40e_set_coalesce are also modified accordingly to
> support queue-specific settings. To make it compatible with old ethtool,
> if user doesn't specify the queue number, i40e_get_coalesce will return
> queue 0's value. While i40e_set_coalesce will apply value to all queues.
> 
> Signed-off-by: Kan Liang <kan.liang@intel.com>
> ---
> 

Acked-by: Shannon Nelson <shannon.nelson@intel.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V3 1/6] net/ethtool: introduce a new ioctl for per queue setting
  2016-01-14 14:08 [PATCH V3 1/6] net/ethtool: introduce a new ioctl for per queue setting kan.liang
                   ` (4 preceding siblings ...)
  2016-01-14 14:08 ` [PATCH V3 6/6] i40e/ethtool: support coalesce setting " kan.liang
@ 2016-01-17  2:00 ` Ben Hutchings
  2016-01-18  2:25   ` Liang, Kan
  5 siblings, 1 reply; 12+ messages in thread
From: Ben Hutchings @ 2016-01-17  2:00 UTC (permalink / raw)
  To: kan.liang, netdev, davem, bwh
  Cc: jesse.brandeburg, andi, f.fainelli, alexander.duyck,
	jeffrey.t.kirsher, shannon.nelson, carolyn.wyborny,
	donald.c.skidmore, mitch.a.williams, ogerlitz, edumazet, jiri,
	sfeldma, gospo, sasha.levin, dsahern, tj, cascardo, corbet

[-- Attachment #1: Type: text/plain, Size: 1152 bytes --]

On Thu, 2016-01-14 at 09:08 -0500, kan.liang@intel.com wrote:
[...]
> --- a/include/uapi/linux/ethtool.h
> +++ b/include/uapi/linux/ethtool.h
> @@ -1144,6 +1144,21 @@ enum ethtool_sfeatures_retval_bits {
>  #define ETHTOOL_F_WISH          (1 << ETHTOOL_F_WISH__BIT)
>  #define ETHTOOL_F_COMPAT        (1 << ETHTOOL_F_COMPAT__BIT)
>  
> +#define MAX_NUM_QUEUE		4096
> +
> +/**
> + * struct ethtool_per_queue_op - apply sub command to the queues in mask.
> + * @cmd: ETHTOOL_PERQUEUE
> + * @sub_command: the sub command which apply to each queues
> + * @queue_mask: Bitmap of the queues which sub command apply to
> + * @data: A complete command structure following for each of the queues addressed
> + */
> +struct ethtool_per_queue_op {
> +	__u32	cmd;
> +	__u32	sub_command;
> +	unsigned long queue_mask[BITS_TO_LONGS(MAX_NUM_QUEUE)];
[...]

I thought we already agreed that this won't work for 32-bit userland on
a big-endian 64-bit kernel?

Ben.

-- 
Ben Hutchings
Theory and practice are closer in theory than in practice.
                                - John Levine, moderator of comp.compilers

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V3 2/6] net/ethtool: support get coalesce per queue
  2016-01-14 14:08 ` [PATCH V3 2/6] net/ethtool: support get coalesce per queue kan.liang
@ 2016-01-17  2:06   ` Ben Hutchings
  0 siblings, 0 replies; 12+ messages in thread
From: Ben Hutchings @ 2016-01-17  2:06 UTC (permalink / raw)
  To: kan.liang, netdev, davem, bwh
  Cc: jesse.brandeburg, andi, f.fainelli, alexander.duyck,
	jeffrey.t.kirsher, shannon.nelson, carolyn.wyborny,
	donald.c.skidmore, mitch.a.williams, ogerlitz, edumazet, jiri,
	sfeldma, gospo, sasha.levin, dsahern, tj, cascardo, corbet

[-- Attachment #1: Type: text/plain, Size: 974 bytes --]

On Thu, 2016-01-14 at 09:08 -0500, kan.liang@intel.com wrote:
[...]
> @@ -1814,6 +1840,10 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>  	case ETHTOOL_GTUNABLE:
>  		break;
>  	case ETHTOOL_PERQUEUE:
> +		if (copy_from_user(&sub_cmd, useraddr + sizeof(ethcmd), sizeof(sub_cmd)))
> +			return -EFAULT;
> +		if (sub_cmd == ETHTOOL_GCOALESCE)
> +			break;

ETHTOOL_PERQUEUE should be handled before the containing switch(), so
we can then apply the switch() to sub_cmd; something like this:

	if (ethcmd == ETHTOOL_PERQUEUE) {
		if (copy_from_user(&sub_cmd, useraddr + sizeof(ethcmd), sizeof(sub_cmd)))
			return -EFAULT;
	} else {
		sub_cmd = ethcmd;
	}

	switch (sub_cmd) {
	...
	}

Ben.

>  	default:
>  		if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
>  			return -EPERM;
-- 
Ben Hutchings
Theory and practice are closer in theory than in practice.
                                - John Levine, moderator of comp.compilers

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V3 3/6] net/ethtool: support set coalesce per queue
  2016-01-14 14:08 ` [PATCH V3 3/6] net/ethtool: support set " kan.liang
@ 2016-01-17  2:13   ` Ben Hutchings
  0 siblings, 0 replies; 12+ messages in thread
From: Ben Hutchings @ 2016-01-17  2:13 UTC (permalink / raw)
  To: kan.liang, netdev, davem, bwh
  Cc: jesse.brandeburg, andi, f.fainelli, alexander.duyck,
	jeffrey.t.kirsher, shannon.nelson, carolyn.wyborny,
	donald.c.skidmore, mitch.a.williams, ogerlitz, edumazet, jiri,
	sfeldma, gospo, sasha.levin, dsahern, tj, cascardo, corbet

[-- Attachment #1: Type: text/plain, Size: 2472 bytes --]

On Thu, 2016-01-14 at 09:08 -0500, kan.liang@intel.com wrote:
[...]
> --- a/net/core/ethtool.c
> +++ b/net/core/ethtool.c
> @@ -1773,6 +1773,63 @@ static int ethtool_get_per_queue_coalesce(struct net_device *dev,
>  	return 0;
>  }
>  
> +static int ethtool_set_per_queue_coalesce(struct net_device *dev,
> +					  void __user *useraddr,
> +					  struct ethtool_per_queue_op *per_queue_opt)
> +{
> +	int bit, i, ret = 0;
> +	int queue_num = bitmap_weight(per_queue_opt->queue_mask, MAX_NUM_QUEUE);
> +	struct ethtool_coalesce *backup = NULL, *tmp = NULL;
> +	bool rollback = true;
> +
> +	if (!dev->ethtool_ops->set_per_queue_coalesce)
> +		return -EOPNOTSUPP;
> +
> +	if (!dev->ethtool_ops->get_per_queue_coalesce)
> +		rollback = false;

I think that get_per_queue_coalesce should be a required operation.
Why would a driver not implement both?

Then there's no need to make the rollback code conditional.

> +	useraddr += sizeof(*per_queue_opt);
> +
> +	if (rollback)
> +		tmp = backup = kmalloc(queue_num * sizeof(*backup), GFP_KERNEL);

Use kmalloc_array() as it checks for overflow (shouldn't be possible
here, but it's best to check).

> +	for_each_set_bit(bit, per_queue_opt->queue_mask, MAX_NUM_QUEUE) {
> +		struct ethtool_coalesce coalesce;
> +
> +		if (rollback) {
> +			if (dev->ethtool_ops->get_per_queue_coalesce(dev, bit, tmp)) {
> +				ret = -EFAULT;

This should propagate the error code from get_per_queue_coalesce rather
than substituting EFAULT.

> +				goto roll_back;
> +			}
> +			tmp += sizeof(struct ethtool_coalesce);
> +		}
> +
> +		if (copy_from_user(&coalesce, useraddr, sizeof(coalesce))) {
> +			ret = -EFAULT;
> +			goto roll_back;
> +		}
> +
> +		ret = dev->ethtool_ops->set_per_queue_coalesce(dev, bit, &coalesce);
> +		if (ret != 0)
> +			goto roll_back;
> +
> +		useraddr += sizeof(coalesce);
> +	}
> +
> +roll_back:
> +	if (rollback) {
> +		if (ret != 0) {
> +			tmp = backup;
> +			for_each_set_bit(i, per_queue_opt->queue_mask, bit - 1) {

The upper bound is excluded so it should be bit, not bit - 1.

Ben.

> +				dev->ethtool_ops->set_per_queue_coalesce(dev, i, tmp);
> +				tmp += sizeof(struct ethtool_coalesce);
> +			}
> +		}
> +		kfree(backup);
> +	}
> +	return ret;
> +}
[...]

-- 
Ben Hutchings
Theory and practice are closer in theory than in practice.
                                - John Levine, moderator of comp.compilers

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH V3 1/6] net/ethtool: introduce a new ioctl for per queue setting
  2016-01-17  2:00 ` [PATCH V3 1/6] net/ethtool: introduce a new ioctl for per queue setting Ben Hutchings
@ 2016-01-18  2:25   ` Liang, Kan
  2016-01-19 19:30     ` David Decotigny
  0 siblings, 1 reply; 12+ messages in thread
From: Liang, Kan @ 2016-01-18  2:25 UTC (permalink / raw)
  To: Ben Hutchings, netdev, davem, bwh, decot
  Cc: Brandeburg, Jesse, andi, f.fainelli, alexander.duyck, Kirsher,
	Jeffrey T, Nelson, Shannon, Wyborny, Carolyn, Skidmore, Donald C,
	Williams, Mitch A, ogerlitz, edumazet, jiri, sfeldma, gospo,
	sasha.levin, dsahern, tj, cascardo, corbet@lwn.net

> 
> On Thu, 2016-01-14 at 09:08 -0500, kan.liang@intel.com wrote:
> [...]
> > --- a/include/uapi/linux/ethtool.h
> > +++ b/include/uapi/linux/ethtool.h
> > @@ -1144,6 +1144,21 @@ enum ethtool_sfeatures_retval_bits {
> >  #define ETHTOOL_F_WISH          (1 << ETHTOOL_F_WISH__BIT)
> >  #define ETHTOOL_F_COMPAT        (1 << ETHTOOL_F_COMPAT__BIT)
> >
> > +#define MAX_NUM_QUEUE		4096
> > +
> > +/**
> > + * struct ethtool_per_queue_op - apply sub command to the queues in
> mask.
> > + * @cmd: ETHTOOL_PERQUEUE
> > + * @sub_command: the sub command which apply to each queues
> > + * @queue_mask: Bitmap of the queues which sub command apply to
> > + * @data: A complete command structure following for each of the
> > +queues addressed  */ struct ethtool_per_queue_op {
> > +	__u32	cmd;
> > +	__u32	sub_command;
> > +	unsigned long
> queue_mask[BITS_TO_LONGS(MAX_NUM_QUEUE)];
> [...]
> 
> I thought we already agreed that this won't work for 32-bit userland on a
> big-endian 64-bit kernel?

I found David Decotigny's patch. https://lkml.org/lkml/2015/12/14/728
But it looks like it is not merged yet. 
Any idea when it will be merged to net-next or net?
So I can build my patch on top of his patch.

Thanks,
Kan

> 
> Ben.
> 
> --
> Ben Hutchings
> Theory and practice are closer in theory than in practice.
>                                 - John Levine, moderator of comp.compilers

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V3 1/6] net/ethtool: introduce a new ioctl for per queue setting
  2016-01-18  2:25   ` Liang, Kan
@ 2016-01-19 19:30     ` David Decotigny
  0 siblings, 0 replies; 12+ messages in thread
From: David Decotigny @ 2016-01-19 19:30 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Ben Hutchings, netdev, davem, bwh, Brandeburg, Jesse, andi,
	f.fainelli, alexander.duyck, Kirsher, Jeffrey T, Nelson, Shannon,
	Wyborny, Carolyn, Skidmore, Donald C, Williams, Mitch A,
	ogerlitz, edumazet, jiri, sfeldma, gospo, sasha.levin@oracle.com

I have a v6 ready, didn't have a chance to test + send it for review.
will try to get done with it by tomorrow morning (PST).

On Sun, Jan 17, 2016 at 6:25 PM, Liang, Kan <kan.liang@intel.com> wrote:
>>
>> On Thu, 2016-01-14 at 09:08 -0500, kan.liang@intel.com wrote:
>> [...]
>> > --- a/include/uapi/linux/ethtool.h
>> > +++ b/include/uapi/linux/ethtool.h
>> > @@ -1144,6 +1144,21 @@ enum ethtool_sfeatures_retval_bits {
>> >  #define ETHTOOL_F_WISH          (1 << ETHTOOL_F_WISH__BIT)
>> >  #define ETHTOOL_F_COMPAT        (1 << ETHTOOL_F_COMPAT__BIT)
>> >
>> > +#define MAX_NUM_QUEUE              4096
>> > +
>> > +/**
>> > + * struct ethtool_per_queue_op - apply sub command to the queues in
>> mask.
>> > + * @cmd: ETHTOOL_PERQUEUE
>> > + * @sub_command: the sub command which apply to each queues
>> > + * @queue_mask: Bitmap of the queues which sub command apply to
>> > + * @data: A complete command structure following for each of the
>> > +queues addressed  */ struct ethtool_per_queue_op {
>> > +   __u32   cmd;
>> > +   __u32   sub_command;
>> > +   unsigned long
>> queue_mask[BITS_TO_LONGS(MAX_NUM_QUEUE)];
>> [...]
>>
>> I thought we already agreed that this won't work for 32-bit userland on a
>> big-endian 64-bit kernel?
>
> I found David Decotigny's patch. https://lkml.org/lkml/2015/12/14/728
> But it looks like it is not merged yet.
> Any idea when it will be merged to net-next or net?
> So I can build my patch on top of his patch.
>
> Thanks,
> Kan
>
>>
>> Ben.
>>
>> --
>> Ben Hutchings
>> Theory and practice are closer in theory than in practice.
>>                                 - John Levine, moderator of comp.compilers

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-01-19 19:30 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-14 14:08 [PATCH V3 1/6] net/ethtool: introduce a new ioctl for per queue setting kan.liang
2016-01-14 14:08 ` [PATCH V3 2/6] net/ethtool: support get coalesce per queue kan.liang
2016-01-17  2:06   ` Ben Hutchings
2016-01-14 14:08 ` [PATCH V3 3/6] net/ethtool: support set " kan.liang
2016-01-17  2:13   ` Ben Hutchings
2016-01-14 14:08 ` [PATCH V3 4/6] i40e: queue-specific settings for interrupt moderation kan.liang
2016-01-14 22:41   ` Nelson, Shannon
2016-01-14 14:08 ` [PATCH V3 5/6] i40e/ethtool: support coalesce getting by queue kan.liang
2016-01-14 14:08 ` [PATCH V3 6/6] i40e/ethtool: support coalesce setting " kan.liang
2016-01-17  2:00 ` [PATCH V3 1/6] net/ethtool: introduce a new ioctl for per queue setting Ben Hutchings
2016-01-18  2:25   ` Liang, Kan
2016-01-19 19:30     ` David Decotigny

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.