All of lore.kernel.org
 help / color / mirror / Atom feed
* [Patch net-next v3 0/3] net_sched: reflect tx_queue_len change for pfifo_fast
@ 2018-01-26  2:26 Cong Wang
  2018-01-26  2:26 ` [Patch net-next v3 1/3] net: introduce helper dev_change_tx_queue_len() Cong Wang
                   ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: Cong Wang @ 2018-01-26  2:26 UTC (permalink / raw)
  To: netdev; +Cc: john.fastabend, Cong Wang

This pathcset restores the pfifo_fast qdisc behavior of dropping
packets based on latest dev->tx_queue_len. Patch 1 introduces
a helper, patch 2 introduces a new Qdisc ops which is called when
we modify tx_queue_len, patch 3 implements this ops for pfifo_fast.

Please see each patch for details.

---
v3: use skb_array_resize_multiple()
v2: handle error case for ->change_tx_queue_len()

Cong Wang (3):
  net: introduce helper dev_change_tx_queue_len()
  net_sched: plug in qdisc ops change_tx_queue_len
  net_sched: implement ->change_tx_queue_len() for pfifo_fast

 include/linux/netdevice.h |  1 +
 include/net/sch_generic.h |  2 ++
 net/core/dev.c            | 29 +++++++++++++++++++++++++++
 net/core/net-sysfs.c      | 25 +----------------------
 net/core/rtnetlink.c      | 18 +++++------------
 net/sched/sch_generic.c   | 51 +++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 89 insertions(+), 37 deletions(-)

-- 
2.13.0

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Patch net-next v3 1/3] net: introduce helper dev_change_tx_queue_len()
  2018-01-26  2:26 [Patch net-next v3 0/3] net_sched: reflect tx_queue_len change for pfifo_fast Cong Wang
@ 2018-01-26  2:26 ` Cong Wang
  2018-06-27 16:14   ` Eric Dumazet
  2018-01-26  2:26 ` [Patch net-next v3 2/3] net_sched: plug in qdisc ops change_tx_queue_len Cong Wang
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 21+ messages in thread
From: Cong Wang @ 2018-01-26  2:26 UTC (permalink / raw)
  To: netdev; +Cc: john.fastabend, Cong Wang

This patch promotes the local change_tx_queue_len() to a core
helper function, dev_change_tx_queue_len(), so that rtnetlink
and net-sysfs could share the code. This also prepares for the
following patch.

Note, the -EFAULT in the original code doesn't make sense,
we should propagate the errno from notifiers.

Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 include/linux/netdevice.h |  1 +
 net/core/dev.c            | 28 ++++++++++++++++++++++++++++
 net/core/net-sysfs.c      | 25 +------------------------
 net/core/rtnetlink.c      | 18 +++++-------------
 4 files changed, 35 insertions(+), 37 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 24a62d590350..0804e1d38c78 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3330,6 +3330,7 @@ int dev_get_alias(const struct net_device *, char *, size_t);
 int dev_change_net_namespace(struct net_device *, struct net *, const char *);
 int __dev_set_mtu(struct net_device *, int);
 int dev_set_mtu(struct net_device *, int);
+int dev_change_tx_queue_len(struct net_device *, unsigned long);
 void dev_set_group(struct net_device *, int);
 int dev_set_mac_address(struct net_device *, struct sockaddr *);
 int dev_change_carrier(struct net_device *, bool new_carrier);
diff --git a/net/core/dev.c b/net/core/dev.c
index 4670ccabe23a..e0b0c2784070 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7048,6 +7048,34 @@ int dev_set_mtu(struct net_device *dev, int new_mtu)
 EXPORT_SYMBOL(dev_set_mtu);
 
 /**
+ *	dev_change_tx_queue_len - Change TX queue length of a netdevice
+ *	@dev: device
+ *	@new_len: new tx queue length
+ */
+int dev_change_tx_queue_len(struct net_device *dev, unsigned long new_len)
+{
+	unsigned int orig_len = dev->tx_queue_len;
+	int res;
+
+	if (new_len != (unsigned int)new_len)
+		return -ERANGE;
+
+	if (new_len != orig_len) {
+		dev->tx_queue_len = new_len;
+		res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN, dev);
+		res = notifier_to_errno(res);
+		if (res) {
+			netdev_err(dev,
+				   "refused to change device tx_queue_len\n");
+			dev->tx_queue_len = orig_len;
+			return res;
+		}
+	}
+
+	return 0;
+}
+
+/**
  *	dev_set_group - Change group this device belongs to
  *	@dev: device
  *	@new_group: group this device should belong to
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index c4a28f4667b6..60a5ad2c33ee 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -346,29 +346,6 @@ static ssize_t flags_store(struct device *dev, struct device_attribute *attr,
 }
 NETDEVICE_SHOW_RW(flags, fmt_hex);
 
-static int change_tx_queue_len(struct net_device *dev, unsigned long new_len)
-{
-	unsigned int orig_len = dev->tx_queue_len;
-	int res;
-
-	if (new_len != (unsigned int)new_len)
-		return -ERANGE;
-
-	if (new_len != orig_len) {
-		dev->tx_queue_len = new_len;
-		res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN, dev);
-		res = notifier_to_errno(res);
-		if (res) {
-			netdev_err(dev,
-				   "refused to change device tx_queue_len\n");
-			dev->tx_queue_len = orig_len;
-			return -EFAULT;
-		}
-	}
-
-	return 0;
-}
-
 static ssize_t tx_queue_len_store(struct device *dev,
 				  struct device_attribute *attr,
 				  const char *buf, size_t len)
@@ -376,7 +353,7 @@ static ssize_t tx_queue_len_store(struct device *dev,
 	if (!capable(CAP_NET_ADMIN))
 		return -EPERM;
 
-	return netdev_store(dev, attr, buf, len, change_tx_queue_len);
+	return netdev_store(dev, attr, buf, len, dev_change_tx_queue_len);
 }
 NETDEVICE_SHOW_RW(tx_queue_len, fmt_dec);
 
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 97874daa1336..6fa6b9c60694 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -2291,19 +2291,11 @@ static int do_setlink(const struct sk_buff *skb,
 
 	if (tb[IFLA_TXQLEN]) {
 		unsigned int value = nla_get_u32(tb[IFLA_TXQLEN]);
-		unsigned int orig_len = dev->tx_queue_len;
-
-		if (dev->tx_queue_len ^ value) {
-			dev->tx_queue_len = value;
-			err = call_netdevice_notifiers(
-			      NETDEV_CHANGE_TX_QUEUE_LEN, dev);
-			err = notifier_to_errno(err);
-			if (err) {
-				dev->tx_queue_len = orig_len;
-				goto errout;
-			}
-			status |= DO_SETLINK_MODIFIED;
-		}
+
+		err = dev_change_tx_queue_len(dev, value);
+		if (err)
+			goto errout;
+		status |= DO_SETLINK_MODIFIED;
 	}
 
 	if (tb[IFLA_GSO_MAX_SIZE]) {
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Patch net-next v3 2/3] net_sched: plug in qdisc ops change_tx_queue_len
  2018-01-26  2:26 [Patch net-next v3 0/3] net_sched: reflect tx_queue_len change for pfifo_fast Cong Wang
  2018-01-26  2:26 ` [Patch net-next v3 1/3] net: introduce helper dev_change_tx_queue_len() Cong Wang
@ 2018-01-26  2:26 ` Cong Wang
  2018-01-26 14:16   ` Michael S. Tsirkin
  2018-01-26  2:26 ` [Patch net-next v3 3/3] net_sched: implement ->change_tx_queue_len() for pfifo_fast Cong Wang
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 21+ messages in thread
From: Cong Wang @ 2018-01-26  2:26 UTC (permalink / raw)
  To: netdev; +Cc: john.fastabend, Cong Wang

Introduce a new qdisc ops ->change_tx_queue_len() so that
each qdisc could decide how to implement this if it wants.
Previously we simply read dev->tx_queue_len, after pfifo_fast
switches to skb array, we need this API to resize the skb array
when we change dev->tx_queue_len.

To avoid handling race conditions with TX BH, we need to
deactivate all TX queues before change the value and bring them
back after we are done, this also makes implementation easier.

Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 include/net/sch_generic.h |  2 ++
 net/core/dev.c            |  1 +
 net/sched/sch_generic.c   | 33 +++++++++++++++++++++++++++++++++
 3 files changed, 36 insertions(+)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index eac43e8ca96d..e2ab13687fb9 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -200,6 +200,7 @@ struct Qdisc_ops {
 					  struct nlattr *arg,
 					  struct netlink_ext_ack *extack);
 	void			(*attach)(struct Qdisc *sch);
+	int			(*change_tx_queue_len)(struct Qdisc *, unsigned int);
 
 	int			(*dump)(struct Qdisc *, struct sk_buff *);
 	int			(*dump_stats)(struct Qdisc *, struct gnet_dump *);
@@ -489,6 +490,7 @@ void qdisc_class_hash_remove(struct Qdisc_class_hash *,
 void qdisc_class_hash_grow(struct Qdisc *, struct Qdisc_class_hash *);
 void qdisc_class_hash_destroy(struct Qdisc_class_hash *);
 
+int dev_qdisc_change_tx_queue_len(struct net_device *dev);
 void dev_init_scheduler(struct net_device *dev);
 void dev_shutdown(struct net_device *dev);
 void dev_activate(struct net_device *dev);
diff --git a/net/core/dev.c b/net/core/dev.c
index e0b0c2784070..c8443cfaa17a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7070,6 +7070,7 @@ int dev_change_tx_queue_len(struct net_device *dev, unsigned long new_len)
 			dev->tx_queue_len = orig_len;
 			return res;
 		}
+		return dev_qdisc_change_tx_queue_len(dev);
 	}
 
 	return 0;
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 1816bde47256..08f9fa27e06e 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -1178,6 +1178,39 @@ void dev_deactivate(struct net_device *dev)
 }
 EXPORT_SYMBOL(dev_deactivate);
 
+static int qdisc_change_tx_queue_len(struct net_device *dev,
+				     struct netdev_queue *dev_queue)
+{
+	struct Qdisc *qdisc = dev_queue->qdisc_sleeping;
+	const struct Qdisc_ops *ops = qdisc->ops;
+
+	if (ops->change_tx_queue_len)
+		return ops->change_tx_queue_len(qdisc, dev->tx_queue_len);
+	return 0;
+}
+
+int dev_qdisc_change_tx_queue_len(struct net_device *dev)
+{
+	bool up = dev->flags & IFF_UP;
+	unsigned int i;
+	int ret = 0;
+
+	if (up)
+		dev_deactivate(dev);
+
+	for (i = 0; i < dev->num_tx_queues; i++) {
+		ret = qdisc_change_tx_queue_len(dev, &dev->_tx[i]);
+
+		/* TODO: revert changes on a partial failure */
+		if (ret)
+			break;
+	}
+
+	if (up)
+		dev_activate(dev);
+	return ret;
+}
+
 static void dev_init_scheduler_queue(struct net_device *dev,
 				     struct netdev_queue *dev_queue,
 				     void *_qdisc)
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Patch net-next v3 3/3] net_sched: implement ->change_tx_queue_len() for pfifo_fast
  2018-01-26  2:26 [Patch net-next v3 0/3] net_sched: reflect tx_queue_len change for pfifo_fast Cong Wang
  2018-01-26  2:26 ` [Patch net-next v3 1/3] net: introduce helper dev_change_tx_queue_len() Cong Wang
  2018-01-26  2:26 ` [Patch net-next v3 2/3] net_sched: plug in qdisc ops change_tx_queue_len Cong Wang
@ 2018-01-26  2:26 ` Cong Wang
  2018-01-26  3:57   ` Jason Wang
  2018-01-29  5:35 ` [Patch net-next v3 0/3] net_sched: reflect tx_queue_len change " John Fastabend
  2018-01-29 17:43 ` David Miller
  4 siblings, 1 reply; 21+ messages in thread
From: Cong Wang @ 2018-01-26  2:26 UTC (permalink / raw)
  To: netdev; +Cc: john.fastabend, Cong Wang

pfifo_fast used to drop based on qdisc_dev(qdisc)->tx_queue_len,
so we have to resize skb array when we change tx_queue_len.

Other qdiscs which read tx_queue_len are fine because they
all save it to sch->limit or somewhere else in qdisc during init.
They don't have to implement this, it is nicer if they do so
that users don't have to re-configure qdisc after changing
tx_queue_len.

Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 net/sched/sch_generic.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 08f9fa27e06e..190570f21b20 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -763,6 +763,23 @@ static void pfifo_fast_destroy(struct Qdisc *sch)
 	}
 }
 
+static int pfifo_fast_change_tx_queue_len(struct Qdisc *sch,
+					  unsigned int new_len)
+{
+	struct pfifo_fast_priv *priv = qdisc_priv(sch);
+	struct skb_array *bands[PFIFO_FAST_BANDS];
+	int prio;
+
+	for (prio = 0; prio < PFIFO_FAST_BANDS; prio++) {
+		struct skb_array *q = band2list(priv, prio);
+
+		bands[prio] = q;
+	}
+
+	return skb_array_resize_multiple(bands, PFIFO_FAST_BANDS, new_len,
+					 GFP_KERNEL);
+}
+
 struct Qdisc_ops pfifo_fast_ops __read_mostly = {
 	.id		=	"pfifo_fast",
 	.priv_size	=	sizeof(struct pfifo_fast_priv),
@@ -773,6 +790,7 @@ struct Qdisc_ops pfifo_fast_ops __read_mostly = {
 	.destroy	=	pfifo_fast_destroy,
 	.reset		=	pfifo_fast_reset,
 	.dump		=	pfifo_fast_dump,
+	.change_tx_queue_len =  pfifo_fast_change_tx_queue_len,
 	.owner		=	THIS_MODULE,
 	.static_flags	=	TCQ_F_NOLOCK | TCQ_F_CPUSTATS,
 };
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Patch net-next v3 3/3] net_sched: implement ->change_tx_queue_len() for pfifo_fast
  2018-01-26  2:26 ` [Patch net-next v3 3/3] net_sched: implement ->change_tx_queue_len() for pfifo_fast Cong Wang
@ 2018-01-26  3:57   ` Jason Wang
  2018-01-26  4:01     ` Cong Wang
  2018-01-26 13:48     ` Michael S. Tsirkin
  0 siblings, 2 replies; 21+ messages in thread
From: Jason Wang @ 2018-01-26  3:57 UTC (permalink / raw)
  To: Cong Wang, netdev; +Cc: john.fastabend, Michael S. Tsirkin



On 2018年01月26日 10:26, Cong Wang wrote:
> pfifo_fast used to drop based on qdisc_dev(qdisc)->tx_queue_len,
> so we have to resize skb array when we change tx_queue_len.
>
> Other qdiscs which read tx_queue_len are fine because they
> all save it to sch->limit or somewhere else in qdisc during init.
> They don't have to implement this, it is nicer if they do so
> that users don't have to re-configure qdisc after changing
> tx_queue_len.
>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> ---
>   net/sched/sch_generic.c | 18 ++++++++++++++++++
>   1 file changed, 18 insertions(+)
>
> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index 08f9fa27e06e..190570f21b20 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -763,6 +763,23 @@ static void pfifo_fast_destroy(struct Qdisc *sch)
>   	}
>   }
>   
> +static int pfifo_fast_change_tx_queue_len(struct Qdisc *sch,
> +					  unsigned int new_len)
> +{
> +	struct pfifo_fast_priv *priv = qdisc_priv(sch);
> +	struct skb_array *bands[PFIFO_FAST_BANDS];
> +	int prio;
> +
> +	for (prio = 0; prio < PFIFO_FAST_BANDS; prio++) {
> +		struct skb_array *q = band2list(priv, prio);
> +
> +		bands[prio] = q;
> +	}
> +
> +	return skb_array_resize_multiple(bands, PFIFO_FAST_BANDS, new_len,
> +					 GFP_KERNEL);
> +}
> +
>   struct Qdisc_ops pfifo_fast_ops __read_mostly = {
>   	.id		=	"pfifo_fast",
>   	.priv_size	=	sizeof(struct pfifo_fast_priv),
> @@ -773,6 +790,7 @@ struct Qdisc_ops pfifo_fast_ops __read_mostly = {
>   	.destroy	=	pfifo_fast_destroy,
>   	.reset		=	pfifo_fast_reset,
>   	.dump		=	pfifo_fast_dump,
> +	.change_tx_queue_len =  pfifo_fast_change_tx_queue_len,
>   	.owner		=	THIS_MODULE,
>   	.static_flags	=	TCQ_F_NOLOCK | TCQ_F_CPUSTATS,
>   };

Is __skb_array_empty() in pfifo_fast_dequeue() still safe after this change?

Thanks

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Patch net-next v3 3/3] net_sched: implement ->change_tx_queue_len() for pfifo_fast
  2018-01-26  3:57   ` Jason Wang
@ 2018-01-26  4:01     ` Cong Wang
  2018-01-26 14:10       ` Michael S. Tsirkin
  2018-01-29  3:31       ` Jason Wang
  2018-01-26 13:48     ` Michael S. Tsirkin
  1 sibling, 2 replies; 21+ messages in thread
From: Cong Wang @ 2018-01-26  4:01 UTC (permalink / raw)
  To: Jason Wang
  Cc: Linux Kernel Network Developers, John Fastabend, Michael S. Tsirkin

On Thu, Jan 25, 2018 at 7:57 PM, Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2018年01月26日 10:26, Cong Wang wrote:
>>
>> pfifo_fast used to drop based on qdisc_dev(qdisc)->tx_queue_len,
>> so we have to resize skb array when we change tx_queue_len.
>>
>> Other qdiscs which read tx_queue_len are fine because they
>> all save it to sch->limit or somewhere else in qdisc during init.
>> They don't have to implement this, it is nicer if they do so
>> that users don't have to re-configure qdisc after changing
>> tx_queue_len.
>>
>> Cc: John Fastabend <john.fastabend@gmail.com>
>> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>> ---
>>   net/sched/sch_generic.c | 18 ++++++++++++++++++
>>   1 file changed, 18 insertions(+)
>>
>> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
>> index 08f9fa27e06e..190570f21b20 100644
>> --- a/net/sched/sch_generic.c
>> +++ b/net/sched/sch_generic.c
>> @@ -763,6 +763,23 @@ static void pfifo_fast_destroy(struct Qdisc *sch)
>>         }
>>   }
>>   +static int pfifo_fast_change_tx_queue_len(struct Qdisc *sch,
>> +                                         unsigned int new_len)
>> +{
>> +       struct pfifo_fast_priv *priv = qdisc_priv(sch);
>> +       struct skb_array *bands[PFIFO_FAST_BANDS];
>> +       int prio;
>> +
>> +       for (prio = 0; prio < PFIFO_FAST_BANDS; prio++) {
>> +               struct skb_array *q = band2list(priv, prio);
>> +
>> +               bands[prio] = q;
>> +       }
>> +
>> +       return skb_array_resize_multiple(bands, PFIFO_FAST_BANDS, new_len,
>> +                                        GFP_KERNEL);
>> +}
>> +
>>   struct Qdisc_ops pfifo_fast_ops __read_mostly = {
>>         .id             =       "pfifo_fast",
>>         .priv_size      =       sizeof(struct pfifo_fast_priv),
>> @@ -773,6 +790,7 @@ struct Qdisc_ops pfifo_fast_ops __read_mostly = {
>>         .destroy        =       pfifo_fast_destroy,
>>         .reset          =       pfifo_fast_reset,
>>         .dump           =       pfifo_fast_dump,
>> +       .change_tx_queue_len =  pfifo_fast_change_tx_queue_len,
>>         .owner          =       THIS_MODULE,
>>         .static_flags   =       TCQ_F_NOLOCK | TCQ_F_CPUSTATS,
>>   };
>
>
> Is __skb_array_empty() in pfifo_fast_dequeue() still safe after this change?

Yes, we sync with dequeue path before calling ->change_tx_queue_len().
I already mentioned this in patch 2/3.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Patch net-next v3 3/3] net_sched: implement ->change_tx_queue_len() for pfifo_fast
  2018-01-26  3:57   ` Jason Wang
  2018-01-26  4:01     ` Cong Wang
@ 2018-01-26 13:48     ` Michael S. Tsirkin
  1 sibling, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2018-01-26 13:48 UTC (permalink / raw)
  To: Jason Wang; +Cc: Cong Wang, netdev, john.fastabend

On Fri, Jan 26, 2018 at 11:57:59AM +0800, Jason Wang wrote:
> 
> 
> On 2018年01月26日 10:26, Cong Wang wrote:
> > pfifo_fast used to drop based on qdisc_dev(qdisc)->tx_queue_len,
> > so we have to resize skb array when we change tx_queue_len.
> > 
> > Other qdiscs which read tx_queue_len are fine because they
> > all save it to sch->limit or somewhere else in qdisc during init.
> > They don't have to implement this, it is nicer if they do so
> > that users don't have to re-configure qdisc after changing
> > tx_queue_len.
> > 
> > Cc: John Fastabend <john.fastabend@gmail.com>
> > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> > ---
> >   net/sched/sch_generic.c | 18 ++++++++++++++++++
> >   1 file changed, 18 insertions(+)
> > 
> > diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> > index 08f9fa27e06e..190570f21b20 100644
> > --- a/net/sched/sch_generic.c
> > +++ b/net/sched/sch_generic.c
> > @@ -763,6 +763,23 @@ static void pfifo_fast_destroy(struct Qdisc *sch)
> >   	}
> >   }
> > +static int pfifo_fast_change_tx_queue_len(struct Qdisc *sch,
> > +					  unsigned int new_len)
> > +{
> > +	struct pfifo_fast_priv *priv = qdisc_priv(sch);
> > +	struct skb_array *bands[PFIFO_FAST_BANDS];
> > +	int prio;
> > +
> > +	for (prio = 0; prio < PFIFO_FAST_BANDS; prio++) {
> > +		struct skb_array *q = band2list(priv, prio);
> > +
> > +		bands[prio] = q;
> > +	}
> > +
> > +	return skb_array_resize_multiple(bands, PFIFO_FAST_BANDS, new_len,
> > +					 GFP_KERNEL);
> > +}
> > +
> >   struct Qdisc_ops pfifo_fast_ops __read_mostly = {
> >   	.id		=	"pfifo_fast",
> >   	.priv_size	=	sizeof(struct pfifo_fast_priv),
> > @@ -773,6 +790,7 @@ struct Qdisc_ops pfifo_fast_ops __read_mostly = {
> >   	.destroy	=	pfifo_fast_destroy,
> >   	.reset		=	pfifo_fast_reset,
> >   	.dump		=	pfifo_fast_dump,
> > +	.change_tx_queue_len =  pfifo_fast_change_tx_queue_len,
> >   	.owner		=	THIS_MODULE,
> >   	.static_flags	=	TCQ_F_NOLOCK | TCQ_F_CPUSTATS,
> >   };
> 
> Is __skb_array_empty() in pfifo_fast_dequeue() still safe after this change?
> 
> Thanks

I think it isn't.  If you want to use resize *and* use unlocked variants,
you must lock all producers and consumers when resizing yourself.

-- 
MST

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Patch net-next v3 3/3] net_sched: implement ->change_tx_queue_len() for pfifo_fast
  2018-01-26  4:01     ` Cong Wang
@ 2018-01-26 14:10       ` Michael S. Tsirkin
  2018-01-29  2:33         ` Cong Wang
  2018-01-29  3:31       ` Jason Wang
  1 sibling, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2018-01-26 14:10 UTC (permalink / raw)
  To: Cong Wang; +Cc: Jason Wang, Linux Kernel Network Developers, John Fastabend

On Thu, Jan 25, 2018 at 08:01:42PM -0800, Cong Wang wrote:
> On Thu, Jan 25, 2018 at 7:57 PM, Jason Wang <jasowang@redhat.com> wrote:
> >
> >
> > On 2018年01月26日 10:26, Cong Wang wrote:
> >>
> >> pfifo_fast used to drop based on qdisc_dev(qdisc)->tx_queue_len,
> >> so we have to resize skb array when we change tx_queue_len.
> >>
> >> Other qdiscs which read tx_queue_len are fine because they
> >> all save it to sch->limit or somewhere else in qdisc during init.
> >> They don't have to implement this, it is nicer if they do so
> >> that users don't have to re-configure qdisc after changing
> >> tx_queue_len.
> >>
> >> Cc: John Fastabend <john.fastabend@gmail.com>
> >> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> >> ---
> >>   net/sched/sch_generic.c | 18 ++++++++++++++++++
> >>   1 file changed, 18 insertions(+)
> >>
> >> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> >> index 08f9fa27e06e..190570f21b20 100644
> >> --- a/net/sched/sch_generic.c
> >> +++ b/net/sched/sch_generic.c
> >> @@ -763,6 +763,23 @@ static void pfifo_fast_destroy(struct Qdisc *sch)
> >>         }
> >>   }
> >>   +static int pfifo_fast_change_tx_queue_len(struct Qdisc *sch,
> >> +                                         unsigned int new_len)
> >> +{
> >> +       struct pfifo_fast_priv *priv = qdisc_priv(sch);
> >> +       struct skb_array *bands[PFIFO_FAST_BANDS];
> >> +       int prio;
> >> +
> >> +       for (prio = 0; prio < PFIFO_FAST_BANDS; prio++) {
> >> +               struct skb_array *q = band2list(priv, prio);
> >> +
> >> +               bands[prio] = q;
> >> +       }
> >> +
> >> +       return skb_array_resize_multiple(bands, PFIFO_FAST_BANDS, new_len,
> >> +                                        GFP_KERNEL);
> >> +}
> >> +
> >>   struct Qdisc_ops pfifo_fast_ops __read_mostly = {
> >>         .id             =       "pfifo_fast",
> >>         .priv_size      =       sizeof(struct pfifo_fast_priv),
> >> @@ -773,6 +790,7 @@ struct Qdisc_ops pfifo_fast_ops __read_mostly = {
> >>         .destroy        =       pfifo_fast_destroy,
> >>         .reset          =       pfifo_fast_reset,
> >>         .dump           =       pfifo_fast_dump,
> >> +       .change_tx_queue_len =  pfifo_fast_change_tx_queue_len,
> >>         .owner          =       THIS_MODULE,
> >>         .static_flags   =       TCQ_F_NOLOCK | TCQ_F_CPUSTATS,
> >>   };
> >
> >
> > Is __skb_array_empty() in pfifo_fast_dequeue() still safe after this change?
> 
> Yes, we sync with dequeue path before calling ->change_tx_queue_len().
> I already mentioned this in patch 2/3.


This part?

+       bool up = dev->flags & IFF_UP;
+       unsigned int i;
+       int ret = 0;
+
+       if (up)
+               dev_deactivate(dev);
+
+       for (i = 0; i < dev->num_tx_queues; i++) {
+               ret = qdisc_change_tx_queue_len(dev, &dev->_tx[i]);
+
+               /* TODO: revert changes on a partial failure */
+               if (ret)
+                       break;
+       }
+
+       if (up)
+               dev_activate(dev);


I wonder whether it really is safe to read dev->flags like that
without any locks.

-- 
MST

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Patch net-next v3 2/3] net_sched: plug in qdisc ops change_tx_queue_len
  2018-01-26  2:26 ` [Patch net-next v3 2/3] net_sched: plug in qdisc ops change_tx_queue_len Cong Wang
@ 2018-01-26 14:16   ` Michael S. Tsirkin
  2018-01-29  2:31     ` Cong Wang
  0 siblings, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2018-01-26 14:16 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev, john.fastabend

On Thu, Jan 25, 2018 at 06:26:23PM -0800, Cong Wang wrote:
> Introduce a new qdisc ops ->change_tx_queue_len() so that
> each qdisc could decide how to implement this if it wants.
> Previously we simply read dev->tx_queue_len, after pfifo_fast
> switches to skb array, we need this API to resize the skb array
> when we change dev->tx_queue_len.
> 
> To avoid handling race conditions with TX BH, we need to
> deactivate all TX queues before change the value and bring them
> back after we are done, this also makes implementation easier.
> 
> Cc: John Fastabend <john.fastabend@gmail.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> ---
>  include/net/sch_generic.h |  2 ++
>  net/core/dev.c            |  1 +
>  net/sched/sch_generic.c   | 33 +++++++++++++++++++++++++++++++++
>  3 files changed, 36 insertions(+)
> 
> diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
> index eac43e8ca96d..e2ab13687fb9 100644
> --- a/include/net/sch_generic.h
> +++ b/include/net/sch_generic.h
> @@ -200,6 +200,7 @@ struct Qdisc_ops {
>  					  struct nlattr *arg,
>  					  struct netlink_ext_ack *extack);
>  	void			(*attach)(struct Qdisc *sch);
> +	int			(*change_tx_queue_len)(struct Qdisc *, unsigned int);
>  
>  	int			(*dump)(struct Qdisc *, struct sk_buff *);
>  	int			(*dump_stats)(struct Qdisc *, struct gnet_dump *);
> @@ -489,6 +490,7 @@ void qdisc_class_hash_remove(struct Qdisc_class_hash *,
>  void qdisc_class_hash_grow(struct Qdisc *, struct Qdisc_class_hash *);
>  void qdisc_class_hash_destroy(struct Qdisc_class_hash *);
>  
> +int dev_qdisc_change_tx_queue_len(struct net_device *dev);
>  void dev_init_scheduler(struct net_device *dev);
>  void dev_shutdown(struct net_device *dev);
>  void dev_activate(struct net_device *dev);
> diff --git a/net/core/dev.c b/net/core/dev.c
> index e0b0c2784070..c8443cfaa17a 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -7070,6 +7070,7 @@ int dev_change_tx_queue_len(struct net_device *dev, unsigned long new_len)
>  			dev->tx_queue_len = orig_len;
>  			return res;
>  		}
> +		return dev_qdisc_change_tx_queue_len(dev);
>  	}
>  
>  	return 0;
> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index 1816bde47256..08f9fa27e06e 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -1178,6 +1178,39 @@ void dev_deactivate(struct net_device *dev)
>  }
>  EXPORT_SYMBOL(dev_deactivate);
>  
> +static int qdisc_change_tx_queue_len(struct net_device *dev,
> +				     struct netdev_queue *dev_queue)
> +{
> +	struct Qdisc *qdisc = dev_queue->qdisc_sleeping;
> +	const struct Qdisc_ops *ops = qdisc->ops;
> +
> +	if (ops->change_tx_queue_len)
> +		return ops->change_tx_queue_len(qdisc, dev->tx_queue_len);
> +	return 0;
> +}
> +
> +int dev_qdisc_change_tx_queue_len(struct net_device *dev)
> +{
> +	bool up = dev->flags & IFF_UP;
> +	unsigned int i;
> +	int ret = 0;
> +
> +	if (up)
> +		dev_deactivate(dev);


This drops all packets in the queue. I don't think tweaking the queue
length did this previously - did it?
If not this change might surprise some people.

> +
> +	for (i = 0; i < dev->num_tx_queues; i++) {
> +		ret = qdisc_change_tx_queue_len(dev, &dev->_tx[i]);
> +
> +		/* TODO: revert changes on a partial failure */
> +		if (ret)
> +			break;
> +	}
> +
> +	if (up)
> +		dev_activate(dev);
> +	return ret;
> +}
> +
>  static void dev_init_scheduler_queue(struct net_device *dev,
>  				     struct netdev_queue *dev_queue,
>  				     void *_qdisc)
> -- 
> 2.13.0

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Patch net-next v3 2/3] net_sched: plug in qdisc ops change_tx_queue_len
  2018-01-26 14:16   ` Michael S. Tsirkin
@ 2018-01-29  2:31     ` Cong Wang
  0 siblings, 0 replies; 21+ messages in thread
From: Cong Wang @ 2018-01-29  2:31 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Linux Kernel Network Developers, John Fastabend

On Fri, Jan 26, 2018 at 6:16 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
>
> This drops all packets in the queue. I don't think tweaking the queue
> length did this previously - did it?

No, because previously only enqueue reads the value.

> If not this change might surprise some people.

It is hard to say which behavior is better, it depends on what you
expect:

1) If you want to tx_queue_len change immediately takes affects,
it should drop those in queue too, maybe not all, but at least when
we shrinking the queue length, we should drop those beyond limit.

2) If you want to tx_queue_len change takes affects after all pending
packets are pushed out. This is literally the old behavior, so at some
moments, the number of packets in queue could be larger than the
new tx_queue_len.

I don't see which one is obviously better than the other one,
therefore it is hard to say which one people really expect.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Patch net-next v3 3/3] net_sched: implement ->change_tx_queue_len() for pfifo_fast
  2018-01-26 14:10       ` Michael S. Tsirkin
@ 2018-01-29  2:33         ` Cong Wang
  0 siblings, 0 replies; 21+ messages in thread
From: Cong Wang @ 2018-01-29  2:33 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, Linux Kernel Network Developers, John Fastabend

On Fri, Jan 26, 2018 at 6:10 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
>
> This part?

Yes, dev_deactivate() as you quote.

>
> +       bool up = dev->flags & IFF_UP;
> +       unsigned int i;
> +       int ret = 0;
> +
> +       if (up)
> +               dev_deactivate(dev);
> +
> +       for (i = 0; i < dev->num_tx_queues; i++) {
> +               ret = qdisc_change_tx_queue_len(dev, &dev->_tx[i]);
> +
> +               /* TODO: revert changes on a partial failure */
> +               if (ret)
> +                       break;
> +       }
> +
> +       if (up)
> +               dev_activate(dev);
>
>
> I wonder whether it really is safe to read dev->flags like that
> without any locks.

I really to hate to point it out again we have RTNL here. You
missed my previous response to John. ;)

Please read v1 and v2 when you response to v3.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Patch net-next v3 3/3] net_sched: implement ->change_tx_queue_len() for pfifo_fast
  2018-01-26  4:01     ` Cong Wang
  2018-01-26 14:10       ` Michael S. Tsirkin
@ 2018-01-29  3:31       ` Jason Wang
  1 sibling, 0 replies; 21+ messages in thread
From: Jason Wang @ 2018-01-29  3:31 UTC (permalink / raw)
  To: Cong Wang
  Cc: Linux Kernel Network Developers, John Fastabend, Michael S. Tsirkin



On 2018年01月26日 12:01, Cong Wang wrote:
>> Is __skb_array_empty() in pfifo_fast_dequeue() still safe after this change?
> Yes, we sync with dequeue path before calling ->change_tx_queue_len().
> I already mentioned this in patch 2/3.

Aha, ok, I think I get the synchronize_net() trick in dev_deactivate_many().

Thanks

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Patch net-next v3 0/3] net_sched: reflect tx_queue_len change for pfifo_fast
  2018-01-26  2:26 [Patch net-next v3 0/3] net_sched: reflect tx_queue_len change for pfifo_fast Cong Wang
                   ` (2 preceding siblings ...)
  2018-01-26  2:26 ` [Patch net-next v3 3/3] net_sched: implement ->change_tx_queue_len() for pfifo_fast Cong Wang
@ 2018-01-29  5:35 ` John Fastabend
  2018-01-29  5:57   ` Cong Wang
  2018-01-29  6:01   ` Cong Wang
  2018-01-29 17:43 ` David Miller
  4 siblings, 2 replies; 21+ messages in thread
From: John Fastabend @ 2018-01-29  5:35 UTC (permalink / raw)
  To: Cong Wang, netdev

On 01/25/2018 06:26 PM, Cong Wang wrote:
> This pathcset restores the pfifo_fast qdisc behavior of dropping
> packets based on latest dev->tx_queue_len. Patch 1 introduces
> a helper, patch 2 introduces a new Qdisc ops which is called when
> we modify tx_queue_len, patch 3 implements this ops for pfifo_fast.
> 
> Please see each patch for details.
> 

Overall this series is better than what we have at the moment, but
a better fix would preallocate the memory, to avoid ENOMEM errors,
and add a ptr_ring API to use the preallocated buffers.

We have time (its only in net-next) so lets do the complete fix
rather than this series IMO.

.John

> ---
> v3: use skb_array_resize_multiple()
> v2: handle error case for ->change_tx_queue_len()
> 
> Cong Wang (3):
>   net: introduce helper dev_change_tx_queue_len()
>   net_sched: plug in qdisc ops change_tx_queue_len
>   net_sched: implement ->change_tx_queue_len() for pfifo_fast
> 
>  include/linux/netdevice.h |  1 +
>  include/net/sch_generic.h |  2 ++
>  net/core/dev.c            | 29 +++++++++++++++++++++++++++
>  net/core/net-sysfs.c      | 25 +----------------------
>  net/core/rtnetlink.c      | 18 +++++------------
>  net/sched/sch_generic.c   | 51 +++++++++++++++++++++++++++++++++++++++++++++++
>  6 files changed, 89 insertions(+), 37 deletions(-)
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Patch net-next v3 0/3] net_sched: reflect tx_queue_len change for pfifo_fast
  2018-01-29  5:35 ` [Patch net-next v3 0/3] net_sched: reflect tx_queue_len change " John Fastabend
@ 2018-01-29  5:57   ` Cong Wang
  2018-01-29  6:09     ` John Fastabend
  2018-01-29  6:01   ` Cong Wang
  1 sibling, 1 reply; 21+ messages in thread
From: Cong Wang @ 2018-01-29  5:57 UTC (permalink / raw)
  To: John Fastabend; +Cc: Linux Kernel Network Developers

On Sun, Jan 28, 2018 at 9:35 PM, John Fastabend
<john.fastabend@gmail.com> wrote:
> On 01/25/2018 06:26 PM, Cong Wang wrote:
>> This pathcset restores the pfifo_fast qdisc behavior of dropping
>> packets based on latest dev->tx_queue_len. Patch 1 introduces
>> a helper, patch 2 introduces a new Qdisc ops which is called when
>> we modify tx_queue_len, patch 3 implements this ops for pfifo_fast.
>>
>> Please see each patch for details.
>>
>
> Overall this series is better than what we have at the moment, but
> a better fix would preallocate the memory, to avoid ENOMEM errors,

I am not against for better ENOMEM error handling, but I still have to
remind you that attach_one_default_qdisc() doesn't handle it either.
Since no one complained about it, why this one is so special?


> and add a ptr_ring API to use the preallocated buffers.


What ptr_ring API could cure netdev tx queues problem here?
Looks like you still don't understand the problem here.


>
> We have time (its only in net-next) so lets do the complete fix
> rather than this series IMO.
>

Why not just accept this and complete the error handling later
given the fact that I already add a TODO? IOW, why it is now?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Patch net-next v3 0/3] net_sched: reflect tx_queue_len change for pfifo_fast
  2018-01-29  5:35 ` [Patch net-next v3 0/3] net_sched: reflect tx_queue_len change " John Fastabend
  2018-01-29  5:57   ` Cong Wang
@ 2018-01-29  6:01   ` Cong Wang
  1 sibling, 0 replies; 21+ messages in thread
From: Cong Wang @ 2018-01-29  6:01 UTC (permalink / raw)
  To: John Fastabend; +Cc: Linux Kernel Network Developers

On Sun, Jan 28, 2018 at 9:35 PM, John Fastabend
<john.fastabend@gmail.com> wrote:
> We have time (its only in net-next) so lets do the complete fix
> rather than this series IMO.

Also have to remind you: this patchset fixes a regression introduced
by your patchset in net-next, it is not a new feature. I don't think it is
a serious regression, but it is still one...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Patch net-next v3 0/3] net_sched: reflect tx_queue_len change for pfifo_fast
  2018-01-29  5:57   ` Cong Wang
@ 2018-01-29  6:09     ` John Fastabend
  2018-01-29  6:25       ` Cong Wang
  0 siblings, 1 reply; 21+ messages in thread
From: John Fastabend @ 2018-01-29  6:09 UTC (permalink / raw)
  To: Cong Wang; +Cc: Linux Kernel Network Developers

On 01/28/2018 09:57 PM, Cong Wang wrote:
> On Sun, Jan 28, 2018 at 9:35 PM, John Fastabend
> <john.fastabend@gmail.com> wrote:
>> On 01/25/2018 06:26 PM, Cong Wang wrote:
>>> This pathcset restores the pfifo_fast qdisc behavior of dropping
>>> packets based on latest dev->tx_queue_len. Patch 1 introduces
>>> a helper, patch 2 introduces a new Qdisc ops which is called when
>>> we modify tx_queue_len, patch 3 implements this ops for pfifo_fast.
>>>
>>> Please see each patch for details.
>>>
>>
>> Overall this series is better than what we have at the moment, but
>> a better fix would preallocate the memory, to avoid ENOMEM errors,
> 
> I am not against for better ENOMEM error handling, but I still have to
> remind you that attach_one_default_qdisc() doesn't handle it either.
> Since no one complained about it, why this one is so special?

Its not we should fix both cases. And also clean up the multiple
net sync calls in these paths as well.

> 
> 
>> and add a ptr_ring API to use the preallocated buffers.
> 
> 
> What ptr_ring API could cure netdev tx queues problem here?
> Looks like you still don't understand the problem here.
> 

The resize multiple array API can only fail due to alloc failures.
We need to break this API into two pieces preallocate the memory
and then commit array changes. Alternatively the qdisc layer could
allocate new arrays and then swap them after all the arrays been
initialized correctly using ptr_ring APIs below the ptr_ring
resize API calls.

Having ptr_ring API operations to support this seems best.

> 
>>
>> We have time (its only in net-next) so lets do the complete fix
>> rather than this series IMO.
>>
> 
> Why not just accept this and complete the error handling later
> given the fact that I already add a TODO? IOW, why it is now?
> 

Because its not correct and its not too much work to get it so
the error is avoided.

I don't mind if the patches go in as-is a follow up patch can
also fix the TODO item.

.John

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Patch net-next v3 0/3] net_sched: reflect tx_queue_len change for pfifo_fast
  2018-01-29  6:09     ` John Fastabend
@ 2018-01-29  6:25       ` Cong Wang
  0 siblings, 0 replies; 21+ messages in thread
From: Cong Wang @ 2018-01-29  6:25 UTC (permalink / raw)
  To: John Fastabend; +Cc: Linux Kernel Network Developers

On Sun, Jan 28, 2018 at 10:09 PM, John Fastabend
<john.fastabend@gmail.com> wrote:
> On 01/28/2018 09:57 PM, Cong Wang wrote:
>> On Sun, Jan 28, 2018 at 9:35 PM, John Fastabend
>> <john.fastabend@gmail.com> wrote:
>>> On 01/25/2018 06:26 PM, Cong Wang wrote:
>>>> This pathcset restores the pfifo_fast qdisc behavior of dropping
>>>> packets based on latest dev->tx_queue_len. Patch 1 introduces
>>>> a helper, patch 2 introduces a new Qdisc ops which is called when
>>>> we modify tx_queue_len, patch 3 implements this ops for pfifo_fast.
>>>>
>>>> Please see each patch for details.
>>>>
>>>
>>> Overall this series is better than what we have at the moment, but
>>> a better fix would preallocate the memory, to avoid ENOMEM errors,
>>
>> I am not against for better ENOMEM error handling, but I still have to
>> remind you that attach_one_default_qdisc() doesn't handle it either.
>> Since no one complained about it, why this one is so special?
>
> Its not we should fix both cases. And also clean up the multiple
> net sync calls in these paths as well.

Now can we agree error handling can be improved later? You
already agree this is not a new problem introduced by this patchset,
why do you want to block a regression fix just because of an old
problem which I will fix later?


>
>>
>>
>>> and add a ptr_ring API to use the preallocated buffers.
>>
>>
>> What ptr_ring API could cure netdev tx queues problem here?
>> Looks like you still don't understand the problem here.
>>
>
> The resize multiple array API can only fail due to alloc failures.
> We need to break this API into two pieces preallocate the memory
> and then commit array changes. Alternatively the qdisc layer could
> allocate new arrays and then swap them after all the arrays been
> initialized correctly using ptr_ring APIs below the ptr_ring
> resize API calls.
>
> Having ptr_ring API operations to support this seems best.
>


Apparently qdisc layer doesn't care about ptr_ring, as you describe
here this potentially needs to change two layers: 1) qdisc layer
2) ptr_ring API. It is more work than just a simple error handling,
potentially bigger than this patchset itself.


>>
>>>
>>> We have time (its only in net-next) so lets do the complete fix
>>> rather than this series IMO.
>>>
>>
>> Why not just accept this and complete the error handling later
>> given the fact that I already add a TODO? IOW, why it is now?
>>
>
> Because its not correct and its not too much work to get it so
> the error is avoided.

So you must want a fix for net too, because attach_one_default_qdisc()
"is not correct" either and "it is not too much work".

I can't agree on any of your claims here. Sorry.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Patch net-next v3 0/3] net_sched: reflect tx_queue_len change for pfifo_fast
  2018-01-26  2:26 [Patch net-next v3 0/3] net_sched: reflect tx_queue_len change for pfifo_fast Cong Wang
                   ` (3 preceding siblings ...)
  2018-01-29  5:35 ` [Patch net-next v3 0/3] net_sched: reflect tx_queue_len change " John Fastabend
@ 2018-01-29 17:43 ` David Miller
  2018-01-30  0:12   ` Cong Wang
  4 siblings, 1 reply; 21+ messages in thread
From: David Miller @ 2018-01-29 17:43 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: netdev, john.fastabend

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Thu, 25 Jan 2018 18:26:21 -0800

> This pathcset restores the pfifo_fast qdisc behavior of dropping
> packets based on latest dev->tx_queue_len. Patch 1 introduces
> a helper, patch 2 introduces a new Qdisc ops which is called when
> we modify tx_queue_len, patch 3 implements this ops for pfifo_fast.
> 
> Please see each patch for details.
> 
> ---
> v3: use skb_array_resize_multiple()
> v2: handle error case for ->change_tx_queue_len()

Series applied, thanks Cong.

Please follow up with John about making the queue allocation use
a prepare + commit phase.

And as for the TX queue state handling on change, I think it's
fine to purge the whole queue.  That is definitely better than
allowing the queue length to be visibly over the tx_queue_len
setting.

Thank you.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Patch net-next v3 0/3] net_sched: reflect tx_queue_len change for pfifo_fast
  2018-01-29 17:43 ` David Miller
@ 2018-01-30  0:12   ` Cong Wang
  0 siblings, 0 replies; 21+ messages in thread
From: Cong Wang @ 2018-01-30  0:12 UTC (permalink / raw)
  To: David Miller; +Cc: Linux Kernel Network Developers, John Fastabend

On Mon, Jan 29, 2018 at 9:43 AM, David Miller <davem@davemloft.net> wrote:
>
> Please follow up with John about making the queue allocation use
> a prepare + commit phase.

Will do it once net-next is re-open.


>
> And as for the TX queue state handling on change, I think it's
> fine to purge the whole queue.  That is definitely better than
> allowing the queue length to be visibly over the tx_queue_len
> setting.
>

OK. Thanks.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Patch net-next v3 1/3] net: introduce helper dev_change_tx_queue_len()
  2018-01-26  2:26 ` [Patch net-next v3 1/3] net: introduce helper dev_change_tx_queue_len() Cong Wang
@ 2018-06-27 16:14   ` Eric Dumazet
  2018-06-27 17:41     ` Cong Wang
  0 siblings, 1 reply; 21+ messages in thread
From: Eric Dumazet @ 2018-06-27 16:14 UTC (permalink / raw)
  To: Cong Wang, netdev; +Cc: john.fastabend



On 01/25/2018 06:26 PM, Cong Wang wrote:
> This patch promotes the local change_tx_queue_len() to a core
> helper function, dev_change_tx_queue_len(), so that rtnetlink
> and net-sysfs could share the code. This also prepares for the
> following patch.
> 
> Note, the -EFAULT in the original code doesn't make sense,
> we should propagate the errno from notifiers.
> 
> Cc: John Fastabend <john.fastabend@gmail.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> ---
>  include/linux/netdevice.h |  1 +
>  net/core/dev.c            | 28 ++++++++++++++++++++++++++++
>  net/core/net-sysfs.c      | 25 +------------------------
>  net/core/rtnetlink.c      | 18 +++++-------------
>  4 files changed, 35 insertions(+), 37 deletions(-)
> 

Hi Cong

What about using dev_change_tx_queue_len() helper from SIOCSIFTXQLEN path in
net/core/dev_ioctl.c ?

This would make sure we call dev_qdisc_change_tx_queue_len() in this case.

Thanks !

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Patch net-next v3 1/3] net: introduce helper dev_change_tx_queue_len()
  2018-06-27 16:14   ` Eric Dumazet
@ 2018-06-27 17:41     ` Cong Wang
  0 siblings, 0 replies; 21+ messages in thread
From: Cong Wang @ 2018-06-27 17:41 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Linux Kernel Network Developers, John Fastabend

On Wed, Jun 27, 2018 at 9:14 AM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 01/25/2018 06:26 PM, Cong Wang wrote:
> > This patch promotes the local change_tx_queue_len() to a core
> > helper function, dev_change_tx_queue_len(), so that rtnetlink
> > and net-sysfs could share the code. This also prepares for the
> > following patch.
> >
> > Note, the -EFAULT in the original code doesn't make sense,
> > we should propagate the errno from notifiers.
> >
> > Cc: John Fastabend <john.fastabend@gmail.com>
> > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> > ---
> >  include/linux/netdevice.h |  1 +
> >  net/core/dev.c            | 28 ++++++++++++++++++++++++++++
> >  net/core/net-sysfs.c      | 25 +------------------------
> >  net/core/rtnetlink.c      | 18 +++++-------------
> >  4 files changed, 35 insertions(+), 37 deletions(-)
> >
>
> Hi Cong
>
> What about using dev_change_tx_queue_len() helper from SIOCSIFTXQLEN path in
> net/core/dev_ioctl.c ?
>
> This would make sure we call dev_qdisc_change_tx_queue_len() in this case.
>

Good catch! Will send a patch for net-next.

Thanks.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2018-06-27 17:41 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-26  2:26 [Patch net-next v3 0/3] net_sched: reflect tx_queue_len change for pfifo_fast Cong Wang
2018-01-26  2:26 ` [Patch net-next v3 1/3] net: introduce helper dev_change_tx_queue_len() Cong Wang
2018-06-27 16:14   ` Eric Dumazet
2018-06-27 17:41     ` Cong Wang
2018-01-26  2:26 ` [Patch net-next v3 2/3] net_sched: plug in qdisc ops change_tx_queue_len Cong Wang
2018-01-26 14:16   ` Michael S. Tsirkin
2018-01-29  2:31     ` Cong Wang
2018-01-26  2:26 ` [Patch net-next v3 3/3] net_sched: implement ->change_tx_queue_len() for pfifo_fast Cong Wang
2018-01-26  3:57   ` Jason Wang
2018-01-26  4:01     ` Cong Wang
2018-01-26 14:10       ` Michael S. Tsirkin
2018-01-29  2:33         ` Cong Wang
2018-01-29  3:31       ` Jason Wang
2018-01-26 13:48     ` Michael S. Tsirkin
2018-01-29  5:35 ` [Patch net-next v3 0/3] net_sched: reflect tx_queue_len change " John Fastabend
2018-01-29  5:57   ` Cong Wang
2018-01-29  6:09     ` John Fastabend
2018-01-29  6:25       ` Cong Wang
2018-01-29  6:01   ` Cong Wang
2018-01-29 17:43 ` David Miller
2018-01-30  0:12   ` Cong Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.